Thoughts on partitioning optimizations

I’m supposed to be working on subquery optimization but I can’t get BUG#26630 out of my head.

The bug test case is small and easy to understand: looking at the EXPLAIN:

create table tbl_test( ... ) partition by range(num) (...);

explain partitions
select * from tbl_co c straight_join tbl_test t where t.num=c.num and reg=8;
..+-------+-------------+------+---------------+------+---------+------+------+-------------+
  | table | partitions  | type | possible_keys | key  | key_len | ref  | rows | Extra       |
..+-------+-------------+------+---------------+------+---------+------+------+-------------+
  | c     | NULL        | ALL  | NULL          | NULL | NULL    | NULL |   17 | Using where |
  | t     | p0,p1,p2,p3 | ALL  | NULL          | NULL | NULL    | NULL |   17 | Using where |
..+-------+-------------+------+---------------+------+---------+------+------+-------------+

we see that for table t no partition pruning is performed. While it is apparent that MySQL could do this (pieces in red font indicate what’s missing):

  for each row in table c
  {
     $p= <find partition that has rows such that t.num = c.num>;
     for each row in table t, within partition $p
       if (where clause matches)
         pass row combination to output;
  }

Partition Selection does something similar, but it will work only if there is an INDEX(t.num) and the optimizer choses to access table t via ref(t.num=c.num). This doesn’t hold in our case, so all partitions will be scanned every time.

Ok, now on to the general thoughts. MySQL has two kinds of partitioning optimizations:

  1. Partition pruning is performed statically, i.e. we look at the query and infer a set of partitions that do not need to be accessed, no matter which query execution plan will be used.
  2. Partition selection is the opposite: when we’re executing the query, and do an index lookup on
      tbl.keypart1=const1 AND tbl.keypart2=const2 AND ... AND tbl.keypartN=constN
    

    we check if those equalities allow us to determine one single partition that needs to be accessed. If yes, we access only that partition.

That is, we have

Partition pruning
  • static (can use predicates that depend only on this table)
  • thorough predicate analysis (can use >, <, BETWEEN, etc)
Partition selection
  • dynamic (can use predicates like tbl.x=tbl2.y)
  • can use equalities only

This dualism has its direct counterpart in MySQL table access methods. If we ignore “special” methods like fulltext and loose index scan, we have

range/index_merge
  • “static” access methods (can use predicates that depend only on this table)
  • thorough predicate analysis
ref-family methods
  • dynamic (can use predicates like tbl.x=tbl2.y)
  • can use equalities only

In fact, Partition Pruning is performed by creating a fake “index description”, running range/index_merge analyzer and then analyzing the obtained ranges.

Partition Selection could have been implemented in a similar way by re-using the ref-family methods analyzer: we could create a fake index description, run the analyzer, and then check if there are any potential ref accesses that use that index. If we have a ref-access candidate on

  tbl.partition_col = some_expresion(some_tables)

then we will know that we only need to access one partition, and we’ll know how to find out which. This solution is better than Partition Selection because

  • The requirement that we use an index that covers all partitioned columns will be lifted
  • The optimizer will know that only one partition will be accessed (currently this is discovered only at runtime) and will be able to take this into account
    • and it will be easy to show it in EXPLAIN, too
  • Architecturally, everything will look very nice:
    Engine User
    range/index_merge analyzer partition pruning
    ref-family analyzer partition selection

Having written all this I now realize that the example of BUG#26630 will probably not be resolved – it will still have to use all partitions because of the automatic unconditional use of “join buffering”. Well, hopefully the reporter does not really intend to run cross joins and has some will be satisfied with ability to use indexes that do not cover all partitioned fields.

Rant

Now a small rant. It seems all this is an example of Conway’s Law:

  1. Partition pruning was designed/implemented by people who were familiar with range/index_merge analyzer. Hence the reuse.
  2. Partition selection was designed/implemented by people who I beleive were not familiar with ref-family analyzer. Hence, no reuse. They were familiar with table handler interface and so partition selection went into the handler interface.

Funny.

5 Comments

  1. makefool says:

    May I ask a question?
    I use partitioning in my “Mysql 5.1.24-rc-community”, the table create SQL like this ” CREATE TABLE min5(
    sCode varchar( 7 ) NOT NULL ,
    sTime double NOT NULL ,
    sVolume float DEFAULT NULL
    )partition BY range(
    floor( sTime )
    )(
    PARTITION p0 VALUES LESS THAN( 1995 ) , PARTITION p1
    VALUES LESS THAN( 1996 ) , PARTITION p2
    VALUES LESS THAN MAXVALUE
    ) “,
    and I got a error “#1491 – The PARTITION function returns the wrong type”

    How to alter the SQL to reslove it, the ‘sTime’ must be float type.

    Like

  2. sergey says:

    makefool, good question 🙂

    It doesn’t allow the partitioning you used because its return type is FLOAT and not INTEGER

    I’ve tried casting to integer, used

    …partition BY range( convert(floor(sTime), signed))

    but that got me

    ERROR 1564 (HY000): This partition function is not allowed

    Looking at

    http://dev.mysql.com/doc/refman/5.1/en/partitioning-limitations-functions.html

    it seems that it is now impossible to use RANGE partitioning over FLOAT columns.

    This is seems to be an unintended effect of “tightening the bolts” partitioning team did for partitioning over complicated charsets/collations.

    Like

  3. sergey says:

    makefool, some more notes:

    * The desired partitioning scheme is allowed if you change column type from FLOAT to DECIMAL(m,n). Maybe that will work for you?

    * For the future, we have PARTITION BY(COLUMN_LIST(…)) syntax in the works (see WL#4065 http://forge.mysql.com/worklog/task.php?id=4065) for examples, which will allow to directly use range partitioning over a FLOAT column.

    Like

  4. sergey says:

    Filed the problem with inability to use range partitioning over FLOAT columns as BUG#36303 (http://bugs.mysql.com/36303)

    Like

  5. makefool says:

    Thank you, I use such :

    CREATE TABLE min5(
    sCode varchar( 7 ) NOT NULL ,
    sTime DECIMAL( 8, 4 ) NOT NULL ,
    sVolume float DEFAULT NULL
    )partition BY range(
    floor( sTime )
    )(
    PARTITION p0 VALUES LESS THAN( 1995 ) , PARTITION p1
    VALUES LESS THAN( 1996 ) , PARTITION p2
    VALUES LESS THAN MAXVALUE
    )

    It works fine.
    I hardly imagine I can exploring such a bug, because I use DataBase tool so obtuse.

    thank you again

    A beginner from P.R. China.

    Like

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s