Partition pruning tip: Use comparisons of column, not of partitioning function value

A customer issue has drawn my attention to this this pecularity: if partitioning is done by the value of some function, then partition pruning module will make use of comparisons of the partitioning column(s), but not of comparisons of the value of the partitioning function. Here is an example:

CREATE TABLE t1 ( recdate  DATETIME NOT NULL, ... )
PARTITION BY RANGE( TO_DAYS(recdate) ) (
  PARTITION p0 VALUES LESS THAN ( TO_DAYS('2007-01-01') ),
  PARTITION p1 VALUES LESS THAN ( TO_DAYS('2007-02-01') ),
  ...
);
EXPLAIN PARTITIONS SELECT * FROM t1 WHERE recdate='2007-01-15';
+----+-------------+-------+------------+------+-...
| id | select_type | table | partitions | type |
+----+-------------+-------+------------+------+-...
|  1 | SIMPLE      | t1    | p1         | ALL  |
+----+-------------+-------+------------+------+-...

EXPLAIN PARTITIONS SELECT * FROM t1 WHERE TO_DAYS(recdate)=TO_DAYS('2007-01-15');
+----+-------------+-------+----------------+------+-...
| id | select_type | table | partitions     | type |
+----+-------------+-------+----------------+------+-...
|  1 | simple      | t1    | p0,p1,p2,p3,p4 | all  |
+----+-------------+-------+----------------+------+-...

This also applies to cases where partitioning function is a function of multiple columns. This gives us this tip:


For partition pruning to work, the WHERE clause must use comparisons of columns used by partitioning function, not of the result value of the partitioning function.

This doesn’t look like a totally natural limitation, but that is what is in effect for the time being.

The impact of this limitation and whether it can be lifted

The limitation can be easily worked around in cases where partitioning function is unary and monotonic, i.e where one can easily convert back and forth between comparison of partitioning function:

  monotonic_part_func(const1) < monotonic_part_func(part_col) < monotonic_part_func(const2)

and comparison of partitioning column:

const1 < part_col < const2

Partition pruning module can do such conversions, too, so it is not difficult to make MySQL handle such cases automatically. The case with non-monotonic or n-ary partitioning function is more difficult. Partition pruning uses range analysis over column ordering, that is, it analyzes the WHERE clause, converts it to a disjoint list of intervals like:

(a < part_col < b) OR (part_col = c) OR ...

and then finds which partitions those intervals fall into. We could switch to part_func(part_col) ordering, i.e. collect intervals like

(a < part_func(part_col) < b) OR (part_func(part_col) = c) OR ...

but this will make it hard to handle predicates like “a < part_col < b” (remember I’m talking about non-monotonic case, where X < Y does not mean that part_func(X) < part_func(Y)).

We could do two partition pruning passes, one with part_col ordering and the other with part_func(part_col) ordering, but that is slow, ugly, and will not handle cases like

part_col=const1 OR part_func(part_col)=const2

We could stop doing the range analysis altogether and operate on sets of used partitions, that is, recursively process the WHERE clause using a set of rules like this:


# AND/OR recursion
partitions(cond1 AND cond2) := intersect(partitions(cond1), partitions(cond2))
partitions(cond1 OR cond2) := union(partitions(cond1), partitions(cond2))

# Various atomic predicates
partitions(part_col = C) := { partition_no(part_func(C)) }
partitions(part_col < C) := ...
...

but that will be slow, especially when there are many partitions. Also this technique will work poorly (or get very complicated) in cases where partitioning function uses several columns, and comparisons of those columns are on different AND/OR branches in the WHERE clause.

Neither solution is prefect, and each of them adds some overhead. It seems we’ll need to collect some user input before we decide what (if anything) we should do.

Use of join buffer is now visible in EXPLAIN

UPDATE:
* s/Using join cache/Using join buffer/, changed to show the final variants of EXPLAIN output as described here
* s/join_buff_size/join_buffer_size/


Starting from 5.1.18, EXPLAIN output may show “Using join cache“, like in this example:

mysql> explain select * from t1, t2 where t1.col < 10 and t2.col < 'bar';
+----+-------------+-------+-------+-...-+--------------------------------+
| id | select_type | table | type  |     | Extra                          |
+----+-------------+-------+-------+-...-+--------------------------------+
|  1 | SIMPLE      | t1    | range |     | Using where                    |
|  1 | SIMPLE      | t2    | range |     | Using where; Using join buffer |
+----+-------------+-------+-------+-...-+--------------------------------+

The join cache is actually not a new feature. It has been available in MySQL at least since version 4.0, and for all this time it has remained invisible and undocumented. The only thing that indicated its presense was the @@join_buffer_size server variable.

We’re trying to gradually make EXPLAIN show more information. Georgi Kodinov was fixing BUG#27531 and has used that occasion to make join buffering show up in EXPLAIN output.

If you already know how MySQL’s join buffering works, that’s all the news. If not, the remainder of this post has a hopefully readable explanation of how join buffering works and when it is used.

How join buffering works

Let’s start with regular Nested Loops Join. Suppose we have a join query

select * from t1, t2, t3 where t2.key1=t1.col1 and t3.key1<40;

and the query plain is like shown in this EXPLAIN output:

...-+-------+-------+---------------+------+---------+--------------+------+-------------+
    | table | type  | possible_keys | key  | key_len | ref          | rows | Extra       |
...-+-------+-------+---------------+------+---------+--------------+------+-------------+
    | tbl1  | ALL   | NULL          | NULL | NULL    | NULL         |   10 |             |
    | tbl2  | ref   | key1          | key1 | 5       | db.tbl1.col1 |    2 | Using where |
    | tbl3  | range | key1          | key1 | 5       | NULL         |   40 | Using where |
...-+-------+-------+---------------+------+---------+--------------+------+-------------+

When no join buffering is used, the query will be executed as follows:

  for each record t1rec in table tbl1
  {
    for each record t2rec in tbl2 such that t2rec.key1=t1rec.col
    {
      for each record t3rec in tbl3 such that t3rec.key1<40
      {
        pass the (t1rec, t2rec, t3rec) row combination to output;
      }
    }
  }

Graphically the execution flow can be depicted as follows (yellow are the table scans, blue are the table rows):

nl-join-no-buffering.png

From the code and picture we see that:

  • Table tbl2 is scanned several times, but each scan accesses a different part of the table
  • Table tbl3 is scanned many times, and all performed scans are identical

It is apparent that the second and the third scans of table tbl3 bring no new information and can be removed. We do not have to re-scan tbl3 for any row combination from tables tbl1, tbl2. Instead, we could accumulate a back of such row combination, and then do one tbl3 scan for all of them. And this is what join buffering is.

In pseudo-code, the execution will look as follows:

  for each record t1rec in table tbl1
  {
    for each record t2rec in tbl2 such that t2rec.key1=t1rec.col
    {
      put (t1rec, t2rec) into the buffer
      if (buffer is full)
        flush_buffer();
    }
  }

  flush_buffer() {
    for each record t3rec in tbl3 such that t3rec.key1<40
    {
      for each record in the buffer
        pass (t1rec, t2rec, t3rec) row combination to output;
    }
    empty the buffer;
  }

And graphically it will look as follows:

nl-join-buffering.png

The EXPLAIN output will be as follows:

explain select * from t1,t2,t3 where t2.key1 = t1.col1 and t3.key1<40;
...-+-------+-------+---------------+------+---------+--------------+------+--------------------------------+
    | table | type  | possible_keys | key  | key_len | ref          | rows | Extra                          |
...-+-------+-------+---------------+------+---------+--------------+------+--------------------------------+
    | t1    | ALL   | NULL          | NULL | NULL    | NULL         |   10 |                                |
    | t2    | ref   | key1          | key1 | 5       | test.t1.col1 |    2 | Using where                    |
    | t3    | range | key1          | key1 | 5       | NULL         |   40 | Using where; Using join buffer |
...-+-------+-------+---------------+------+---------+--------------+------+--------------------------------+

In this example join buffering is used for one table, but it can be used for several tables as well. MySQL uses join buffering whenever it can, access to some table tbl_x will be bufferred if

  • The SELECT does not have an ORDER BY clause
  • We’re not at top level “select” of a multi-table UPDATE
  • tbl_x is accessed using an “independent” access method: ALL, index, range, or index_merge.
  • tbl_x is not inner w.r.t. some outer join

The server variable @@join_buffer_size specifies how much MySQL should allocate for each buffer. That is, if two tables use buffering, MySQL will allocate two buffers of @@join_buffer_size bytes each.

Thoughts on partitioning optimizations

I’m supposed to be working on subquery optimization but I can’t get BUG#26630 out of my head.

The bug test case is small and easy to understand: looking at the EXPLAIN:

create table tbl_test( ... ) partition by range(num) (...);

explain partitions
select * from tbl_co c straight_join tbl_test t where t.num=c.num and reg=8;
..+-------+-------------+------+---------------+------+---------+------+------+-------------+
  | table | partitions  | type | possible_keys | key  | key_len | ref  | rows | Extra       |
..+-------+-------------+------+---------------+------+---------+------+------+-------------+
  | c     | NULL        | ALL  | NULL          | NULL | NULL    | NULL |   17 | Using where |
  | t     | p0,p1,p2,p3 | ALL  | NULL          | NULL | NULL    | NULL |   17 | Using where |
..+-------+-------------+------+---------------+------+---------+------+------+-------------+

we see that for table t no partition pruning is performed. While it is apparent that MySQL could do this (pieces in red font indicate what’s missing):

  for each row in table c
  {
     $p= <find partition that has rows such that t.num = c.num>;
     for each row in table t, within partition $p
       if (where clause matches)
         pass row combination to output;
  }

Partition Selection does something similar, but it will work only if there is an INDEX(t.num) and the optimizer choses to access table t via ref(t.num=c.num). This doesn’t hold in our case, so all partitions will be scanned every time.

Ok, now on to the general thoughts. MySQL has two kinds of partitioning optimizations:

  1. Partition pruning is performed statically, i.e. we look at the query and infer a set of partitions that do not need to be accessed, no matter which query execution plan will be used.
  2. Partition selection is the opposite: when we’re executing the query, and do an index lookup on
      tbl.keypart1=const1 AND tbl.keypart2=const2 AND ... AND tbl.keypartN=constN
    

    we check if those equalities allow us to determine one single partition that needs to be accessed. If yes, we access only that partition.

That is, we have

Partition pruning
  • static (can use predicates that depend only on this table)
  • thorough predicate analysis (can use >, <, BETWEEN, etc)
Partition selection
  • dynamic (can use predicates like tbl.x=tbl2.y)
  • can use equalities only

This dualism has its direct counterpart in MySQL table access methods. If we ignore “special” methods like fulltext and loose index scan, we have

range/index_merge
  • “static” access methods (can use predicates that depend only on this table)
  • thorough predicate analysis
ref-family methods
  • dynamic (can use predicates like tbl.x=tbl2.y)
  • can use equalities only

In fact, Partition Pruning is performed by creating a fake “index description”, running range/index_merge analyzer and then analyzing the obtained ranges.

Partition Selection could have been implemented in a similar way by re-using the ref-family methods analyzer: we could create a fake index description, run the analyzer, and then check if there are any potential ref accesses that use that index. If we have a ref-access candidate on

  tbl.partition_col = some_expresion(some_tables)

then we will know that we only need to access one partition, and we’ll know how to find out which. This solution is better than Partition Selection because

  • The requirement that we use an index that covers all partitioned columns will be lifted
  • The optimizer will know that only one partition will be accessed (currently this is discovered only at runtime) and will be able to take this into account
    • and it will be easy to show it in EXPLAIN, too
  • Architecturally, everything will look very nice:
    Engine User
    range/index_merge analyzer partition pruning
    ref-family analyzer partition selection

Having written all this I now realize that the example of BUG#26630 will probably not be resolved – it will still have to use all partitions because of the automatic unconditional use of “join buffering”. Well, hopefully the reporter does not really intend to run cross joins and has some will be satisfied with ability to use indexes that do not cover all partitioned fields.

Rant

Now a small rant. It seems all this is an example of Conway’s Law:

  1. Partition pruning was designed/implemented by people who were familiar with range/index_merge analyzer. Hence the reuse.
  2. Partition selection was designed/implemented by people who I beleive were not familiar with ref-family analyzer. Hence, no reuse. They were familiar with table handler interface and so partition selection went into the handler interface.

Funny.

MySQL Optimizer and Prepared Statements

Performance-wise, the idea of Prepared Statements is that the server does certain pre-processing on PREPARE command, and then those steps are omitted when the statement is executed. If the statement is executed several times, we get an economy of

cost_of_prepare_preprocessing * (#statement_executions – 1)

This makes one want to move the CPU and IO-intensive query optimization into the PREPARE phase. Unfortunately, this will make the optimizer work much worse – optimizer’s decisions are based on the external information, and there is much less available information at PREPARE phase. The most crucial differences are that

  • The values of the ‘?’ parameter markers are not yet known
  • The results of probes done in the the queried tables cannot be relied on because the table data may change before the EXECUTE
  • [less important] Table and index statistics may change before the EXECUTE

Those limitations cause the most of MySQL’s optimizations to be re-done at every EXECUTE command. To be precise, here is a table of what is done when:

Action When
Query parse PREPARE
Negation elimination PREPARE
Subquery re-writes PREPARE
Nested JOIN simplification First EXECUTE
OUTER->INNER JOIN conversions First EXECUTE
Partition pruning Every EXECUTE
COUNT/MIN/MAX elimination Every EXECUTE
Constant subexpression removal Every EXECUTE
Equality propagation Every EXECUTE
Constant table detection Every EXECUTE
ref access analysis Every EXECUTE
range/index_merge analysis and optimization Every EXECUTE
Join optimization Every EXECUTE

Basically, the current rule is as follows:

Optimizations that depend only on query syntax are performed either at PREPARE or at first EXECUTE. Optimizations that also depend on something else are performed at every EXECUTE.

If you’re only interested in the current state of affairs, that’s all you need to know. If you want to know what [I think] is going happen in the future, read on.

The future

The above quoted rule is not a DBMS law of nature. In fact, it is already getting in the way of subquery rewrites (see e.g. BUG#27430, WL#3740) and some other server subsystems. We’ve recently had a discussion about other possible solutions. The smartest and most elegant solution was suggested by Timour:

  • Let every optimization record its “dependencies” – information that it depends on:
    • ref access analyzer: set of table indexes
    • constant table detector: the above + the set of tables that have 1 or 0 rows
    • range optimizer: all of the above + used values of parameter markers(if any) + results of records_in_range calls
    • … and so forth…
  • On subsequent EXECUTE, check which dependencies have changed and re-do the correponding optimizations.

This approach will reduce the work done on EXECUTE step to doing a set of checks. In most cases, the checks will pass, and no re-optimization will be required. This approach is hard to implement, however. MySQL’s internal data structures do not naturally support “block undo”, so it is not easy to cancel and re-do an optimization.

For now we’ve settled for a less impressive solution:

  • We’ll gradually move query rewrites that depend on table DDLs into PREPARE phase
  • On EXECUTE we’ll check if DDLs of the used tables have changed. If they have, we’ll throw out the Prepared Statement and re-prepare it.

Re-preparing the statement is expensive but it should be a very rare occurence for most applications. This task is filed as WL#3813: Re-prepare the prepared statement if its underlying objects have changed.

How to find out if an outer join was converted to inner

After this post I’ve got a question how one can tell if his outer join was converted to inner. You can find it out by looking at the warning generated by EXPLAIN EXTENDED. If the outer join wasn’t converted, you’ll see it in the rewritten query in the warning:

mysql> explain extended select * from t1 left join (t2, t3) on t2.a= t1.a;
...
3 rows in set, 1 warning (0.00 sec)

mysql> show warnings\G
*************************** 1. row ***************************
  Level: Note
   Code: 1003
Message: select `test`.`t1`.`a` AS `a`,`test`.`t2`.`a` AS `a`,`test`.`t3`.`a`
AS `a` from `test`.`t1` left join (`test`.`t2` join `test`.`t3`) on ((`test`.`t2`.
`a` = `test`.`t1`.`a`)) where 1

In this query LEFT JOIN is not converted to inner.

Now let’s try a query where outer join will be converted:

mysql> explain extended select * from t1 left join (t2, t3) on t2.a= t1.a where t2.a  show warnings\G
*************************** 1. row ***************************
  Level: Note
   Code: 1003
Message: select `test`.`t1`.`a` AS `a`,`test`.`t2`.`a` AS `a`,`test`.`t3`.`a`
AS `a` from `test`.`t1` join `test`.`t2` join `test`.`t3` where ((`test`.`t1`.
`a` = `test`.`t2`.`a`) and (`test`.`t2`.`a` < 10))

Looking at the warning text we see an inner join instead of the left join, and also we see that the ON clause has been added into the WHERE.

Yes, those warning messages are hard to read, they have excessive quoting and the lines are too long. But at the moment certain kinds of query plan details are displayed only there, so skiming through the rewritten query text may pay off when you have doubts about what is going on.

Nested outer joins

Here is MySQL’s nested outer joins optimization cheat sheet:

  • Conversions:
    • RIGHT JOIN is converted to LEFT JOIN. FULL JOIN is not supported.
    • Outer joins are converted to inner joins when possible
  • Constraints on join order:

    • “Outer tables go first”
    • “No interleaving”
  • Table access rules:

    • “Inner” table access methods are constructed from parts of the ON condition. WHERE condition can’t be used to construct table accesses.
    • Parts of ON condition are checked as soon as possible
    • Parts of the WHERE condition
      – are not checked until we’ve found a row combination that matches the ON clause
      – are checked as soon as possible after that.

Or, in more detail:

Conversions

RIGHT JOIN to LEFT JOIN conversion is obvious:

  (t1 RIGHT JOIN t2 ON cond) = (t2 LEFT JOIN t1 ON cond)

Conversion from outer to inner join is possible when the result of inner join will be the same. It will be the same if the row combination with NULL-complimented row will not pass the WHERE clause. For example, if we look at the query

  t1 LEFT JOIN t2 ON some_cond WHERE t2.a=t1.b

we’ll see that a row with t2.a IS NULL will not satisfy the WHERE condition. Hence, this outer join can be converted to inner.

Constraints on join order

Outer tables go first
any outer table used in the outer join’s ON clause must be before all of the inner tables.
No interleaving
tables contained within an outer join must form a continuous sequence in the join order. Interleaving with tables that are outside of the outer join is not allowed.

Table access rules

Now, this requires some explanation. MySQL’s nested-loops join code tries to check parts of the WHERE as soon as
possible. For example when a query

SELECT * FROM
  t1,t2, ...
WHERE
  t1.col1=c1 AND
  t2.col1=t1.col2 AND t2.col2=c3 AND
  ...

is executed using a join order of (t1, t2,…), it proceeds according to this kind of scenario:

Inner join swimlanes

We can see here that the the WHERE condition is split into parts that are checked “as early as possible”.

With outer joins is more complicated. We need to know if we’ll need to generate a NULL-complemented row combination. We won’t need to if there was a combination of inner tables that matched the ON (but not necessarily the WHERE) clause. The solution is to switch the WHERE parts checking on and off.

The best way to show it is with example: Suppose we have a query

 SELECT * FROM ... ot1 LEFT JOIN (it1, it2) ON somecond WHERE ...

and suppose the join order is (ot1, it1, it2, …). The execution will proceed in this manner:

outer join swimlanes

What’s visible there? When we start scanning table it1, we check only the ON condition. We can’t check the WHERE – we could iscard some it1’s row that is the only row that will match the ON condition, think there will be no matches, and erroneously generate the NULL-complimented row.

After we’ve found the match for the ON condition, we go back and check all parts of the WHERE we did not check because of the above mentioned reason.

After that, the execution proceeds as if this was an inner join, with ON merged into the WHERE clause.

The diagram also shows why we can’t use parts of the WHERE clause to create table acccess methods: because there are times when we can’t use parts of the WHERE for filtering. We always can use parts of the ON though.

Now it should be clear where all Table Access Rules came from.

Subqueries: the new strategy for "NULL IN (SELECT …)"

I hope this is my last post about this topic. It seems we’ve resolved all of the issues and I’ll now describe the user-visible consequences.
To recall, we’re talking about subquery predicates in form

 (oe1, ..., oeN) [NOT] IN (SELECT ie1, ..., ieN FROM ... )

that are located in a context where it matters if the predicate’s result is NULL or FALSE. The name “oe” stands for “outer expression”, ie stands for “inner expression”.

MySQL evaluates queries “from outside to inside”, i.e. we first get the values of (oe1, .. oeN) and then we run the subquery and capture the rows it produces. An apparent and very useful optimization is to “inform” the subquery that we’re looking only for rows that have “ie1=oe1”, “ie2=oe2” and so on. This is done by injecting appropriate equalities into subquery’s WHERE (or HAVING) clause. That is,

 (oe1, ..., oeN) [NOT] IN (SELECT ie1, ..., ieN FROM ... )

becomes

  EXISTS (SELECT 1 /* ie1, ..., ieN */
           FROM ... WHERE subquery_where AND
                          oe1=ie1 AND
                          oe2=ie2 AND
                          ...
                          oeN=ieN))

However, this conversion is only valid if we ignore possible NULL values. If some of the iek can be NULL, then we need to use oek=iek OR iek IS NULL instead. I’ve covered this case in detail here in NULL problem in the right part section.

Correct handling of cases where some oek IS NULL requires more radical changes. We’ve just made those changes and here they are:

The new strategy

According to SQL’s interpretation of NULL as “unknown value”,

  NULL IN (non-empty-list-of-some-values) = NULL
  NULL IN () = FALSE

So, when we want to evaluate

  NULL IN (SELECT ie FROM ... )

we need to run the SELECT and see if it will produce any rows. Note that we need to run the original SELECT, without any injected equalities mentioned in the previous section.

On the other hand, it is absolutely essential to have

  not_null_oe IN (SELECT ie FROM ...)

converted to

  EXISTS (SELECT 1 /* ie1 */ FROM ... WHERE ie1=not_null_oe ...)

If we don’t do this, subqueries will be terribly slow. We’ve solved this “inject or not inject” dilemma by wrapping the injected conditions into triggers. A subquery

  (oe1, ..., oeN) [NOT] IN (SELECT ie1, ..., ieN FROM ... )

is converted into

  EXISTS (SELECT 1 /* ie1, ..., ieN */
           FROM ... WHERE subquery_where AND
                          trigcond(oe1=ie1) AND
                          trigcond(oeN=ieN) AND
                          ...
         )

where each trigcond(X) is a special “magic” function defined as:

  trigcond(X) := X    when the "linked" outer expression oe_i is not NULL
  trigcond(X) := TRUE when the "linked" outer expression oe_i is NULL

Equalities that are wrapped into trigcond() function are not first class predicates for the query optimizer. Most optimizations cannot deal with predicates that may be turned on and off at query execution time, so they assume any trigcond(X) to be unknown function and ignore it. At the moment, triggerered equalities can be used by those optimizations:

  1. Ref-optimizer: trigcond(X=Y [OR Y IS NULL]) can be used to construct ref, eq_ref or ref_or_null table accesses.
  2. Index lookup-based subquery execution engines: trigcond(X=Y) can be used to construct unique_subquery or index_subquery access.
  3. Table condition generator: if the subquery is a join of several tables, triggered condition will be checked as soon as possible.

When the optimizer uses triggered condition to create some kind of index lookup-based access (#1 and #2 in the above list), it must have a strategy for the case when the condition is turned off. This “Plan B” strategy is always the same – do a full table scan. In EXPLAIN the plan B shows up as “Full scan on NULL key” in the “Extra” column:

mysql> explain select t1.col1, t1.col1 IN (select t2.key1 from t2 where t2.col2=t1.col2) from t1
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: t1
        ...
*************************** 2. row ***************************
           id: 2
  select_type: DEPENDENT SUBQUERY
        table: t2
         type: index_subquery
possible_keys: key1
          key: key1
      key_len: 5
          ref: func
         rows: 2
        Extra: Using where; Full scan on NULL key

And if you run EXPLAIN EXTENDED …; SHOW WARNINGS you can see the triggered condition:

*************************** 2. row ***************************
  Level: Note
   Code: 1003
Message: select `test`.`t1`.`col1` AS `col1`,<in_optimizer>(`test`.`t1`.`col1`,<exists>(<
index_lookup>(<cache>(`test`.`t1`.`col1`) in t2 on key1 checking NULL where (`test`.`t2`.
`col2` = `test`.`t1`.`col2`) having trigcond(<is_not_null_test>(`test`.`t2`.`key1`))))) AS
`t1.col1 IN (select t2.key1 from t2 where t2.col2=t1.col2)` from `test`.`t1`

Performance implications

The first apparent implication is that NULL IN (SELECT …) now may cause full table scans (slow!) where it previously did not. This is the price to pay for correct results.
For multi-table subqueries the execution of NULL IN (SELECT …) is going to be particularly slow because the join optimizer doesn’t optimize for the case when outer expression is NULL. It assumes that subquery evaluations with NULL on the left side are very rare, even if there is statistics that says otherwise

On the other hand, if you have left expression that may be NULL but actually never is, you will not lose any speed.

The practical hints are

  • A column must be declared as NOT NULL if it really is. This is important for the other parts of the query optimizer too.
  • If you don’t really need the correct NULL/FALSE answer, you can easily avoid the slow execution path: just replace
         oe IN (SELECT ie FROM ...)
    

    with

         (oe IS NOT NULL) AND (oe IN (SELECT ie FROM ...))
    

    and NULL IN (SELECT …) will never be evaluated because MySQL stops evaluating AND parts as soon as the answer is clear.

The goal of this new strategy was to improve compliance and not speed. However we’ve had an intent to not make anything unneccessarily slow. If something became slower for you please file a bug, perhaps we’ll be able to do something about it.

Subqueries: NULLs and IN/=ANY problem fixed

A while ago I wrote about problem with NULLs and IN/=ANY subqueries MySQL had. I was completely correct when I wrote that the fix won’t be simple. It took 3 bug entries (BUG#8804, BUG#24085, BUG#24127), several review iterations by Igor, and the final patch is around 2,300 lines long.

The good news is that this patch solves the problem completely, and it is already in the 5.0.36 tree. The documentation is not yet updated, doing that is now on my todo. There is quite a lot to document: we’ve had to introduce “switchable access methods”, where a table accessed using ref is sometimes accessed using full table scan. (for the impatient: no, the new access method is not not called ref_or_null_or_all :-), it is still ref[_or_null] but with “Full scan on NULL key” in the “Extra” column).

I’ll post here when the docs become available.

An idea how to speed up nested-loops join a little

Working on subquery optimizations, got an idea how to speed up join execution a little. Read on.

The idea

Consider a query:

select * from t1, t2, t3 where t3.key=t1.col1 and t2.key=t1.col2

Suppose the join order is t1, t2, t3, i.e. the EXPLAIN is like

+-------+------+---------------+------+---------+--------------+-..
| table | type | possible_keys | key  | key_len | ref          |
+-------+------+---------------+------+---------+--------------+-..
| t1    | ALL  | NULL          | NULL | NULL    | NULL         |
| t2    | ref  | key           | key  | 5       | test.t1.col1 |
| t3    | ref  | key           | key  | 5       | test.t1.col2 |
+-------+------+---------------+------+---------+--------------+-..

The important property is that access to t3 is independent of access to t2. MySQL’s nested loops join algorithm will run this as in this swimlane diagram:

NL execution diagram

Here we assume that

  • table t2 has 4 rows such that t2.key=t1.col1
  • table t3 doesn’t have any rows such that t3.key=t1.col2

As soon as we make first index lookup in table t3 (the one with blue border), we see that there will be no matching row combinations for this row of t1. Nevertheless, MySQL executioner will proceed to examine different rows in table t2 (marked with red border). This is redundant and can be easily avoided.

The implementation

The executioner part is easy: just maked the nested-loops join code to “jump back” in the cases like the illustrated. If we don’t find a match and table access/selection condition does not depend on the preceding table(s), then go back to the last table that the table access depends on. I don’t have the ready term for this, the working name is “short-cutting”.

The optimizer part is (as usual) more complicated. One way is to take the easy route: let the join optimizer (the part of code that chooses the join order) remain unaware of the short-cutting. Once the join order is produced, set up the executioner to perform short-cutting where appopriate.

The bad part is that join optimizer doesn’t account for short-cutting when comparing costs of various join orders, which may lead it to choose non-optimal join orders. However, taking short-cutting into account won’t be easy:

  • First, we’ll need to generate selection conditions for the join orders we consider (in other words, figure out if the EXPLAIN will have “Using WHERE”). Our current way of doing this will probably be too expensive to be done for each considered join order.
  • Second, we’ll need to get somewhere an estimate of how often short-cutting will occur. That is, we’ll need to know probability that for some *arbitrary* values of columns from preceding tables, table T will have no rows that would satisfy given condition COND(preceding_tables, T). This estimate is likely to end up being some while guess like “lets use
    0.5”.

At the moment the easy route seems like the way to go.

2007-02-11: Fixed typos in EXPLAIN ouput