An idea how to speed up nested-loops join a little

Working on subquery optimizations, got an idea how to speed up join execution a little. Read on.

The idea

Consider a query:

select * from t1, t2, t3 where t3.key=t1.col1 and t2.key=t1.col2

Suppose the join order is t1, t2, t3, i.e. the EXPLAIN is like

+-------+------+---------------+------+---------+--------------+-..
| table | type | possible_keys | key  | key_len | ref          |
+-------+------+---------------+------+---------+--------------+-..
| t1    | ALL  | NULL          | NULL | NULL    | NULL         |
| t2    | ref  | key           | key  | 5       | test.t1.col1 |
| t3    | ref  | key           | key  | 5       | test.t1.col2 |
+-------+------+---------------+------+---------+--------------+-..

The important property is that access to t3 is independent of access to t2. MySQL’s nested loops join algorithm will run this as in this swimlane diagram:

NL execution diagram

Here we assume that

table t2 has 4 rows such that t2.key=t1.col1
table t3 doesn’t have any rows such that t3.key=t1.col2

As soon as we make first index lookup in table t3 (the one with blue border), we see that there will be no matching row combinations for this row of t1. Nevertheless, MySQL executioner will proceed to examine different rows in table t2 (marked with red border). This is redundant and can be easily avoided.

The implementation

The executioner part is easy: just maked the nested-loops join code to “jump back” in the cases like the illustrated. If we don’t find a match and table access/selection condition does not depend on the preceding table(s), then go back to the last table that the table access depends on. I don’t have the ready term for this, the working name is “short-cutting”.

The optimizer part is (as usual) more complicated. One way is to take the easy route: let the join optimizer (the part of code that chooses the join order) remain unaware of the short-cutting. Once the join order is produced, set up the executioner to perform short-cutting where appopriate.

The bad part is that join optimizer doesn’t account for short-cutting when comparing costs of various join orders, which may lead it to choose non-optimal join orders. However, taking short-cutting into account won’t be easy:

First, we’ll need to generate selection conditions for the join orders we consider (in other words, figure out if the EXPLAIN will have “Using WHERE”). Our current way of doing this will probably be too expensive to be done for each considered join order.
Second, we’ll need to get somewhere an estimate of how often short-cutting will occur. That is, we’ll need to know probability that for some *arbitrary* values of columns from preceding tables, table T will have no rows that would satisfy given condition COND(preceding_tables, T). This estimate is likely to end up being some while guess like “lets use
0.5”.

At the moment the easy route seems like the way to go.

2007-02-11: Fixed typos in EXPLAIN ouput

5 thoughts on “An idea how to speed up nested-loops join a little”

Xaprb says:

January 29, 2007 at 5:04 pm

Perhaps another way to think of it is “pruning the execution tree” if you think of the nested-loop join as a depth-first search.

LikeLike

sergey says:

January 31, 2007 at 10:43 pm

Maybe, I’m not a terminology guru. Thanks for feedback, I’ll keep the “pruning the execution tree” option for the time we’ll be settling with which terms to use.

LikeLike

wolis says:

March 1, 2007 at 10:46 am

Can you show us the optimised version of the initial SQL or does “The executioner part is easy: just maked the nested-loops join code to ‘jump back'” explain it and I just need to learn what the ‘executioner’ is?

Thanks

LikeLike

Sergey Petrunia’s blog » Nested-loops join speedup idea promoted to WL task says:

April 18, 2007 at 5:22 am

[…] The idea how to speed up nested-loops join a little I’ve mentioned earlier has now been promoted into a worklog entry. It is now known as WL#3724 “Short-Cutting Join Execution: Speeding up star queries” and its text is publicly available at MySQLForge. […]

LikeLike

Andris says:

April 21, 2007 at 4:48 pm

Once you should look at WL#358 too. I think it’s an idea to speed up things too, and help client application developers. (My idea.)

LikeLike

Sergei Petrunia’s blog

Obervations made while working on MariaDB's query optimizer

An idea how to speed up nested-loops join a little

The idea

The implementation

5 thoughts on “An idea how to speed up nested-loops join a little”

Leave a comment Cancel reply

The idea

The implementation

Share this:

Related

5 thoughts on “An idea how to speed up nested-loops join a little”

Leave a comment Cancel reply