MySQL Optimizer and Prepared Statements

April 28, 2007 spetrunia26 Comments

Performance-wise, the idea of Prepared Statements is that the server does certain pre-processing on PREPARE command, and then those steps are omitted when the statement is executed. If the statement is executed several times, we get an economy of

cost_of_prepare_preprocessing * (#statement_executions – 1)

This makes one want to move the CPU and IO-intensive query optimization into the PREPARE phase. Unfortunately, this will make the optimizer work much worse – optimizer’s decisions are based on the external information, and there is much less available information at PREPARE phase. The most crucial differences are that

The values of the ‘?’ parameter markers are not yet known
The results of probes done in the the queried tables cannot be relied on because the table data may change before the EXECUTE
[less important] Table and index statistics may change before the EXECUTE

Those limitations cause the most of MySQL’s optimizations to be re-done at every EXECUTE command. To be precise, here is a table of what is done when:

Action	When
Query parse	PREPARE
Negation elimination	PREPARE
Subquery re-writes	PREPARE
Nested JOIN simplification	First EXECUTE
OUTER->INNER JOIN conversions	First EXECUTE
Partition pruning	Every EXECUTE
COUNT/MIN/MAX elimination	Every EXECUTE
Constant subexpression removal	Every EXECUTE
Equality propagation	Every EXECUTE
Constant table detection	Every EXECUTE
ref access analysis	Every EXECUTE
range/index_merge analysis and optimization	Every EXECUTE
Join optimization	Every EXECUTE

Basically, the current rule is as follows:

Optimizations that depend only on query syntax are performed either at PREPARE or at first EXECUTE. Optimizations that also depend on something else are performed at every EXECUTE.

If you’re only interested in the current state of affairs, that’s all you need to know. If you want to know what [I think] is going happen in the future, read on.

The future

The above quoted rule is not a DBMS law of nature. In fact, it is already getting in the way of subquery rewrites (see e.g. BUG#27430, WL#3740) and some other server subsystems. We’ve recently had a discussion about other possible solutions. The smartest and most elegant solution was suggested by Timour:

Let every optimization record its “dependencies” – information that it depends on:

ref access analyzer: set of table indexes
constant table detector: the above + the set of tables that have 1 or 0 rows
range optimizer: all of the above + used values of parameter markers(if any) + results of records_in_range calls
… and so forth…

On subsequent EXECUTE, check which dependencies have changed and re-do the correponding optimizations.

This approach will reduce the work done on EXECUTE step to doing a set of checks. In most cases, the checks will pass, and no re-optimization will be required. This approach is hard to implement, however. MySQL’s internal data structures do not naturally support “block undo”, so it is not easy to cancel and re-do an optimization.

For now we’ve settled for a less impressive solution:

We’ll gradually move query rewrites that depend on table DDLs into PREPARE phase
On EXECUTE we’ll check if DDLs of the used tables have changed. If they have, we’ll throw out the Prepared Statement and re-prepare it.

Re-preparing the statement is expensive but it should be a very rare occurence for most applications. This task is filed as WL#3813: Re-prepare the prepared statement if its underlying objects have changed.

6 thoughts on “MySQL Optimizer and Prepared Statements”

Lukas says:

April 28, 2007 at 8:24 pm

A few points:

1) many people are using prepared statements as a way to protect against SQL injection. in this light they are often using prepared statements for single shot queries
2) as you point out, different data will lead to different optimal query plans, however you also do not want to replan the query everytime you get different data

As such it seems to me you would want to do a couple of things:
1) give the user a way to decide if he wants a general plan or one that is optimized to the data set in the very first execute
2) ideally go in the direction you mentioned above and store multiple execution plan’s as the user is executing statements with data sets that require different execution plans

While 2) is fairly hard to do, 1) should be fairly “easy”. PostgreSQL provides this due to their different handling of named and unnamed prepared statements. IIRC Oracle since 9i will always defer execution planning until the very first execute.

LikeLike

Reply
sergey says:

May 2, 2007 at 3:43 am

> many people are using prepared statements as a way to
> protect against SQL injection. in this light they are
> often using prepared statements for single shot queries

Perhaps this case should be handled by using “client side” Prepared
Statements – handling the PS on the client will spare one
clientserver roundtrip.

> 1) give the user a way to decide if he wants a general plan
> or one that is optimized to the data set in the very first
> execute

The problem is MySQL’s Query Plans may stronly depend on properties like

‘table X has exactly one row with values (X,Y,Z)’

so the query plan from 1st EXECUTE may be not valid for the second EXECUTE.
We will need dependency checker to handle cases like this.

When this is handled and there are no QEP-destroying changes, we will have
the choice between re-running the [expensive] range/join optimization or
re-using their results from the previous execution. This decision can be
made based on user input (I’d prefer explicit hints over named/unnamed PS).

(if we allow to re-use optimization we’ll need to make the join/range/ref
optimizers to treat the ‘?’ markers as “unknown constants” though)

Thanks also for info on Postgres and Oracle.

I’ll amend the WL text based on this info and ideas.

LikeLike

Reply
Pythian Group Blog » Log Buffer #43: a Carnival of the Vanities for DBAs says:

May 4, 2007 at 9:36 pm

[…] Sergey Petrunia reports on MySQL optimizer and prepared statements. He says that the economy of prepared statements will hamper the work of the optimizer, and distills the reasons into the rule, “Optimizations that depend only on query syntax are performed either at PREPARE or at first EXECUTE. Optimizations that also depend on something else are performed at every EXECUTE.” He continues with some speculation on how this might change in future releases and fixes. […]

LikeLike

Reply
Rick James says:

September 27, 2008 at 10:55 pm

I like the approach mentioned in Comment 2. From my experience, I would say that well over 90% of QEPs are identical, regardless of the constants supplied. I conclude that the “hang onto prepare” hint would be ‘the right thing’ most of the time.

This observation comes from monitoring the slowlogs on dozens of unrelated MySQL servers, collecting them over time, and comparing the latest 50 EXPLAINs (or at least running EXPLAIN all at once on the various SELECTs). Rarely do the QEPs differ by anything substantive.

OTOH, I have seen some statements come up with as many as 4 different QEPs. Sometimes the diff is in picking INDEX (a,b) vs INDEX (a,c) when all it wants it to lookup by “a”. Question: Might the optimizer pick a different INDEX based on what index blocks happen to be cached at the moment?

LikeLike

Reply
Stan Vassilev says:

December 19, 2008 at 1:20 pm

>Perhaps this case should be handled by using
>â€œclient sideâ€ Prepared
>Statements – handling the PS on the client will
>spare one
>clientserver roundtrip.

What he means is that prepared statements protects from exploits which can’t be solved on the client side. MySQL’s escaping will not detect charset/collation changes at “runtime”, and continue to escape for the charset/collation the connection opened at.

Prepared statements dont’ suffer from this as a binary format.

Best of both worlds would be a query format which includes a statement part, and then a binary data part, all delivered as a single message in a single roundtrip.

LikeLike

Reply
please click the following webpage says:

May 2, 2013 at 7:30 am

please click the following webpage…

Sergey PetruniaÂ’s blog Â» MySQL Optimizer and Prepared Statements…

LikeLike

Reply

Sergei Petrunia’s blog

Obervations made while working on MariaDB's query optimizer

MySQL Optimizer and Prepared Statements

The future

6 thoughts on “MySQL Optimizer and Prepared Statements”

Leave a comment Cancel reply

The future

Share this:

Related

6 thoughts on “MySQL Optimizer and Prepared Statements”

Leave a comment Cancel reply