The votes have been counted and now there is a schedule for MySQL and Friends devroom at FOSDEM 2012.
I’m giving two talks:
The votes have been counted and now there is a schedule for MySQL and Friends devroom at FOSDEM 2012.
I’m giving two talks:
In an ideal world, you should never need to use optimizer hints. In the real world, hints are necessary. We needed hints even before we’ve made a release. The first requests came from our Quality engineer who complained about it being difficult to hit particular join orders while using a particular subquery strategy.
So, why not add the needed hints? MySQL already has the STRAIGHT_JOIN and FORCE INDEX, you can add more as required. The problem is that once you have transformations that change the query to be far enough from the original SQL, you don’t have a “natural” place or syntax for hints anymore.
For example, if you have a join
SELECT ... FROM table1, table2, table3 WHERE ....
and want a join order of (table2, table3, table1), you can write it as
SELECT ... FROM table2 STRAIGHT_JOIN table3 STRAIGHT_JOIN table1
and that gives the optimizer the join order while still looking like SQL. Now, how do you force the same join order when the query looks like this:
SELECT ...
FROM table1
WHERE
table1.column IN (SELECT table2.col1
FROM table2
WHERE
table2.col2 IN (SELECT table3.col3
FROM table3)
The obvious answers are
table2, table3, table1 together somewhereAnd since you now need to invent the whole new non-SQL syntax to specify query plans, the task becomes just too hard. It’s difficult to design a future-proof compact syntax for specifying query plans. My not-really-solution to the problem was to
@@debug_optimizer_prefer_join_prefix system variable. That way, I avoided messing with the parserAs a result, you can make the optimizer to “strongly prefer” some particular join prefixes. The preferences are applied to each select, though. If you have a UNION and want to have one preference for on branch of it, and another preference for the other, there is no way to achieve that.
Given this kind of limitations, I think the patch is useful for development or debugging, but not for production use. Everything (documentation, link to launchpad tree) are here, at the mariadb-53-optimizer-debugging page.
With MariaDB 5.3.3 Release Candidate out of the door, I could give some attention to documentation and improve the master What is MariaDB 5.3 page and pages linked from it.
I think the part about subquery optimizations should be fairly easy to read now, and our tech writer Daniel Bartholomew also did a pass over it. If you have a picture-type mind like I do, there is plenty of pictures, including the Subquery optimizations map.
For now, my goal was to just have a descriptions of all optimizations in place. We have also done substantial amount of benchmarking, but that data still waits to be processed and published.
The deadline for proposals has passed, and there are 37 talk proposals and 12 time slots. In order to have the best possible program, we need your help. There is public voting on proposals, votes are accepted via twitter or via email. Please Vote for MySQL and Friends at FOSDEM 2012 devroom talks. Thank you!
Following the tradition, there will be a “MySQL and Friends” devroom at FOSDEM 2012. The devroom is scheduled for Sunday 5th February 2012, whole day. If you are interested to give a talk, please submit it before December 26th through this submission form.
Three years ago MySQL has got @@optimizer_switch variable. It was introduced in MySQL 5.1.34, that is, we needed it so much that we’ve added it into the stable release.
In a nutshell,@@optimizer_switch held a comma-separated list of optimizer controls:
mysq> > select @@optimizer_switch; +------------------------------------------------------------------------------------------+ | @@optimizer_switch | +------------------------------------------------------------------------------------------+ | index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on| +------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec)
One could set all settings at once:
mysql> set optimizer_switch='index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=off'; Query OK, 0 rows affected (0.01 sec)
or set individual flags:
mysql> set optimizer_switch='index_merge_sort_union=off'; Query OK, 0 rows affected (0.01 sec)
The reasons for putting all optimizer parameters into one variable were:
@@optimizer_switch solution allowed all that, and was very useful in optimizer development and troubleshooting. However, it is becoming a victim of its own success. In current development version of MariaDB @@optimizer_switch has 26 flags, and we’re thinking of adding at least two more before MariaDB 5.3 release. It now looks like this:
MariaDB [test]&> select @@optimizer_switch; *************************** 1. row *************************** @@optimizer_switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on, index_merge_intersection=on,index_merge_sort_intersection=off,index_condition_pushdown=on, derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=off, in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on, subquery_cache=on,mrr=on,mrr_cost_based=off,mrr_sort_keys=on,outer_join_with_cache=off, semijoin_with_cache=off,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on, optimize_join_buffer_size=on,table_elimination=on 1 row in set (0.00 sec)
It is rather difficult to check the value of some particular flag. Also, there is no easy way to get all settings for subquery optimization flags (other than knowing them by heart and checking each of them).
We at MariaDB are having a discussion about switching from single @@optimizer_switch variable to a set of variables like this:
optimizer.index_merge=on optimizer.index_merge.union=on optimizer.index_merge.sort_union=on optimizer.index_merge.intersection=on optimizer.index_merge.sort_intersection=off optimizer.index_condition_pushdown=on optimizer.join_cache.bka=on optimizer.join_cache.hashed=on optimizer.join_cache.incremental=on optimizer.join_cache.optimize_buffer_size=on optimizer.join_cache.outer_join=off optimizer.join_cache.semijoin=off optimizer.mrr=on optimizer.mrr.cost_based=off optimizer.mrr.sort_keys=on optimizer.semijoin=on optimizer.semijoin.firstmatch=on optimizer.semijoin.loosescan=on optimizer.subquery.cache=on optimizer.subquery.in_to_exists=on optimizer.subquery.materialization=off optimizer.subquery.partial_match_rowid_merge=on optimizer.subquery.partial_match_table_scan=on optimizer.table_elimination=on
There are various opinions about how we could make the switch while remaining compatible with the current approach, whether we should switch at all/the optimizer_switch variable/all variables to dotted.name.notation, etc.
If you have an opinion, now it is a good time to voice it at maria-developers@lists.launchpad.net.
At FOSDEM 2010, MariaDB was represented by Kristian Nielsen and me. Together with Vlad Kolesnikov we’ve manned the joint MariaDB & PBXT stand. I’ve had the main track talk titled “MariaDB: extra features that make it a better branch of MySQL” (slides), and
Kristian had a more general talk titled “Beyond MySQL GA: patches, storage engines, forks, and pre-releases” (slides) in MySQL devroom.
There were no other MySQL-related main track talks, so overall, MySQL ecosystem was represented by one devroom, one stand, and one talk. This is exactly as much as a certain other open source database had, with the exception that our stand didn’t have anything to match the blue plush elephants and pencils and mugs.
I’ve taken notes about questions asked at the stand, here they are:
Due to being one of the three people manning the stand I’ve missed most of the MySQL devroom. It would be nice to learn if there were any interesting discussions there.
The main focus of MariaDB staff has been the MariaDB 5.1.38 release in recent weeks, but this doesn’t mean that we’ve abandoned everything else for it. There are several non-release projects going on, one of which is adding binlog filtering and markup capabilities.
In order to see how the new features fit in, let’s first look at what binlog filtering options are already present in MySQL (and so, MariaDB):
| kind | Master | Slave | mysqlbinlog |
|---|---|---|---|
| DB-level filtering | –binlog-do-db –binlog-ignore-db |
–replicate-do-db=db –replicate-ignore-db=db |
–database=dbname |
| Table-level filtering | – | –replicate-do-table=db.tbl –replicate-ignore-table=db.tbl –replicate-wild-do-table=pattern –replicate-wild-ignore-table=pattern |
– |
| Database name rewrite | – | –replicate-rewrite-db=â€from->to†| – |
| Statement-verb filtering | – | – | – |
As long as MySQL had only statement-based replication, one could work around the blank space in mysqlbinlog column by processing mysqlbinlog output with perl/awk/etc scripts. With Row-Based Replication, mysqlbinlog’s output contains events that look like this:
BINLOG ' vjrjShMBAAAAJwAAAPcCAAAAABIAAAAAAAAAAmQyAAJ0MgABAwAB vjrjShkBAAAAIgAAABkDAAAQABIAAAAAAAEAAf/+AgAAAA== '/*!*/;
which practically prevents one from doing any processing on it with perl/awk or similar tools. We’ve got a request to fix this, and set to add the following filtering capabilities:
| kind | Master | Slave | mysqlbinlog |
|---|---|---|---|
| DB-level filtering | –binlog-do-db –binlog-ignore-db |
–replicate-do-db=db –replicate-ignore-db=db |
–database=dbname |
| Table-level filtering | – | –replicate-do-table=db.tbl –replicate-ignore-table=db.tbl –replicate-wild-do-table=pattern –replicate-wild-ignore-table=pattern |
MWL#40 |
| Database name rewrite | – | –replicate-rewrite-db=â€from->to†| MWL#36 |
| Statement-verb filtering | – | – | MWL#41 |
At this moment, MWL#36 has already been coded and pushed into mariadb-5.2 tree. The rest of the tasks will hopefully follow.
MWL#36’s --replicate-rewrite-db has the same limitations as slave’s --replicate-rewrite-db: cross-database updates and CREATE/DROP/ALTER DATABASE statements are not rewritten. We were lucky that slave’s replicate-rewrite-db had these limitations and we could follow them. The thing is, since the slave parses the queries, it is relatively easily for it to walk the parse tree and rewrite the database name wherever necessary (and thus provide handling all kinds of statements). If this was implemented on the slave, it would be very difficult to repeat in mysqlbinlog, since mysqlbinlog has no SQL parser and so is not able to reliably find references to database name in SQL statement text.
It seems we won’t be able to dodge this problem in MWL#40: Table-level filtering, though. According to the manual, replicate-ignore-table “Tells the slave thread to not replicate any statement that updates the specified table, even if any other tables might be updated by the same statement (source). In order to copy this behaviour, mysqlbinlog will need to be able to tell which tables are affected by each of the statements it processes. As written in the worklog entry, so far we see three reliable ways to do that
@@ignore_tables session variable and then let mysqlbinlog print SET ignore_table=... as the first statement in its output.All three approaches have certain drawbacks. The first seems like an overkill and will have no potential to ever be able to work for VIEWs. The second will increase the size of binary logs and won’t work for un-annotated binary logs produced by legacy servers. The third approach requires this @@ignore_tableshack in the server and doesn’t really do filtering, which might be a nuisance when one does some additional processing on mysqlbinlog‘s output … I’m still swaying.
MariaDB 5.1 beta is almost out, so it’s time to cover some of its features. The biggest optimizer feature is MWL#17 Table Elimination.
The basic idea behind table elimination is that sometimes it is possible to resolve the query without even accessing some of the tables that the query refers to. One can invent many kinds of such cases, but in Table Elimination we targeted only a certain class of SQL constructs that one ends up writing when they are querying highly-normalized data.
The sample queries were drawn from “Anchor Modeling”, a database modeling technique which takes normalization to the extreme. The slides at anchor modeling website have an in-depth explanation of Anchor modeling and its merits, but the part that’s important for table elimination can be shown with an example.
Suppose the database stores information about actors, together with their names, birthdates, and ratings, where ratings can change over time:

According to anchor modeling, each attribute should go into its own table:
-- the 'anchor' table which has only synthetic primary key create table ac_anchor(AC_ID int primary key); -- a table for 'name' attribute: create table ac_name(AC_ID int, ACNAM_name char(N), primary key(AC_ID)); -- a table for 'birthdate' attribute: create table ac_dob(AC_ID int, ACDOB_birthdate date, primary key(AC_ID)); -- a table for 'rating' attribute, which is historized: create table ac_rating(AC_ID int, ACRAT_rating int, ACRAT_fromdate date, primary key(AC_ID, ACRAT_fromdate));
With this approach it becomes easy to add/change/remove attributes, but this comes at a cost of added complexity in querying the data: in order to answer the simplest, select-star question of displaying actors and their current ratings one has to write outer joins:
-- Display actors, with their names and current ratings select ac_anchor.AC_ID, ACNAM_Name, ACDOB_birthdate, ACRAT_rating from ac_anchor left join ac_name on ac_anchor.AC_ID=ac_name.AC_ID left join ac_dob on ac_anchor.AC_ID=ac_dob.AC_ID left join ac_rating on (ac_anchor.AC_ID=ac_rating.AC_ID and ac_rating.ACRAT_fromdate = (select max(sub.ACRAT_fromdate) from ac_rating sub where sub.AC_ID = ac_rating.AC_ID))
Apparently one won’t want to write such join every time they need to access actor’s properties, so they’ll create a view:
create view actors as select ac_anchor.AC_ID, ACNAM_Name, ACDOB_birthdate, ACRAT_rating from <see the select above>
which will allow one to access data as if it was stored in a regular way:
select ACRAT_rating from actors where ACNAM_name='Gary Oldman'
And this is where table elimination will be needed
The first thing the optimizer will do will be to merge the VIEW definition into the query and obtain
select ACRAT_rating from ac_anchor left join ac_name on ac_anchor.AC_ID=ac_name.AC_ID left join ac_dob on ac_anchor.AC_ID=ac_dob.AC_ID left join ac_rating on (ac_anchor.AC_ID=ac_rating.AC_ID and ac_rating.ACRAT_fromdate = (select max(sub.ACRAT_fromdate) from ac_rating sub where sub.AC_ID = ac_rating.AC_ID)) where ACNAM_name='Gary Oldman'
Now, it’s important to realize that the obtained query has a useless part (highlighted in magenta). Indeed,
left join ac_dob on ac_dob.AC_ID=... will produce exactly one matching record:
primary key(ac_dob.AC_ID) guarantees that there will be at most one match for any value of ac_anchor.AC_ID,LEFT JOIN will generate a NULL-complemented “row”,This means that the … left join ac_dob on … part can be removed from the query and this is what Table Elimination module does. The detection logic is rather smart, for example it would be able to remove the … left join ac_rating on … part as well, together with the subquery (in the above example it won’t be removed because ac_rating used in the selection list of the query). Table Elimination module can also handle nested outer joins and multi-table outer joins.
One can check that table elimination is working by looking at the output of EXPLAIN [EXTENDED] and not finding there the tables that were eliminated:
MySQL [test]> explain select ACRAT_rating from actors where ACNAM_name='Gary Oldman'; +----+--------------------+-----------+--------+---------------+---------+---------+----------------------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+--------------------+-----------+--------+---------------+---------+---------+----------------------+------+-------------+ | 1 | PRIMARY | ac_anchor | index | PRIMARY | PRIMARY | 4 | NULL | 2 | Using index | | 1 | PRIMARY | ac_name | eq_ref | PRIMARY | PRIMARY | 4 | test.ac_anchor.AC_ID | 1 | Using where | | 1 | PRIMARY | ac_rating | ref | PRIMARY | PRIMARY | 4 | test.ac_anchor.AC_ID | 1 | | | 3 | DEPENDENT SUBQUERY | sub | ref | PRIMARY | PRIMARY | 4 | test.ac_rating.AC_ID | 1 | Using index | +----+--------------------+-----------+--------+---------------+---------+---------+----------------------+------+-------------+ 4 rows in set (0.01 sec)
Note that ac_dob table is not in the output. Now let’s try getting birthdate instead:
MySQL [test]> explain select ACDOB_birthdate from actors where ACNAM_name='Gary Oldman'; +----+-------------+-----------+--------+---------------+---------+---------+----------------------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-----------+--------+---------------+---------+---------+----------------------+------+-------------+ | 1 | PRIMARY | ac_anchor | index | PRIMARY | PRIMARY | 4 | NULL | 2 | Using index | | 1 | PRIMARY | ac_name | eq_ref | PRIMARY | PRIMARY | 4 | test.ac_anchor.AC_ID | 1 | Using where | | 1 | PRIMARY | ac_dob | eq_ref | PRIMARY | PRIMARY | 4 | test.ac_anchor.AC_ID | 1 | | +----+-------------+-----------+--------+---------------+---------+---------+----------------------+------+-------------+ 3 rows in set (0.01 sec)
The ac_dob table is there while ac_rating and the subquery are gone. Now, if we just want to check the name of the actor
MySQL [test]> explain select count(*) from actors where ACNAM_name='Gary Oldman'; +----+-------------+-----------+--------+---------------+---------+---------+----------------------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-----------+--------+---------------+---------+---------+----------------------+------+-------------+ | 1 | PRIMARY | ac_anchor | index | PRIMARY | PRIMARY | 4 | NULL | 2 | Using index | | 1 | PRIMARY | ac_name | eq_ref | PRIMARY | PRIMARY | 4 | test.ac_anchor.AC_ID | 1 | Using where | +----+-------------+-----------+--------+---------------+---------+---------+----------------------+------+-------------+ 2 rows in set (0.01 sec)
then it will eliminate both ac_dob and ac_rating tables.
Removing tables from a query is not expected to make the query slower, and it does not cut off any optimization opportunities, so we’ve made table elimination unconditional and don’t plan on having any kind of query hints for it.
I wanted to add an @@optimizer_switch flag anyway, just in case, but Monty was against that and eventually we have agreed that @@optimizer_switch will have table_elimination=on|off switch only in debug builds of the server.
And it has failed the testuite on every single build slave. I’ve filed BUG#45605, BUG#45630 (together with a patch), BUG#45631, BUG#45632. There is also rpl.rpl_innodb_bug28430 test failure which I didn’t report as I don’t yet have enough details about the build slave.
At the moment our setup works as follows: there is lp:~maria-captains/maria/mysql-5.1-testing branch, which our copy of lp:mysql-server. We periodically pull from the main tree into our copy, it’s a manual process. Buildbot watches for pushes to our copy and runs builds/tests after every push. The results are publicly available here.
UPDATE: Sun people say their pushbuild is nearly green (=all tests pass, on nearly all platforms). This is very odd, as our build slaves are nothing special – about half of them are recent Ubuntu installs on most popular architectures.