Please try your subqueries on MariaDB

MariaDB 5.3 is now GA, and MariaDB 5.5 is RC. One of the primary features in these releases is all-round coverage with subquery optimizations. Practically every kind of subquery available in SQL has got some new optimization.

We do a lot of testing, so these new optimizations should be now reasonably stable. What is missing is performance testing with real-world queries on real-world data. I expect most of the time you will see a speedup, however, there can also be cases where the new version will be slower. New optimizations make query plan search space much bigger, this means the new optimizer will have lots of room to make errors where previously was none.

Back at MySQL Ab, I could use bugs/support cases to do some analysis of how real-world queries are affected by the new optimizations.

Now, we don’t have access to that data anymore, and so are asking for help: If you’ve got some queries with subqueries, please try running them on the latest MariaDB 5.3 Stable or MariaDB 5.5 RC and let us know of the results.

3 Comments

  1. @Przemek, thanks a lot! I and the rest of MariaDB team really appreciate getting feedback on our work.

    Subquery cache should definitely work for double subqueries and UNIONs, I have no idea why you’re not seeing the speedup.

    Perhaps, the dataset is such that the cache is not useful?

    If you’re going to re-run the UNIONs/double-subquery cases, could you check the values of status variables subquery_cache_hit subquery_cache_miss before/after the query? (they show how subquery cache worked, see http://kb.askmonty.org/en/subquery-cache for details)

    Like

  2. Ok, here is example query where subquery_cache made no difference in execution time for me when MySQL did choose the wrong indexes:

    SELECT DISTINCT `page`.page_namespace AS page_namespace,
    `page`.page_title AS page_title,
    `page`.page_id AS page_id,
    rev_user,
    rev_user_text,
    rev_comment,
    rev_timestamp
    FROM `revision` AS rev,
    `page`
    WHERE 1 = 1
    AND `page`.page_namespace IN ( ‘0’, ‘0’ )
    AND `page`.page_is_redirect = 0
    AND `page`.page_id = rev.rev_page
    AND rev.rev_timestamp = (SELECT Min(rev_aux.rev_timestamp)
    FROM `revision` AS rev_aux
    WHERE rev_aux.rev_page = rev.rev_page)
    AND ‘Stryker117’ = (SELECT rev_user_text
    FROM `revision`
    WHERE `revision`.rev_page = page_id
    ORDER BY `revision`.rev_timestamp ASC
    LIMIT 1)
    AND `page`.page_namespace NOT IN ( 1100 )
    ORDER BY rev_timestamp DESC
    LIMIT 0, 5;

    EXPLAIN:
    +——+——————–+———-+——-+————————+—————-+———+——————-+——-+———————————————————————+
    | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
    +——+——————–+———-+——-+————————+—————-+———+——————-+——-+———————————————————————+
    | 1 | PRIMARY | page | range | PRIMARY,name_title | name_title | 4 | NULL | 34382 | Using index condition; Using where; Using temporary; Using filesort |
    | 1 | PRIMARY | rev | ref | PRIMARY,page_timestamp | PRIMARY | 4 | halo.page.page_id | 2 | Using where |
    | 3 | DEPENDENT SUBQUERY | revision | index | PRIMARY,page_timestamp | rev_timestamp | 14 | NULL | 1 | Using where |
    | 2 | DEPENDENT SUBQUERY | rev_aux | ref | PRIMARY,page_timestamp | page_timestamp | 4 | halo.rev.rev_page | 2 | Using index |
    +——+——————–+———-+——-+————————+—————-+———+——————-+——-+———————————————————————+

    When subquery_cache=off the status variables were of course equal to 0 after query run:
    show status like ‘sub%’;
    +———————+——-+
    | Variable_name | Value |
    +———————+——-+
    | Subquery_cache_hit | 0 |
    | Subquery_cache_miss | 0 |
    +———————+——-+

    and with subquery_cache=on they were:
    +———————+——-+
    | Variable_name | Value |
    +———————+——-+
    | Subquery_cache_hit | 9608 |
    | Subquery_cache_miss | 710 |
    +———————+——-+

    However in both cases the execution time was about the same (~1 hour 36 min).

    But after I forced the proper index usage:
    +——+——————–+———-+——-+————————+————+———+——————-+——-+———————————————————————+
    | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
    +——+——————–+———-+——-+————————+————+———+——————-+——-+———————————————————————+
    | 1 | PRIMARY | page | range | PRIMARY,name_title | name_title | 4 | NULL | 34382 | Using index condition; Using where; Using temporary; Using filesort |
    | 1 | PRIMARY | rev | ref | PRIMARY,page_timestamp | PRIMARY | 4 | halo.page.page_id | 4 | Using where |
    | 3 | DEPENDENT SUBQUERY | revision | ref | PRIMARY | PRIMARY | 4 | halo.page.page_id | 4 | Using where; Using filesort |
    | 2 | DEPENDENT SUBQUERY | rev_aux | ref | PRIMARY | PRIMARY | 4 | halo.rev.rev_page | 4 | |
    +——+——————–+———-+——-+————————+————+———+——————-+——-+———————————————————————+

    Times were bit better with subquery_cache enabled indeed (Subquery_cache_hit/miss the same as before):

    w/o subquery_cache:
    2.88 sec (cold buffer pool)
    1.22 sec (warm buffer pool)

    with subquery_cache:
    2.38 sec (cold buffer pool)
    0.77 sec (warm buffer pool)

    Like

Leave a Reply to spetrunia Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s