Combinatorial blowups can occur in Name Resolution, too.

Short: There are many places where (any) query optimizer has to take care to avoid combinatorial blowups of memory and/or processing time. But we had a real world query that had a combinatorial blowup at Name Resolution (aka “binding” or “semantic analysis”) phase.

Long: The trick is to use nested derived tables (or views) and use the column multiple times.

The example using views. Start from a base table:

create table t1 (
  f1 int,
  f2 int,
  ...
);

Then create a view with columns that refer to the base table multiple times:

create view v1 as
select 
  case 
    when f1 > f2 then f2
    when f1 > f3 then f3
    when s1 > f4 then f4
  end AS f1,
  ...
  case ...  as f2,
  ...
from 
  t1;

Here, v1.f1 has 9 references to columns of t1. Other columns have similar definitions.
Then, create view v2 that uses columns of v1 in a similar way:

create view v2 as
select 
  case 
    when f1 > f2 then f2
    when f1 > f3 then f3
    when s1 > f4 then f4
  end AS f1,
  ...
  case ...  as f2,
  ...
from 
  v1;

v2.f1 has 9 references to v1.f1. Each of those has 9 references to columns of t1, which gives 9*9=81 references to columns of t1.

Repeat this until you reach depth level 8, and now the query takes about 1 minute in MariaDB and MySQL. In PostgreSQL, it is killed due to running out of memory. All the details can be found in MDEV-31130.

In MariaDB (and MySQL), multiple references to the same column are shared, so query processing doesn’t require a combinatorial amount of memory (like it seems to do in PostgreSQL). But condition processing ignores shared references and so does require combinatorial amount of time.

One can imagine fixing this by extending all the code that processes scalar expressions to handle common sub-expressions. But I also imagine it will add a lot of complexity, all just to handle a fairly exotic case, right?

Leave a comment