Notes from Feature Request Bonanza session at Percona Performance conference

I was taking notes during “Open Q&A: Feature Request Bonanza” session at the Percona Performance conf. The session started at 9 pm at the last day of the conference, so the room wasn’t as full as it was for other sessions, but still there was an interesting discussion. I’ve missed several requests but more than 90% of stuff is there.

DISCLAIMER People are mentioned when I could both identify them (I was in the first row which rules out those in the back) and had time to note that, so names below are contact points and not indication of who was[n’t] there. I was also somewhat tired so please re-check the statements with their authors if you’re going to make any far-reaching conclusions based on the below:

  • The first request was for partial index support: create and use indexes that only have records that match a certain condition.
  • Pre-allocate space after table creation. Monty: CREATE TABLE statement has MIN_ROWS parameter already, and it’s honored by MyISAM.
  • PeterZ: Besides allocation of space at table creation, it would be nice if it was possible to allocate table’s space in extents. Monty: This is possible in Maria.
  • Somebody requests to specify fill factor for InnoDB pages. Domas: Technically it’s there, can be changed in gdb.
  • X: I want to specify the parameter for each table and index. Monty: MariaDB has support for name=value table parameters. They are passed to engine.
  • Monty: The engine should mark which options it recognized and warnings should be issued for unrecognized options.
  • Jeremy Cole requests online DDL. Monty: Maria has lazy add/remove column.
  • PeterZ asks about online ALTER TABLE. There’s a discussion about instant vs. online/background ALTER TABLE and what kind of operations can be performed instantly or online.
  • Domas would like to see online OPTIMIZE. It should be a background process which one start/stop or set to work at some limited rate so it can run without much impact on other activity. It should be possible to set to behave more aggressively.
  • PeterZ asks what OPTIMIZE should do for SSD drives. Somebody answers that SSD drives have high cost per megabyte of storage, so OPTIMIZE should reclaim wasted space.
  • Antony Curtis requests columns to have default values which are functions without use of triggres. Monty: this will require changes in .frm file format. They want MariaDB to be compatible.
  • Call for new datatypes anyone would like to see added
    • Ryan: column encryption
    • Plug-in abstract datatypes
    • Microsecond timestamp (Monty wants to add this)
    • PeterZ: blob/text data compression

    then there’s a discussion on where compression should be handled – inside storage engine or at connector level, or somewhere inside SQL runtime, etc. There are different opinions.

  • Somebody asks for more comprehensive features in general and a comprehensive set of DTrace probes in particular. There is a counterargument that it’s not possible to have static DTrace probes for every possible case and that one should use dynamic tracing. That requires knowledge of source code though.
    Monty requests a list of missing probes.
  • A request from Baron Schwartz: there are cases when MySQL will do ref access over columns of some unique index, but will use some non-unique index because it is covering all needed columns while the unique index isn’t. MySQL should use eq_ref access in such cases. He says he has run some benchmarks and there’s a 20% speed difference between ref and eq_ref. The engine doesn’t matter.
  • Susanne asks for DROP CASCADE. Domas doesn’t want anybody to run DROP CASCADE on his servers.
  • Ryan requests that index_merge optimizer is extended to allow sort-intersect strategy. At the moment we have just ‘intersect’ which can produce intersection of rowid-ordered index scans which means that it handles equality predicates:
    SELECT ... FROM tbl WHERE t.key1='foo' AND t.key=123
    but not range predicates:
    SELECT ... FROM tbl WHERE t.key1 LIKE 'foo%' AND t.key2 BETWEEN 123 AND 134

  • Alexey Rybak requests Bitmap indexes.
  • He also requests a fully asynchronous client library, one that would allow a client app can run many queries on many servers concurrently. It seems Drizzle has a new client library that does that. They don’t support binary protocol though. Monty intends to wait until Drizzle’s client library is stabilized, then add binary protocol support to it, and then see if it could be used instead of the standard library. Someone states that new PHP connector already supports asynchronous operation.
  • Somebody asks if there is any way to scale the RAM footprint of embedded MySQL. He says he has severe RAM (but not disk) constraints. I express doubt if database would work when OS has no space for disk cache but Monty says MyISAM is capable of operating reasonably decent in such settings.
  • Ryan requests InnoDB to have instant (auto-maintained) table checksums.
  • Domas requests “fuzzy replication”. Here I can’t make sense of my notes – it’s something about losing some of latest transactions but recovering to some consistent state, but I can’t remember how does all that relate to replication.
  • Ryan says it’s annoying that InnoDB takes everything offline when it detects corruption. He suggests that InnoDB should take offline only the corrupted table (which is feasible when one is using innodb_file_per_table option — sergeyp).
  • Monty says that Maria will shut down only the corrupted table and will automatically attempt to repair it. The audience wants the same to happen for partitions.
  • Somebody asks for something related to INSERT DELAYED. There’s a reply that that can be achieved in the application with SQL statement queuing.
  • Domas tells a story about Wikipedia having a number of tables with various counters, like number of pages in categories. They do update counters at the end of the transaction and in a number of cases counter update causes deadlock which otherwise would not occur. All changes to counters are commutative/additive/reversible actions, so it would be nice if the engine [or its extension] understood that used that knowledge to avoid deadlocks. Antony and Ryan mention they work around this problem by storing counters in non-transactional MyISAM tables
  • Somebody requests settable limits on how much memory a client can use. Domas mentions that the day before he has demonstrated how a half-megabyte query can consume gigs of RAM without using any buffers. BUG#42946 and BUG#27863 are mentioned as other examples of how one could cause excessive memory consumption with seemingly innocent statements. Besides that, queries consume engines’ internal buffer/cache resources and that is very difficult to account for. Monty says that the reality is that an experienced DBA is able to bring down the server.

That’s it. It will be interesting to get back to the list after one year and see if anything of it got implemented 🙂

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s