State of the Union for MySQL storage engines

By May 29, 2012Technical

(Feel free to skip to the next section if you’re familiar with MySQL’s storage engines)

If you’re not familiar with MySQL’s storage engines, it’s where your data hits the disk. You write SQL to insert or query the data, and leave the rest to MySQL. MySQL arranges that data into files on the disk by using what’s termed a storage engine. The choice of engine affects how fast your data can be inserted/queried/deleted, how it’s indexed, and how robustly it’s stored.

You don’t need to make a conscious choice – unless you explicitly specify, you’ll get the default of MyISAM. For many people this is fine, but it makes a difference when you start to accumulate a lot of data. If you’re after straightforward advice, skip down to InnoDB or MyISAM?

Storage engines again?

Every hosting company worth its salt has an outline of the differences between MySQL storage engines. It’s something we’ve covered before on a somewhat informal basis for individual customers, but we thought it’d be good to go hunting and see what, if anything has changed. In doing the research we decided we’d also take a more pragmatic approach, rather than reiterate the same bullet-pointed facts that seems to be part of the hivemind.

As an example, something you’ll often see is that MySQL supports about a dozen storage engines. This is true, but the reality is that there’s two storage engines (InnoDB and MyISAM) that you actually care about, maybe three if you’re “lucky”.

Furthermore, even if some of the others sound appealing, there’s a good chance they weren’t compiled into your distro’s packages. We’re restricting ourselves to current and mainstream distros because maintainability and reliability are king in a hosting environment; that means we’re dealing with MySQL 5.1 here. Ignoring the non-serious engines:

  • RHEL 6: MEMORY is available in addition to InnoDB and MyISAM
  • Debian squeeze: MEMORY and ARCHIVE in addition to InnoDB and MyISAM

InnoDB or MyISAM?

The short answer? InnoDB.

InnoDB has solid all-round performance characteristics, a decent feature set and good data safety. It replaced MyISAM as the default engine in MySQL 5.5, most likely for these reasons.

Choosing InnoDB can be done on a per-table basis, and is performed when the table is created, like so:

CREATE TABLE foobar (a int) ENGINE=InnoDB;

You can also easily convert existing MyISAM tables to InnoDB, note that this will take a while for big tables and may lock access:

ALTER TABLE foobar ENGINE=InnoDB;

The case for MyISAM

MyISAM performs well on SELECT-intensive workloads, and in low-concurrency applications (a small number of users at any given time). MyISAM also offers Full Text Indexing, which is great for content-searching applications. InnoDB doesn’t offer this unless you’re running a very new version (5.6) of MySQL, which is unlikely.

Against it, MyISAM isn’t great for write-heavy workloads (UPDATE and DELETE), and scales poorly when you have a lot of concurrent activity, mostly from its use of table-level locks. MyISAM doesn’t support transactions or referential integrity between tables (not a huge issue for loosely structured data). In addition, MyISAM tables can have a long recovery after a crash, as they need to be checked for consistency.

The case for InnoDB

InnoDB handles write-heavy workloads quite well and is properly ACID compliant. It uses a concurrency model that allows for row-level locking and transactional support, making it well suited for environments with a greater number of concurrent users. InnoDB supports foreign keys and referential integrity, meaning you can model your data structures with assurance that they’ll be honoured. InnoDB is crash-safe meaning that it can cleanly recover after a crash by replaying its journal, which is generally much faster than fully checking tables like MyISAM.

On the downside, InnoDB tends to be more demanding in terms of CPU and diskspace load. Depending on the workload, certain operations can be slower when compared to MyISAM due to the extra bookkeeping that InnoDB does. Unless you run a very new version of MySQL, InnoDB doesn’t support full-text indexes, which may be a deal breaker for you.

Alternatives to vanilla MySQL

So we’ve established that not a lot has changed recently insofar as storage engines go. What we have noticed instead is that there’s a few forks of MySQL now, each with their own advantages and quirks.

We suspect that development isn’t moving so quickly after Oracle bought out MySQL AB, so the community is tending to fork the existing codebase and add enhancements. There’s a few commercial forks as well, and that’s where you tend to see heavily-targeted storage engines with specific customers in mind.

MariaDB

MariaDB is a community fork, intended to be a drop-in replacement for MySQL, without the uncertainty of license terms after the acquisition by Oracle. MariaDB also include’s Percona’s XtraDB storage engine, an enhanced version of the InnoDB engine that should scale better in large, high performance environments.

MariaDB also includes the Aria storage engine, an in-development engine that aims to provide the best features of both MyISAM and InnoDB, tunable according to the user’s demands.

As an alternative, MariaDB should provide plenty of interesting options for someone who wants to play with newer developments, without a high barrier to entry.

Drizzle

Drizzle is an interesting fork that doesn’t intend to maintain compatibility with MySQL or its architecture, instead aiming for the same original goals (ease-of-use, reliability and performance) in a somewhat familiar package.

The driving force at the core of Drizzle is to create a “microkernel” and farm operations out to pluggable modules. This has enabled them to heavily rewrite a lot of functionality and add a lot of new features. Pretty much everything is now a plugin, such as storage engines, replication, handling of views and query rewriting, query optimisation, etc.

In terms of engines Drizzle retains InnoDB and adds BlitzDB, intended as a lightweight competitor to MyISAM and the in-memory options. Other engines like PBXT and XtraDB are available in branches.

Drizzle is a bit too exciting if you’re looking for a simple alternative to MySQL, but worth watching as it has some interesting ideas about scalability and performance taken to the n-th degree.

Percona

Percona is primarily a MySQL consulting and support company with their own fork of the database, Percona Server with XtraDB. The same XtraDB engine is included in MariaDB, though Percona’s server contains other performance enhancements to the core codebase, promising greater performance and scalability.

Percona is a drop-in replacement for MySQL. Furthermore, it’s already packaged for DEB and RPM-based systems so it’s dead easy to try out and see how it works for you. They also have a handy toolkit to make common operations much faster and/or less painful – we can highly recommend the MyISAM-to-InnoDB conversion tool for greatly reducing downtime on huge tables.

OurDelta

One last project worthy of note is OurDelta. Their project is about distributing up-to-date builds of MySQL for popular platforms. Their own page describes it best, but the goal is to deliver useful versions of MySQL and MariaDB with a current set of patches, helping to push the development and testing cycles along. If you’re stuck in an environment that doesn’t see much activity in terms of backported updates, this can be quite helpful.

Wrap-up

We haven’t touched on commercial forks of MySQL here, mainly because they tend to be specifically targeted and not readily accessible to casual users or people just wanting to experiment with pet projects. There’s no shortage of them though, often with paid Enterprise and free Community versions. Just from a quick search, examples include InfiniDB, solidDB, and InfoBright.

Of the alternatives mentioned, we’ve deployed with MariaDB, Percona and OurDelta for a few customers at their insistence. We can’t really quantify the benefits without proper benchmarking, but we can happily say that they’ve done the job as advertised and there’s no reason not to try them if you’re looking for something different.

Finally (and as always), we’re still fans of Postgres as an alternative to MySQL. It’s well supported, performs and scales superbly, is feature rich, and requires no choices when it comes to storage engines (there’s one, and it’s excellent). If you’re starting a new project and have the option, why not try Postgres.


Any questions or comments? Leave us a comment, or get in touch – feel free to mail us at [email protected], or call us on 1300-883-979 (+61-2-8296-5111 from outside Australia).