So you’ve just provisioned your shiny new OS instance with your host of choice, loaded in your confidential data and away you go without a worry in the world right? If your data consists only of captioned photos of cute furry animals, then all is well. Perhaps however, your data is worth just a wee bit more than that (not that we don’t ♥ cute furry animals!).
Depending on your host and product used, your data could be sitting on anywhere from locally attached disks, a NAS/SAN or some clustered distributed block device/filesystem with no way to easily determine who has access to it, what snapshots exist, what will happen to failed media, etc. For certain customers with certain sensitive applications, that is simply not an acceptable risk.
To protect your data at rest, the most reassuring option is to utilise disk encryption within your operating system where only you possess the key. Whilst this can present a bit of an operational challenge (i.e. how does the key get entered if the server is rebooted), the show stopper question is going to be “Can I actually run a production database with heavy I/O load on encrypted storage?”. This post seeks to answer that question or to at least help you figure it out for your circumstances.
Our test server is a Dell PowerEdge R410 with:
- 2 × Intel Xeon E5620 CPUs
- 16 GB of RAM
- Dell PERC H700 hardware RAID controller
- 2 × Dell 50GB SSDs in a hardware RAID 1 array for the database
- 2 × 300GB 15K SAS in a hardware RAID 1 array for the operating system
- Red Hat Enterprise Linux Server 6.0 (kernel 2.6.32-71.18.1.el6.x86_64)
The interesting thing to note in this setup is that the CPUs contain hardware accelerated AES encryption via the AES instruction set (Intel’s marketing name is AES-NI). You can check if your CPU and kernel support this feature by running at the command line:
grep aes /proc/cpuinfo
if you get output, then congratulations, you have the feature supported!
The recommended way to do block device level encryption under Linux is to use the device mapper target dm-crypt. This is a straightforward component that is built-in to the Linux kernel and well supported in all modern Linux distributions. To set up full disk encryption, you should use your distribution’s installer when you initially provision your server. If you are not encrypting your entire OS (sans /boot) and data, then you will need to use the utility cryptsetup on the storage device you want to encrypt. For our testing, we are just encrypting the SSDs as follows:
cryptsetup luksFormat /dev/sdb -c aes-xts-plain -s 512
This command formats the storage device with a LUKS meta-data header. This header contains useful information including:
- the fact that it contains encrypted data (rather than just random data)
- a UUID
- the algorithm used
- the key size
- the key (securely encrypted with a passphrase that you provide at format time)
- checksums and other useful bits
For high security, we are using a 512 bit AES key in an XTS cipher mode (requires kernel >= 2.6.24). A small word of warning: whilst cryptsetup allows you to setup encryption on devices without any sort of identifying meta-data header, DON’T ever do that. If you need to hide the fact that you are using encryption there are better ways to do that. Contact us for details if you really want to go down that path.
Once you have formatted the device, you then need to activate it by running the command:
cryptsetup luksOpen /dev/sdb testing
this will then make the unencrypted block device ready for you to use (i.e mkfs) as /dev/mapper/testing. To have the operating system setup the device at boot, you will need to put an entry into /etc/crypttab. Check the man page for details.
Finally, for best performance you want the implementation of AES that is most optimised for your hardware. You can check /proc/crypto to see which algorithms and drivers are available and their priority (highest priority number indicates higher priority). In order of performance, you want AES-NI, AES-x86_64 and the generic AES implementation last. The highest priority crypto implementation at the time of an encrypted block device being activated (via cryptsetup luksOpen) is used. Higher priority drivers installed into a running kernel have no effect on active devices.
The simple benchmark tool zcav is fine for our purposes. It only measures the sustained read/write speed. For an I/O device, this benchmark would normally be of limited value as the device bottleneck is usually with the rate of I/O requests that can be handled. For an encrypted block device, the bottleneck is instead in the throughput (i.e. the sustained read/write speeds) as the higher the throughput, the more work that the CPU has to do.
The output from zcav was fed into gnuplot to generate the following graphs:
The obvious thing to note in this diagram is that the new AES instructions make a huge difference to performance. The impact of the encryption is relatively low and should be quite acceptable for a production system. An interesting thing to note is the relatively low performance of the RAID array. Being RAID 1, the RAID controller should have been able to balance the read requests over both drives to get about double what it is getting. I haven’t looked into the RAID settings though to see if this can be tuned.
The throughput here with the AES instructions in use is almost identical.
The results certainly look promising for even relatively high end work loads. One thing that is not measured here is the latency difference. It should be relatively straightforward to calculate the latency that will be added to every I/O request.
The numbers above can also be massively improved upon with a newer kernel featuring the dm-crypt scalability patches (the benchmark above was limited to using 1 core out of 8!).
The nice thing about using the built-in dm-crypt solution is that you can use it on physical as well as virtual machines. If you are after something a bit more turn-key, you may wish to look at a system with drives (and a controller and BIOS) that support self encrypting drives as an alternative.