A Year of OpenStack

By March 2, 2015General

In late 2013 we looked around the company and asked ourselves the questions any management team has to ask: what are we doing, where do we need to be, and what’s holding us back from getting there?

OpenStackBecause Anchor had grown organically across many years, our internal procedures for infrastructure management were spread across a number of tools that weren’t scalable and efficient enough to keep up with the demands of new sales. Furthermore, they weren’t compatible with a product-based self-serve future.

There are a number of pieces to addressing that, but certainly it would be tremendously useful to have API-driven software defined infrastructure. The recommendation, universally, was that we needed to consider OpenStack.

We were astonished at the degree of rigour and testing that had gone into vetting changes (since the beginning, there have been no un-gated commits to mainline anywhere in the project), and particularly impressed with the breadth of organisations contributing. These are not captive single vendor in-house tools that just happen to be open sourced; OpenStack is a vibrant ecosystem with thousands of contributors from organisations around the world. It feels like Linux for the datacenter!

Great software is nothing without a solid infrastructure and hardware to underpin it. Whilst evaluating our technical choices we aimed for best-of-class solutions, such as using Ceph for our storage backend and Infiniband for networking.

A crucial part of any cloud infrastructure deployment is effectively managing storage. For years the state of the art was running beefy servers with large numbers of disks arranged in a RAID array of some description. This is wasteful; unused capacity in one server is of no help to a beleaguered machine elsewhere in the datacenter. Additionally it limits your capacity to deal with failures in a cost-effective manner.

In the age of cluster computing, we can do better. At any kind of scale you’re going to have, in aggregate, a massive number of disks. You might as well put them all to work. And if they’re all contributing to a storage pool, then any one failure represents only a small fraction of the available data and a large number of disks can contribute to re-establishing redundancy. Best of all, there are a number of viable open source storage technologies that allow you to make this vision a reality.

To achieve such a cluster, we chose Ceph, a magnificent distributed storage system. It has extraordinary scaling properties (other systems bottleneck somewhere between 500TB to 1PB; Ceph eats right through that boundary). It is built for the real world where server reboots and disk failures are an expected part of normal operations. Most importantly, action to distribute data and restore redundancy is taken autonomously by members of the storage cluster; once the system is running it needs very little attention.

We use Ceph for bulk object storage and to back the disks behind virtual machines. This is nice; it means we have (only) one storage technology to get good at managing.

This is not a one-size-fits-all solution, however. Different workloads benefit from different classes of storage for cost effective performance, and we’ve selected appropriate hardware to address the requirements of each. These can be broadly described as: “fast”, “block”, and “object”.

Object storage is bulk capacity. It’s not slow, by any means, but it is on large disks that can be provisioned in quantity. Data written there is durable, but we don’t attempt to make any guarantees about performance. They get good throughput and are a great place to store large bulk assets and backups.

Block storage is what holds the disk images backing the virtual machines instances in the Nova compute cluster. We’ve gone to quite some lengths to tune this; smaller SAS hard disks running fast with SSD journals to absorb writes.

Finally, Fast storage is a tier dedicated to database load. High performance databases need fast write acknowledgement and also demand fast random access. We provide that with a pure SSD layer; there’s no lag waiting for disks either for write acknowledgement, flushing journals, or reads, be they random access or large bulk queries. And since the systems are dedicated to database storage they are tuned to these access patterns without being disrupted by the general I/O that typifies normal block devices.

Ceph delivers on its promises, but there’s no free lunch here. As in other leading edge storage technologies, intra-cluster traffic can be quite significant. When Ceph decides to relocate a piece of data, rebalance replicas amongst itself, or to restore redundancy after an outage, a lot of data can be in motion. So we wanted to ensure we had a storage backbone that wouldn’t give us any trouble. That turned out not to be a problem, because we also wanted primary access by client servers to be fast. Our solution? Choose Infiniband.

We need a lot more than Gigabit Ethernet. 1 Gb/s is actually okay for uplinks from servers to the core network and on to the internet, but for the kind of intra-storage traffic we knew we were in for, we wanted fat pipes. More than that, it was important that the time to transfer a disk image was not a significant factor in provisioning a new virtual machine. Most of all, we wanted to find a network technology that would move the bar so far that we could scale out for a considerable time to come and not have to rebuild our network stack each time we did so. Native Infiniband promises 56 Gb/s without overheads. That’s fast!

Out of the box our initial testing gave us 11 Gb/s white running IP-over-Infiniband. Not a bad start. A week later after a little tuning? 37 Gb/s. Wow. The network is not our problem. 🙂

This is only part of the puzzle though. Good infrastructure will get you so far, but that’s not the same as taking your product to market. In a future post we’ll talk a little more about our deployment and how we’ve made it work for Anchor. We’re excited to be building the future of the company on the latest and greatest cloud architecture.

Anchor’s Cloud infrastructure is available to existing clients now. If you’re interested in direct API access, contact your account rep, or if you’re more interested in Anchor’s tradition of outstanding managed hosting, you can get a flexible and reactive managed service by asking about Anchor’s ManagedOps. And if e-commerce is your thing, you’re looking at the best people in the world at Magento hosting, taken to the next level with Anchor Fleet.

One Comment

  • derp says:

    Was infiniband worth the investemnt? Are you claiming that using 10gbps ethernet (1.25GB/s) instead, would be a significant bottleneck to your ceph cluster?

    What would be interesting is a post on the intricacies involved with configuring infiniband to transfer data in this network. Noting the small amount of compatible software available, compared to traditional ethernet.

Leave a Reply