GitHub: Speed matters

September 29, 2009 Technical, General

Impressions from the first article (in its first day) and the first 24 hours of the GitHub migration, have caused us at Anchor to believe that;

  1. GitHub is just as popular as we thought,
  2. The migration was worth it, as things are running much faster (just check your twitter feeds, or better yet, check your GitHub source tree for no reason 😉 ); and,
  3. People are interested in what has gone under the hood of the new GitHub (insert your favorite fast car here; otherwise lets say a roadster).

Taking these three things into account, this installment will discuss why things are so much faster post migration compared to prior.

I said ‘faster’ and not ‘fast’, because GitHub is now as fast as any website should be. So in comparison, yes, GitHub is fast now, however it is akin to riding your bicycle with half inflated tires: when fully inflated, suddenly your old bike is blazing fast. Now this is not to be critical of the former architecture which held its merits when GitHub was founded. GitHub had simply moved to a stage where a infrastructure architecture refresh was logical.

The main thing, in the large, that made this new architecture fast was that we were given a blank slate and large amounts of freedom to make an architecture that would do the job well.  This is an incredibly rare thing, and it no doubt took a lot of courage on Github’s part.  For that, we have to say “thankyou” to the Github team for letting us have that freedom.  I like to think that we’ve repaid that trust with a pretty awesome architecture that will serve them well for some time to come.

SCALE: When looking at the new architecture as a whole, the increased scale is immediately evident. GitHub now consumes far more hardware than ever before:

Old Infrastructure:

  • 10 VMs
  • 39 VCPUs
  • 54GB RAM

New Infrastructure:

  • 16 physical machines
  • 128 physical cores
  • 288GB RAM

Or for those who enjoy visual cues:

Resource comparison old to new infrastructure

It is a credit to the old infrastructure and GitHub’s code that it ran so well on so little (in comparison). The first credit for increased performance is increased scale.

An important note regarding the hardware is that there is nothing special (or industry secretive) regarding it. The solution in its entirety is run from commodity hardware. No special black boxes doing scary things with packets and routes. No appliance servers. The solution architecture developed by Anchor can be used with any hardware vendor (insert: Dell, HP, IBM, SuperMicro, etc). Vendor neutrality provides GitHub with no encumbrance with either scaling up or out, a key issue when considering growth and future flexibility.

Note: The architectures flexibility allows for the user repository storage to be expanded with a mix of vendor hardware (should GitHub ever change hardware vendor). Furthermore, any component can be exchanged for another vendor’s hardware with no change to GitHubs architecture or software.

In a nutshell, the increased scale provides:

  • More GitHub front-end servers to service your requests;
  • More storage; and
  • More I/O bandwidth when working with your repository data

HARDWARE PERFORMANCE: The speed specifications of the underlying components is important, in addition to how that hardware is utilised.

Storage I/O: A common factor in poor performance with any solution is an I/O bottleneck at the storage level.  This pain was GitHub’s. To alleviate this, not only is the storage now distributed across several servers (distributing the I/O), but it is now running on direct-attached 15,000 RPM SAS disks on battery-backed hardware RAID. Therefore, the second credit for increased performance is faster storage.

Direct access to hardware: Virtualisation is great. What isn’t great is when virtualisation is used as a universal solution. At Anchor we believe there is a place for virtualisation, and systems with massive I/O or CPU requirements is not that place. By moving resource heavy systems onto dedicated hardware, any contention for resources between individual VMs is removed. The third credit goes to less overhead.

ARCHITECTURE: Throwing hardware at a scaling problem is an easy solution, but without the right division of resources and the right software to properly use it, it’s not going to run real fast.

For GitHub, this was their innovative Git command proxying systems, which do an excellent job of taking requests from the frontends (where users connect with their web browser, git client, or SSH client) and shipping them to the fileservers.  The database structure, filesystem layout, and code efficiency also contribute to this.

Given that the software isn’t our speciality, there’s not a lot for us to say about this, but Github are planning a series of posts on their blog, and I’m quite sure it’ll be enlightening.

TO REVIEW: The factors involved in GitHub’s faster response on the new infrastructure include (but are not limited to):

  • Increased Infrastructure (Scale)
  • Faster Hardware ( Storage)
  • No resource contention (More resources per server)
  • Solid, scalable architecture (Awesomeness)

Keep an eye on this space, as we delve into technology specific posts regards what kinds of 11 herbs and spices Anchor used to realise the new GitHub architecture.