The Automation Waltz

August 5, 2011 Technical, General

When you have a bunch of machines involved in a process, you need to ensure that various stages in this process have executed. If the target host is unavailable, you want a guarantee that the job will execute when the host becomes available again.  This is well beyond the capabilities of ssh in a for-loop.

In trying to solve this problem, we had assessed tools like mcollective, but came to the conclusion that they were inappropriate for our environment.  mcollective in particular was removed from consideration as it was designed for a more homogeneous environment than the one here at Anchor.

When we realised we needed a different solution, a few of us gathered around a whiteboard and started enumerating our requirements.  The result was Orchestra, which we’re releasing today under the BSD License.

Orchestra is a suite for managing the reliable execution of tasks over a number of hosts. It is intended to be simple and secure, leaving more complicated coordination and tasks to tools better suited for those jobs.

Because everybody loves this web development stuff, it was developed to provide an interface that operates asynchronously in relation to the
execution of the work being done, allowing for cheap and easy polling of job state.

And last, but not least, because having critical system services depend on potentially fragile language interpreters or their libraries is generally a really bad idea for reliability, it was completely implemented in Go – a type-safe static language that compiles to native code and includes many features derived from dynamic languages. Despite this, the work units it executes for you can be written in any language you like.

Orchestra itself is far from finished, but it’s working reliably enough that we’re already using it in production in a very limited capacity, and have been slowly extending it’s reach into new areas as appropriate.

We’d love you to take a look and see if it works for you, we’re open to suggestions and contributions for improvement. If you’re after a quick overview, doc/orchestra.tex is a good place to start, and the samples/ directory contains commented config files for the various daemons.

We don’t believe in duplicating functionality, so it’s assumed that your config, SSL certificates and scores are distributed by another automation tool – we use puppet. Utilities like daemontools or god do a great job of keeping your daemons running.