SSH ControlMaster: The Good, The Bad, The Ugly

February 24, 2010 Technical, General

Do you love SSH for the good it has done for mankind, but get annoyed by how long it takes to establish a connection over a high-latency connection? Perhaps you have a process that needs to make thousands of SSH connections, and you’d like a little extra speed from the whole thing. Either way, ControlMaster is your new best friend.

The concept is very simple — rather than each new SSH connection to a particular server opening up a new TCP connection, you instead multiplex all of your SSH connections down one TCP connection. The authentication only happens once, when the TCP connection is opened, and thereafter all your extra SSH sessions are sent down that connection.

If you’re SSHing between machines on the same LAN, or otherwise a short ping away, you probably wouldn’t notice the difference — the round-trip times are negligible. However, when you’re doing transcontinental SSHing (which we do often, when we’re managing customer machines in the US), it’s a godsend. On some trivial benchmarking I did when validating ControlMaster for our use, I found that we were saving nearly 2.5 seconds per connection — a drop from 3.3 seconds to 0.8. Mighty convenient.

It’s simple to use, too. If you just want to enable “opportunistic” multiplexing, you can do something as simple as this in your SSH config:

Host *
ControlMaster auto
ControlPath ~/.ssh/cm_socket/%[email protected]%h:%p

Then mkdir ~/.ssh/cm_socket, and you’re away. Any time a connection to a remote server exists, it’ll be used as the master for any other connections. Perusal of the ssh_config(5) manpage should give you the necessary hints to setup more restrictive configurations. If you need to disable control master for a given connection (the reasons why this might be necessary will be covered shortly), you can pass -S none to ssh (or set ControlPath none).

Whilst this basic setup is undeniable, pure, distilled awesome, there are some limitations and caveats to beware of. The first, and most important, is that SSH session multiplexing isn’t particularly stable when you try to put a lot of data down it from a lot of connections at once. This came to light fairly early on in my testing, when I stress-tested things by doing about 25 concurrent rsync runs all at once. The result was a large number of rsync sessions going “aiee!” and falling over. So, don’t do that.

The second, semi-related problem, is a simple bandwidth issue. For a given connection latency and TCP configuration, there is a hard limit to how fast you can send data, due to the time it takes to acknowledge the packets being received. When you’re multiplexing multiple file transfers down the one TCP connection, therefore, your total transfer speed will be limited by this TCP speed limit. Once again, it’s unlikely that this will cause you problems on a LAN (where round-trip delays are negligible), but in the high-latency world where connection sharing does the most good from a connection setup perspective, the speed limits will cause much wailing and gnashing of teeth. So, the take home message is: if you’re doing a lot of heavy data transfer over SSH, ControlMaster probably isn’t the solution for your problems. Instead, run multiple concurrent SSH connections, as the TCP speed limits are per-connection, so you can still fill your high-latency gigabit pipe — you just need lots of concurrent connections to do it (see also: BitTorrent).

Finally, there is something of an annoyance with ControlMaster, and it’ll probably confuse you mightily when you first come across it. Because all of your SSH sessions are multiplexed down a single TCP connection initiated by the first SSH session, that first session must stay alive until all of the other sessions are complete. This problem will manifest itself as an apparent “hang” when you log out of the remote session that is acting as the master — instead of getting your local prompt back, SSH will just sit there. If you Ctrl-C or otherwise kill this session, all of the other sessions you’ve got setup to that server will drop, so don’t do that. Instead, when you logout of all the other sessions, the master will then return to the local prompt.

If you’re doing a high volume of SSH connections to a particular remote endpoint, consider setting up a dedicated master connection — that way it’ll always be available (and you don’t have to worry about master logout hangs). I use a simple daemontools service, that runs ssh -MNn [email protected]. Works an absolute treat.