Speed Boost for Repeated SSH
If you lead Linux life, you probably have a bunch of scripts automating it. Assuming you have access to more than one computer, it’s really easy to use SSH and execute stuff remotely. In my network, for example, I’ve automated daily reports. They connect to all various servers I have around, collect bunch of data, and twice a day I get an e-mail detailing any unexpected findings.
Now, this script has grown over the years. At the very beginning it was just checking ZFS status and health, then it evolved to check smart data, and now it collects all things disk related up to a serial number level. And that’s not the only one. I check connectivity, temperatures, backup states, server health, docker states, and bunch of other stuff. So, my daily mail that used too come at 7:00 and 19:00 every day over time started taking over 30 minutes. While this is not a critical issue, it started bugging me - why the heck it takes that long.
Short analysis later and my culprit was traced to the number of SSH commands those script execute. Just checking my disks remotely executed commands over SSH more than 50 times. Per server. And that wasn’t the only one.
Now, solution was a simple one - just optimize darn scripts. And there was a lot of places to optimize as I rarely cached command output. However, those optimizations would inevitevely make my Bash scripts uglier. And we cannot have that.
Thus, I turned toward another approach - speeding up the SSH. Back in days when I first played with the Ansible, I noticed that it keeps its connections open. At the time I mostly noticed it due to issues it caused. But now I was thinking - if the Ansible can reuse connections, can I?
And indeed I can. Secret lies in adding the following configuration to the ~/.ssh/config
file:
ControlMaster auto
ControlPersist 1m
ControlPath ~/.ssh/.persist.%r@%h:%p
What this controls is leaving the old SSH connection open, and then reusing the existing connection instead of going throush the SSH authentication each time. Since SSH authentication is not the fastest thing out there, this actually saves a lot of CPU time thus speeding it a lot. And, since connection is encrypted, you don’t lose anything.
Setting ControlMaster
to auto
allows your SSH connection to reuse the existing connection if it exists and fallback to the “standard” behavior if one cannot be found. Location of cached sockets is controlled using ControlPath
setting and one should use directly that is specific to the user. I like using .ssh
rather than creating a separate directory but any valid path will do as long as you parameterize it using %r
, %h
, and %p
at a minimum. And lastly, the duration we can specify using ControlPersist
value. Here I like using 1 minute as it gives me meaningful caching for script use while not keeping connection so long that I need to kill them manually.
With this, the execution time for my scripts went from more than 30 minutes to less than 5. Not bad for a trivial change.