Overview of Checkpoint and Restore – live-migrating processes on a Linux system

We’re attending Linuxconf 2013 this week, being held down in our fair capital Canberra. There’s been some great talks so far, we thought we’d share one of the most interesting with you. In a nutshell, Checkpoint and Restore In Userspace (CRIU) is the ability to take a point-in-time snapshot of a running process (checkpoint), and revive it later, either on the same system or another system (restore). We’ll go over the difficulties in pulling this off, and what it’s good for. Problems – the rabbit hole goes much deeper At first blush, this sounds simple enough – dump the process’ memory and stash it away, then later restore it and fix up a few references in the kernel, too easy! Not so fast there, there’s a lot of subtle problems…

