Migrating a physical server to a Virtual Machine

This page describes the process for migrating a currently-live physical machine to a new virtual server.

Assumptions

  • Typical victim will be an Anchor dedicated server and full root access is available.

  • Server runs a Redhat-type or Debian-type operating system
  • Server has network connectivity and is currently up and running.

Goals and Constraints

  • Integrity of data must not be compromised in any way
  • Perform conversion to a virtual machine running on one of our hosts
  • Physical server to be decommissioned once conversion has taken place, part of which will necessarily be done during the procedure (both systems can/should not be live simultaneously)
  • Minimise downtime caused by the procedure

Standing issues

  • Do older operating systems have drivers for all the virtual hardware?

Need-to-know and procedural assumptions

  • The target will be a clean VM with an extra 8gb virtual drive attached (in addition to the empty "main" drive for the new VM)
  • The extra drive is a clean installation of xubuntu, it has all our useful tools
  • The extra drive, positioned as /dev/sdb, will be booted from
  • Once booted, the target can receive the source's rsync
  • We do this because the available rescue environments are too limited
  • We'll convert any old-style partitioning to new LVM style in the process, it's easy and practically 90% done by other necessary steps
  • We'll also remove any bind-mounts, as they're pretty annoying eh

Procedure

Check source server

  • Ensure there's no outstanding package updates. It'll be annoying later if a newer kernel is installed than is currently running.

Create target

  1. Determine the actual disk usage of the live machine, and create a new VM with disk/s of appropriate size
    • Suggest leaving 50% room to grow. As an example: for a server with 100gb of data across all partitions, create a 150gb empty virtual disk (henceforth referred to as "vmdk")
  2. Attach the extra vmdk as a secondary SCSI device, typically node 0:1.

  3. Add NICs as required to match the source machine and attach to the appropriate VLAN/s
  4. Boot the target VM from the xubuntu installation.
  5. You might need to get the network configured on a free public IP address if you don't already have one. An address in the same subnet is ideal.
    • If the build network has already been configured, it may mess up your routing. Disconnect that interface in the VMware console and login again through the public IP.
  6. The xubuntu rescue installation should already accept password auth for root (though look out for this stuff if you have any problems)

Setup partitioning on the target disk

Modern style

  1. Partition the "main", empty vmdk. Now is a good time to move to the single-partition scheme from our old four, so we'll do that
    • 100mb /boot partition with type of 0x83
    • Leave the rest as LVM physical volume, type 0x8e
    • For the following directions, assume a "normal" case of /dev/hdb on the target - the migrationhelper will probably be occupying /dev/hda

  2. Configure LVM on the target.
    • These commands are copied from dedicated/Convert_Ext3_to_LVM, we show lvm subcommands because RHEL's rescue environment doesn't expose them as executables in the path

      pvcreate /dev/hdb2
      vgcreate SERVERNAME /dev/hdb2
      lvcreate -L 1024M -n swap SERVERNAME
      lvcreate -l 100%FREE -n root SERVERNAME

      Check your handiwork with the pvs, vgs, and lvs commands. If things don't appear you may need to poke it a bit by using the pvscan, vgscan and lvscan commands.

  3. Create the filesystems for the target

    mkfs.ext3 /dev/hdb1
    mkfs.ext3 /dev/SERVERNAME/root
    mkswap /dev/SERVERNAME/swap
  4. Get the new filesystems mounted at /sysimage

    mkdir /sysimage
    mount /dev/SERVERNAME/root /sysimage
    mkdir /sysimage/boot
    mount /dev/hdb1 /sysimage/boot
    rmdir /sysimage/lost+found /sysimage/boot/lost+found

Old-school mode (Debian Sarge, no LVM 4 U)

  1. Partition the "main", empty vmdk. You're probably not using LVM, but you're using RAID, which we no longer need
    • Copy the partition table from the source machine with sfdisk. Dump the partitions, copy the file across, then splat the partition table onto the vmdk

      # On the source
      sfdisk -d /dev/hda > /root/hda_partitions
      
      # Copy it to the target with copy-pasta or scp
      
      # Splatter the partition table on the target
      cat /root/hda_partitions | sfdisk /dev/sda
    • Check that it looks okay on the target

      sfdisk -d /dev/sda
      # partition table of /dev/sda
      unit: sectors
      
      /dev/sda1 : start=       63, size=   979902, Id=83, bootable
      /dev/sda2 : start=   979965, size=  3903795, Id=82
      /dev/sda3 : start=  4883760, size=  3903795, Id=83
      /dev/sda4 : start=  8787555, size= 69368670, Id= 5
      /dev/sda5 : start=  8787618, size=  3903732, Id=83
      /dev/sda6 : start= 12691413, size= 65464812, Id=83
    • You want to tweak the partition types for non-RAID now. Run cfdisk and set partitions to type 0x83, except for the designated swap, which is 0x82
    • Write out the updated partition table, you should be done now
  2. Create the filesystems for the target

    mkfs.ext3 -L root /dev/sda1
    mkswap /dev/sda2
    mkfs.ext3 -L var /dev/sda3
    mkfs.ext3 -L usr /dev/sda5
    mkfs.ext3 -L data /dev/sda6
  3. Get the new filesystems mounted at /sysimage

    mkdir /sysimage
    mount /dev/sda1 /sysimage/
    cd /sysimage/
    rmdir lost+found/
    mkdir data usr var
    mount /dev/sda3 var/
    mount /dev/sda5 usr/
    mount /dev/sda6 data/
    rmdir */lost+found

Make decisions about how you'll do the rsync

We're in the enviable position of being able to completely reshape the disk subsystem without too much work at all.

  • If you're migrating from a physical server then you highly likely have software raid on the source. We can get rid of that! (you're already doing this)
  • If the source is a really old machine it might not be using LVM. We can make that better! (you would've done this just a moment ago if it applies to you)
  • You almost certainly have bindmounts on the source. Seeing as our new systems eschew that for one big root volume, we can ditch the bindmounts too.

This will definitely vary on a system-by-system basis, but I'm sure you can figure this out.

Bindmount-preserving method

  1. Login to the source server, we need to build a list of exclusions for the rsync process - the exclusions are stolen from our rdiff backup procedure

    exclude=$(awk 'BEGIN { exclude="bind" } /^nodev/ { exclude=(exclude "|" $2) } END { print exclude }' /proc/filesystems)
    mount | egrep $exclude | awk '{ print $3 } END { print "/selinux\n/dev\n/mnt\n/tmp" }' | sort | uniq > /NOCOPY

    Have a look at the output of mount on the source server, look for anything else you might not want to copy, such as external USB drives.

  2. An example NOCOPY file

    /dev
    /dev/pts
    /dev/shm
    /home
    /home/investor/olddata
    /mnt
    /opt
    /proc
    /proc/bus/usb
    /proc/sys/fs/binfmt_misc
    /selinux
    /sys
    /tmp
    /var/lib/mysql
    /var/lib/nfs/rpc_pipefs
    /var/lib/pgsql

Ditching bindmounts

Proceed as above for bindmount-preserving, but observe the following notes. These notes might not be 100% accurate or applicable to your system, you need to use your head.

  1. When editing /etc/fstab and /etc/mtab (this is covered later on at the relevant stage anyway), delete any mention of bindmounts

    # POSSIBLE FSTAB
    proc /proc proc defaults 0 0
    /dev/bosun/root / ext3 defaults 1 1
    LABEL=/boot /boot ext3 defaults 1 1
    sysfs /sys sysfs defaults 0 0
    tmpfs /tmp tmpfs defaults,size=500m,nosuid,nodev 0 0
    
    # POSSIBLE MTAB
    /dev/mapper/bosun-root / ext3 rw 0 0
    /dev/sda1 /boot ext3 rw 0 0
    proc /proc proc rw 0 0
    sysfs /sys sysfs rw 0 0
    devpts /dev/pts devpts rw,gid=5,mode=620 0 0
    tmpfs /dev/shm tmpfs rw 0 0
    tmpfs /tmp tmpfs rw,nosuid,nodev,size=500m 0 0
  2. Create NOCOPY differently, you need to decide for yourself what needs to be excluded and what doesn't. In this example, I've chosen to copy /dev to make life easier later (no need to fiddle with MAKEDEV, but I still need to create the LVM devices). I've also ignored the bindmount sources from /data, and removed the places they're normally bound to. We're not running rsync with the -x flag, so rsync happily traverses between filesystems on the source.

    /dev/pts
    /dev/shm
    /data/home
    /mnt
    /data/opt
    /proc
    /proc/bus/usb
    /selinux
    /sys
    /data/var.lib.mysql
    /data/var.lib.postgres
    /data/var.lib.mysqlbackup
    /data/var.log
  3. We only mount the proc and sysfs special filesystems later on, /dev is already there.

Perform first (bulk) copy of data

  1. Schedule downtime for the source server in Nagios, this is probably going to take several hours
  2. Customer must be fully informed of what's happening, what we plan to do, how we plan to do it, what our contingencies are
  3. Setup some SSH keys on the source and target (migrationhelper), so root@source can SSH directly to root@migrationhelper without a passphrase
    • Lock down the from address on the migration helper for security, eg.

      from="202.4.12.23" ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEAmPHNFjlT/vJoXJPxnLSz5lsFreYNn/+H2Mk7kli5QVip6RAyKXobAPjKiZclLRcM/gPiwuTvzxIvYzcbvDEzV3+sxKmm6v38OBYtwa/rKDDoP9XUnASLDLJ0ESZtTgbzM5tbheY2pHPPkRhnkW2OII8tZkZXs67rCd52k8cNBo0= root@SOURCESERVER
  4. Start the first rsync on the source server. Any currently-open files will be skipped, but they'll be copied in the next step
    • We're copying the entire filesystem tree, with exclusions for non-real files
    • Note that we're not using the -x flag - this means that filesystem boundaries will be crossed, so it's important that your NOCOPY file is good.

    • It's important that you use --numeric-ids on the rsync. It's not a problem 99% of the time, but unless the systems are identical there's a chance that uids and gids on the source don't match those on the migration helper. This would cause pain and problems later on.

    • Your magic invocations:

      rsync -avn --delete --numeric-ids --exclude-from=/NOCOPY / root@migrationhelper:/sysimage/   # best to dry-run it first, see that it looks sane enough, but it'll be a bit verbose for easy reading
      rsync -av --bwlimit=5120 --delete --numeric-ids --exclude-from=/NOCOPY / root@migrationhelper:/sysimage/   # bandwidth limiting is not a bad idea, for everyone's sake

Perform final copy of data

  1. Shut down all services on the source, other than networking. Check that all possible files are closed, by using lsof

  2. Re-run the rsync to get the remainder of the files.
    • The target should now resemble the source machine, as though it just had the power pulled out
  3. Re-enable services on the live machine, if you're assured that it's okay to lose changes made after this point.
  4. Or take the source machine down now, in a state where it's available to be used again at a moment's notice

    • You could do this by administratively shutting down the switch-port

Take a pre-munge snapshot

  • If using VMware: take a VMware snapshot now, label it "post-copy, pre-cleanup"
  • If using Xen: unfortunately it's not possible to promote a snapshot to a fully-fledged volume. Thus, a snapshot is great for practising, but no good for keeping a copy to roll-back to (not trivially, anyway. you could dd from the unmunged snapshot, but it'd be slow)

Migration munging

This is where we massage the physical machine into the VM environment.

  1. Mount the necessary magic filesystems in the chroot on the migrationhelper

    mkdir /sysimage/proc
    mount -t proc proc /sysimage/proc
    
    mkdir /sysimage/sys
    mount -t sysfs sysfs /sysimage/sys
    
    # only do this if you're not rsync'ing `/dev`, this depends on what your earlier decisions
    mkdir /sysimage/dev
    mount -o bind /dev /sysimage/dev
  2. Jump into the chroot

    chroot /sysimage
  3. Ensure that all necessary mountpoints exist - I think this is just for bind-mounds. Non-exhaustive examples

    mkdir /home
    mkdir /opt
    mkdir /var/lib/mysql
    mkdir /var/lib/pgsql
    mkdir /var/lib/postgres  # <-- Debian Sarge(?)
    mkdir /var/lib/mysqlbackup
    mkdir /var/spool
  4. We typically expect some temp space, tmpfs is our usual nowadays

    mkdir /tmp
    mount -t tmpfs tmpfs /tmp -o defaults,size=500m,nosuid,nodev
    
    # If the source isn't using tmpfs, your /tmp needs to be world-writeable and sticky
    chmod 777 /tmp
    chmod +t /tmp
  5. On Debian (all versions?) the initramdisk expects to mount a tmpfs to /lib/init/rw/

    mkdir /lib/init/rw
  6. Take a backup copy of fstab, mtab and the grub configuration. It's not strictly needed, but it's helpful if you need to refer to a pre-change state quickly

    cp /etc/fstab /root/
    cp /etc/mtab /root/
    cp /boot/grub/menu.lst  /root/   # for Debian-type
    cp /boot/grub/grub.conf /root/   # for Redhat-type
  7. Correct /etc/fstab to reflect changes in partitioning. This will be dependent on the particular system you're working with, this is just a guide.

    • Remove references to /usr, /var, /data, swap

    • You may wish to add sysfs if it's not already there (sysfs is a newish thing); ensure /sys exists

    • Update the root mount and add the entry for /boot, etc.

    • An example of an entirely reasonable fstab

      /dev/mapper/SERVERNAME-root     /               ext3    defaults,noatime,nodiratime     1 1
      /dev/hda1                       /boot           ext3    defaults,noatime,nodiratime     1 2
      /dev/mapper/SERVERNAME-swap     swap            swap    defaults        0 0
      
      none                            /proc           proc    defaults        0 0
      none                            /sys            sysfs   defaults        0 0
      tmpfs                           /tmp            tmpfs   defaults,size=500m,nosuid,nodev 0 0
      none                            /dev/pts        devpts  gid=5,mode=620  0 0
      none                            /dev/shm        tmpfs   defaults        0 0
  8. Correct /etc/mtab to reflect the changes in the fstab, you have to figure out the specifics for yourself.

    • Remove usbfs if you see it in mtab, the OS will be confused next time it boots and tries to mount usbfs again, or fails to mount it

    • A typical mtab example

      /dev/mapper/SERVERNAME-root / ext3 rw 0 0
      /dev/hda1 /boot ext3 rw 0 0
      proc /proc proc rw 0 0
      sysfs /sys sysfs rw 0 0
      devpts /dev/pts devpts rw,gid=5,mode=620 0 0
      tmpfs /dev/shm tmpfs rw 0 0
      tmpfs /tmp tmpfs rw,nosuid,nodev,size=500m 0 0
      none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0
      sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw 0 0
  9. grep /etc for references to md0 and friends (will depend on config of source server), make notes on anything you find, use your judgement to determine whether they'll need to be updated

    grep -r md0 /etc/*
    grep -r md1 /etc/*
    etc...
  10. Correct any issues relating to your grep of /etc - some examples from my own work:

    • Delete /etc/lilo.conf.anaconda

    • Delete /etc/lvm/.cache or /etc/lvm/cache

    • Remove array definitions from /etc/mdadm/mdadm.conf or remove it entirely

    • Remove lines referring to md-raid from /etc/rc.d/rc.sysinit

  11. Check out /etc/inittab (may not apply on very new systems using upstart). Check for any gettys on physical serial lines (ttyS0, for example). These generally won't work on a VM, and shouldn't be needed, so comment them out.

  12. Check out /etc/modprobe.conf for anything relevant, correct appropriately (doesn't apply to Debian-type, apparently)

    • Of particular importance is controller driver modules and whatnot
    • This is for Redhat

      alias eth0 8139cp
      alias scsi_hostadapter ata_piix
    • This is for Redhat on KVM, it's different!

      alias scsi_hostadapter ata_piix
      alias scsi_hostadapter1 virtio_blk  <-- not sure what the deal is with this one
      alias eth0 virtio_net
      alias ethX virtio_net
      ...
  13. The initrd almost certainly needs rebuilding due to the sudden change in drive controller
    • This will rely on correct detection of LVM, which means we need to bindmount /dev from outside the chroot

      # jump out
      mount -o bind /dev /sysimage/dev
      chroot /sysimage   # again
    • Backup the existing initrd, just in case (it can't really help you anyway and there's no risk of losing anything here)

      cp /boot/initrd-XXX-XXX.yyyy.img /boot/initrd-XXX-XXX.yyyy.img.BACKUP
    • Redhat-type uses mkinitrd, which you throw some commandline options at. Figure out the correct uname -r for yourself in the chroot, as it'll misdetect the running kernel of the migrationhelper

      mkinitrd -v -f /boot/initrd-`uname -r`.img `uname -r`
      
      # Example for RHEL4 in a Centos 5 migrationhelper
      mkinitrd -v -f /boot/initrd-2.6.9-89.0.11.EL.img 2.6.9-89.0.11.EL

      check the output and look for copying of a statically-linked lvm binary, looks like this:

      /sbin/lvm.static -> /var/tmp/initrd.yD5560/bin/lvm
      /etc/lvm -> /var/tmp/initrd.yD5560/etc/lvm
      `/etc/lvm/lvm.conf' -> `/var/tmp/initrd.yD5560/etc/lvm/lvm.conf'
      • Under VMware: This may (but shouldn't) require use of --with=MODULENAME to get the necessary modules for the VMware virtualised hardware

        • mptbase

        • mptspi

        • ata_piix

        • mptscsih - unconfirmed

    • Debian-type (Etch and later) uses mkinitramfs, which has lots of magic - unsure of any necessary extra parameters
      1. Dig around in /etc/initramfs-tools/ and see if you need to modify anything. Modules can be added in modules, other config is in initramfs.conf.

      2. While you're there, delete anything in conf.d/resume

      3. For some reason the update flag doesn't work, so just delete and recreate the initrd; you need to specify the version you want

        update-initramfs -v -d -k 2.6.18-5-686
        update-initramfs -v -c -k 2.6.18-5-686
      4. Check the output of mkinitramfs and look for the modules you expect it to have

    • Debian-type (Sarge and earlier) uses mkinitrd instead. You can probably specify extra modules in /etc/mkinitrd/. Another catch is that update-initrd only does 2.2 and 2.4 kernels, WTF guys!? You have to use mkinitrd instead.

      update-initrd 2.6.8-4-686
      
      mkinitrd -k -o /boot/initrd.img-2.6.8-4-686.NEW 2.6.8-4-686
      • If you get some error about there being no physical volumes, you probably need to activate them. pvscan will usually do this, but in some cases it may say it finds nothing. Chances are /etc/lvm/lvm.conf is configured to ignore the relevant block devices for some reason. Try one of the following settings, you can then also run vgscan and lvscan if you're still having no love.

            # By default we accept every block device
            # filter = [ "a/.*/" ]
        
            # Exclude the cdrom drive
            filter = [ "r|/dev/cdrom|" ]
        
            # If sysfs is mounted (2.6 kernels) restrict device scanning to 
            # the block devices it believes are valid.
            # 1 enables; 0 disables.
            sysfs_scan = 0
    • For Debian Sarge this may still not work when you reboot. If that's the case, just get an initrd from a clean Sarge installation straight to VMware.
  14. Correct /boot/grub/{menu.lst,grub.conf}, you'll need to use your judgement to make sure you get everything

    • If a Debian-type system then you've got automagic sections to update - just quietly, update-grub can probably do this for you; I'm not sure
      • kopt=

    • root=(hd0,0)

    • kernel /vmlinuz-2.6.18-92.1.22.el5 ro root=/dev/SERVERNAME/root

    • initrd /initrd-2.6.18-92.1.22.el5.img

  15. Fix the MBR with grub, as per post-driveswap steps from our Linux software RAID failure procedure

    #roughly
    device (hd0) /dev/sda
    root (hd0,0)
    setup (hd0)
  16. Check the boot-time device map in /boot/grub/device.map, make corrections as appropriate

  17. Check the network configuration for any hardcoded MAC addresses. If present, update them to use the MAC address/es shown in ip link

  18. For Debian: there are some details here about udev and consistent network interface naming. You should probably update whatever line/s is in /etc/udev/rules.d/z25_persistent-net.rules and cross your fingers

  19. If you're not sure of the root password, now is a good time to set it to something known. Note the current password hash on a ticket, then set it to something internally-obvious, like test123

    touch /root/current_root_password_hash
    chmod 0600 /root/current_root_password_hash
    grep ^root /etc/shadow > /root/current_root_password_hash
    passwd
  20. Exit the chroot, I think you're done.
  21. Unmount everything under /sysimage - this isn't strictly necessary, but I've seen the migrationhelper get hung up and have trouble unmounting things during shutdown

Take a post-munge snapshot

If using VMware, take another snapshot now, and label it "post-copy, post-cleanup, pre-reboot". If anything comes up wrong, you can rollback to this state where everything is still mounted and easily munge-able

Bringing it up

Things should now just work, you can boot the VM, test to make sure everything's working as expected

Watch out for the network coming up; the source server should be downed or otherwise "shot in the head". Alternatively, fudge the network config in Xen/VMware to ensure it can't break anything, then bring it up while the source machine is still running.

  1. Shutdown the migrationhelper
  2. Detach the target disk from migration helper. For Xen this means commenting it out in the config, for VMware this means "disconnecting" it
  3. Attach the target disk to the VM it really belongs to. This is basically nothing to do in Xen, and an appropriate "connection" in VMware
  4. Check that the hardware and networks configuration looks sane as far as Xen/VMware is concerned
  5. Boot the VM
  6. If in VMware, jump into the BIOS and sanity-check the hard drives and boot order

Fingers crossed! This should hopefully work.

Tidying up the VPS

At your leisure, complete the switch of the live and virtual servers and finish patching up the target

  • Put back the root password ASAP. If you're unsure about the old root password matching our records (and the customer is cool with it), just reset the password so it matches our records. If you're reinstating the old password hash, use the vipw command to do it.

  • If backups were configured on the source server, fiddle eth0.241 to eth1
  • Ensure that the VM will autostart when the host boots
    • Xen
      1. There's a symlink in /etc/xen/auto/ to ../servername.cfg

    • VMware
      1. VMI control panel
      2. Click on the host
      3. Choose the Configuration tab
      4. Select "Virtual Machine Startup/Shutdown" from the main pane, under Software
      5. Click "Properties" in the top-right of the main pane
      6. Find the VM and shuffle it up into the Any Order section
  • Are backups configured and working?
    • You'll almost certainly need to revise your list of inclusions/exclusions, now that you've restructured the filesystem mounts
  • Are all SSL sites working? This is a good way to test that the network is all good
  • Remove any raid checks from nagios, it's not needed any more
  • what else are we forgetting..?

Optimising

  • Adjust installed packages appropriate to a VM, taken from HardeningInitialInstallation - a VM simply doesn't need a lot of things

    • avahi*
    • nfs-utils*
    • portmap*
    • pcsc*
    • smartmontools
    • kudzu
    • firstboot-tui
    • system-config*
    • mdadm
  • Check that swap sizing is appropriate. We used to suggest ditching the swap partition and using a swapfile. Do that if you need to, directiosn at LinuxSwapfile.

Housekeeping

  1. Shutdown the migration helper (if applicable)
  2. Update the provisioning system
    1. Rename the old asset to SERVERNAME.old
    2. You'll can then create a new VM with the same name, this is the same as the usual VM creation procedure
    3. Create HDD and NIC sub-resources as applicable
    4. The VPS in marked as In Use
    5. The old server is marked To Be Decommissioned
    6. If the buildsheet exists, dupe that page and cleanup both as best as possible
  3. Arrange to have the physical server physically switched off so it can't accidentally come up and screw you before it's prepared
  4. Create a ticket for the decommissioning of the old physical host
  5. Clean up the snapshots if everything is okay


Other pages in similar categories