My blog has moved!

You should be automatically redirected in 6 seconds. If not, visit
http://taylor.braun-jones.com/blog/
and update your bookmarks.

Easy, automated backups with rdiff-backup

Let me start by saying rdiff-backup is exactly the tool I set out to find when looking for a backup solution. It makes use of librsync so you get all the goodness of the rsync remote-delta optimization algorithm, plus incremental backups. To top it off, your most recent backup is stored as a simple mirror so recovering your most recent backup of a given file is just a matter of (s)cp'ing it.

To setup automated, unattended backups (the holy grail of data backups) I used a good HOWTO guide for setting up unattended backups. so I won't duplicate that; I'll just summarize and detail the parts that are unique to Debian or ARM/embedded systems (I'm running Debian squeeze on a SheevaPlug). For simplicity, I'll use the same machine names as were used in the original HOWTO – kitty for the backup server and fishie for the machine to be backed up. I'm only backing up one home directory on fishie for a hypothetical user we'll call nemo.

  1. First, install rdiff-backup on kittie and fishie:
    sudo apt-get install rdiff-backup
  2. Create an account for your backup bot:
    adduser --system --group --disabled-login --disabled-password --home /var/backupbot backupbot
    My 1 TB USB hard drive is mounted at /var so that is where I place the home directory. This also follows the FHS guidelines.
  3. Create a public/private key pair for backupbot, copy it to the authorized_keys file of nemo@fishie, then test it:
    sudo -s -u backupbot
    ssh-keygen -f ~backupbot/.ssh/id_fishie
    ssh-copy-id -i ~backupbot/.ssh/id_fishie.pub nemo@fishie
    ssh nemo@fishie
  4. Assuming the above commands worked, you should now be logged in as nemo@fishie. Open an editor and prepend the following to the long line of gibberish that was just added to ~nemo/.ssh/authorized_keys
    command="rdiff-backup --server --restrict-read-only /home",from="kittie",no-port-forwarding,no-X11-forwarding,no-pty 
    Now any SSH session that is authenticated using this key will (1) automatically launch rdiff-backup in server mode and (2) have several SSH features disabled (for extra security).
  5. As backupbot, open ~backupbot/.ssh/config (sudo -u backupbot emacs ~backupbot/.ssh/config) and enter the following:
    host nemo-backup
        hostname fishie
        user nemo
        identityfile /backup/.ssh/id_fishie
        compression no
        protocol 2
    Some notes about these options:
    • I don't use root to log into the remote machines because I'm only backing up home directories. This means you should take care to not to create files within your home directory that you do not have read access to (or at least keep them confined to one subdirectory so that rdiff-backup can skip them with a simple --exclude option). To find and fix any files that you may have already created while (e.g.) running as root use the following command:
      sudo find ~nemo -not -user nemo -exec chown nemo:nemo {} +
    • The first time I ran a backup I used 'compression yes'. I promise never to do that to you again, poor little SheevaPlug. If you have a slower link speed or a faster CPU then compression may make sense. For more optimization you may want to experiment with 'cipher blowfish' option. It can supposedly run at 88% of the speed of 'cipher none' but I haven't had a chance to play with it myself.
  6. The last step is to automate the backup. Run sudo -u backupbot EDITOR=emacs crontab -e. Add the following crontab line:
    30 0 * * * rdiff-backup nemo-backup::/home/nemo /var/backupbot/nemo && rdiff-backup --remove-older-than 3M /var/backupbot/nemo && echo "Backup successful"
    This will run a backup every night at 12:30 AM and delete any incremental backups older than 3 months.

I'll do a follow up post on how I setup postfix with Gmail's SMTP servers. So that each user gets an email each morning with the status of the backup.

No comments:

Post a Comment