Having a solid backup system in place for any project is pretty much essential but if you’re anything like me it’s something that’s often neglected. Let’s be honest with ourselves backups are boring, we’d much rather spend our time working on fun things than botching shell scripts that copy files around. For this reason until last night the backup system for my web server consisted of a USB hard drive connected to my home server and a single rsync command run by cron. This has always been perfectly adequate but the way it only gives one day to restore a deleted file before removing it forever is often a worry.

Luckily we have rdiff-backup which works in a comparable way to rsync but with one very important difference; it stores the incremental history of every file. Changed a file 2 months ago and need to revert to the previous version? No problem! This is certainly a much more intelligent approach than simply keeping a mirror of the directory.

How it works

The current version of the folder being backed up is stored as it would be with rsync, as a direct mirror, there is also a special folder added that stores the reverse binary diffs for each file. Each time the backup runs only the change to a file is stored, this is a very efficient use of space and means the contents of a file can be reconstructed from any point in time. For large files you’ll need a bit of CPU time to restore but nothing too painful really.

Backup

rdiff-backup is designed to backup a specific folder into another folder so the arguments don’t _quite_ have the same meaning as they do for rsync. If you wanted to make a backup of everything in /home/jacek and store it in /media/backups/home/jacek you would do something along the lines of

rdiff-backup --backup-mode --create-full-path-v 5 /home/jacek /media/backups/home/jacek

Restore

To restore a file you need to know where it was and the time it existed at. If I wanted to restore my desktop wallpaper to how it was last week I’d run

rdiff-backup -r 7D /media/backups/home/jacek/wallpaper.png /home/jacek/wallpaper.png

The history tree can be mounted as a filesystem for easy, and occasionally a little slow, browsing with another too called rdiff-backup-fs.

rdiff-backup-fs /mnt/restore /media/backup/home/jacek

Now you can browse all of the files in /mnt/backup and copy them out manually. Be warned though, this will compute the diff for every file visible in the folders you open and can be a bit slow.

Removing old data

You probably don’t want the size of the backup growing forever with no limit, luckily there is a really simple command to clear up old data.

rdiff-backup --remove-older-than 6M --force /media/backups/home/jacek

This will remove any diff data that is for files older than 6 month. You’ll need to experiment to get the right time for your storage space.

Installing

Super simple, just use your favourite package manager

apt-get install rdiff-backup rdiff-backup-fs