Requirements
Through the last years I have used various own hacks for backing up my desktop(s). There are dozens of packaged backup solutions in Debian/Ubuntu already, but none of them did quite fit my requirements:
- KISS! no fancy web UI, storage formats, or millions of plugins and configuration files; backups should just be a normally accessible directory
- Supports standard backup strategy: daily backups for last week, weekly backups for last month, permanent monthly backups. This must not require my computer to be switched on all the time.
- Runs as my own user, so that I don’t need to set it up each time I reinstall my box
- No interactivity; any backup solution that requires me to do anything regularly is doomed to fail.
- Push-style backup to my server through ssh (or derived, like scp or rsync)
- Supports per-directory filtering to avoid backing up unnecessary stuff; my upload bandwidth is very small. (e. g. I don’t want to include
~/.cache
and in~/evolution
I want to ignore thecache
subdir).
rsnapshot
I have used rsnapshot as a basis for about a year now. It was originally intended to be used with pull-style, but that does not work for home setups behind a NAT. But it’s easy to use push-style with it (details later). It is by and large a fancy wrapper around good old trusted rsync, which is why I liked it from the start: It by and large just creates a full tree copy of your data for each snapshot, and uses hardlinks to avoid duplicate files. So restoring is easy and robust, you can use any file browser to get to your data.
File selection
If in doubt, backup should include a file rather than exclude it. I value completeness over small storage size, and I just check the volume of a snapshot from time to time to ensure that it doesn’t grow too big (it’s currently in the magnitude of 200 MB, which is small enough for daily deltas to be pushed through a slow DSL uplink without much pain). So my approach is to backup everything in /home/martin except explicitly configured files. For configuring the blacklist I use per-directory .rsync-filter
files (which have builtin support by rsync).
Excerpt of my ~/.rsync-filter
:
# global ignores
– *.log
– *.cache
– .*.swp
– .swp
– .*.lock
# only direct subdirectories/files
– /.ICEauth*
– /.Trash
/.aptitude
/.ccache
[…]
/download
/ubuntu
[…]
(In case you wonder, everything in ubuntu/ is either in the Ubuntu archive or in bzr, so no need to include this.)
Another example is ~/.Private/mozilla/firefox/t3znsw4q.default/.rsync-filter
:
– /url*.sqlite*
– /*.bak
– /Cache
– /adblockplus
– /OfflineCache
Please see man rsync
, section “FILTER RULES” for the details of the syntax.
Having and maintaining a sensible arrangement of your home directory is by far the most difficult aspect of backup, if you need to be stingy with bandwidth.
rsnapshot configuration
You need a central configuration file ~/.rsnapshotrc
. The most important settings are the paths to back up, the destination directory, and the modes (daily/weekly/monthly). In addition I include my crontab into the backup, and add a post-backup action to rsync the backup tree to my server. Backups go to /var/backups/$USER
on my systems, which is a different partition than /home
(can’t stress that enough; today’s file systems are good, but not infallible).
config_version 1.2
snapshot_root /var/backups/martin
cmd_rsync /usr/bin/rsync
link_dest 1
one_fs 1
lockfile /home/martin/.rsnapshot.lock
rsync_long_args -F –delete –numeric-ids –delete-excluded
cmd_preexec /bin/sh -c ‘crontab -l > ~/.crontab’
cmd_postexec /bin/rm ~/.crontab
cmd_postexec /usr/bin/rsync -e ‘ssh -i /home/martin/.ssh/id-backup_rsa’ -aHzvPy –delete /var/backups/martin/ piware.de:backup/tick-home
interval daily 7
interval weekly 4
interval monthly 6
backup /home/martin martin-home
cronnery
The last piece of the puzzle is a script which calls rsnapshot regularly with the desired mode. I wrote a small shell script (which lives in ~/bin/backup
) which determines the age of the last daily/weekly/monthly backup, and calls rsnapshot with the correct mode argument. It doesn’t do anything if the last backup was done less than a day ago, so it’s designed to be called very often.
The actual cron job just needs to call it every hour:
$ crontab -l
# m h dom mon dow command
05 * * * * $HOME/bin/backup >/dev/null
And voila, from now on I have e. g. yesterday’s backup on piware.de:backup/tick-home/daily.1/
.