Backing up Using Rsync

This a short guide on how to use rsync to back up important files from your self managed machine to your DICE file space.

Introduction

The first resources you will need are the rsync man page and the rsync documentation. You also need an installation of rsync. Almost all linux distributions come with an rsync package, often installed as part of the basic installation. You should refer to your distribution documentation for more information. Mac OS X also has the rsync command line tool installed as part of the standard instalaltion and is accessible using the terminal application. Contributions for how best to install rsync on Windows are welcome.

It is worth mentioning at this point the most important command line option available when using rsync: -n (aka --dry-run). When used in conjunction with the -v (aka --verbose) option this allows you to see what will happen before you try it for real. You should always test your rsync commands with the options -v -n (this can be shortended to -vn).

Be careful, you can end up deleting files rather than copying them using rsync if you get your source and destinations wrong.

Using Rsync

At its most simple level backing up your home directory on your self managed machine can be done with the command:

        bash$ rsync -e ssh -av $HOME/ <remote user>@<remote server>:/home/<remote username>/Backup

The -e ssh option instructs rsync to use ssh as the transport mechanism and allows you to transfer files to any machine you can SSH into. The -av uses the archive mode to transfer files and requests verbose output.

The $HOME/ part refers is the source location, also refered to as the root of transfer. All other actions are preformed relative to this directory. <remote user>@<remote server> refers to the remote username and the remote server you wish to transfer to, for example joe@myserver.inf.ed.ac.uk. The /home/<remote username>/Backup part refers to the destination directory on the remote server to transfer files to, for example /home/joe/Backup.

Note that the /home/<remote username>/Backup directory must already exist on the remote side.

For most users this basic recipe will need a little tweaking to take into account that backing up all of your home directory is likely to be overkill and cause storage space problems on the other destination side.

A more practial example is:

        bash$ rsync -e ssh -vrlpt \
            $HOME/ \
            <remote user>@<remote server>:/home/<remote user>/Backup \
            --include "/Desktop/" \
            --include "/Desktop/**" \
            --include "/Documents/" \
            --include "/Documents/**" \
            --include "/Library/" \
            --include "/Library/Mail/" \
            --include "/Library/Mail/**" \
            --exclude "*"

In this example the -av options have been replaced with -vrlpt. This is because the -a option implies -rlptgoD which will attempt to preserve the group and user ownership attributes This not always suitable if the user and groups at the two ends have different meanings. This is quite likely to be the case on a self managed machine. Not using these options will ensure that the files on the remote side are owned by your remote user and the default group for your user on the remote side.

The -D option allows rsync to handle device files, however you are unlikely to have any of these in your home directory making this option unecessary.

The important parts are all the --include lines and the --exclude "*" at the end. Each of the --include options come in sets:

        --include "/Documents/"
        --include "/Documents/**"

The first line directs rsync to include the directory /Documents/ in the set of files to be copied. The second line directs rsync to include all files and subdirectories (recursively due to the = -r= option to rsync) below the /Documents/ directory. All the include patterns are relative to the root of transfer which is specified by the source option to rsync, in this case $HOME/.

When there are multiple levels of directory structure to the target directory the rules are similar but with extra --include per parent directory:

        --include "/Library/"
        --include "/Library/Mail/"
        --include "/Library/Mail/**"

There must be an --include rule for each directory level that makes up the desired source directory. The above rule will copy the /Library/Mail/ directory structure to the destination and all the files inside the /Library/Mail/ directory. It will not copy any other files in the /Library/ tree unless you add a =--include "Library/**" rule. Using these patterns you can build up a very precise picture of what you want to transfer.

The --exclude "*" option is present as the default rule for rsync transfers is to include everything relative to the root of transter (at least if the -r option is included). When only a subset of the files is desired to be transfered the basic strategy is to specify everything you do want using --include rules and exclude everything else.

The best way to pass complex --include and --exclude rulesets to rsync is to use the --include-from=FILE option. This allows the sets of rules to be listed in a file rather than on the command line. The notation is similar to using the --include and --exclude although a little shorter. An example --include-from=FILE equivalent to the rules above would be:

            + /Test/
            + /Test/Desktop/
            + /Test/Desktop/**
            + /Test/Documents/
            + /Test/Documents/**
            + /Test/Library/
            + /Test/Library/Mail/
            + /Test/Library/Mail/**
            - *

This would allow you to use the command:

        bash$ rsync -e ssh -vrlpt \
            $HOME/ \
            <remote user>@<remote server>:/home/<remote user>/Backup \
            --include-from=<name of file containing the rules above>

For more information on =--include" and "--exclude" patterns see the EXCLUDE PATTERNS section of the man page.

Other Resources

This page has some Tips and Tricks. For examples of more advanced use of rsync for backup see this very good article.

-- CarwynEdwards - 21 Sep 2005

Topic revision: r3 - 27 Sep 2005 - 04:36:24 - CarwynEdwards
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies