Duplicity Backup

By Paulus, 24 June, 2018

What is Duplicity?

Duplicity is a command line tool used for incremental backups to local or remote storage, which supports Amazon S3, Rackspace Cloud, Dropbox, Google Docs, rsync, ssh, FTP, FTPS, and webdav.

Prerequisites

  • gpg
  • swaks
  • duplicity
  • duply (optional)

Generating GPG Keys

Although duplicity uses GPG by default, this can be skipped by adding the --no-encryption option to the command.

# gpg —gen-key
or
# gpg --full-generate-key

Depending on the version of GPG you have installed, the --gen-key or --full-generate-key may be the command line option that you want in order to specify the following options.

Select what kind of key you want:
RSA and RSA:
DSA and Elgamal
DSA (sign only)
RSA (sign only)

If you selected RSA and RSA, select the bit length, how long the key should be valid for, and info such as Real name; email address; and comment. Finally, enter a pasphrase, which needs to be remembered. While the key is being generated, you will need to do some work such as typing on the keyboard, moving the mouse, or hard drive work.

gpg: checking the trustdb
gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model
gpg: depth: 0 valid: 1 signed: 0 trust: 0-, 0q, 0n, 0m, 0f, 1u
pub 4096R/AE6DDA2A 2015-10-12
Key fingerprint = 2A5A 9EFD D958 D0C5 FCAA 713A E51E 324A AE6D DA2A
uid [ultimate] Paul Lyon (Backup Key) <xxxx@xxxx.xxx>
sub 4096R/16E8FE75 2015-10-12

The GPG key in this case is AE6DDA2A

Backup your keys, if you lose your private key, you will no longer be able to retrieve your backup.

# gpg --export -a "Paul Lyon" > public.key
# gpg --export-secret-key -a "Paul Lyon" > private.key

To import them, run the following commands:

# gpg --import public.key
# gpg --allow-secret-key-import --import private.key

Backing Up

To backup locally, without any encryption:

# duplicity --no-encryption /home/paulus/Pictures file:///mnt/backup/paulus

To backup locally with GPG encryption:

# duplicity --sign-key AE6DDA2A /home/paulus/Pictures file:///mnt/backup/paulus

Other useful options include the --exclude, --exclude-device-files, --exclude-filelist, --exclude-regexp, --include, --include-filelist, and --include-regexp that will allow you to be selective as to what files you want or don't want in the backup.

Amazon S3

I’ve found that backing up to Amazon S3 can be a bit finicky that result in various errors on different distributions. For example, on a dedicated server through A Small Orange running CentOS, exporting AWS_ACCESS_KEY and AWS_SECRET_KEY does not seem to work. However, using duply and adding the API and secret to the duply configuration works.

If you are having any issues try different combinations of the following:
add or remove the —s3-use-new-style option
try s3://<amazon s3 endpoint url>/<container>[<directory>], s3://<container>[/<directory>], or replacing s3 with s3+http
This usually solves the “No connection to backend” error see this bug report: https://bugs.launchpad.net/duplicity/+bug/1278529
You can get a list Amazon end points here: http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region
If you’re using a European bucket, make sure to use the --s3-european-buckets option.

In order to upload backups an IAM (Identity Access Management) account needs to be created. This is not a comprehensive guide to managing users, this is just to demonstrate how to set up a basic user to get your backups to Amazon’s cloud storage. In a larger organization, more care should be given to user creation.

For each user created, and if you selected the option to create an access key for each user, you will be given an access key and secret access key.

Download the credentials because this will be the only time that you will see the secret access key.

In order to be able to upload the backups to Amazon S3, the user associated with the access keys must have the ability to do so. This is done by attaching a policy to the user account. This can be done by adding a user to a group which as the required policies attached or by individually adding those policy to the user account.

# export PASSPHRASE=“GPG_KEY_PASSPHRASE"
# export AWS_ACCESS_KEY=“YOUR_AWS_ACCESS_KEY"
# export AWS_SECRET_ACCESS_KEY=“YOUR_AWS_SECRET_ACCESS_KEY"
# duplicity --sign-key AE6DDA2A --s3-use-new-style /home/paulus/Pictures s3+http:///home/backup/paulus
# unset $PASSPHRASE
# unset $AWS_ACCESS_KEY
# unset $AWS_SECRET_ACCESS_KEY

Dropbox

In order to upload to Dropbox, you will need the Dropbox SDK. The easiest way to get that is using the Python Package Installer (pip). If you’re running on RHEL or CentOS, you will need to install the python-pip from the EPEL repository.

# yum -y install python-pip

Install the Dropbox SDK.

# pip install dropbox

During the installation, pip may state some packages need to be upgraded. In my case I had to upgrade pip, urllib3, six, and requests. Use pip’s install command to upgrade any packages that need to be updated:

# pip install —upgrade pip urllib3 six requests

When running a backup for the first time, it must be interactive so that duplicity can store the OAuth token.

# duplicity /home/paulus/Photos dpbx:///Photos
url: https://www.dropbox.com/1/oauth/authorize?oauth_token=TOKEN

Visit the URL and authorize the application to access your files. Once access has been granted, a duplicity subdirectory is created in Apps — and the Apps directory if it doesn’t exist already. The destination reflects where the files where be stored in duplicity folder.

dpbx:/// - is Apps/duplicity
dpbx:///Photos - Apps/duplicity/Photos
dpbx:///backups/Photos - Apps/backups/Photos

The duplicity folder is linked to one OAuth token. If you to backup to dropbox from multiple machines you will need to copy the .dropbox.token_store.txt. In the event that you get an UnsupportedBackendScheme: scheme not supported in dpbx:///backup. Look for the file called dpbxbackend.py and uncomment the last line

duplicity.backend.register_backend(“dpbx”, DPBXBackend)

I found the file in the /usr/lib64/python-2.7/site-packages/duplicity/backends.

Listing Files

It's possible to list the contents of a backup collection with the use of the list-current-files command:

# duplicity list-current-files file:///home/backup/paulus

To list the files of a previous backup point use the --time option

# duplicity list-current-files --time file:///home/backup/paulus

Verifying

After performing a backup, verify it:

# duplicity verify --no-encryption file:///mnt/backup/paulus /home/paulus/Pictures
# duplicity verify --sign-key=AE6DDA2A file:///mnt/backup/paulus /home/paulus/Pictures

To verify backups in an Amazon S3 container:

duplicity verify --s3-use-new-style /home/paulus/Pictures s3+http://Paulus-Pictures

If everything is OK, you will see this:

Local and Remote metadata are synchronized, no sync needed.
Last full backup date: Mon Oct 12 02:27:24 2015
Verify complete: 5702 files compared, 0 differences found.

Restoring Files

When restoring files, you specify the location where the files will be restore to. To do a full restore from a backup:

# duplicity restore file:///mnt/backup/paulus /home/paulus/Pictures

To restore from a certain point, use the --time option

# duplicity restore --time 20151018T040815Z --sign-key AE6DDA2A file:///mnt/backup/paulus /home/paulus/Pictures

Instead of restoring every file, you can specify a single directory or file by using the --file-to-restore option. This option takes a relative path to the file you want to restore. This path is obtained by running the list-current-files command.

# duplicity restore --sign-key AE6DDA2A --file-to-restore cat.jpg file:///mnt/backup/paulus /home/paulus/restored

Duply

Duply is a wrapper for duplicity that makes backing up, verifying, and restoring easier by storing options that build the duplicity command in a configuration file. The first thing to do is create a profile.

# duly amazon create

The profile directory is created either in the user's home directory or in /etc/duply. Within the profile directory are two files: conf and exclude. The most important file is the conf file.

# gpg encryption settings, simple settings:
#  GPG_KEY='disabled' - disables encryption alltogether
#  GPG_KEY='<key1>[,<key2>]'; GPG_PW='pass' - encrypt with keys,
#   sign if secret key of key1 is available use GPG_PW for sign & decrypt
#  Note: you can specify keys via all methods described in gpg manpage,
#        section "How to specify a user ID", escape commas (,) via backslash (\)
#        e.g. 'Mueller, Horst', 'Bernd' -> 'Mueller\, Horst, Bernd'
#        as they are used to separate the entries
#  GPG_PW='passphrase' - symmetric encryption using passphrase only
GPG_KEY='_KEY_ID_'
GPG_PW='_GPG_PASSWORD_'
# gpg encryption settings in detail (extended settings)
#  the above settings translate to the following more specific settings
#  GPG_KEYS_ENC='<keyid1>[,<keyid2>,...]' - list of pubkeys to encrypt to
#  GPG_KEY_SIGN='<keyid1>|disabled' - a secret key for signing
#  GPG_PW='<passphrase>' - needed for signing, decryption and symmetric
#   encryption. If you want to deliver different passphrases for e.g. 
#   several keys or symmetric encryption plus key signing you can use
#   gpg-agent. Simply make sure that GPG_AGENT_INFO is set in environment.
#   also see "A NOTE ON SYMMETRIC ENCRYPTION AND SIGNING" in duplicity manpage 
# notes on en/decryption
#  private key and passphrase will only be needed for decryption or signing.
#  decryption happens on restore and incrementals (compare archdir contents).
#  for security reasons it makes sense to separate the signing key from the
#  encryption keys. https://answers.launchpad.net/duplicity/+question/107216
#GPG_KEYS_ENC='<pubkey1>,<pubkey2>,...'
#GPG_KEY_SIGN='<prvkey>'
# set if signing key passphrase differs from encryption (key) passphrase
# NOTE: available since duplicity 0.6.14, translates to SIGN_PASSPHRASE
#GPG_PW_SIGN='<signpass>'

# uncomment and set a file path or name force duply to use this gpg executable
# available in duplicity 0.7.04 and above (currently unreleased 06/2015)
#GPG='/usr/local/gpg-2.1/bin/gpg'

# gpg options passed from duplicity to gpg process (default='')
# e.g. "--trust-model pgp|classic|direct|always" 
#   or "--compress-algo=bzip2 --bzip2-compress-level=9"
#   or "--personal-cipher-preferences AES256,AES192,AES..."
#   or "--homedir ~/.duply" - keep keyring and gpg settings duply specific
#   or "--pinentry-mode loopback" - needed for GPG 2.1+ _and_
#      also enable allow-loopback-pinentry in your .gnupg/gpg-agent.conf
#GPG_OPTS=''

# disable preliminary tests with the following setting
#GPG_TEST='disabled'

# backend, credentials & location of the backup target (URL-Format)
# generic syntax is
#   scheme://[user[:password]@]host[:port]/[/]path
# eg.
#   sftp://bob:secret@backupserver.com//home/bob/dupbkp
# for details and available backends see duplicity manpage, section URL Format
#   http://duplicity.nongnu.org/duplicity.1.html#sect7
# NOTE:
#   some backends (eg. cloudfiles) need additional env vars to be set to
#   work properly, when in doubt consult the man page mentioned above.
# ATTENTION:
#   characters other than A-Za-z0-9.-_.~ in the URL have to be
#   replaced by their url encoded pendants, see
#     http://en.wikipedia.org/wiki/Url_encoding
#   if you define the credentials as TARGET_USER, TARGET_PASS below duply
#   will try to url_encode them for you if the need arises.
TARGET='scheme://user[:password]@host[:port]/[/]path'
# optionally the username/password can be defined as extra variables
# setting them here _and_ in TARGET results in an error
#TARGET_USER='_backend_username_'
#TARGET_PASS='_backend_password_'

# base directory to backup
SOURCE='/path/of/source'

# a command that runs duplicity e.g. 
#  shape bandwidth use via trickle
#  "trickle -s -u 640 -d 5120" # 5Mb up, 40Mb down"
#DUPL_PRECMD=""

# exclude folders containing exclusion file (since duplicity 0.5.14)
# Uncomment the following two lines to enable this setting.
#FILENAME='.duplicity-ignore'
#DUPL_PARAMS="$DUPL_PARAMS --exclude-if-present '$FILENAME'"

# Time frame for old backups to keep, Used for the "purge" command.  
# see duplicity man page, chapter TIME_FORMATS)
#MAX_AGE=1M

# Number of full backups to keep. Used for the "purge-full" command. 
# See duplicity man page, action "remove-all-but-n-full".
#MAX_FULL_BACKUPS=1

# Number of full backups for which incrementals will be kept for.
# Used for the "purge-incr" command.
# See duplicity man page, action "remove-all-inc-of-but-n-full".
#MAX_FULLS_WITH_INCRS=1

# activates duplicity --full-if-older-than option (since duplicity v0.4.4.RC3) 
# forces a full backup if last full backup reaches a specified age, for the 
# format of MAX_FULLBKP_AGE see duplicity man page, chapter TIME_FORMATS
# Uncomment the following two lines to enable this setting.
#MAX_FULLBKP_AGE=1M
#DUPL_PARAMS="$DUPL_PARAMS --full-if-older-than $MAX_FULLBKP_AGE " 

# sets duplicity --volsize option (available since v0.4.3.RC7)
# set the size of backup chunks to VOLSIZE MB instead of the default 25MB.
# VOLSIZE must be number of MB's to set the volume size to.
# Uncomment the following two lines to enable this setting. 
#VOLSIZE=50
#DUPL_PARAMS="$DUPL_PARAMS --volsize $VOLSIZE "

# verbosity of output (error 0, warning 1-2, notice 3-4, info 5-8, debug 9)
# default is 4, if not set
#VERBOSITY=5

# temporary file space. at least the size of the biggest file in backup
# for a successful restoration process. (default is '/tmp', if not set)
#TEMP_DIR=/tmp

# Modifies archive-dir option (since 0.6.0) Defines a folder that holds 
# unencrypted meta data of the backup, enabling new incrementals without the 
# need to decrypt backend metadata first. If empty or deleted somehow, the 
# private key and it's password are needed.
# NOTE: This is confidential data. Put it somewhere safe. It can grow quite 
#       big over time so you might want to put it not in the home dir.
# default '~/.cache/duplicity/duply_<profile>/'
# if set  '${ARCH_DIR}/<profile>'
#ARCH_DIR=/some/space/safe/.duply-cache

# DEPRECATED setting
# sets duplicity --time-separator option (since v0.4.4.RC2) to allow users 
# to change the time separator from ':' to another character that will work 
# on their system.  HINT: For Windows SMB shares, use --time-separator='_'.
# NOTE: '-' is not valid as it conflicts with date separator.
# ATTENTION: only use this with duplicity < 0.5.10, since then default file 
#            naming is compatible and this option is pending depreciation 
#DUPL_PARAMS="$DUPL_PARAMS --time-separator _ "

# DEPRECATED setting
# activates duplicity --short-filenames option, when uploading to a file
# system that can't have filenames longer than 30 characters (e.g. Mac OS 8)
# or have problems with ':' as part of the filename (e.g. Microsoft Windows)
# ATTENTION: only use this with duplicity < 0.5.10, later versions default file 
#            naming is compatible and this option is pending depreciation
#DUPL_PARAMS="$DUPL_PARAMS --short-filenames "

# more duplicity command line options can be added in the following way
# don't forget to leave a separating space char at the end
#DUPL_PARAMS="$DUPL_PARAMS --put_your_options_here "

The generated conf file has a lot documentation that will assist you in configuring the backup profile. The other file that was created, exclude, is meant to hold a list of files and directories that should not be backed up.

- **/.android/avd
- **/.cache
- **/.**history
- **/lost+found
- **/Trash
- /home/backup

To exclude files or directories from being backed up, list each file and directory on its own line starting with a minus (-). The double asterisk (**) is a wild card. For example, **/Trash will exclude all directories named Trash. Two other files that are worth mentioning but are not created by default are the pre and post files. These two files contain commands that should be run before and after the backup.

List Backup Sets

Found primary backup chain with matching signature chain:
-------------------------
Chain start time: Wed May 16 23:02:34 2018
Chain end time: Wed Jun 13 21:45:02 2018
Number of contained backup sets: 11
Total number of contained volumes: 795
 Type of backup set:                            Time:      Num volumes:
                Full         Wed May 16 23:02:34 2018               427
         Incremental         Wed May 23 22:59:23 2018               196
         Incremental         Sat May 26 11:38:07 2018                 3
         Incremental         Fri Jun  1 06:09:54 2018                41
         Incremental         Mon Jun  4 20:56:32 2018                15
         Incremental         Thu Jun  7 20:14:00 2018                 1
         Incremental         Thu Jun  7 21:38:52 2018                 1
         Incremental         Fri Jun  8 20:10:09 2018                 1
         Incremental         Mon Jun 11 20:19:46 2018                 1
         Incremental         Tue Jun 12 19:57:03 2018                 1
         Incremental         Wed Jun 13 21:45:02 2018               108
-------------------------
Also found 6 backup sets not part of any chain,
and 1 incomplete backup set.
These may be deleted by running duplicity with the "cleanup" command.
Using temporary directory /home/backup/tmp/duplicity-kS6MWk-tempdir
--- Finished state OK at 01:35:20.613 - Runtime 00:00:04.929 ---

Cleaning Up

Duply does not automatically cleanup unneeded backups. In order to free up space, you must instruct duply to purge old backups. This is done by issuing the purge command.

duply home purge --force

Restoring

To restore a full backup, specify the backup and where you want to restore that backup to.

duply home restore /mnt/restore

Restoring a file is similar to restoring a full backup with the exception that in addition to specifying the backup, the file that is to be restore is also passed along with which version of the file.

duply home fetch home/paulus/.bash_profile /home/paulus/.bash_profile 7D

Conclusion

Duplicity and Duply are great for making personal backups because it's easy and quick. To ensure that your data is safe even if your computer is stolen or damaged you can easily backup to a service provider. As great as these tools are, they definitely wouldn't be a solution that should be implemented in a business or data center. If you are looking for a tool for a business environment I would recommend looking at bacula.