Flexbackup using SSH

Flexible backup of remote systems using a secure data channel.

Copyright C.S. Kirk, Seattle, USA, July 2008

Overview

This paper describes a method for using the 'flexbackup [1]' perl script to remotely backup whole systems or directories securely using ssh specifically tailored for FreeBSD operating systems. The method utilises 'afio' to bundle compressed files into a labelled archive. These archives (backups), build on the backup host and do not require temporary storage on the prime_host storing the original data.

The complete system requires duplicates of binaries/programs to exist in both the prime machine (prime_host) and the backup machine (backup_host). In this case the prime machine is the machine which holds the original data/files to be backed up. The backup machine is a machine which is used to store the backed up files.

The method described here involves:

  1. Outlining a limited/basic backup strategy - (push/pull-local/remote, levels, turnover, reporting)
  2. Installing flexbackup and patching
  3. Configuring the backup machine including mounting the fixed-disk and configuring fstab, installing and configuring binaries such as sudo, afio, tar, etc (on both machines)
  4. Creating a new user account (on both machines)
  5. Creating directories and applying permissions strategies
  6. Configuring ssh keys and transferring to hosts
  7. Configuring sudoers
  8. Configuring flexbackup.conf (targets to backup, log directories, tmp directory, backup destination, report verbiage)
  9. Archive bundling using afio
  10. Restoration of files after backup
  11. Configuring cron (crontab)

The backup/restoration strategy

Typical backup/restoration strategies are the subject of many well-written articles and as such better dealt with outside this paper. However, it is reasonable to place this work in context with a brief outline. A backup strategy and plan should be part of an over-arching policy and planning for Disaster Recovery but may nevertheless exist autonomously. Any plan should address the need for regular, effective backups and ensure that they are capable of being restored (or files extracted) in a manner appropriate for the organisation and its daily activities and in-time to meet business needs. Effectiveness should be assessed regularly and review the completeness of backed-up data and its restoration potential within a predetermined time-frame based on business demands.

The use of 'flexbackup' as part of this procedure might be realistic for small organisations (or small networks in a large organisation) or for personal use. It might be considered for users who need to make backups either locally within the machine such as to a designated fixed-disk or remotely such as to a dedicated backup machine in a small network environment. When correctly configured it controls a simple backup typically as a 'Push' strategy in which data is 'pushed' from the originating source to the backup directory or backup disk/tape in the same machine. However it can operate remotely as a 'Pull' strategy using a secure connection (such as ssh) and this is discussed below.

The script can control storage to both fixed-disk and to tape.

Installing and patching flexbackup

At the time of writing (July 2008) the latest stable version of 'flexbackup' was version 1.2.1 (2003). The script has been the subject of comment and further work and the following patches increase or correct its performance in important areas. These patches are primarily written by contributors to the 'Gentoo' operating system. Intellectual property remains with the code/patch-authors but they are available in accordance with the GPL License. The patch authors are acknowledged and thanks are expressed for permission to cite herein. All patch files and related material are available at the FreeBSD-linked ports repository [2].

The patches are:

# flexbackup-1.2.1-lzma.patch
# flexbackup-1.2.1-secure-tempfile.patch
# flexbackup-1.2.1-bash.patch
# flexbackup-1.2.1-mbuffer-switch.patch
# flexbackup-1.2.1-remote-bufftest.patch
# flexbackup-1.2.1-prune.patch
# flexbackup-1.2.1-spaces-in-filenames.patch
# flexbackup-1.2.1-afio-large-files.patch

Note: The files flexbackup-1.2.1-CAN-2005-2965.patch (not listed above) was superceded by flexbackup-1.2.1-secure-tempfile.patch and should not be used. The files above are listed in the order in which they are recommended for installation.

Note 2: Those patches are already incorporated in the FreeBSD specific port patch under [ 2 ] and do not need to be applied manually.

After patching, flexbackup should be installed following the instructions supplied with the original package. This will typically install the perl script in /usr/local/bin and copy a configuration file to /usr/local/etc. (Note also, later notes for implementation of 'flexbackup' with 'cron'.)

Configuring the backup machine

For integration into a system such as a small network where a dedicated host is configured for use as a storage device for the backup data, then configuration is necessary to prepare the system for communication with the flexbackup script. In such a case, the activity becomes that of a 'Pull' operation where management is handled by the backup host and data is 'pulled' to the backup host. For clarity the following definitions apply:

prime_host
The host which stores all original data, images and binaries which are to be backed-up, (example IP address: 10.0.0.15).
backup_host
The host which will receive the backup copies and store them in archives pending restoration. This system has the flexbackup script and configuration, (example IP address: 10.0.0.10).

The backup_host (10.0.0.10) should ideally have a dedicated fixed-disk or tape system (not the root system disk) to store the backed-up data. This should be mounted and tested before use (i.e. the backup system must survive auto-rebooting after a power failure and correctly mount the backup-disk).

The 'flexbackup' script pulls data to its remote destination but uses binaries/programs on both the prime_host and the backup_host. It is imperative that such binaries/programs exist in the same places on both machines to avoid errors being created. To implement this mirror approach the following example binaries exist in the same directories on both machines (note that if 'flexbackup' is installed using FreeBSD ports that additional configuration can be automated by first applying 'make config' before 'make'. This will ensure that afio and compression utilities are also installed on the system):

/usr/local/bin/flexbackup
/usr/local/etc/flexbackup.conf
/usr/bin/find
/usr/bin/gzip
/usr/local/bin/afio
/usr/local/bin/sudo
/usr/local/bin/gtar
/tmp

Note the configuration file (flexbackup.conf) can specify the location of a temporary directory. Care should be taken to ensure that this is writable by a non-priveleged user. It is recommended that the /tmp directory is used as this is available on both systems and is also scheduled for clearing at intervals.

Creating a new user account

Prudent security principals dictate that it is unwise to permit direct root access remotely (even over ssh). To maintain this principal yet permit backing up of all data (including restricted configuration and data files) it is necessary to adopt an indirect method. In this case a non-priveleged user account is created complete with normal shell access. It is recommended to use the same username and password on both machines. An example would be to use the 'adduser' script (as root) to create a default user named flex.

<backup_host># adduser
Username: flex
...

<prime_host># adduser
Username: flex
...

Creating directories and applying permission strategies

It is recommended that a directory is created with a recognizable name such as (/backups) to aid restoration procedures and identification.

The contents of backup files are no less important than those which they represent and the security procedures and integrity of the system should also now include the backup host and those directories which hold archived data. Security principles might be reviewed to prevent unauthorized access whilst retaining directory trees and permission structures of archive files. One suggested outline is detailed below.

# For a directory structure where the directory /backups contains subdirectories segregating data archives
#
# as root for the flex group (in this example case 'flex' is the group name for backup activity)

<backup_host># cd /
<backup_host># mkdir -m 2750 backups

# 2750 note: executable files with this bit set will run with the effective gid set
# to the gid of the file owner so any new directories belong to flex (the backup user).

<backup_host># cd backups

# typically create directories to segregate backup archives

<backup_host># mkdir -m 0770 laptop
<backup_host># mkdir -m 0770 main_machine

# create directories to hold important data

<backup_host># mkdir -m 0770 log
<backup_host># mkdir -m 0770 stamps

# note the subdirectories restrict file access to file owners and group owners

<backup_host># cd /
<backup_host># chown -R flex:flex backups

Configuring ssh keys and transferring to hosts

The backup system can be driven manually at any time or can be automated using timing scheduled in crontab.

In the case of automation it is necessary to provide solutions to requests for passwords by creating keys for ssh clearance. The basic requirement will be to create a key on the backup_host and transfer a public key to the prime_host to facilitate access. The following example (logged in as flex) might be used:

<backup_host>$ ssh-keygen -t dsa

This will create a public/private dsa key pair which can be saved to the default location as follows:

/flex/.ssh/id_dsa

This should be transferred to the prime_host (preferably using scp):

<backup_host># cd /usr/home/flex/.ssh/
<backup_host># scp id_dsa.pub flex@prime_host_ip:
Password:
...

In the above example, terminating with only a colon (:) stores the file with the same filename as the original.

It is necessary to connect with the prime_host (ssh) to complete the transfer by creating a directory with a new file and to change operational permissions:

<backup_host># ssh flex@prime_host_ip
Password:
<prime_host># pwd
/usr/home/flex
<prime_host># mkdir -m 0700 .ssh
<prime_host># cat id_dsa.pub >> .ssh/authorized_keys
<prime_host># chmod 0600 .ssh/authorized_keys
<prime_host># rm id_dsa.pub
<prime_host># exit
<backup_host>#

This outcome should be checked to ensure that it is working correctly by logging in to prime_host as the user flex using ssh. The response should be access without any password being requested. If a password is requested, this suggests that the procedure was ineffective and should be repeated. Note that if the user is logged in as root on the backup_host it is not possible to then login as flex on the prime_host since the default configuration in ssh_config would force the program to search for the 'flex' in /root/.ssh. This would fail. It is therefore necessary to switch user initially to flex (in the backup_host) and to login as expanded below:

# Where the target is using a key pair this will succeed as follows (where the user is flex)
<backup_host>$ ssh flex@prime_host_ip
<prime_host>$
...

# Where the target is using a key pair this will fail as follows (where the user is root)
<backup_host>$ ssh flex@prime_host_ip
Password:
...

# Where the target is using a key pair this will succeed as follows (where the user is initially root)
<backup_host>$ su - flex
Password:
<backup_host>$ ssh flex@prime_host_ip
<prime_host>$
...

Configuring sudoers

Access for non-priveleged users is restricted and controlled typically by file permissions and other measures. To permit the user 'flex' to create full and complete backups of all required data it is necessary to extend a limited permission capability. This is achieved using sudo. FreeBSD does not have this program installed by default. It is necessary to install it using a package or ported source such as in /usr/ports/security/.

Changes need to be made in the configuration file sudoers (stored in /usr/local/etc/). The configuration extends certain binaries/programs to be available to the user 'flex' and also without requesting a confirmatory password. This is necessary because although the non-privileged user can already use these binaries it is not permitted to use them on certain restricted files such as the configurations in /etc or in the accounts of other users, without additional permissions. Furthermore, a given right to use programs across a wider file set still requires the confirmation by password to prevent unauthorized use. In this case, to fully automate the system yet retain the denial of root-login, the sudoers file includes the NOPASSWD directive. Since sudo reads the whole configuration file and thus operates on a 'last configuration success basis' it is necessary to have any NOPASSWD directive as the LAST LINE in the file, as follows:

<backup_host># vi /usr/local/etc/sudoers
# add all on one line at end of file
flex ALL=NOPASSWD: /usr/bin/flexbackup, /bin/pax, /usr/bin/find, /usr/local/bin/gtar, /usr/bin/gzip, /usr/local/bin/afio

<prime_host># vi /usr/local/etc/sudoers
# add all on one line at end of file
flex ALL=NOPASSWD: /usr/bin/flexbackup, /bin/pax, /usr/bin/find, /usr/local/bin/gtar, /usr/bin/gzip, /usr/local/bin/afio

Note that in this case the procedure is calling 'Gnu tar - gtar' to enable seemless backing up of a range of other machines. The installation of 'findutils' is recommended for use with 'flexbackup' due to its reliance on pattern matching. In later versions, the FreeBSD 'find' is workable as it includes pattern matching (regex) capabilities in common with those of 'Gnu find - gfind'. Historically, use of Linux-type (Gnu) programs was necessary as 'flexbackup' made use of regex pattern matching and piping which the more formal BSD-unix did not support.

A review of access and permissions is also recommended and in particular a check to ensure that sshd_config still prohibits remote root-logins.

In /etc/ssh/sshd_config

...
PermitRootLogin no
...

If necessary reload/restart sshd with the new configuration

<prime_host># /etc/rc.d/sshd reload

The configuration file - flexbackup.conf

'Flexbackup' achieves flexibility with ease-of-use through the provision of an editable configuration file stored in /usr/local/etc.

This has various opportunities for change and the user should make a careful study of the 'flexbackup' manual and the notes in the configuration file to adapt it for individual use. However, the following examples might provide guidance and were taken from a working system.

# extracts from /usr/local/etc/flexbackup.conf

# Archive type?
$type = 'afio';

# each routine is callable by reference to backup sets
# typically these are individual directories or (space-separated) lists of directory groups
# where the prime_host for example is 10.0.0.15

# example ip addresses only - not for use in a live system
$set{'home'} = "10.0.0.15:/usr/home";
$set{'etc'} = "10.0.0.15:/etc";
$set{'laptop_backup'} = "128.199.199.199:/usr/home";

$compress = 'gzip'; # one of false/gzip/bzip2/lzop/zip/compress/hardware
$compr_level = '4'; # compression level (1-9) (for gzip/bzip2/lzop/zip)

# Buffering program - to help tape streaming
$buffer = 'false'; # one of false/buffer/mbuffer

$device = '/backups'; # the directory on backup_host

$remoteshell = 'ssh'; # command for remote shell (rsh/ssh/ssh2)

# in many test cases it is important to see details
# of files as they are compressed and piped

# in normal environments particularly in the case
# of large directories this should be set to false
$verbose = 'false';

# Simplify file names by removing recording dates. This simplifies automated operations
# Files are overwritten at subsequent backup runs
# Note: Use a shell script to rotate/archive files before overwriting if necessary
$staticfiles = 'true';

$logdir = '/backups/log/flexbackup_log'; # directory for log files
$comp_log = 'gzip'; # compress log? false/gzip/bzip2/lzop/compress/zip
$staticlogs = 'false'; # static log filenames w/ no date stamp
$prefix = ; # log files will start with this prefix
$tmpdir = '/tmp'; # used for temporary refdate files, etc
$stampdir = '/backups/stamps'; # directory for backup timestamps
$index = '/backups/log/flexbackup_index'; # DB filename for tape indexes
$keyfile = '00-index-key'; # filename for keyfile if archiving to dir
$sprefix = ; # stamp files will start with this prefix

# Paths

$path{'tar'} = 'sudo /usr/local/bin/gtar';
$path{'pax'} = 'sudo /bin/pax';
$path{'find'} = 'sudo /usr/bin/find';
$path{'afio'} = 'sudo /usr/local/bin/afio';
$path{'ssh'} = '/usr/bin/ssh';
Backup types
There are three types of backup and these (as with other backup systems are temporally based on file date stamps. A 'full' backup means to backup all files regardless of age of creation. A 'differential' is the idealised mid-ground tween a full backup and an incremental backup and archives all files changed/added since the last full backup. Finally, an 'incremental' backup means to backup only the files changed/added since the last backup of any type. Thus typically, the differential backup is ideal for use as a weekly catch-all backup and and incremental is ideal for daily backups. 'Flexbackup' also permits (though not discussed here), numeric identification of backup sequencing which thus opens the possibility for a more refined backup policy.
Example rotation
An idealised activity schedule is a full backup - monthly, with incrementals every day with a weekly differential backup. At this point, daily incrementals may be deleted and repeated weekly until the time to perform another monthly full backup. At that point the sequence is repeated. Copies may be rotated out to tape or other device/partition pending termination sequencing.
Additional system information

A useful feature of FreeBSD is the integration of package information. If (shortly before any backup), a call is made to pkg_info with output directed to a directory which is to be backed up, then a list of all packages installed on the system at the time of the backup will be preserved. Indeed, using this list as an input to pkg_add enables a virtually complete suite programs to be reinstalled if necessary in support of disaster recovery procedures.

<prime_host># pkg_info > pkg_info_output.txt

Example shell script to be called via cron:

#!/bin/sh
# filename: pkgs_installed.sh
pkg_info > /usr/home/flex/pkg_info_output.txt
<prime_host># chmod 0755 pkgs_installed.sh

Archive bundling - afio

In the method described, the selected files are compressed using 'gzip'. The resulting (and numerous) compressed files need to be aggregated into an archive. There are many capable methods but in this example case 'afio' has been selected as it offers a number of useful attributes, namely:

It is recommended that the latest version of afio be used. The version tested was 2.5.

Restoration of files after backup

Restoration should be tested thoroughly to ensure adequacy of procedure and retention of dates and permissions. Typically, restoration is conducted in a controlled manner to avoid the overwriting of important data files or disrupting/corrupting working systems and this is discussed below though notably backups using the 'dump' method conveniently have an interactive restoration sequence. There are two main restoration events which could be initiated. It is not possible to initiate restorative command remotely. Thus, backup archives will need to be transferred to replacement machines in the event of disaster recovery or to temporary directories in the case of partial reconstructions. It is therefore helpful to the restoration process if a full working version of flexbackup exists on the prime host. Single files may be retrieved in any event. This is relatively trivial to activate on the backup host and transfer only the retrieved file rather than a large backup archive.

Typical restoration commands are:

To list the contents of an afio bundled archive (it creates its own output log file in the current directory and rushes a copy to stdout to test speed reading skills!)

flexbackup -list name_of_the_archive.afio-gz

Interestingly, this log file is suffixed with the date-time of the enquiry to avoid confusion with other logs.

To extract a single known file:

flexbackup -extract name_of_the_archive.afio-gz -onefile name_of_the_requested_file

To extract a number of known files:

flexbackup -extract name_of_the_archive.afio-gz -flist file_containing_list_of_files

To extract all files stored in an archive:

flexbackup -extract name_of_the_archive.afio-gz

Configuring cron (crontab)

Cron is a useful program to automate many tasks, typically backup and rotation of files. It can be incorporated into a backup procedure but program files may need to be copied to additional locations in order to operate correctly and completely.

Initially, check and/or amend the editor. Typically, implement 'vi' using this command:

$export EDITOR=vi
$sudo crontab -e

Furthermore, using the above command, the cron daemon will be automatically restarted to adopt the amendments.

Cron operates out of its own (very very limited) environment. Amongst other things, the PATH is strictly limited. Typically this is restricted to /usr/sbin and /usr/bin but could be modified.

Ultimately, security principals should prevail and it is advisable to leave the restrictions in place and to change the location of a limited number of programs rather than to open the environment to accommodate them into the cron path. In a test case all that was required was to copy sudo from /usr/local/bin to /usr/bin and to leave redirection of other programs (such as Gnu tar) as a statement within 'flexbackup.conf'.

Important note: It might be that cron can execute /usr/local/bin/sudo without problems. In that case you do not need to change anything nor link or copy the sudo binary to another directory. You are advised to give /usr/local/bin/sudo a try before copying or linking it. The crontab below uses /usr/bin/sudo - change it as necessary to point to the path of your sudo binary.

A typical cron entry in the crontab for the unprivileged user 'flex' is as follows:

# example for recording packages on the system prior to backing up
00  01  1-7  *  *  flex  /usr/home/flex/pkgs_installed.sh

# example for remote backup of a server (full) at 2am on the first Sunday of the month
00  02  1-7  *  *  flex  /usr/bin/flexbackup -wday 7 -set all -level full >> /usr/home/flex/backup_output.txt

# example for remote backup of a server (differential) at 2am every Saturday
00  02  *  *  6  flex  /usr/bin/flexbackup -set all -level differential >> /usr/home/flex/backup_output.txt

# example for remote backup of a server (incremental) at 2am daily (weekdays)
00  02  *  *  1-5  flex  /usr/bin/flexbackup -set all -level incremental >> /usr/home/flex/backup_output.txt

# example for remote backup of laptop at a time when it will be available (say 11:30 am)
30  11  *  *  *  flex  /usr/bin/flexbackup -set laptop_backup -level full >> /usr/home/flex/backup_output.txt

References

Flexbackup is a perl script written by Edwin Huffstutler and is available in its original form (v. 1.2.1 2003) at: http://sourceforge.net/projects/flexbackup

Patch files are available at: http://www.freebsd.org/cgi/cvsweb.cgi/ports/sysutils/flexbackup/

Author and Acknowledgements

The author, Dr. Chris Kirk is an independent researcher in academia and industry, (USA and UK): drcskirk@fastmail.fm

Dr. Kirk gratefully acknowledges the help and assistance of:

Version