18.3 Software for Backups
There
are a number of software packages that
allow you to perform backups. Some are vendor-specific, and others
are quite commonly available. Each may have particular benefits in a
particular environment. We'll outline a few of the
more common ones here, including a few that you might not otherwise
consider. You should consult your local documentation to see if there
are special programs available with your system.
Standard Unix
files are direct-access files; in other
words, you can specify an offset from the beginning of the file, and
then read and write from that location. If you have ever had
experience with older mainframe systems that only allowed files to be
accessed sequentially, you know how important random access is for
many things, including building random-access databases.
An interesting case occurs when a program references beyond the
"end" of the file and then writes.
What goes into the space between the old end-of-file and the data
just now written? Zero-filled bytes would seem to be appropriate, as
there is really nothing there.
Now consider that the span could be millions of bytes long, and there
is really nothing there. If Unix were to allocate disk blocks for all
that space, it could possibly exhaust the free space available.
Instead, values are set internal to the inode and file data pointers
so that only blocks needed to hold written data are allocated. The
remaining span represents a hole that Unix remembers. Files with
holes are sometimes called sparse files.
Attempts to read any of those blocks simply return zero values.
Attempts to write any location in the hole results in a real disk
block being allocated and written, so everything continues to appear
normal. (One way to identify these files is to compare the size
reported by ls -l with the size reported by
ls -s.)
Small files with large holes can be a serious concern to backup
software, depending on how your software handles them. Simple copy
programs will try to read the file sequentially, and the result is a
stream with lots of zero bytes. When copied into a new file, blocks
are actually allocated for the whole span, and lots of space may be
wasted. More intelligent programs, like dump or
GNU tar with the -S option,
bypass the normal file system and read the actual inode and set of
data pointers. Such programs only save and restore the actual blocks
allocated, thus saving both tape and file storage.
Keep these comments in mind if you try to copy or archive a file that
appears to be larger in size than the disk it resides in. Copying a
file with holes to another device can cause you to suddenly run out
of disk space.
|
18.3.1 Simple Local Copies
The simplest form of backup is to make
simple copies of your files and directories. You might make those
copies to local disk, to removable disk, to tape, or to some other
media. Some file copy programs will properly duplicate modification
and access times, and copy owner and protection information, if you
are the superuser or if the files belong to you. They seldom recreate
links, however. Examples include:
- cp
-
The standard command for copying individual files. Some versions
support a -R or
-r option to copy an
entire directory tree.
- dd
-
This command can be used to copy a whole disk partition at one time
by specifying the names of partition device files as arguments. This
process should be done with great care if the source partition is
mounted: in such a case, the device should be for the
block version of the disk rather than the
character version. Never
copy onto a mounted partition—unless you want to destroy the
partition and cause an abrupt system halt!
|
Be careful when backing up live filesystems! If
you're not going to bring your system down to
single-user mode during backups (and few users are willing to
tolerate this kind of downtime), you should be aware of how your
backup procedure will handle attempts to back up a file
that's in use by another process, particularly a
process that may lock the file, write to the file, or unlink the file
during the backup process. In some cases, you may need to write a
script to temporarily stop certain processes (such as relational
databases) during the backup and restart them afterwards in order to
be sure that the backup file is not corrupted.
|
|
18.3.2 Simple Archives
There are several programs that are
available to make simple archives packed into disk files or onto
tape. These are usually capable of storing all directory information
about a file, and restoring much of it if the correct options are
used. Running these programs may result in a change of either (or
both) the atime and the
ctime of items archived, however (see Chapter 6).
- ar
-
Simple file
archiver. Largely obsolete for backups (although still used for
creating Unix libraries).
- tar
-
Simple tape
archiver. Can create archives to files, tapes, or elsewhere. This
choice seems to be the most widely used and simple archive program.
- cpio
-
Another simple
archive program. This program can create portable archives in plain
ASCII of even binary files, if invoked with the correct options.
- pax
-
The portable
archiver/exchange tool, which is defined in the POSIX standard. This
program combines tar and
cpio functionality. It uses
tar as its default file format.
18.3.3 Specialized Backup Programs
There are several dedicated backup
programs:
- dump/restore
-
This program is the "classic" one
for archiving a whole partition at once, and for the associated file
restorations. Many
versions of this program exist; all back up from the raw disk device,
thus bypassing calls that would change any of the times present in
inodes for files and directories. This program can also make the
backups quickly.
- backup
-
Some SVR4-based systems have a suite of programs named, collectively,
backup. These are also designed specifically to
do backups of files and whole filesystems.
18.3.4 Network Backup Systems
A few programs can be used to do
backups across a network link. Thus, you can do backups on one
machine and write the results to another. An obvious example would be
using a program that can write to stdout, and
then piping the output to a remote shell. Some programs provide for
compression (to improve backup speed on slower networks) and/or
encryption of the data stream:
- rdump/rrestore
-
A network version of the dump and
restore commands. It uses a dedicated process on
a machine that has a tape drive, and sends the data to that process.
Thus, it allows a tape drive to be shared by a whole network of
machines.
- rsync
-
A program
designed to remotely synchronize two filesystems. One filesystem is
the master; changes in that one are propagated to the slave.
rsync is optimized for use with logfiles: if a
100 MB file has 1 megabyte appended, rsync can
detect this and copy only over the last megabyte.
- scp
-
Enables you to copy a file or a whole directory tree to a remote
machine using the SSH protocol, which avoids sending cleartext
passwords over the network and can encrypt the data stream. It is
based on the older rcp command, which is
unsecure.
- unison
-
Designed for
two-way synchronization between two or more filesystems. When
unison first runs, it creates a database that
describes the current state of both filesystems. Thereafter, it can
automatically propagate file additions, changes, and deletions from
one filesystem to the other.
There are also several backup programs specifically designed to back
up data from clients to a tape drive on a central server over a
network. The central server is typically outfitted with a large tape
drive or jukebox and is configured to back up the clients at night.
- Amanda
-
The
Advanced Maryland Automatic Network Disk Archiver (http://www.amanda.org). Amanda is a free
software, client/server backup system that's over 10
years old and still actively maintained. The backup server (the host
with the tape drive) connects to each backup client and instructs it
to transfer data, which the server writes to tape using standard Unix
utilities such as dump or
tar. It is compatible with many tape drivers and
changers, and has its own tape management system. In conjunction with
Samba, it can back up Windows hosts as well.
- Commercial solutions
-
Like Amanda, most commercial backup
systems are based on a client/server architecture to allow a backup
server to perform unattended backups of Unix, Windows, and Macintosh
hosts over a network. Key features in commercial offerings are:
Unfortunately, there are drawbacks for many uses, notably lack of
portability across multiple platforms, and compatibility with sites
that may not have the software installed. Be sure to fully evaluate
the conditions under which you'll need to use the
program and decide on a backup strategy before purchasing the
software.
18.3.5 Encrypting Your Backups
You can improvise your own backup encryption if
you have an encryption program that can be used as a filter and you
use a backup program that can write to a file, such as the
dump, cpio, or
tar commands. For example, to make an encrypted
tape archive using the tar command and the OpenSSL
encryption program, you might use the following command:
# tar cf - dirs and files | openssl enc -des3 -salt | dd bs=10240 of=/dev/rm8
Although software encryption is not foolproof (for example, the
software encryption program can be compromised to record all
passwords), this method is certainly preferable to storing sensitive
information on unencrypted backups.
Here is an example: suppose that you have the
OpenSSL
encryption program, which can prompt the user for a passphrase and
then encrypt its standard input to standard output. You could use
this program with the
dump (called ufsdump under
Solaris) program to back up the filesystem /u to
the device /dev/rmt8 with the command:
# dump f - /u | openssl enc -des3 -salt | dd bs=10240 of=/dev/rmt8
enter des-ede3-cbc encryption password:
If you wanted to back up the filesystem with
tar, you would instead use the command:
# tar cf - /u | openssl enc -des3 -salt | dd bs=10240 of=/dev/rmt8
enter des-ede3-cbc encryption password:
To read these files back, you would use the following command
sequences:
# dd bs=10240 if=/dev/rmt8 | openssl enc -d -des3 -salt | restore fi -
enter des-ede3-cbc decryption password:
and:
# dd bs=10240 if=/dev/rmt8 | openssl enc -d -des3 -salt | tar xpBfv -
enter des-ede3-cbc decryption password:
In both of these examples, the backup programs are instructed to send
the backup of the filesystems to standard output. The output is then
encrypted and written to the tape drive.
|
If you encrypt the backup of a filesystem and you forget the key, the
information stored on the backup will be unusable. Also, note that
many systems do not encrypt individual files separately; you may have
to decrypt (and in some cases restore) the entire partition that you
backed up in order to restore a single file.
|
|
|