[Chapter 19] 19.5 Using tar to Create and Unpack Archives

19.5 Using tar to Create and Unpack Archives

Many UNIX users think of tar ( 20.1 ) as a utility for creating tapes. Like most UNIX utilities though, that's only the beginning. For example, you can use tar for copying directory trees ( 18.16 ) .

One common use for tar is creating archive files that can be shipped to other systems. We've already seen a utility for creating shell archives ( 19.2 ) , but there are a lot of things that a shell archive can't do. tar is very useful when you're sending binary data; I've seen some shar utilities that can handle binary data, but they're rare, and I don't particularly like the way they do it. If you use tar , you can package several directories into an archive, you can send directories that include links, you can preserve file ownership and access permissions, etc.

To create a tar archive, use the c (create) and f (filename) options to save tar 's output in a file:

% 

cd /home/src/fsf


% 

tar cf emacs.tar emacs

This command puts everything in the emacs directory into a file (called a tar file ) named emacs.tar . You can then give this file to other users, via FTP, UUCP ( 1.33 ) , or any other means.

Archives (no matter how you make them) are usually rather large, so it's common to compress ( 24.7 ) them, with a command like:

% 

gzip emacs.tar

This creates the file emacs.tar.gz , which should be significantly smaller than the original tar archive.

If you're going to use UUCP or FTP to transfer the file, this is good enough; both UUCP and FTP know how to handle binary data. Often though, you'd like to send the archive via electronic mail ( 1.33 ) , and some mail programs only know how to handle ASCII ( 51.3 ) data. In that case, you'll need to create an ASCII version. To do this, use the uuencode ( 52.9 ) command. To read the file directly, repeat its name twice:

% 

uuencode emacs.tar.gz emacs.tar.gz > emacs.tar.gz.uu

You can then insert emacs.tar.gz.uu into a mail message and send it to someone. Of course, the ASCII-only encoding won't be as efficient as the original binary file. It's about 33 percent larger. [1]

[1] If so, why bother gzip ping? Why not forget about both gzip and uuencode ? Well, you can't. Remember that tar files are binary files to start with - even if every file in the archive is an ASCII text file. You'd need to uuencode a file before mailing it, anyway - so you'd still pay the 33 percent size penalty that uuencode incurs. Using gzip minimizes the damage.

If you'd rather, you can combine the steps above into one pipeline. Giving tar the archive filename ( 13.13 ) tells it to write to its standard output. That feeds the archive down the pipe:

% 

tar cf - emacs | gzip | uuencode emacs.tar.gz > emacs.tar.gz.uu

What happens when you receive a uuencoded, compressed tar file? The same thing, in reverse. You'll get a mail message that (after the various header lines) looks something like this:

begin 644 emacs.tar.gz
M+DQ0"D%L;"!O9B!T:&5S92!P<F]B;&5M<R!C86X@8F4@<V]L=F5D(&)Y(")L
M:6YK<RPB(&$@;65C:&%N:7-M('=H:6-H"F%L;&]W<R!A(&9I;&4@=&\@:&%V
M92!T=V\@;W(@;6]R92!N86UE<RX@(%5.25@@<')O=FED97,@='=O(&1I9F9E
M<F5N= IK:6YD<R!O9B!L:6YK<SH*+DQS($(*+DQI"EQF0DAA<F0@;&EN:W-<

So you save the message in a file, complete with headers. Let's say you call this file mailstuff . How do you get the original files back? Use the following sequence of commands:

% 

uudecode mailstuff


% 

gunzip emacs.tar.gz


% 

tar xf emacs.tar

The uudecode command creates the file emacs.tar.gz . Then gunzip recreates your original tar file, and tar xf extracts the individual files from the archive. Article 19.7 shows a more efficient method - and also explains the tar o option, which many System V users will need.

By the way, tar is so flexible precisely because of UNIX's file-oriented design: everything, even a tape drive, "looks like" a file. So tar creates a certain kind of file and sends it out into the world; it usually lands on a tape, but you can put it somewhere else if you want. With most operating systems, a tape utility would know how to talk to a tape drive, and that's all.

- ML