Chapter 7. Upgrading Software and the KernelContents:
Archive and Compression Utilities In this chapter, we'll show you how to upgrade software on your system, including rebuilding and installing a new operating system kernel. Although most Linux distributions provide some automated means to install, remove, and upgrade specific software packages on your system, it is often necessary to install software by hand. Non-expert users will find it easiest to install and upgrade software by using a package system, which most distributions provide. If you don't use a package system, installations and upgrades are more complicated than with most commercial operating systems. Even though precompiled binaries are available, you may have to uncompress them and unpack them from an archive file. You may also have to create symbolic links or set environment variables so that the binaries know where to look for the resources they use. In other cases, you'll need to compile the software yourself from sources. Another common Linux activity is building the kernel. This is an important task for several reasons. First of all, you may find yourself in a position where you need to upgrade your current kernel to a newer version, to pick up new features or hardware support. Second, building the kernel yourself allows you to select which features you do (and do not) want included in the compiled kernel. Why is the ability to select features a win for you? All kernel code and data are "locked down" in memory; that is, it cannot be swapped out to disk. For example, if you use a kernel image with support for hardware you do not have or use, the memory consumed by the support for that hardware cannot be reclaimed for use by user applications. Customizing the kernel allows you to trim it down for your needs. It should be noted here that most distributions today ship with modularized kernels. This means that the kernel they install by default contains only the minimum functionality needed to bring up the system; everything else is then contained in modules that add any additionally needed functionality on demand. We will talk about modules in much greater detail later. 7.1. Archive and Compression UtilitiesWhen installing or upgrading software on Unix systems, the first things you need to be familiar with are the tools used for compressing and archiving files. Dozens of such utilities are available. Some of these (such as tar and compress) date back to the earliest days of Unix; others (such as gzip and bzip2) are relative newcomers. The main goal of these utilities is to archive files (that is, to pack many files together into a single file for easy transportation or backup) and to compress files (to reduce the amount of disk space required to store a particular file or set of files). In this section, we're going to discuss the most common file formats and utilities you're likely to run into. For instance, a near-universal convention in the Unix world is to transport files or software as a tar archive, compressed using compress or gzip. In order to create or unpack these files yourself, you'll need to know the tools of the trade. The tools are most often used when installing new software or creating backups — the subject of the following two sections in this chapter. 7.1.1. Using gzip and bzip2gzip is a fast and efficient compression program distributed by the GNU project. The basic function of gzip is to take a file, compress it, save the compressed version as filename.gz, and remove the original, uncompressed file. The original file is removed only if gzip is successful; it is very difficult to accidentally delete a file in this manner. Of course, being GNU software, gzip has more options than you want to think about, and many aspects of its behavior can be modified using command-line options. First, let's say that we have a large file named garbage.txt: rutabaga% ls -l garbage.txt -rw-r--r-- 1 mdw hack 312996 Nov 17 21:44 garbage.txt To compress this file using gzip, we simply use the command: gzip garbage.txt This replaces garbage.txt with the compressed file garbage.txt.gz. What we end up with is the following: rutabaga% gzip garbage.txt rutabaga% ls -l garbage.txt.gz -rw-r--r-- 1 mdw hack 103441 Nov 17 21:44 garbage.txt.gz Note that garbage.txt is removed when gzip completes. You can give gzip a list of filenames; it compresses each file in the list, storing each with a .gz extension. (Unlike the zip program for Unix and MS-DOS systems, gzip will not, by default, compress several files into a single .gz archive. That's what tar is for; see the next section.) How efficiently a file is compressed depends upon its format and contents. For example, many graphics file formats (such as PNG and JPEG) are already well compressed, and gzip will have little or no effect upon such files. Files that compress well usually include plain-text files, and binary files, such as executables and libraries. You can get information on a gzipped file using gzip -l. For example: rutabaga% gzip -l garbage.txt.gz compressed uncompr. ratio uncompressed_name 103115 312996 67.0% garbage.txt To get our original file back from the compressed version, we use gunzip, as in: gunzip garbage.txt.gz After doing this, we get: rutabaga% gunzip garbage.txt.gz rutabaga% ls -l garbage.txt -rw-r--r-- 1 mdw hack 312996 Nov 17 21:44 garbage.txt which is identical to the original file. Note that when you gunzip a file, the compressed version is removed once the uncompression is complete. Instead of using gunzip, you can also use gzip -d (e.g., if gunzip happens not to be installed). gzip stores the name of the original, uncompressed file in the compressed version. This way, if the compressed filename (including the .gz extension) is too long for the filesystem type (say, you're compressing a file on an MS-DOS filesystem with 8.3 filenames), the original filename can be restored using gunzip even if the compressed file had a truncated name. To uncompress a file to its original filename, use the -N option with gunzip. To see the value of this option, consider the following sequence of commands: rutabaga% gzip garbage.txt rutabaga% mv garbage.txt.gz rubbish.txt.gz If we were to gunzip rubbish.txt.gz at this point, the uncompressed file would be named rubbish.txt, after the new (compressed) filename. However, with the -N option, we get: rutabaga% gunzip -N rubbish.txt.gz rutabaga% ls -l garbage.txt -rw-r--r-- 1 mdw hack 312996 Nov 17 21:44 garbage.txt gzip and gunzip can also compress or uncompress data from standard input and output. If gzip is given no filenames to compress, it attempts to compress data read from standard input. Likewise, if you use the -c option with gunzip, it writes uncompressed data to standard output. For example, you could pipe the output of a command to gzip to compress the output stream and save it to a file in one step, as in: rutabaga% ls -laR $HOME | gzip > filelist.gz This will produce a recursive directory listing of your home directory and save it in the compressed file filelist.gz. You can display the contents of this file with the command: rutabaga% gunzip -c filelist.gz | more This will uncompress filelist.gz and pipe the output to the more command. When you use gunzip -c, the file on disk remains compressed. The zcat command is identical to gunzip -c. You can think of this as a version of cat for compressed files. Linux even has a version of the pager less for compressed files, called zless. When compressing files, you can use one of the options -1, -2, through -9 to specify the speed and quality of the compression used. -1 (also — fast) specifies the fastest method, which compresses the files less compactly, while -9 (also — best) uses the slowest, but best compression method. If you don't specify one of these options the default is -6. None of these options has any bearing on how you use gunzip; gunzip will be able to uncompress the file no matter what speed option you use. gzip is relatively new in the Unix world. The compression programs used on most Unix systems are compress and uncompress, which were included in the original Berkeley versions of Unix. compress and uncompress are very much like gzip and gunzip, respectively; compress saves compressed files as filename.Z as opposed to filename.gz, and uses a slightly less efficient compression algorithm. However, the free software community has been moving to gzip for several reasons. First of all, gzip works better. Second, there has been a patent dispute over the compression algorithm used by compress — the results of which could prevent third parties from implementing the compress algorithm on their own. Because of this, the Free Software Foundation urged a move to gzip, which at least the Linux community has embraced. gzip has been ported to many architectures, and many others are following suit. Happily, gunzip is able to uncompress the .Z format files produced by compress. Another compression/decompression program has also emerged to take the lead from gzip. bzip2 is the new kid on the block and sports even better compression (on the average about 10-20% better than gzip), at the expense of longer compression times. You cannot use bunzip2 to uncompress files compressed with gzip and vice versa, and because you cannot expect everybody to have bunzip2 installed on their machine, you might want to confine yourself to gzip for the time being if you want to send the compressed file to somebody else. However, it pays to have bzip2 installed because more and more FTP servers now provide bzip2-compressed packages in order to conserve disk space and bandwidth. You can recognize bzip2-compressed files by their .bz2 filename extension. While the command-line options of bzip2 are not exactly the same as those of gzip, those that have been described in this section are. For more information, see the bzip2(1) manual page. The bottom line is that you should use gzip/gunzip or bzip2/bunzip2 for your compression needs. If you encounter a file with the extension .Z, it was probably produced by compress, and gunzip can uncompress it for you. Earlier versions of gzip used .z (lowercase) instead of .gz as the compressed-filename extension. Because of the potential confusion with .Z, this was changed. At any rate, gunzip retains backwards compatibility with a number of filename extensions and file types. 7.1.2. Using tartar is a general-purpose archiving utility capable of packing many files into a single archive file, while retaining information needed to restore the files fully, such as file permissions and ownership. The name tar stands for tape archive because the tool was originally used to archive files as backups on tape. However, use of tar is not at all restricted to making tape backups, as we'll see. The format of the tar command is: tar functionoptions files... where function is a single letter indicating the operation to perform, options is a list of (single-letter) options to that function, and files is the list of files to pack or unpack in an archive. (Note that function is not separated from options by any space.) function can be one of the following:
You'll rarely use most of these functions; the more commonly used are c, x, and t.
There are others, which we will cover later in this section. Although the tar syntax might appear complex at first, in practice it's quite simple. For example, say we have a directory named mt, containing these files: rutabaga% ls -l mt total 37 -rw-r--r-- 1 root root 24 Sep 21 1993 Makefile -rw-r--r-- 1 root root 847 Sep 21 1993 README -rwxr-xr-x 1 root root 9220 Nov 16 19:03 mt -rw-r--r-- 1 root root 2775 Aug 7 1993 mt.1 -rw-r--r-- 1 root root 6421 Aug 7 1993 mt.c -rw-r--r-- 1 root root 3948 Nov 16 19:02 mt.o -rw-r--r-- 1 root root 11204 Sep 5 1993 st_info.txt We wish to pack the contents of this directory into a single tar archive. To do this, we use the command: tar cf mt.tar mt The first argument to tar is the function (here, c, for create) followed by any options. Here, we use the option f mt.tar to specify that the resulting tar archive be named mt.tar. The last argument is the name of the file or files to archive; in this case, we give the name of a directory, so tar packs all files in that directory into the archive. Note that the first argument to tar must be the function letter and options. Because of this, there's no reason to use a hyphen (-) to precede the options as many Unix commands require. tar allows you to use a hyphen, as in: tar -cf mt.tar mt but it's really not necessary. In some versions of tar, the first letter must be the function, as in c, t, or x. In other versions, the order of letters does not matter. The function letters as described here follow the so-called "old option style." There is also a newer "short option style" in which you precede the function options with a hyphen, and a "long option style" in which you use long option names with two hyphens. See the Info page for tar for more details if you are interested. Be careful to remember the filename if you use the cf function letters. Otherwise tar will overwrite the first file in your list of files to pack because it will mistake that for the filename! It is often a good idea to use the v option with tar; this lists each file as it is archived. For example: rutabaga% tar cvf mt.tar mt mt/ mt/st_info.txt mt/README mt/mt.1 mt/Makefile mt/mt.c mt/mt.o mt/mt If you use v multiple times, additional information will be printed, as in: rutabaga% tar cvvf mt.tar mt drwxr-xr-x root/root 0 Nov 16 19:03 1994 mt/ -rw-r--r-- root/root 11204 Sep 5 13:10 1993 mt/st_info.txt -rw-r--r-- root/root 847 Sep 21 16:37 1993 mt/README -rw-r--r-- root/root 2775 Aug 7 09:50 1993 mt/mt.1 -rw-r--r-- root/root 24 Sep 21 16:03 1993 mt/Makefile -rw-r--r-- root/root 6421 Aug 7 09:50 1993 mt/mt.c -rw-r--r-- root/root 3948 Nov 16 19:02 1994 mt/mt.o -rwxr-xr-x root/root 9220 Nov 16 19:03 1994 mt/mt This is especially useful as it lets you verify that tar is doing the right thing. In some versions of tar, f must be the last letter in the list of options. This is because tar expects the f option to be followed by a filename — the name of the tar file to read from or write to. If you don't specify f filename at all, tar assumes for historical reasons that it should use the device /dev/rmt0 (that is, the first tape drive). In Section 8.1, in Chapter 8, we'll talk about using tar in conjunction with a tape drive to make backups. Now, we can give the file mt.tar to other people, and they can extract it on their own system. To do this, they would use the command: tar xvf mt.tar This creates the subdirectory mt and places all the original files into it, with the same permissions as found on the original system. The new files will be owned by the user running the tar xvf (you) unless you are running as root, in which case the original owner is preserved. The x option stands for "extract." The v option is used again here to list each file as it is extracted. This produces: courgette% tar xvf mt.tar mt/ mt/st_info.txt mt/README mt/mt.1 mt/Makefile mt/mt.c mt/mt.o mt/mt We can see that tar saves the pathname of each file relative to the location where the tar file was originally created. That is, when we created the archive using tar cf mt.tar mt, the only input filename we specified was mt, the name of the directory containing the files. Therefore, tar stores the directory itself and all the files below that directory in the tar file. When we extract the tar file, the directory mt is created and the files placed into it, which is the exact inverse of what was done to create the archive. By default, tar extracts all tar files relative to the current directory where you execute tar. For example, if you were to pack up the contents of your /bin directory with the command: tar cvf bin.tar /bin tar would give the warning: tar: Removing leading / from absolute pathnames in the archive. What this means is that the files are stored in the archive within the subdirectory bin. When this tar file is extracted, the directory bin is created in the working directory of tar — not as /bin on the system where the extraction is being done. This is very important and is meant to prevent terrible mistakes when extracting tar files. Otherwise, extracting a tar file packed as, say, /bin would trash the contents of your /bin directory when you extracted it.[22] If you really wanted to extract such a tar file into /bin, you would extract it from the root directory, /. You can override this behavior using the P option when packing tar files, but it's not recommended you do so. [22]Some (older) implementations of Unix (e.g., Sinix and Solaris) do just that. Another way to create the tar file mt.tar would have been to cd into the mt directory itself, and use a command, such as: tar cvf mt.tar * This way the mt subdirectory would not be stored in the tar file; when extracted, the files would be placed directly in your current working directory. One fine point of tar etiquette is to always pack tar files so that they have a subdirectory at the top level, as we did in the first example with tar cvf mt.tar mt. Therefore, when the archive is extracted, the subdirectory is also created and any files placed there. This way you can ensure that the files won't be placed directly in your current working directory; they will be tucked out of the way and prevent confusion. This also saves the person doing the extraction the trouble of having to create a separate directory (should they wish to do so) to unpack the tar file. Of course, there are plenty of situations where you wouldn't want to do this. So much for etiquette. When creating archives, you can, of course, give tar a list of files or directories to pack into the archive. In the first example, we have given tar the single directory mt, but in the previous paragraph we used the wildcard *, which the shell expands into the list of filenames in the current directory. Before extracting a tar file, it's usually a good idea to take a look at its table of contents to determine how it was packed. This way you can determine whether you do need to create a subdirectory yourself where you can unpack the archive. A command, such as: tar tvf tarfile lists the table of contents for the named tarfile. Note that when using the t function, only one v is required to get the long file listing, as in this example: courgette% tar tvf mt.tar drwxr-xr-x root/root 0 Nov 16 19:03 1994 mt/ -rw-r--r-- root/root 11204 Sep 5 13:10 1993 mt/st_info.txt -rw-r--r-- root/root 847 Sep 21 16:37 1993 mt/README -rw-r--r-- root/root 2775 Aug 7 09:50 1993 mt/mt.1 -rw-r--r-- root/root 24 Sep 21 16:03 1993 mt/Makefile -rw-r--r-- root/root 6421 Aug 7 09:50 1993 mt/mt.c -rw-r--r-- root/root 3948 Nov 16 19:02 1994 mt/mt.o -rwxr-xr-x root/root 9220 Nov 16 19:03 1994 mt/mt No extraction is being done here; we're just displaying the archive's table of contents. We can see from the filenames that this file was packed with all files in the subdirectory mt so that when we extract the tar file, the directory mt will be created and the files placed there. You can also extract individual files from a tar archive. To do this, use the command: tar xvf tarfile files where files is the list of files to extract. As we've seen, if you don't specify any files, tar extracts the entire archive. When specifying individual files to extract, you must give the full pathname as it is stored in the tar file. For example, if we wanted to grab just the file mt.c from the previous archive mt.tar, we'd use the command: tar xvf mt.tar mt/mt.c This would create the subdirectory mt and place the file mt.c within it. tar has many more options than those mentioned here. These are the features that you're likely to use most of the time, but GNU tar, in particular, has extensions that make it ideal for creating backups and the like. See the tar manual page and the following section for more information. 7.1.3. Using tar with gzip and bzip2tar does not compress the data stored in its archives in any way. If you are creating a tar file from three 200K files, you'll end up with an archive of about 600K. It is common practice to compress tar archives with gzip (or the older compress program). You could create a gzipped tar file using the commands: tar cvf tarfile files... gzip -9 tarfile But that's so cumbersome, and requires you to have enough space to store the uncompressed tar file before you gzip it. A much trickier way to accomplish the same task is to use an interesting feature of tar that allows you to write an archive to standard output. If you specify - as the tar file to read or write, the data will be read from or written to standard input or output. For example, we can create a gzipped tar file using the command: tar cvf - files... | gzip -9 > tarfile.tar.gz Here, tar creates an archive from the named files and writes it to standard output; next, gzip reads the data from standard input, compresses it, and writes the result to its own standard output; finally, we redirect the gzipped tar file to tarfile.tar.gz. We could extract such a tar file using the command: gunzip -c tarfile.tar.gz | tar xvf - gunzip uncompresses the named archive file and writes the result to standard output, which is read by tar on standard input and extracted. Isn't Unix fun? Of course, both commands are rather cumbersome to type. Luckily, the GNU version of tar provides the z option which automatically creates or extracts gzipped archives. (We saved the discussion of this option until now, so you'd truly appreciate its convenience.) For example, we could use the commands: tar cvzf tarfile.tar.gz files... and: tar xvzf tarfile.tar.gz to create and extract gzipped tar files. Note that you should name the files created in this way with the .tar.gz filename extensions (or the equally often used .tgz, which also works on systems with limited filename capabilities) to make their format obvious. The z option works just as well with other tar functions such as t. Only the GNU version of tar supports the z option; if you are using tar on another Unix system, you may have to use one of the longer commands to accomplish the same tasks. Nearly all Linux systems use GNU tar. When you want to use tar in conjunction with bzip2, you need to tell tar about your compression program preferences, like this: tar cvf tarfile.tar.bz2 - -use-compress-program=bzip2 files... or, shorter: tar cvf tarfile.tar.bz2 - -use-compress-program=bzip2 files... or, shorter still: tar cvjf tarfile.tar.bz2 files The last version works only with newer versions of GNU tar that support the j option. Keeping this in mind, you could write short shell scripts or aliases to handle cookbook tar file creation and extraction for you. Under bash, you could include the following functions in your .bashrc: tarc ( ) { tar czvf $1.tar.gz $1 } tarx ( ) { tar xzvf $1 } tart ( ) { tar tzvf $1 } With these functions, to create a gzipped tar file from a single directory, you could use the command: tarc directory The resulting archive file would be named directory.tar.gz. (Be sure that there's no trailing slash on the directory name; otherwise the archive will be created as .tar.gz within the given directory.) To list the table of contents of a gzipped tar file, just use: tart file.tar.gz Or, to extract such an archive, use: tarx file.tar.gz As a final note, we would like to mention that files created with gzip and/or tar can be unpacked with the well-known WinZip utility on Windows systems. WinZip doesn't have support for bzip2 yet, though. If you, on the other hand, get a file in .zip format, you can unpack it on your Linux system using the unzip command. 7.1.4. tar TricksBecause tar saves the ownership and permissions of files in the archive and retains the full directory structure, as well as symbolic and hard links, using tar is an excellent way to copy or move an entire directory tree from one place to another on the same system (or even between different systems, as we'll see). Using the - syntax described earlier, you can write a tar file to standard output, which is read and extracted on standard input elsewhere. For example, say that we have a directory containing two subdirectories: from-stuff and to-stuff. from-stuff contains an entire tree of files, symbolic links, and so forth — something that is difficult to mirror precisely using a recursive cp. In order to mirror the entire tree beneath from-stuff to to-stuff, we could use the commands: cd from-stuff tar cf - . | (cd ../to-stuff; tar xvf -) Simple and elegant, right? We start in the directory from-stuff and create a tar file of the current directory, which is written to standard output. This archive is read by a subshell (the commands contained within parentheses); the subshell does a cd to the target directory, ../to-stuff (relative to from-stuff, that is), and then runs tar xvf, reading from standard input. No tar file is ever written to disk; the data is sent entirely via pipe from one tar process to another. The second tar process has the v option that prints each file as it's extracted; in this way, we can verify that the command is working as expected. In fact, you could transfer directory trees from one machine to another (via the network) using this trick; just include an appropriate rsh (or ssh) command within the subshell on the right side of the pipe. The remote shell would execute tar to read the archive on its standard input. (Actually, GNU tar has facilities to read or write tar files automatically from other machines over the network; see the tar(1) manual page for details.) Copyright © 2003 O'Reilly & Associates. All rights reserved. |
|