Upgrading Software and the Kernel (Running Linux)

In this chapter, we'll show you how to upgrade software on your system, including rebuilding and installing a new operating system kernel. Although most Linux distributions provide some automated means to install, remove, and upgrade specific software packages on your system, it is often necessary to install software by hand. The kernel is the operating system itself. It is a set of routines and data that is loaded by the system at boot time and controls everything on the system: software access to hardware devices, scheduling of user processes, memory management, and more. Building your own kernel is often beneficial, as you can select which features you want included in the operating system.

Installing and upgrading free software is usually more complicated than installing commercial products. Even when you have precompiled binaries available, you may have to uncompress them and unpack them from an archive file. You may also have to create symbolic links or set environment variables so that the binaries know where to look for the resources they use. In other cases, you'll need to compile the software yourself from sources.

Another common Linux activity is building the kernel. This is an important task for several reasons. First of all, you may find yourself in a position where you need to upgrade your current kernel to a newer version, to pick up new features or hardware support. Secondly building the kernel yourself allows you to select which features you do (and do not) want included in the compiled kernel.

Why is the ability to select features a win for you? All kernel code and data is "locked down" in memory; that is, it cannot be swapped out to disk. For example, if you use a kernel image with drivers for hardware you do not have or use, the memory consumed by those hardware drivers cannot be reclaimed for use by user applications. Customizing the kernel allows you to trim it down for your needs.

7.1. Archive and Compression Utilities

When installing or upgrading software on Unix systems, the first things you need to be familiar with are the tools used for compressing and archiving files. There are dozens of such utilities available. Some of these (such as tar and compress) date back to the earliest days of Unix; others (such as gzip) are relative newcomers. The main goal of these utilities is to archive files (that is, to pack many files together into a single file for easy transportation or backup) and to compress files (to reduce the amount of disk space required to store a particular file or set of files).

In this section, we're going to discuss the most common file formats and utilities you're likely to run into. For instance, a near-universal convention in the Unix world is to transport files or software as a tar archive, compressed using compress or gzip. In order to create or unpack these files yourself, you'll need to know the tools of the trade. The tools are most often used when installing new software or creating backups--the subject of the following two sections in this chapter.

7.1.1. Using gzip and bzip2

gzip is a fast and efficient compression program distributed by the GNU project. The basic function of gzip is to take a file, compress it, save the compressed version as filename.gz, and remove the original, uncompressed file. The original file is removed only if gzip is successful; it is very difficult to accidentally delete a file in this manner. Of course, being GNU software, gzip has more options than you want to think about, and many aspects of its behavior can be modified using command-line options.

First, let's say that we have a large file named garbage.txt:

rutabaga% ls -l garbage.txt 
-rw-r--r--   1 mdw      hack       312996 Nov 17 21:44 garbage.txt

To compress this file using gzip, we simply use the command:

gzip garbage.txt

This replaces garbage.txt with the compressed file garbage.txt.gz. What we end up with is the following:

rutabaga% gzip garbage.txt 
rutabaga% ls -l garbage.txt.gz 
-rw-r--r--   1 mdw      hack       103441 Nov 17 21:44 garbage.txt.gz

Note that garbage.txt is removed when gzip completes.

You can give gzip a list of filenames; it compresses each file in the list, storing each with a .gz extension. (Unlike the zip program for Unix and MS-DOS systems, gzip will not, by default, compress several files into a single .gz archive. That's what tar is for; see the next section.)

How efficiently a file is compressed depends upon its format and contents. For example, many graphics file formats (such as GIF and JPEG) are already well compressed, and gzip will have little or no effect upon such files. Files that compress well usually include plain-text files, and binary files such as executables and libraries. You can get information on a gzipped file using gzip -l. For example:

rutabaga% gzip -l garbage.txt.gz 
compressed  uncompr. ratio uncompressed_name 
   103115    312996  67.0% garbage.txt

To get our original file back from the compressed version, we use gunzip, as in:

gunzip garbage.txt.gz

After doing this, we get:

rutabaga% gunzip garbage.txt.gz 
rutabaga% ls -l garbage.txt 
-rw-r--r--   1 mdw      hack       312996 Nov 17 21:44 garbage.txt

which is identical to the original file. Note that when you gunzip a file, the compressed version is removed once the uncompression is complete.

gzip stores the name of the original, uncompressed file in the compressed version. This way, if the compressed filename (including the .gz extension) is too long for the filesystem type (say, you're compressing a file on an MS-DOS filesystem with 8.3 filenames), the original filename can be restored using gunzip even if the compressed file had a truncated name. To uncompress a file to its original filename, use the -N option with gunzip. To see the value of this option, consider the following sequence of commands:

rutabaga% gzip garbage.txt 
rutabaga% mv garbage.txt.gz rubbish.txt.gz

If we were to gunzip rubbish.txt.gz at this point, the uncompressed file would be named rubbish.txt, after the new (compressed) filename. However, with the -N option, we get:

rutabaga% gunzip -N rubbish.txt.gz 
rutabaga% ls -l garbage.txt 
-rw-r--r--   1 mdw      hack       312996 Nov 17 21:44 garbage.txt

gzip and gunzip can also compress or uncompress data from standard input and output. If gzip is given no filenames to compress, it attempts to compress data read from standard input. Likewise, if you use the -c option with gunzip, it writes uncompressed data to standard output. For example, you could pipe the output of a command to gzip to compress the output stream and save it to a file in one step, as in:

rutabaga% ls -laR $HOME | gzip > filelist.gz

This will produce a recursive directory listing of your home directory and save it in the compressed file filelist.gz. You can display the contents of this file with the command:

rutabaga% gunzip -c filelist.gz | more

This will uncompress filelist.gz and pipe the output to the more command. When you use gunzip -c, the file on disk remains compressed.

The zcat command is identical to gunzip -c. You can think of this as a version of cat for compressed files. Linux even has a version of the pager less for compressed files, called zless.

When compressing files, you can use one of the options -1, -2, through -9 to specify the speed and quality of the compression used. -1 (also --fast) specifies the fastest method, which compresses the files less compactly, while -9 (also --best) uses the slowest, but best compression method. If you don't specify one of these options the default is -6. None of these options has any bearing on how you use gunzip; gunzip will be able to uncompress the file no matter what speed option you use.

gzip is relatively new in the Unix world. The compression programs used on most Unix systems are compress and uncompress, which were included in the original Berkeley versions of Unix. compress and uncompress are very much like gzip and gunzip, respectively; compress saves compressed files as filename.Z as opposed to filename.gz, and uses a slightly less efficient compression algorithm.

However, the free software community has been moving to gzip for several reasons. First of all, gzip works better. Second there has been a patent dispute over the compression algorithm used by compress--the results of which could prevent third parties from implementing the compress algorithm on their own. Because of this, the Free Software Foundation urged a move to gzip, which at least the Linux community has embraced. gzip has been ported to many architectures, and many others are following suit. Happily, gunzip is able to uncompress the .Z format files produced by compress.

Another compression/decompression program has also emerged to take the lead from gzip. bzip2 is the new kid on the block and sports even better compression (on the average about 10-20 percent better than gzip), at the expense of longer compression times. You cannot use bunzip2 to uncompress files compressed with gzip and vice versa, and since you cannot expect everybody to have bunzip2 installed on their machine, you might want to confine yourself to gzip for the time being if you want to send the compressed file to somebody else. However, it pays to have bzip2 installed, because more and more FTP servers now provide bzip2-compressed packages in order to conserve disk space and bandwidth. You can recognize bzip2-compressed files from their typical .bz2 file name extension.

While the command-line options of bzip2 are not exactly the same as those of gzip, those that have been described in this section are. For more information, see the bzip2 manual page.

The bottom line is that you should use gzip/gunzip or bzip2/bunzip2 for your compression needs. If you encounter a file with the extension .Z, it was probably produced by compress, and gunzip can uncompress it for you.

Earlier versions of gzip used .z (lowercase) instead of .gz as the compressed-filename extension. Because of the potential confusion with .Z, this was changed. At any rate, gunzip retains backwards-compatibility with a number of filename extensions and file types.

7.1.2. Using tar

tar is a general-purpose archiving utility capable of packing many files into a single archive file, retaining information, such as file permissions and ownership. The name tar stands for tape archive, because the tool was originally used to archive files as backups on tape. However, use of tar is not at all restricted to making tape backups, as we'll see.

The format of the tar command is:

tar functionoptions files…

where function is a single letter indicating the operation to perform, options is a list of (single-letter) options to that function, and files is the list of files to pack or unpack in an archive. (Note that function is not separated from options by any space.)

function can be one of:

c: To create a new archive
x: To extract files from an archive
t: To list the contents of an archive
r: To append files to the end of an archive
u: To update files that are newer than those in the archive
d: To compare files in the archive to those in the filesystem

You'll rarely use most of these functions; the more commonly used are c, x, and t.

The most common options are:

v: To print verbose information when packing or unpacking archives
k: To keep any existing files when extracting--that is, to not overwrite any existing files which are contained within the tar file
f filename: To specify that the tar file to be read or written is filename
z: To specify that the data to be written to the tar file should be compressed or that the data in the tar file is compressed with gzip
v: To make tar show the files it is archiving or restoring--it is good practice to use this so that you can see what actually happens (unless, of course, you are writing shell scripts)

There are others, which we will cover later in this section.

Although the tar syntax might appear complex at first, in practice it's quite simple. For example, say we have a directory named mt, containing these files:

rutabaga% ls -l mt 
total 37 
-rw-r--r--   1 root     root           24 Sep 21  1993 Makefile 
-rw-r--r--   1 root     root          847 Sep 21  1993 README 
-rwxr-xr-x   1 root     root         9220 Nov 16 19:03 mt 
-rw-r--r--   1 root     root         2775 Aug  7  1993 mt.1 
-rw-r--r--   1 root     root         6421 Aug  7  1993 mt.c 
-rw-r--r--   1 root     root         3948 Nov 16 19:02 mt.o 
-rw-r--r--   1 root     root        11204 Sep  5  1993 st_info.txt

We wish to pack the contents of this directory into a single tar archive. To do this, we use the command:

tar cf mt.tar mt

The first argument to tar is the function (here, c, for create) followed by any options. Here, we use the one option f mt.tar, to specify that the resulting tar archive be named mt.tar. The last argument is the name of the file or files to archive; in this case, we give the name of a directory, so tar packs all files in that directory into the archive.

Note that the first argument to tar must be a function letter followed by a list of options. Because of this, there's no reason to use a hyphen (-) to precede the options as many Unix commands require. tar allows you to use a hyphen, as in:

tar -cf mt.tar mt

but it's really not necessary. In some versions of tar, the first letter must be the function, as in c, t, or x. In other versions, the order of letters does not matter.

The function letters as described here follow the so-called "old option style." There is also a newer "short option style" where you precede the function options with a hyphen, and a "long option style," where you use long option names with two hyphens. See the Info page for tar for more details if you are interested.

It is often a good idea to use the v option with tar; this lists each file as it is archived. For example:

rutabaga% tar cvf mt.tar mt 
mt/ 
mt/st_info.txt 
mt/README 
mt/mt.1 
mt/Makefile 
mt/mt.c 
mt/mt.o 
mt/mt

If you use v multiple times, additional information will be printed, as in:

rutabaga% tar cvvf mt.tar mt 
drwxr-xr-x root/root         0 Nov 16 19:03 1994 mt/ 
-rw-r--r-- root/root     11204 Sep  5 13:10 1993 mt/st_info.txt 
-rw-r--r-- root/root       847 Sep 21 16:37 1993 mt/README 
-rw-r--r-- root/root      2775 Aug  7 09:50 1993 mt/mt.1 
-rw-r--r-- root/root        24 Sep 21 16:03 1993 mt/Makefile 
-rw-r--r-- root/root      6421 Aug  7 09:50 1993 mt/mt.c 
-rw-r--r-- root/root      3948 Nov 16 19:02 1994 mt/mt.o 
-rwxr-xr-x root/root      9220 Nov 16 19:03 1994 mt/mt

This is especially useful as it lets you verify that tar is doing the right thing.

In some versions of tar, f must be the last letter in the list of options. This is because tar expects the f option to be followed by a filename--the name of the tar file to read from or write to. If you don't specify f filename at all, tar assumes for historical reasons that it should use the device /dev/rmt0 (that is, the first tape drive). In the section "Section 8.1, "Making Backups"," in Chapter 8, "Other Administrative Tasks", we'll talk about using tar in conjunction with a tape drive to make backups.

Now, we can give the file mt.tar to other people, and they can extract it on their own system. To do this, they would use the command:

tar xvf mt.tar

This creates the subdirectory mt and places all the original files into it, with the same permissions as found on the original system. The new files will be owned by the user running the tar xvf (you) unless you are running as root, in which case the original owner is preserved. The x option stands for "extract." The v option is used again here to list each file as it is extracted. This produces:

courgette% tar xvf mt.tar 
mt/ 
mt/st_info.txt 
mt/README 
mt/mt.1 
mt/Makefile 
mt/mt.c 
mt/mt.o 
mt/mt

We can see that tar saves the pathname of each file relative to the location where the tar file was originally created. That is, when we created the archive using tar cf mt.tar mt, the only input filename we specified was mt, the name of the directory containing the files. Therefore, tar stores the directory itself and all of the files below that directory in the tar file. When we extract the tar file, the directory mt is created and the files placed into it, which is the exact inverse of what was done to create the archive.

By default, tar extracts all tar files relative to the current directory where you execute tar. For example, if you were to pack up the contents of your /bin directory with the command:

tar cvf bin.tar /bin

tar would give the warning:

tar: Removing leading / from absolute path names in the archive.

What this means is that the files are stored in the archive within the subdirectory bin. When this tar file is extracted, the directory bin is created in the working directory of tar--not as /bin on the system where the extraction is being done. This is very important and is meant to prevent terrible mistakes when extracting tar files. Otherwise, extracting a tar file packed as, say, /bin, would trash the contents of your /bin directory when you extracted it. If you really wanted to extract such a tar file into /bin, you would extract it from the root directory, /. You can override this behavior using the P option when packing tar files, but it's not recommended you do so.

Another way to create the tar file mt.tar would have been to cd into the mt directory itself, and use a command such as:

tar cvf mt.tar *

This way the mt subdirectory would not be stored in the tar file; when extracted, the files would be placed directly in your current working directory. One fine point of tar etiquette is to always pack tar files so that they contain a subdirectory, as we did in the first example with tar cvf mt.tar mt. Therefore, when the archive is extracted, the subdirectory is also created and any files placed there. This way you can ensure that the files won't be placed directly in your current working directory; they will be tucked out of the way and prevent confusion. This also saves the person doing the extraction the trouble of having to create a separate directory (should they wish to do so) to unpack the tar file. Of course, there are plenty of situations where you wouldn't want to do this. So much for etiquette.

When creating archives, you can, of course, give tar a list of files or directories to pack into the archive. In the first example, we have given tar the single directory mt, but in the previous paragraph we used the wildcard *, which the shell expands into the list of filenames in the current directory.

Before extracting a tar file, it's usually a good idea to take a look at its table of contents to determine how it was packed. This way you can determine whether you do need to create a subdirectory yourself where you can unpack the archive. A command such as:

tar tvf tarfile

lists the table of contents for the named tarfile. Note that when using the t function, only one v is required to get the long file listing, as in this example:

courgette% tar tvf mt.tar
drwxr-xr-x root/root         0 Nov 16 19:03 1994 mt/ 
-rw-r--r-- root/root     11204 Sep  5 13:10 1993 mt/st_info.txt 
-rw-r--r-- root/root       847 Sep 21 16:37 1993 mt/README 
-rw-r--r-- root/root      2775 Aug  7 09:50 1993 mt/mt.1 
-rw-r--r-- root/root        24 Sep 21 16:03 1993 mt/Makefile 
-rw-r--r-- root/root      6421 Aug  7 09:50 1993 mt/mt.c 
-rw-r--r-- root/root      3948 Nov 16 19:02 1994 mt/mt.o 
-rwxr-xr-x root/root      9220 Nov 16 19:03 1994 mt/mt

No extraction is being done here; we're just displaying the archive's table of contents. We can see from the filenames that this file was packed with all files in the subdirectory mt, so that when we extract the tar file, the directory mt will be created, and the files placed there.

You can also extract individual files from a tar archive. To do this, use the command:

tar xvf tarfile files

where files is the list of files to extract. As we've seen, if you don't specify any files, tar extracts the entire archive.

When specifying individual files to extract, you must give the full pathname as it is stored in the tar file. For example, if we wanted to grab just the file mt.c from the previous archive mt.tar, we'd use the command:

tar xvf mt.tar mt/mt.c

This would create the subdirectory mt and place the file mt.c within it.

tar has many more options than those mentioned here. These are the features that you're likely to use most of the time, but GNU tar, in particular, has extensions that make it ideal for creating backups and the like. See the tar manual page and the following section for more information.

7.1.3. Using tar with gzip

tar does not compress the data stored in its archives in any way. If you are creating a tar file from three 200K files, you'll end up with an archive of about 600K. It is common practice to compress tar archives with gzip (or the older compress program). You could create a gzipped tar file using the commands:

tar cvf tarfile files…
gzip -9 tarfile

But that's so cumbersome, and requires you to have enough space to store the uncompressed tar file before you gzip it.

A much trickier way to accomplish the same task is to use an interesting feature of tar that allows you to write an archive to standard output. If you specify - as the tar file to read or write, the data will be read from or written to standard input or output. For example, we can create a gzipped tar file using the command:

tar cvf - files… | gzip -9 > tarfile.tar.gz

Here, tar creates an archive from the named files and writes it to standard output; next, gzip reads the data from standard input, compresses it, and writes the result to its own standard output; finally, we redirect the gzipped tar file to tarfile.tar.gz.

We could extract such a tar file using the command:

gunzip -9c tarfile.tar.gz | tar xvf -

gunzip uncompresses the named archive file, writes the result to standard output, which is read by tar on standard input and extracted. Isn't Unix fun?

Of course, both of these commands are rather cumbersome to type. Luckily, the GNU version of tar provides the z option which automatically creates or extracts gzipped archives. (We saved the discussion of this option until now, so you'd truly appreciate its convenience.) For example, we could use the commands:

tar cvzf tarfile.tar.gz files…

and:

tar xvzf tarfile.tar.gz

to create and extract gzipped tar files. Note that you should name the files created in this way with the .tar.gz filename extensions (or the equally often used .tgz, which also works on systems with limited filename capabilities), to make their format obvious. The z option works just as well with other tar functions such as t.

Only the GNU version of tar supports the z option; if you are using tar on another Unix system, you may have to use one of the longer commands to accomplish the same tasks. Nearly all Linux systems use GNU tar.

When you want to use tar in conjunction with bzip2, you need to tell tar about your compression program preferences like this:

tar cvf tarfile.tar.bz2 --use-compress-program=bzip2 files...

or, shorter:

tar cvf tarfile.tar.bz2 --use=bzip2 files...

or, shorter still:

tar cvIf tarfile.tar.bz2 files

The latter version only works with newer versions of GNU tar that supports the I option.

Keeping this in mind, you could write short shell scripts or aliases to handle cookbook tar file creation and extraction for you. Under bash, you could include the following functions in your .bashrc:

tarc () { tar czvf $1.tar.gz $1 }
tarx () { tar xzvf $1 }
tart () { tar tzvf $1 }

With these functions, to create a gzipped tar file from a single directory, you could use the command:

tarc directory

The resulting archive file would be named directory.tar.gz. (Be sure that there's no trailing slash on the directory name; otherwise the archive will be created as .tar.gz within the given directory.) To list the table of contents of a gzipped tar file, just use:

tart file.tar.gz

Or, to extract such an archive, use:

tarx file.tar.gz

7.1.4. tar Tricks

Because tar saves the ownership and permissions of files in the archive and retains the full directory structure, as well as symbolic and hard links, using tar is an excellent way to copy or move an entire directory tree from one place to another on the same system (or even between different systems, as we'll see). Using the - syntax described earlier, you can write a tar file to standard output, which is read and extracted on standard input elsewhere.

For example, say that we have a directory containing two subdirectories: from-stuff and to-stuff. from-stuff contains an entire tree of files, symbolic links, and so forth--something that is difficult to mirror precisely using a recursive cp. In order to mirror the entire tree beneath from-stuff to to-stuff, we could use the commands:

cd from-stuff 
tar cf - . | (cd ../to-stuff; tar xvf -)

Simple and elegant, right? We start in the directory from-stuff and create a tar file of the current directory, which is written to standard output. This archive is read by a subshell (the commands contained within parentheses); the subshell does a cd to the target directory, ../to-stuff (relative to from-stuff, that is), and then runs tar xvf, reading from standard input. No tar file is ever written to disk; the data is sent entirely via pipe from one tar process to another. The second tar process has the v option that prints each file as it's extracted; in this way, we can verify that the command is working as expected.

In fact, you could transfer directory trees from one machine to another (via the network) using this trick; just include an appropriate rsh command within the subshell on the right side of the pipe. The remote shell would execute tar to read the archive on its standard input. (Actually, GNU tar has facilities to read or write tar files automatically from other machines over the network; see the tar manual page for details.)

Chapter 7. Upgrading Software and the Kernel

Contents:

7.1. Archive and Compression Utilities

7.1.1. Using gzip and bzip2

7.1.2. Using tar

7.1.3. Using tar with gzip

7.1.4. tar Tricks