Upgrading Software (Running Linux)

7.2. Upgrading Software

Linux is a fast-moving target. Because of the cooperative nature of the project, new software is always becoming available, and programs are constantly being updated with newer versions. This is especially true of the Linux kernel, which has many groups of people working on it. During the development process, it's not uncommon for a new kernel patch to be released on a nightly basis. While other parts of the system may not be as dynamic, the same principles apply.

With this constant development, how can you possibly hope to stay on top of the most recent versions of your system software? The short answer is, you can't. While there are people out there who have a need to stay current with, say, the nightly kernel patch release, for the most part, there's no reason to bother upgrading your software this often. In this section, we're going to talk about why and when to upgrade and show you how to upgrade several important parts of the system.

When should you upgrade? In general, you should consider upgrading a portion of your system only when you have a demonstrated need to upgrade. For example, if you hear of a new release of some application that fixes important bugs (that is, those bugs that actually affect your personal use of the application), you might want to consider upgrading that application. If the new version of the program provides new features you might find useful, or has a performance boost over your present version, it's also a good idea to upgrade. When your machine is somehow connected to the Internet, another good reason for upgrading would be plugging in a security hole that has been recently reported. However, upgrading just for the sake of having the newest version of a particular program is probably silly.

Upgrading can sometimes be a painful thing to do. For example, you might want to upgrade a program that requires the newest versions of the compiler, libraries, and other software in order to run. Upgrading this program will also require you to upgrade several other parts of the system, which can be a time-consuming process. On the other hand, this can be seen as an argument for keeping your software up to date; if your compiler and libraries are current, upgrading the program in question won't be a problem.

How can you find out about new versions of Linux software? The best way is to watch the Usenet newsgroup comp.os.linux.announce (see the section "Section 1.10.3, "Usenet Newsgroups"" in Chapter 1, "Introduction to Linux") where announcements of new software releases and other important information are posted. If you have Internet access, you can then download the software via FTP and install it on your system. Another good source to learn about new Linux software is the web site http://www.freshmeat.net.

If you don't have access to Usenet or the Internet, the best way to keep in touch with recent developments is to pay for a CD-ROM subscription. Here you receive an updated copy of the various Linux FTP sites, on CD-ROM, every couple of months. This service is available from a number of Linux vendors. It's a good thing to have, even if you have Internet access.

This brings us to another issue: what's the best upgrade method? Some people feel it's easier to completely upgrade the system by reinstalling everything from scratch whenever a new version of their favorite distribution is released. This way you don't have to worry about various versions of the software working together. For those without Internet access, this may indeed be the easiest method; if you receive a new CD-ROM only once every two months, a great deal of your software may be out of date.

It's our opinion, however, that reinstallation is not a good upgrade plan at all. Most of the current Linux distributions are not meant to be upgraded in this way, and a complete reinstallation may be complex or time-consuming. Also, if you plan to upgrade in this manner, you generally lose all your modifications and customizations to the system, and you'll have to make backups of your user's home directories and any other important files that would be deleted during a reinstallation. Many novices choose this upgrade path because it's the easiest to follow. In actuality, not much changes from release to release, so a complete reinstallation is usually unnecessary and can be avoided with a little upgrading know-how.

In this section, we'll show you how to upgrade various pieces of your system individually. We'll show you how to upgrade your system libraries and compiler, as well as give you a generic method for installing new software. In the following section, we'll talk about building a new kernel.

7.2.1. Upgrading Libraries

Most of the programs on a Linux system are compiled to use shared libraries. These libraries contain useful functions common to many programs. Instead of storing a copy of these routines in each program that calls them, the libraries are contained in files on the system that are read by all programs at run-time. That is, when a program is executed, the code from the program file itself is read, followed by any routines from the shared library files. This saves a great deal of disk space; only one copy of the library routines is stored on disk.

In some instances, it's necessary to compile a program to have its own copy of the library routines (usually for debugging) instead of using the routines from the shared libraries. We say that programs built in this way are statically linked, while programs built to use shared libraries are dynamically linked.

Therefore, dynamically linked executables depend upon the presence of the shared libraries on disk. Shared libraries are implemented in such a way that the programs compiled to use them generally don't depend on the version of the available libraries. This means that you can upgrade your shared libraries, and all programs that are built to use those libraries will automatically use the new routines. (There is an exception: if major changes are made to a library, the old programs won't work with the new library. You'll know this is the case because the major version number is different; we'll explain more later. In this case, you keep both the old and new libraries around. All your old executables will continue to use the old libraries, and any new programs that are compiled will use the new libraries.)

When you build a program to use shared libraries, a piece of code is added to the program that causes it to execute ld.so, the dynamic linker, when the program is started. ld.so is responsible for finding the shared libraries the program needs and loading the routines into memory. Dynamically linked programs are also linked against "stub" routines, which simply take the place of the actual shared library routines in the executable. ld.so replaces the stub routine with the code from the libraries when the program is executed.

The ldd command can be used to list the shared libraries on which a given executable depends. For example:

rutabaga% ldd /usr/bin/X11/xterm 
        libXaw.so.6 => /usr/X11R6/lib/libXaw.so.6.0 
        libXt.so.6 => /usr/X11R6/lib/libXt.so.6.0 
        libX11.so.6 => /usr/X11R6/lib/libX11.so.6.0 
	libc.so.5 => /lib/libc.so.5.0.9

Here, we see that the xterm program depends on the four shared libraries libXaw, libXt, libX11, and libc. (The first three are related to the X Window System, and the last is the standard C library.) We also see the version numbers of the libraries for which the program was compiled (that is, the version of the stub routines used), and the name of the file which contains each shared library. This is the file that ld.so will find when the program is executed.

In order to use a shared library, the version of the stub routines (in the executable) must be compatible with the version of the shared libraries. Basically, a library is compatible if its major version number matches that of the stub routines. The major version number is the part before the first period in the version number; in 6.0, the major number is 6. This way, if a program was compiled with version 6.0 of the stub routines, shared library versions 6.1, 6.2, and so forth could be used by the executable. In the section "Section 13.1.7, "More Fun with Libraries"" in Chapter 13, "Programming Languages", we describe how to use shared libraries with your own programs.

The file /etc/ld.so.conf contains a list of directories that ld.so searches to find shared library files. An example of such a file is:

/usr/lib
/usr/local/lib
/usr/X11R6/lib

ld.so always looks in /lib and /usr/lib, regardless of the contents of ld.so.conf. Usually, there's no reason to modify this file, and the environment variable LD_LIBRARY_PATH can add additional directories to this search path (e.g., if you have your own private shared libraries that shouldn't be used systemwide). However, if you do add entries to /etc/ld.so.conf or upgrade or install additional libraries on your system, be sure to use the ldconfig command which will regenerate the shared library cache in /etc/ld.so.cache from the ld.so search path. This cache is used by ld.so to find libraries quickly at runtime without actually having to search the directories on its path. For more information, check the manual pages for ld.so and ldconfig.

Now that you understand how shared libraries are used, let's move on to upgrading them. The two libraries that are most commonly updated are libc (the standard C library) and libm (the math library). For each shared library, there are two separate files:

library.a: This is the static version of the library. When a program is statically linked, routines are copied from this file directly into the executable, so the executable contains its own copy of the library routines.
library.so.version: This is the shared library image itself. When a program is dynamically linked, the stub routines from this file are copied into the executable, allowing ld.so to locate the shared library at runtime. When the program is executed, ld.so copies routines from the shared library into memory for use by the program. If a program is dynamically linked, the library.a file is not used for this library.

For the libc library, you'll have files such as libc.a and libc.so.5.2.18. The .a files are generally kept in /usr/lib, while .so files are kept in /lib. When you compile a program, either the .a or the .so file is used for linking, and the compiler looks in /lib and /usr/lib (as well as a variety of other places) by default. If you have your own libraries, you can keep these files anywhere, and control where the linker looks with the -L option to the compiler. See the section "Section 13.1.7, "More Fun with Libraries"" in Chapter 13, "Programming Languages" for details.

The shared library image, library.so.version, is kept in /lib for most systemwide libraries. Shared library images can be found in any of the directories that ld.so searches at runtime; these include /lib, /usr/lib, and the files listed in ld.so.conf. See the ld.so manual page for details.

If you look in /lib, you'll see a collection of files such as the following:

lrwxrwxrwx  1 root  root      14 Oct 23 13:25 libc.so.5 -> libc.so.5.2.18
-rwxr-xr-x  1 root  root  623620 Oct 23 13:24 libc.so.5.2.18
lrwxrwxrwx  1 root  root      15 Oct 17 22:17 libvga.so.1 ->\
libvga.so.1.2.10 
-rwxr-xr-x  1 root  root  128004 Oct 17 22:17 libvga.so.1.2.10

Here, we see the shared library images for two libraries--libc and libvga. Note that each image has a symbolic link to it, named library.so.major, where major is the major version number of the library. The minor number is omitted because ld.so searches for a library only by its major version number. When ld.so sees a program that has been compiled with the stubs for version 5.2.18 of libc, it looks for a file called libc.so.5 in its search path. Here, /lib/libc.so.5 is a symbolic link to /lib/libc.so.5.2.18, the actual version of the library we have installed.

When you upgrade a library, you must replace the .a and .so.version files corresponding to the library. Replacing .a file is easy: just copy over it with the new versions. However, you must use some caution when replacing the shared library image, .so.version; most of the programs on the system depend on those images, so you can't simply delete them or rename them. To put this another way, the symbolic link library.so.major must always point to a valid library image. To accomplish this, first copy the new image file to /lib, and then change the symbolic link to point to the new file in one step, using ln -sf. This is demonstrated in the following example.

Let's say you're upgrading from Version 5.2.18 of the libc library to Version 5.4.47. You should have the files libc.a and libc.so.5.4.47. First, copy the .a file to the appropriate location, overwriting the old version:

rutabaga# cp libc.a /usr/lib

Now, copy the new image file to /lib (or wherever the library image should be):

rutabaga# cp libc.so.5.4.47 /lib

Now, if you use the command ls -l /lib/libc you should see something like:

lrwxrwxrwx  1 root  root      14 Oct 23 13:25 libc.so.5 -> libc.so.5.2.18
-rwxr-xr-x  1 root  root  623620 Oct 23 13:24 libc.so.5.2.18
-rwxr-xr-x  1 root  root  720310 Nov 16 11:02 libc.so.5.4.47

To update the symbolic link to point to the new library, use the command:

rutabaga# ln -sf /lib/libc.so.5.4.47 /lib/libc.so.5

This gives you:

lrwxrwxrwx  1 root  root      14 Oct 23 13:25 libc.so.5 ->\ 
/lib/libc.so.5.4.47
-rwxr-xr-x  1 root  root  623620 Oct 23 13:24 libc.so.5.2.18
-rwxr-xr-x  1 root  root  720310 Nov 16 11:02 libc.so.5.4.47

Now you can safely remove the old image file, libc.so.5.2.18. You must use ln -sf to replace the symbolic link in one step, especially when updating libraries, such as libc.

If you were to remove the symbolic link first, and then attempt to use ln -s to add it again, more than likely ln would not be able to execute because the symbolic link is gone, and as far as ld.so is concerned, the libc library can't be found. Once the link is gone, nearly all the programs on your system will be unable to execute. Be very careful when updating shared library images.

Whenever you upgrade or add a library to the system, it's not a bad idea to run ldconfig to regenerate the library cache used by ld.so. In some cases, a new library may not be recognized by ld.so until you run ldconfig.

The Linux community is currently moving from the old libc version 5 to the new so-called glibc2, also called libc6. In principle, this is not different from any other incompatible library update, but in practice this brings all kinds of problems because exchanging the C library in an incompatible manner affects each and every program on the system. While the new glibc2 has several advantages--among other things it is thread-safe, meaning that it makes it a lot easier to write programs that do more than one thing at a time--many people consider it still unstable. In addition, you cannot run programs compiled for one version with the other library version. If you want to run a program for which you do not have the sources, you will have to install the C library version that this program needs. Fortunately, it is possible to have both versions on your system, albeit with some problems. Those distributions that have already switched to glibc2 usually provide an installed package named something like "libc5 compatibility"; install this package if you want to be able to run software compiled with the old C library.

One question remains: where can you obtain the new versions of libraries? Several of the basic system libraries (libc, libm, and so on) can be downloaded from the directory /pub/Linux/GCC on ftp://metalab.unc.edu. It contains the Linux versions of the gcc compiler, libraries, include files, and other utilities. Each file there should have a README or release file that describes what to do and how to install it. Other libraries are maintained and archived separately. At any rate, all libraries you install should include the .a and .so.version files, as well as a set of include files for use with the compiler.

7.2.2. Upgrading the Compiler

One other important part of the system to keep up to date is the C compiler and related utilities. These include gcc (the GNU C and C++ compiler itself), the linker, the assembler, the C preprocessor, and various include files and libraries used by the compiler itself. All are included in the Linux gcc distribution. Usually, a new version of gcc is released along with new versions of the libc library and include files, and each requires the other.

You can find the current gcc release for Linux on the various FTP archives, including /pub/Linux/GCC on ftp://metalab.unc.edu. The release notes there should tell you what to do. Usually, upgrading the compiler is a simple matter of unpacking several tar files as root, and possibly removing some additional files. If you don't have Internet access, you can obtain the newest compiler from CD-ROM archives of the FTP sites, as described earlier.

To find out what version of gcc you have, use the command:

gcc -v

This should tell you something like:

Reading specs from /usr/lib/gcc-lib/i486-linux/2.8.1/specs 
gcc version 2.8.1

Note that gcc itself is just a front-end to the actual compiler and code-generation tools found under:

/usr/lib/gcc-lib/machine/version

gcc (usually in /usr/bin) can be used with multiple versions of the compiler proper, with the -V option. In the section "Section 13.1, "Programming with gcc"" in Chapter 13, "Programming Languages", we describe the use of gcc in detail.

If you are developing software in C++, it might also be a good idea to use egcs, a new version of gcc that is much more robust than gcc itself and supports most of the modern C++ features. Unfortunately, egcs, older versions of gcc (up to version 2.7.2), and newer versions of gcc (from version 2.8.0) all use different and incompatible object file formats, which means that you should recompile all your C++ libraries and applications if you migrate from one compiler to another. The Free Software Foundation has announced recently that egcs will become its default compiler, thus replacing gcc, so these considerations might be obsolete soon.

7.2.3. Upgrading Other Software

Of course, you'll have to periodically upgrade other pieces of your system. As discussed in the previous section, it's usually easier and best to upgrade only those applications you need to upgrade. For example, if you never use Emacs on your system, why bother keeping up-to-date with the most recent version of Emacs? For that matter, you may not need to stay completely current with oft-used applications. If something works for you, there's little need to upgrade.

In order to upgrade other applications, you'll have to obtain the newest release of the software. This is usually available as a gzipped or compressed tar file. Such a package could come in several forms. The most common are binary distributions, where the binaries and related files are archived and ready to unpack on your system, and source distributions, where the source code (or portions of the source code) for the software is provided, and you have to issue commands to compile and install it on your system.

Shared libraries make distributing software in binary form easy; as long as you have a version of the libraries installed that is compatible with the library stubs used to build the program, you're set. However, in many cases, it is easier (and a good idea) to release a program as source. Not only does this make the source code available to you for inspection and further development, it allows you to build the application specifically for your system, with your own libraries. Many programs allow you to specify certain options at compile-time, such as selectively including various features in the program when built. This kind of customization isn't possible if you get prebuilt binaries.

There's also a security issue at play when installing binaries without source code. Although on Unix systems viruses are nearly unheard of,[31] it's not difficult to write a "trojan horse," a program that appears to do something useful, but in actuality causes damage to the system. For example, someone could write an application that includes the "feature" of deleting all files in the home directory of the user executing the program. Because the program would be running with the permissions of the user executing it, the program itself has the ability to do this kind of damage. (Of course, the Unix security mechanism prevents damage being done to other users' files or to any important system files owned by root.)

[31]A "virus" in the classic sense is a program that attaches to a "host," which runs when the host is executed. On Unix systems, this usually requires root privileges to do any harm, and if programmers could obtain such privileges, they probably wouldn't bother with a virus.

While having source won't necessarily prevent this from happening (do you read the source code for every program you compile on your system?), at least it gives you a way to verify what the program is really doing. A programmer would have to make a certain effort to prevent such a trojan horse from being discovered, but if you install binaries blindly, you are setting yourself up for trouble.

At any rate, dealing with source and binary distributions of software is quite simple. If the package is released as a tar file, first use the tar t option to determine how the files have been archived. In the case of binary distributions, you may be able to unpack the tar file directly on your system, say from / or /usr. When doing this, be sure to delete any old versions of the program and its support files (those that aren't overwritten by the new tar file). If the old executable comes before the new one on your path, you'll continue to run the old version unless you remove it.

Many distributions now use a special packaging system that makes installing and uninstalling software a lot easier. There are several packaging systems available, but most distributions, including Red Hat, SuSE, and Caldera use the RPM system, which we will cover in the next section. The Debian distribution uses its own .deb system not covered here.

Source distributions are a bit trickier. First, you must unpack the sources into a directory of their own. Most systems use /usr/src for just this. Because you usually don't have to be root to build a software package (you will usually require root permissions to install the program once compiled!), it might be a good idea to make /usr/src writable by all users, with the command:

chmod 1777 /usr/src

This allows any user to create subdirectories of /usr/src and place files there. The first 1 in the mode is the "sticky" bit, which prevents users from deleting each other's subdirectories.

You can now create a subdirectory under /usr/src and unpack the tar file there, or you can unpack the tar file directly from /usr/src if the archive contains a subdirectory of its own.

Once the sources are available, the next step is to read any README files or installation notes included with the sources. Nearly all packages include such documentation. The basic method used to build most programs is:

Check the Makefile. This file contains instructions for make, which controls the compiler to build programs. Many applications require you to edit minor aspects of the Makefile for your own system; this should be self-explanatory. The installation notes will tell you if you have to do this. If you need more help with the Makefile, read the section "Section 13.2, "Makefiles"" in Chapter 13, "Programming Languages". If there is no Makefile in the package, you might have to generate it first. See item 3 for how to do this.
Possibly edit other files associated with the program. Some applications require you to edit a file named config.h; again, this will be explained in the installation instructions.
Possibly run a configuration script. Such a script is used to determine what facilities are available on your system, which is necessary to build more complex applications.

Specifically, when the sources do not contain a Makefile in the top-level directory, but instead a file called Makefile.in and a file called configure, the package has been built with the Autoconf system. In this (more and more common case), you run the configuration script like this:
```
./configure
```
The ./ should be used so that the local configure is run, and not another configure program that might accidentally be in your path. Some packages let you pass options to configure that often enable or disable specific features of the package. Once the configure script has run, you can proceed with the next step.
Run make. Generally, this executes the appropriate compilation commands as given in the Makefile. In many cases you'll have to give a "target" to make, as in make all or make install. These are two common targets; the former is usually not necessary but can be used to build all targets listed in a Makefile (e.g., if the package includes several programs, but only one is compiled by default); the latter is often used to install the executables and support files on the system after compilation. For this reason, make install is usually run as root.

You might have problems compiling or installing new software on your system, especially if the program in question hasn't been tested under Linux, or depends on other software you don't have installed. In Chapter 13, "Programming Languages", we talk about the compiler, make, and related tools in detail.

Most software packages include manual pages and other files, in addition to the source and executables. The installation script (if there is one) will place these files in the appropriate location. In the case of manual pages, you'll find files with names such as foobar.1 or foobar.man. These files are usually nroff source files, which are formatted to produce the human-readable pages displayed by the man command. If the manual page source has a numeric extension, such as .1, copy it to the directory /usr/man/man1, where 1 is the number used in the filename extension. (This corresponds to the manual "section" number; for most user programs, it is 1.) If the file has an extension such as .man, it usually suffices to copy the file to /usr/man/man1, renaming the .man extension to .1.