4.3 How Unix Implements Passwords

This section describes how passwords are implemented inside the Unix operating system for both locally administered and network-based systems.

4.3.1 The /etc/passwd File

Traditionally, Unix uses the /etc/passwd file to keep track of every user on the system. The /etc/passwd file contains the username, real name, identification information, and basic account information for each user. Each line in the file contains a database record; the record fields are separated by a colon (:).

You can use the cat command to display your system's /etc/passwd file. Here are a few sample lines from a typical file:

root:x:0:1:System Operator:/:/bin/ksh
daemon:x:1:1::/tmp:
uucp:x:4:4::/var/spool/uucppublic:/usr/lib/uucp/uucico
rachel:x:181:100:Rachel Cohen:/u/rachel:/bin/ksh
arlin:x.:182:100:Arlin Steinberg:/u/arlin:/bin/csh

The first three accounts, root, daemon, and uucp, are system accounts, while rachel and arlin are accounts for individual users.

The individual fields of the /etc/passwd file have fairly straightforward meanings. Table 4-1 explains a sample line from the file shown above.

Table 4-1. Example /etc/passwd fields
Field	Contents
rachel	Username.
x	Holding place for the user's "encrypted password." Traditionally, this field actually stored the user's encrypted password. Modern Unix systems store encrypted passwords in a separate file (the shadow password file) that can be accessed only by privileged users.
181	User's user identification number (UID).
100	User's group identification number (GID).
Rachel Cohen	User's full name (also known as the GECOS or GCOS field).^[12]
/u/rachel	User's home directory.
/bin/ksh	User's shell.^[13]

^[12] When Unix was first written, it ran on a small minicomputer. Many users at Bell Labs used their Unix accounts to create batch jobs to be run via Remote Job Entry (RJE) on the bigger GECOS computer in the Labs. The user identification information for the RJE was kept in the /etc/passwd file as part of the standard user identification. GECOS stood for General Electric Computer Operating System; GE was one of several major companies that made computers around that time.

^[13] An empty field for the shell name does not mean that the user has no shell; instead, it means that a default shellóusually the Korn shell (/bin/ksh) or Bourne shell (/bin/sh)óshould be used. To prevent a user from logging in, the program /bin/false is often used as the "shell."

Passwords were traditionally stored in the /etc/passwd file in an encrypted format (hence the file's name). However, because of advances in processor speed, encrypted passwords are now almost universally stored in separate shadow password files, which are described later.

The meanings of the UID and GID fields are described in Chapter 5.

4.3.2 The Unix Encrypted Password System

When Unix requests your password, it needs some way of determining that the password you type is the correct one. Many early computer systems (and quite a few still around today!) kept the passwords for all of their accounts plainly visible in a so-called "password file" that contained exactly thatópasswords. Under normal circumstances, the system protected the passwords so that they could be accessed only by privileged users and operating system utilities. But through accident, programming error, or deliberate act, the contents of the password file could occasionally become available to unprivileged users. This scenario is illustrated in the following remembrance:

Perhaps the most memorable such occasion occurred in the early 1960s when a system administrator on the CTSS system at MIT was editing the password file and another system administrator was editing the daily message that is printed on everyone's terminal on login. Due to a software design error, the temporary editor files of the two users were interchanged and thus, for a time, the password file was printed on every terminal when it was logged in.

óRobert Morris and Ken Thompson, "Password Security: A Case History" Communications of the ACM, November 1979.

The real danger posed by such systems, explained Morris and Thompson, is not that software problems might someday cause a recurrence of this event, but that people can make copies of the password file and purloin them without the knowledge of the system administrator. For example, if the password file is saved on backup tapes, then those backups must be kept in a physically secure place. If a backup tape is stolen, then everybody's password needs to be changed.

Unix avoids this problem by not keeping actual passwords anywhere on the system. Instead, Unix stores a value that is generated by using the password to encrypt a block of zero bits with a one-way function called crypt( ); the result of the calculation was traditionally stored in the /etc/passwd file.^[14] When you try to log in, the program /bin/login does not decrypt the stored password. Instead, /bin/login takes the password that you typed, uses it to transform another block of zeros, and compares the newly transformed block with the block stored in the /etc/passwd file. If the two encrypted results match, the system lets you in.

^[14] These days, the encrypted password is stored either in the shadow password file or on a network-based server, as we'll see in a later section.

The security of this approach rests upon the strength of the encryption algorithm and the difficulty of guessing the user's password. To date, the crypt ( ) algorithm and its successors have proven highly resistant to attacks. Unfortunately, users have a habit of picking easy-to-guess passwords, which creates the need for shadow password files.

4.3.2.1 The traditional crypt ( ) algorithm

The algorithm that traditional crypt( ) uses is based on the Data Encryption Standard (DES) of the National Institute of Standards and Technology (NIST). In normal operation, DES uses a 56-bit key (8 7-bit ASCII characters, for instance) to encrypt blocks of original text, or cleartext, that are 64 bits in length. The resulting 64-bit blocks of encrypted text, or ciphertext, cannot easily be decrypted to the original cleartext without knowing the original 56-bit key.

The Unix crypt( ) function takes the user's password as the encryption key and uses it to encrypt a 64-bit block of zeros. The resulting 64-bit block of ciphertext is then encrypted again with the user's password; the process is repeated a total of 25 times. The final 64 bits are unpacked into a string of 11 printable characters that are stored in the shadow password file.^[15]

^[15] Each of the 11 characters holds six bits of the result, represented as one of 64 characters in the set ".", "/", 0-9, A-Z, a-z, in that order. Thus, the value 0 is represented as ".", and 32 is the letter "U".

Don't confuse the crypt( ) algorithm with the crypt encryption program. The crypt program uses a different encryption system from crypt( ) and is very easy to break. See Chapter 7 for more details.

Although the source code to crypt( ) is readily available, no technique has been discovered (or publicized) to translate the encrypted password back into the original password. Such reverse translation may not even be possible. As a result, the only known way to defeat Unix password security is via a brute-force attack (see the next note), or by a dictionary attack. A dictionary attack is conducted by choosing likely passwordsóas from a dictionaryóencrypting them, and comparing the results with the value stored in /etc/passwd. This approach to breaking a cryptographic cipher is also called a key search or password cracking. It is made easier by the fact that DE uses only the first eight characters of the password as its key; dictionaries need only contain passwords of eight characters or fewer.

Robert Morris and Ken Thompson designed crypt( ) to make a key search computationally expensive. The idea was to make a dictionary attack take too long to be practical. At the time, software implementations of DES were quite slow; iterating the encryption process 25 times made the process of encrypting a single password 25 times slower still. On the original PDP-11 processors upon which Unix was designed, nearly a full second of computer time was required to encrypt a single password. To eliminate the possibility of using DES hardware encryption chips, which were a thousand times faster than software running on a PDP-11, Morris and Thompson modified the DES tables used by their software implementation, rendering the two incompatible. The same modification also served to prevent a bad guy from simply pre-encrypting an entire dictionary and storing it.

What was the modification? Morris and Thompson added a bit of salt, as we'll describe in the next section.

There is no published or known method to easily decrypt DES-encrypted text without knowing the key. Of course, "easily" has a different meaning for cryptographers than for mere mortals. To decrypt something encrypted with DES is computationally expensive; using the fastest current, general-purpose computers might take hundreds of years.

However, computers have grown so much faster in the past 25 years that it is now possible to test millions of passwords in a relatively short amount of time.

4.3.2.2 Unix salt

As table salt adds zest to popcorn, the salt that Morris and Thompson sprinkled into the DES algorithm added a little more spice and variety. The DES salt is a 12-bit number, between 0 and 4,095, which slightly changes the result of the DES function. Each of the 4,096 different salts makes a password encrypt a different way.

When you change your password, the /bin/passwd program selects a salt based on the time of day. The salt is converted into a two-character string and is stored in the /etc/passwd file along with the encrypted "password."^[16] In this manner, when you type your password at login time, the same salt is used again. Unix stores the salt as the first two characters of the encrypted password.

^[16] By now, you know that what is stored in the /etc/passwd file is not really the encrypted password. However, everyone calls it that, and we will do the same from here on. Otherwise, we'll need to keep typing "the superencrypted block of zeros that is used to verify the user's password" everywhere in the book, filling many extra pages and contributing to the premature demise of yet more trees.

Table 4-2 shows how a few different words encrypt with different salts.

Table 4-2. Passwords and salts
Password	Salt	Encrypted password
`nutmeg`	Mi	MiqkFWCm1fNJI
`ellen1`	ri	ri79KNd7V6.Sk
`Sharon`	./	./2aN7ysff3qM
`norahs`	am	amfIADT2iqjAf
`norahs`	7a	7azfT5tIdyh0I

Notice that the last password, norahs, was encrypted two different ways with two different salts. As a side effect, the salt makes it possible for a user to have the same password on a number of different computers and to keep this fact a secret (usually), even from somebody who has access to the /etc/passwd files on all of those computers; two systems would not likely assign the same salt to the user, thus ensuring that the encrypted password field is different.^[17]

^[17] This case occurs only when the user actually types in his password on the second computer. Unfortunately, in practice, system administrators commonly cut and paste /etc/passwd entries from one computer to another when they build accounts for users on new computers. As a result, others can easily tell when a user has the same password on more than one system.

On the Importance of Encrypted Passwords

Alec Muffett, the author of the Crack program (discussed in Table 19-1), related an entertaining story to us about the reuse of passwords in more than one place, which we paraphrase here.

A student friend of Alec's (call him Bob) spent a co-op year at a major computer company site. During his vacations and on holidays, he'd come back to school and play AberMUD (a network-based game) on Alec's computer. One of Bob's responsibilities at the company involved system management. The company was concerned about security, so all passwords were random strings of letters with no sensible pattern or order.

One day, Alec fed the AberMUD passwords into his development version of the Crack program as a dictionary, because they were stored on his machine as plaintext. He then ran this file against his system user-password file, and found a few student account passwords. He had the students change their passwords, and he then forgot about the matter.

Some time later, Alec posted a revised version of the Crack program and associated files to the Usenet. They ended up in one of the Usenet sources newsgroups and were distributed quite widely. Eventually, after a trip of thousands of miles around the world, they came to Bob's company. Bob, being a concerned administrator, decided to download the files and check them against his company's passwords. Imagine Bob's shock and horror when the widely distributed Crack promptly churned out a match for his randomly chosen, super-secret root password!

The moral of the story is that you should teach your users never to use their account passwords for other purposesósuch as games or web sites. They never know when those passwords might come back to haunt them! For developers, the moral is that all programsóeven gamesóshould store passwords encrypted with one-way hash functions.

In recent years the security provided by the salt has diminished significantly. Having a salt means that the same password can encrypt in 4,096 different ways. This makes it much harder for an attacker to build a reverse dictionary for translated encrypted passwords back into her unencrypted form: to build a reverse dictionary of 100,000 words, an attacker would need to have 409,600,000 entries. But with 8-character passwords and 13-character encrypted passwords, 409,600,000 entries fit in roughly 8 GBs of storage.

Another problem with the salt was an error in implementation: many systems selected which salt to use based on the time of day, which made some salts more likely than others.

4.3.2.3 crypt16( ), DES Extended, and Modular Crypt Format

Modern Unix systems have improved the security of the crypt( ) function by changing the underlying encryption algorithm. Instead of a modified DES, a variety of other algorithms have been adopted, including Blowfish and MD5. The advantage of these new algorithms is that more characters of the password are significant, and there are many more possible values for the salt; both of these changes significantly improve the strength of the underlying encrypted password system. The disadvantage is that the encrypted passwords on these systems will not be compatible with the encrypted passwords on other systems.

Because of the widespread use of the original Unix password encryption algorithm, Unix vendors have gone to great lengths to ensure compatibility. Thus, the crypt( ) function called with a traditional salt will always use the original DES-based algorithm. To use one of the newer algorithms you must use either a different function call (some vendors use bigcrypt( ) or crypt16( )) or a different salt value. Consult your documentation to find out what is appropriate for your system.

The DES Extended format is a technique for increasing the number of DES rounds and extending the salt from 2¹² to 2²⁴ possible values. This format has limited use on modern Unix systems but is included on many to provide backwards compatibility.

The Modular Crypt Format (MCF) specifies an extensible scheme for formatting encrypted passwords. MCF is one of the most popular formats for encrypted passwords around today. Here is an example of an MCF-encrypted password:

$1$EqkVUoQ2$4VLpJuZ.Q2wm6TAiyYt75.

Dollar signs are used to delimit the MCF fields, as described in Table 4-3.

Table 4-3. The modular crypt format
Field	Purpose	Notes
#1	Specifies encryption algorithm to use	1 specifies MD5.2 specifies Blowfish.
#2	Salt	Limited to 16 characters.
#3	Encrypted password	Does not include salt, unlike traditional Unix crypt( ) function.

4.3.2.4 The shadow password and master password files

Although changes to the encrypted password system (as described in the previous section) have improved the security of encrypted passwords, they have failed to fundamentally address the weakness exploited by password crackers: people pick passwords that are easy to guess. If an attacker can obtain a copy of the password file, it is a simple matter to guess passwords, perform the encryption transform, and compare against the file.

Ultimately, the best way to deal with the problem of poorly-chosen passwords is to eliminate reusable passwords entirely by using one-time passwords, some form of biometrics, or a token-based authentication system. Because such systems can be awkward or expensive, modern Unix systems have adopted a second approach called shadow password files or master password files.

As the name implies, a shadow password file is a secondary password file that shadows the primary password file. On Solaris and Linux systems, the shadow password is usually stored in the file /etc/shadow and contains the encrypted password and a password expiration date. The /etc/shadow file is protected so that it can be read only by the superuser. Thus, an attacker cannot obtain a copy to use in verifying guesses of passwords.

Instead of a shadow password file, FreeBSD uses a master password file. This file, /etc/master.passwd, is a complete password file that includes usernames, passwords, and other account information. The /etc/passwd file is identical to the /etc/master.passwd file, except that all encrypted passwords have been changed to the letter "x".

Mac OS X stores all account information in the NetInfo network-based account management system. Mac OS X does this for all computers, even for standalone computers that are never placed on a network. The version of NetInfo that is supplied in Mac OS 10.0 and 10.1 does not provide for shadow passwords, although the /etc/master.passwd file is present and is used during boot-up.

4.3.3 One-Time Passwords

The most effective way to minimize the danger of bad passwords is not to use conventional passwords at all. Instead, your site can install software and/or hardware to allow one-time passwords. A one-time password is exactly thatóa password that is used only once.

There are two popular techniques for implementing one-time passwords:

Hardware tokens: An example is the RSA SecureID card, which displays a new PIN or password for each login. Some token-based systems display a different code every minute. Other token-based systems look like little calculators. When you attempt to log in you are presented with a challenge. You type this challenge into your calculator, type in your personal identification number, and then type the resulting number that is displayed into the computer.
Codebooks: These list valid passwords. Each password is crossed off the list after it is used. S/Key is a popular codebook system.^[18]

^[18] More correctly, it is a one-time pad and not a codebook.

One-time passwords can be implemented as a replacement for conventional passwords or in addition to them. In a typical S/Key environment, you enter the S/Key password instead of your standard Unix password. For example:

login: darrel
Password: says rusk wag hut gwen loge

Last login: Wed Jul  5 08:11:33 from r2.nitroba.com
You have new mail.
%

All of these one-time password systems provide an astounding improvement in security over the conventional system. Unfortunately, because they require either the installation of special software or the purchase of additional hardware, they are not as widespread at this time in the Unix marketplace as they should be. However, many major companies and government agencies have moved to using these one-time methods. (See Table 19-1 for additional details.)

4.3.4 Public Key Authentication

Another approach to solving the problem of passwords is to do away with them entirely and use an alternative authentication system. One popular authentication system that has been used is recent years is based on public key cryptography (described in Chapter 7 ).

In a public key authentication system, each user generates a pair of "keys"ótwo long numbers with the interesting property that a message encoded with one of the keys can be decoded only using the other key. The user keeps one of the keys private on his local computer (and often protects its privacy by encrypting the key itself with a password), and provides the other, public key to the remote server. When the user wants to log into the server, the server selects a random number, encodes it with the user's public key, and sends it to the user. By decrypting the random number using his private key and returning it to the server (possibly re-encrypted with the server's public key), the user proves that he is in possession of the private key and is therefore authentic. In a similar fashion, the server can authenticate itself to the user, so that the user is sure that he's logging into the correct machine.

Public key authentication systems have two fundamental problems. The first problem is the management of private keys. Private keys must be kept secure at all costs. Typically, private keys are encrypted with a passphrase to protect them, but all of the caveats about choosing a good password (and not transmitting it where others can eavesdrop) apply.

The second problem is the certification of public keys. If an attacker can substitute his public key for someone else's (or for that of a server to which you wish to connect) all your communication will be visible to the attacker. One solution to this problem is to use a secure channel to exchange public keys. With the Secure Shell (ssh), the public key is merely copied to the remote system (after logging in with a password or another non-public key method) and put into a file in the user's home directory called ~/.ssh/authorized_keys.

A more sophisticated technique for distributing public keys involves the creation of a public key infrastructure (PKI). A group of users and system administrators could all certify their keys to one another in person, or each could have his own key certified by a common person or organization that everyone trusts to verify the identities associated with the keys. SSL, the Secure Socket Layer, provides transparent support for PKI.