Passwords (Building Internet Firewalls, 2nd Edition)

21.2. Passwords

Since many authentication mechanisms depend on passwords, it's important to understand how passwords can be compromised. There are three ways of getting around a requirement for a fixed password:

Find a way to get the system to tell you the password.
Find a way to authenticate without knowing the password.
Guess what the password is.

Systems like Telnet, where the password is passed across the network in cleartext, make it easy for attackers to get the actual password. So do systems where the password is stored on the computer in some way where it can be decrypted (this is common in third-party applications for Windows NT). Although this sort of problem is regrettably common, it's a pretty obvious design flaw, and systems designed with any regard for security usually manage to avoid it. Authentication that involves a cleartext or decryptable password is unacceptably poor for most purposes.

Systems that pass a hashed password around, but use the same hashed password each time, give attackers a way to authenticate without knowing the password. They can simply grab the hashed string and use it. This is not as convenient as knowing the actual password, since you can't use normal client programs and you usually don't get more than one kind of access, but it's still quite sufficient to do damage. Authentication that uses reusable tokens is also unacceptably poor for most purposes.

These days, most systems avoid these pitfalls, leaving attackers to guess what the password is. The only way to actually prevent password guessing is to use true one-time passwords where the password is different every time. However, it is possible to make it much more difficult by making the passwords relatively unpredictable and by making it hard to check the accuracy of guesses.

The first step in making passwords unpredictable is to make certain that there are a large number of possible passwords. This means allowing a wide variety of different characters and as many characters as possible. The mathematics are not intuitive to many people, who have a habit of thinking that if you double the size of the character set or the length of the password, you will double the number of possible passwords. Actually, you do much better than that. Suppose that you have a one-character password, and it can be any lowercase letter. In that case, there are 26 possible passwords. Adding a second character gets you 26 squared, or 676. If you allow both upper- and lowercase, that doubles the one-character case (there are now 52 possibilities), but it brings two characters up to 52 squared, or 2,704.

A standard Unix password is 8 characters long. The size of the character set it uses is a matter of some dispute; theoretically, any ASCII character, including control characters, is acceptable, giving roughly 127 characters. Most of them are difficult to type at best and frequently impossible to get typed in rather than interpreted. In practice, it's more reasonable to assume that the possible characters are limited to the ones actually on a keyboard, which leaves 84 characters. This gives roughly 2.47 * 10¹⁵ possibilities (247 trillion in the United States, 247 billion elsewhere).[132] The algorithm used to produce the strings stored in Unix password files adds two additional "salt" characters. The "salt" makes it more difficult (by a factor of 4,096) to create a dictionary or index that could be used to look up the results of the algorithm to find the password.

[132]The United States and other English-speaking countries use different names for large numbers. Since English-speakers in general are not able to convert the names easily and may be unaware of the problem, we provide both.

Standard Windows NT passwords are 14 characters long, again with a possible character set of about 84 characters. In theory, that would allow for 8.70 * 10²⁶ possible passwords (87 septillion in the United States, and 87 quadrillion elsewhere). This is much larger than the number of possible Unix passwords, but Windows NT passwords are often stored and transmitted using an old format known as LanMan, which is also the name of many other things. The LanMan format greatly decreases the number of possible passwords. First, LanMan is case-insensitive; it uppercases all letters, so it uses only 68 characters. With only 68 characters, this would theoretically give 4.52 * 10²⁵ possibilities for a 14-character password. Second, it splits the password into two 7-character halves and hashes them independently. This is a terrible error because it turns the 14-character LanMan password into two independent 7-character passwords. Each 7-character password only has 6.72 * 10¹² possibilities. This means that it is much easier to create a lookup dictionary for 14-character LanMan format passwords than it is for an 8-character Unix password.

any people figure that splitting the password into two halves means that there are half as many possibilities to search, but the truth is much worse than that. In general, when a 14-character password is hashed correctly it has a theoretical 6 trillion times (6 billion times, outside the United States) as many possibilities as a 7-character password and is therefore this many times more difficult to create a lookup dictionary or search. Because of the split, a 14-character password in LanMan format has only twice as many possibilities as a 7-character password, and this is a very large reduction in security.

Windows NT also has a newer hash format that hashes all 14 characters at the same time, forcing attackers to search the entire space of possible passwords. However, if a Windows NT machine is willing to support clients that use LanMan hashes for authentication, it will store passwords in the older format unless they contain characters that are illegal in LanMan passwords.

The reason that storage formats are important is that attackers have to check the validity of their guesses. It's impractical to try large numbers of possibilities by trying to log into machines. It is relatively slow to start with, and operating systems intentionally slow down responses when multiple failed attempts are made to log in to the same account. Furthermore, even the least attentive system administrator is apt to notice something wrong after the first million or so attempts. Most methods of trying to authenticate take at least a second, and at that rate, it's over 11 days to try a measly million possibilities. On the other hand, an attacker who has the password in its stored form can generate the stored form for each guess and compare them, without the delay or the risk of interacting with the system under attack. Depending on the particular stored form in use and the technology that the attacker is bringing to bear, it may be as slow as hundreds of attempts per second or as fast as millions of attempts per second. Ideally, you would like to prevent people from getting the stored form at all, but you would also like the stored form to be as difficult as possible to compare things against.

It would be possible to build a specialized Unix password cracker similar to the machine "Deep Crack",[133] which could start with a Unix password hash and find the matching password in less than one day by brute force (meaning that every possible password would be tried). It is also possible to do it with a general-purpose computer, but it would take longer. LanMan hashes are much easier to crack; modern, general-purpose computers can brute-force them in a week. Windows NT password hashes, on the other hand, are significantly more difficult than Unix password hashes.

[133]"Deep Crack" is a machine for searching for DES keys; it was built by the Electronic Frontier Foundation and described in the book Cracking DES: Secrets of Encryption Research, Wiretap Politics, and Chip Design, by the Electronic Frontier Foundation (O'Reilly & Associates, 1998).

Unfortunately, all this talk about brute force password cracking, password lengths, and encryption formats misses the important problem with most password systems; people pick really bad passwords. Regardless of the trillions of possible passwords that people could be using, at most sites anywhere from 30 to 70 percent of the passwords can be guessed using only thousands of possibilities (common words and women's names, in general). Few people will voluntarily use special characters or passwords of any significant length. Most of them, left to their own devices will use something they find highly memorable (supposing that they don't just use "password", of course), probably their own name or the name of somebody they love.

Automatically generated passwords may be better, or they may be worse. Systems that generate passwords for users also tend to work in a relatively small range (they have to, to get things that people will remember instead of writing down), and an attacker who knows the algorithm used will be able to search only the possible passwords the generator could come up with. Packages that provide an administrative account with a randomly generated password often do even worse, choosing something that's dependent on the machine they're installed on, or a hexadecimal representation of some number (with only 16 possible letters, it doesn't matter how impeccably random your data is, there still aren't very many different passwords to search).