Key Components of Cryptographic Systems (Building Internet Firewalls, 2nd Edition)

C.2. Key Components of Cryptographic Systems

Cryptography is used for multiple things, and cryptographic systems are built out of multiple parts. Encryption is the best-known and most obvious technique, but in order to understand how cryptography is used, you need to understand several other techniques as well, including those used for cryptographic hashing, integrity protection, and the generation of random numbers.

C.2.1. Encryption

Encryption is the process of reversibly hiding information. When you encrypt something, you take a piece of data (called the plaintext) and apply a process that produces another piece of data (called the ciphertext). There must also be a process for turning the ciphertext back into plaintext.

It's not practical to think up a new algorithm every time you want to encrypt something. Useful encryption algorithms therefore use an extra piece of data called a key. In order to decrypt data, you need not only the decryption algorithm, but also the key; this means that you can use the same algorithm to encrypt different things and share them with different people. If an encryption algorithm doesn't use a key, then as soon as you know the decryption algorithm, you can decrypt all the things encrypted with the algorithm.

Cryptographic wisdom considers an encryption process secure when the only way to recover a plaintext message from ciphertext is to know or discover the key. Furthermore, this rule should still be true even if every other detail about the encryption process is known. In fact, cryptographers become very concerned about the security of an algorithm when more than a key must be kept secret (for instance, when the algorithm or an important part of it is not disclosed).

There are some things that encryption cannot do. When you encrypt something, you conceal its content, which doesn't necessarily conceal everything useful about it. Attackers may be able to gain important information by looking at the lengths of messages, or the times they're sent, or the senders and receivers; none of that will be changed by encryption.

Furthermore, encrypting something does not protect it from being changed. Somebody who has access to the ciphertext can change it. They won't be able to predict exactly what plaintext you'll get, but it will almost certainly be different. This sort of tampering is normally easy for a human to detect in text but not so if the plaintext was binary data. Since computers are unable to comprehend text, they cannot detect tampering by inspection as humans can. In order to protect a message from modification, and in a way that a computer can detect, you must use integrity protection, which we will discuss. Although some integrity protection systems use encryption, encryption by itself does not provide integrity protection.

C.2.1.1. Kinds of encryption algorithms

There are two main families of encryption algorithms: symmetric and public key. A symmetric algorithm uses one key for both encryption and decryption. A public key algorithm uses two matching keys; when you encrypt with one of the keys, you use the other one to decrypt.

The primary characteristic of an encryption algorithm is its strength, which is the amount of effort it takes for somebody to turn the ciphertext into plaintext without having the key. There's no absolute measure of the strength of an encryption algorithm. It's possible to say that some algorithms are very weak (for instance, replacing every letter with the letter three characters later in the alphabet, so that "b" becomes "e" and "c" becomes "f", is definitely a very weak algorithm), and it's possible to say that some algorithms have no known weaknesses. When we discuss particular algorithms, we will mention algorithms that are known to be attackable, or weak. A wide range of algorithms are available that don't have known weaknesses, and in most circumstances, you can and should choose one of them.

In the real world, strength is not the only important characteristic of an encryption algorithm. Some algorithms are much faster to execute than others, for instance, and the speed may depend on the kind of processor you have available. Some algorithms require more memory than others. Some algorithms have legal restrictions on their use, which are different in different countries. Some algorithms are patented and are expensive or difficult to license.

Public key and symmetric algorithms differ on all of these fronts (most notably, symmetric algorithms are much faster than public key algorithms), but the most important difference between them comes from the extra capabilities of public key algorithms. With a symmetric key algorithm, anybody who can encrypt a message can also decrypt the message, and it is therefore important to keep the one key secure so that it is known only by the parties who are trying to communicate. With a public key algorithm, you can make one key public while keeping the other one completely private, so that only you know it.

Having a public key and a private key opens up a number of new uses of cryptography. For instance, if you encrypt something with your private key, you don't keep the data secret (anybody who has the public key can read it), but you do prove that you encrypted it (because only you have the private key). This is called signing. Digital signatures are discussed in detail later in this appendix. Anybody who wants to send you something secret can encrypt it with your public key, and be sure that only you can read it. In fact, some ways of using public key technology don't actually allow you to conceal data; you can only use them for authentication; some public key algorithms are not encryption algorithms.

Just as there are differences between public key algorithms and symmetric algorithms, there are differences among symmetric algorithms. Some work on fixed-sized chunks of data and are called block ciphers. Others, called stream ciphers, work on an arbitrary sequence of bits or bytes. There are various ways, called modes, to extend a block cipher so that it can encrypt more than just a single block of data. Stream ciphers are naturally designed to handle an arbitrarily sized stream of data.

The encryption of variable amounts of data is usually called bulk encryption. In this case, anything bigger than about 64 bits is considered "bulk". Almost all bulk encryption is done with symmetric algorithms because of the speed difference between symmetric and public key encryption. It is frequently desirable to use bulk encryption in situations where the communicating parties don't already have a common symmetric encryption key. An extremely common way to solve this problem is to combine public key cryptography with symmetric key cryptography. For instance, the PGP package, commonly used for bulk encryption of electronic mail, uses a symmetric key algorithm to encrypt the body of a mail message, and then uses public key encryption to encrypt the symmetric key and sends the encrypted symmetric key with the message.

C.2.1.2. Encryption algorithms and key length

One of the important ways algorithms differ is in the keys they use. As we've discussed before, in order to decrypt something that was encrypted with a strong algorithm, you have to know what the key is. Therefore, one way that people attack encryption algorithms is to try to figure out the key. There are obviously many ways to go about this, most of which are not under the control of the encryption algorithm (for instance, you can go around looking for places where the key is recorded and not adequately secured).

However, some ways of figuring out keys are based on how the algorithm works. With a symmetric key algorithm, the easiest way to figure out a key is to try all the possible keys until you find the correct one (knowing you have found the correct key is a separate, and tricky, problem). The more possible keys there are, the harder it is. With a public key algorithm, you can try to calculate a private key based on the public key using existing mathematical techniques if you know that you have a fast enough computer with enough memory, or you can invent a new mathematical theory for solving the equations used in the public key algorithm. Either of these methodologies will be easier than trying all possible private keys.

For any given algorithm, the longer the key is, the harder it is to find it out. On the other hand, you can't directly compare key lengths between different kinds of algorithms because the ways that they can be attacked are different. A 128-bit key is a pretty strong key for a symmetric algorithm but a pretty weak one for most public key algorithms. This is because 128 bits is a lot of keys to search but not large enough to prevent mathematical attacks on public key algorithms. Since different public key algorithms use different relationships between the private key and the public key, key lengths can't always be compared even between different public key algorithms.

If you know that trying all possible keys is the only way to find a key, you can be reasonably confident about your security; the speed of light imposes a theoretical limit on how fast computations can be performed, so if the key is big enough, nobody can be sure to find it within the estimated lifetime of the universe. (It's always possible to get lucky when you're trying keys, but you're much more likely to win a lottery.) The key length required to be big enough is surprisingly short; it's under 128 bits.

The situation with public keys is not as simple. Currently known techniques for finding private keys are difficult on today's computers, but it has yet to be proven that there are no faster techniques.

It's important to distinguish between key length and block length when discussing block ciphers. These are often but not always the same, and the block length is the length most often given. When somebody refers to a "64-bit block cipher", that is the length of the block, and not necessarily the length of the key. This can be horribly confusing, since for other ciphers the key length is the only thing normally specified in bits like that. Pay close attention to which length is being specified; a "64-bit public key algorithm" has a 64-bit key, while a "64-bit block cipher" might have any key length at all.

C.2.2. Cryptographic Hashes, Checksums, and Message Digests

A checksum is a number calculated on a given set of data that is designed to detect changes or errors in communication of that data. This is useful for a communications channel; if a sender calculates a checksum as data is being sent and a receiver does the same as it is being received, the two can simply compare checksums to see if the data arrived intact or if an error occurred during transmission. Another use might be to store the checksum and then repeat the calculation at a later time. If the two checksums are different, then something changed in the data.

A checksum is usually just several bytes long and will take up much less space than the original data. While this makes them easier to store, it also means that there must be situations where a checksum will give the same answer for two different pieces of data. This is called a collision, and checksum algorithms are designed to make collisions unlikely to occur for the differences they are designed to detect. Checksum algorithms for communications are designed to detect random bursts of errors or chunks of missing data because they are the kinds of differences that you often get on a telephone line or a radio transmission (to humans listening, these errors sound like clicks or pops).

What if the error was not random and a deliberate change was intended? If so, is it possible to make a deliberate change and keep the checksum the same? For many checksums, this is certainly possible because the checksums are not designed to make this difficult. There are ways to design checksum algorithms so that it is impossibly difficult to make a deliberate change and still have the checksum match. Algorithms designed this way are called cryptographic hash functions, cryptographic checksums, or message digest functions.

Note that the terminology used for the different techniques and uses of cryptographic hashes is confusing and overlapping. As a result, terms are not used very consistently; about the best you can say is that any term involving integrity, digest, or the acronym MAC (Message Authentication Code) probably refers to some process that uses some kind of hash. If you care about the details, investigate them, rather than trusting that terms are used consistently from one document to another.

The term hash comes from another situation where it is useful to have a short fixed-length string that you can generate consistently from a larger string and that has large changes in the output as a result of small changes to the input. Hash algorithms and checksum algorithms are not always used for the same purposes, but if you extend either concept for cryptographic security, you reach the following set of conditions for a cryptographic hash:

It must be practically impossible to deliberately create data that has a hash value that matches another. This can be achieved by designing the algorithm so that it cannot be reversed and run backwards (you can't start with a hash value and use a method to create data that computes to the same hash).
The hash must be of a large enough size so that you cannot create a list of files, one for each value the hash can have, and match a given hash that way. In practical terms, this means that a useful hash should be at least 128 bits and preferably 160 bits or more in size.
If you change something only very slightly in the data, the hash will change by a large amount. Changing one bit in the data should change about half the bits in the hash. This makes it hard for people to slowly adjust individual bits in the data to achieve a desired hash value.

There are two principal uses for cryptographic hashes. First, they are used to detect changes in data. If you have a cryptographic hash for a piece of data and the hash is kept secure, you can be sure that if you recalculate the hash and it is the same, then the data hasn't been modified. This is the basis for digital signatures, which are discussed later in this appendix.

Cryptographic hashes are also frequently used in authentication systems. In most cases, passwords are not encrypted, but hashed. If you use encryption on a password, it is possible to decrypt it and get back the password, which an attacker could then use somewhere else. This makes storing and sending encrypted passwords dangerous. Instead, secure systems store a hash of the password and, when a user wants to authenticate to the system, compute a hash of the password provided by the user. If the hashes match, the user must know the correct password.

This technique does not help if a user can directly provide the hash, instead of the password (for instance, if the user is logging in over a network and what the system actually receives is the hash). If the hash is sent around, it simply becomes another fixed and reusable password. (This is discussed further in Chapter 21, "Authentication and Auditing Services".) For this reason, good network authentication systems use a random piece of data that changes on every transaction, often called a nonce. Adding this changing value into the information that gets hashed and exchanged prevents eavesdroppers from reading and reusing the hashes.

It is possible to use a block cipher in a special way to calculate a cryptographic hash. When a cipher is used like this, the resulting checksum may be called a Message Authentication Code (MAC), a Message Integrity Code (MIC), or a Message Digest Check (MDC).

The use of block ciphers for calculating cryptographic hashes is often seen in older cryptographic protocols. In particular, they frequently use the Data Encryption Standard (DES) to produce 64-bit values. Most modern cryptographic protocols explicitly use cryptographic hash algorithms. One reason is that cryptographic hash algorithms tend to produce results that are 128 to 160 bits in size. This significantly reduces the chances of being able to come up with a different piece of data that produces the same hash value.

C.2.3. Integrity Protection

One of the most important uses of cryptographic hashes is to provide integrity protection. A cryptographic hash can be used to verify that no changes to data have been made, but this won't work if you include the hash value with the data. All someone wishing to change the data would have to do is replace the hash value with a new value computed from the changed data. Some way is needed to protect the cryptographic hash from being changed to the desired new value.

One way is to encrypt the cryptographic hash using a public key algorithm; this is essentially a digital signature, as we discuss later. Because of the slowness of public key encryption, it is not always a usable solution. Alternatives have been developed that are based on the calculation of cryptographic hashes, which include a secret key as part of the calculation. Without that key, someone wishing to change the data cannot recalculate a new and valid checksum. Many recent Internet protocols use a method called HMAC [188] for combining a key with a cryptographic hash. The HMAC technique can be used with any cryptographic hash function that produces at least 128 bits of output. HMAC is described in RFC 2104.

[188]HMAC, although it is capitalized like an acronym, does not have an official expansion in the standard. In the paper originally describing the algorithm, it stood for "Hash-based Message Authentication Code".

C.2.4. Random Numbers

Many parts of cryptography depend on being able to generate random numbers, or at least numbers that can't be predicted. For instance, if you need to send authentication information across the network in a form that's not reusable, you will want an unpredictable changing value to add in with the information. If an attacker can figure out what that value is, the attacker may be able to use the value to authenticate or to figure out the authentication data for later use.

Computers are very bad at being random because they are designed to be able to repeatedly calculate the same answer if given the same data to work with. If you have an algorithm for giving you random numbers, you will get the same list of random numbers whenever you run it. This is why random number generators on computers are called pseudo-random. Knowing the algorithm and the starting information, and having a faster computer, would allow you to predict what the numbers were going to be. Truly random numbers cannot be predicted.

So where can a source of acceptable random numbers come from? It is easy if the computer has some hardware to generate random numbers; if not, then some random information can be obtained from peripheral devices attached to the computer. It has to be done very carefully because it is easy to overestimate how random some sources of information are. Sampling a fast running clock is not really a good way to get random numbers because typically only a small number of bits will change each time the clock is sampled.[189]

[189]RFC 1750 is an excellent source of information about random numbers for use in security applications.

Several free Unix operating systems, including Linux, have special code as part of the low-level device drivers, which continuously collects random input events from the keyboard and other hardware devices and allows applications to obtain a source of random data.