Chapter 9. Message Digests

In this chapter, we're going to look at the API that implements the ability to create and verify message digests. The ability to create a message digest is one of the standard engines provided by the Sun default security provider. You can therefore reasonably expect every Java implementation to create message digests.

Message digests are the simplest of the standard engines that compose the security provider architecture, so they provide a good starting point in our examination of those engines. In addition, message digests provide the first link in creating and verifying a digital signature--the most important goal of the provider architecture. However, message digests are useful entities in their own right, since a message digest can verify that data has not been tampered with--up to a point. As we'll see, there are certain limitations on the security of a message digest that is transmitted along with the data it represents.

Message digests are implemented through a single class:

public abstract class MessageDigest extends MessageDigestSpi: Implement operations to create and verify a message digest.

In Java 1.1, there is no MessageDigestSpi class, and the MessageDigest class simply extends Object. That difference is important only if you want to implement your own message digest class, which we'll do later in the chapter.

Like all engines in the Java security package, the MessageDigest class (java.security.MessageDigest) is an abstract class; it defines an interface that all message digests must have, but the implementation details of a particular message digest class are hidden in the private classes that accompany a security provider. This allows a developer to use the message digest class without knowing the details of a message digest implementation by operating on the public methods of the message digest class, and it allows providers of a security package to implement their own message digests by implementing the abstract methods of the class. We'll examine the message class from the perspectives of both developer and implementor in this chapter.

9.1. Using the Message Digest Class

For a developer who wants to operate on a message digest, the first step is to obtain an instance of the message digest class. Since the message digest class is abstract, this cannot be done directly; instead, the developer must use one of these methods:

public static MessageDigest getInstance(String algorithm)
public static MessageDigest getInstance(String algorithm, String provider): Return an instance of the message digest class that implements the given algorithm. In the first case, the security providers are searched in order following the process we outlined in Chapter 8, "Security Providers"; otherwise, only the given provider is searched. Valid names for the default Sun security provider are SHA, SHA-1, and MD5. If no provider can be found that implements the given algorithm, a NoSuchAlgorithmException is thrown. If the named provider cannot be found, a NoSuchProviderException is thrown.

Once a message digest object has been obtained, the developer can operate on that object with these methods:

public void update(byte input)

public void update(byte[] input)

public void update(byte[] input, int offset, int length)

Add the specified data to the digest. The first of these methods adds a single byte to the data, the second adds the entire array of bytes, and the third adds only the specified subset of the array of data.

These methods may be called in any order and any number of times to add the desired data to the digest. Consecutive calls to these methods append data to the internal accumulation of data over which the digest will be calculated.

public byte[] digest()

public byte[] digest(byte[] input)

Compute the message digest on the accumulated data (optionally adding the specified data before performing the computation). The resulting digest is returned as a byte array. Once a digest has been calculated, the internal state of the algorithm is reset, so that the object may be reused at this point to create a new message digest.

public int digest(byte[] output, int offset, int len)

Compute the message digest on the accumulated data and place the answer into the provided array, starting at the given offset and copying at most len bytes. Most implementations do not return a partial digest, so if the amount of space in the buffer (taking into account its offset) is not sufficient to store the digest, a DigestException is thrown. This method returns the size of the digest.

public static boolean isEqual(byte digestA[], byte digestB[])

Compare two digests for equality. Two digests are considered equal only if each byte in the first digest is exactly equal to each byte in the second digest and the digests are the same length.

public void reset()

Reset the digest object by discarding all accumulated data and resetting the algorithm that is used to implement the digest. This is equivalent to creating a new instance of the object. In addition, this method throws away any information that the toString() method would have printed (see below).

public final String getAlgorithm()

Return the string representing the algorithm name (e.g., SHA).

public String toString()

A string representation of a digest by default contains the name of the class implementing the digest, the words "Message Digest," and the bytes that were returned by a previous call to the digest() method. If the digest() method has not been called, or if the reset() method has been called, then "<incomplete>" is printed instead of the digest. An example string looks like:

Class Definition

sun.security.provider.SHA Message Digest \
		<0a808982fee54fd74a86aae72eff7991328ff32b>

public Object clone() throws CloneNotSupportedException

Return a clone of the object. Message digest implementations need to implement the clone() method because some internal operations on the digest object require a call to the digest() method, which resets the digest. These operations are typically done on a clone of the object so that the state of the original object is not changed.

public final int getDigestLength()

Return the length of array of bytes that are returned from the digest() method. This value is usually constant (i.e., it does not depend on the amount of data that has been sent through the update() method).

Let's see an example of how all of this works. As a simple case, let's say that we want to save a simple string to a file, but we're worried that the file might be corrupted when we read the string back in. Hence, in addition to saving the string, we must save a message digest. We do this by saving the serialized string object followed by the serialized array of bytes that constitute the message digest.

In order to save the pieces of data, we use this code:

Class Definition

public class Send {
	public static void main(String args[]) {
		try {
			FileOutputStream fos = new FileOutputStream("test");
			MessageDigest md = MessageDigest.getInstance("SHA");
			ObjectOutputStream oos = new ObjectOutputStream(fos);
			String data = "This have I thought good to deliver thee, "+
				"that thou mightst not lose the dues of rejoicing " +
				"by being ignorant of what greatness is promised thee.";
			byte buf[] = data.getBytes();
			md.update(buf);
			oos.writeObject(data);
			oos.writeObject(md.digest());
		} catch (Exception e) {
			System.out.println(e);
		}
	}
}

That's all there is to creating a digest of some data. The call to the getInstance() method finds a message digest object that implements the SHA message digest algorithm. After creating our data--which in this case is a simple string--we pass that data to the update() method of the message digest. In practice, this code could be slightly more complicated, since all the data might not be available at once. As far as the message digest object is concerned, though, that situation would just require multiple calls to the update() method instead of a single call (it can also be handled with digest streams, which we'll examine next). Once we've loaded all the data into the object, it is a simple matter to create the digest itself (with the digest() method) and then save our data objects to the file.

Similarly, to retrieve this data we need only read the object back in and verify the message digest. In order to verify the message digest, we must recompute the digest over the data we received and test to make sure the digest is equivalent to the original digest:

Class Definition

public class Receive {
	public static void main(String args[]) {
		try {
			FileInputStream fis = new FileInputStream("test");
			ObjectInputStream ois = new ObjectInputStream(fis);
			Object o = ois.readObject();
			if (!(o instanceof String)) {
				System.out.println("Unexpected data in file");
				System.exit(-1);
			}
			String data = (String) o;
			System.out.println("Got message " + data);
			o = ois.readObject();
			if (!(o instanceof byte[])) {
				System.out.println("Unexpected data in file");
				System.exit(-1);
			}
			byte origDigest[] = (byte []) o;
			MessageDigest md = MessageDigest.getInstance("SHA");
			md.update(data.getBytes());
			if (MessageDigest.isEqual(md.digest(), origDigest))
				System.out.println("Message is valid");
			else System.out.println("Message was corrupted");
		} catch (Exception e) {
			System.out.println(e);
		}
	}
}

Once again, if the data was not available all at once, we would need to make multiple calls to the update() method as the data arrived. We do not, however, need to make sure that calls to the update() methods between the Send and Receive classes match in any sense; that is, if we called the update() method four times in the Send class, we do not need to call the update() method four times (with the same data) in the Receive class--we can call it once, five times, or whatever. The calculation of the digest is unaffected by how the data was placed into the message digest object--as long as the order of the bytes presented to the various calls to the update() methods is the same.

9.1.1. Secure Message Digests

As we stated in Chapter 7, "Introduction to Cryptography", the message digest by itself does not give us a very high level of security. We can tell whether somehow the output file in this example has been corrupted, because the text that we read in won't produce the same message digest that was saved with the file. But there's nothing to prevent someone from changing both the text and the digest stored in the file in such a way that the new digest reflects the altered text.

There are various ways in which a message digest can be made into a Message Authentication Code (MAC), but the Java security API does not provide any standard techniques for doing so. One popular way is to encrypt the message digest using the encryption engine (if one is available to you)--which, in fact, is really a variation of a digital signature.

If we are not able to encrypt the digest, all is not lost; we can also use a passphrase along with the message digest in order to calculate a secure message digest (or MAC). This requires that both the sender and receiver of the data have a shared passphrase that they have kept secret.

Using this passphrase, calculating a MAC requires that we:

Calculate the message digest of the secret passphrase concatenated with the data:

Class Definition

MessageDigest md = MessageDigest.getInstance("SHA");
String data = "This have I thought good to deliver thee, " +
				"that thou mightst not lose the dues of rejoicing " +
				"by being ignorant of what greatness is promised thee.";
String passphrase = "Sleep no more";
byte dataBytes[] = data.getBytes();
byte passBytes[] = passphrase.getBytes();
md.update(passBytes);
md.update(dataBytes);
byte digest1[] = md.digest();

Calculate the message digest of the secret passphrase concatenated with the just-calculated digest:

Class Definition
```
md.update(passBytes);
md.update(digest1);
byte mac[] = md.digest();
```

We can substitute this code into our original Send example, writing out the data string and the MAC to the file. Note that we can use the same message digest object to calculate both digests, since the object is reset after a call to the digest() method. Also note that the first digest we calculate is not saved to the file: we save only the data and the MAC. Of course, we must make similar changes to the Receive example; if the MACs are equal, the data was not modified in transit.

As long as we use exactly the same data for the passphrase in both the transmitting and receiving class, the message digests (that is, the MACs) still compare as equal. That gives a certain level of security to the message digest, but it requires that the sender and the receiver agree on what data to use for the passphrase; the passphrase cannot be transmitted along with the text. In this case, the security of the message digest depends upon the security of the passphrase. Normally, of course, you would prompt for that passphrase rather than hardcoding into the source as we've done above.

Chapter 9. Message Digests

Contents:

9.1. Using the Message Digest Class

Class Definition

Class Definition

Class Definition

9.1.1. Secure Message Digests

Class Definition

Class Definition