Chapter 2. Java Language Security
The first components of the Java sandbox that we will examine are those components that are built into the Java language itself. These components primarily protect memory resources on the user's machine, although they have some benefit to the Java API as well. Hence, they are primarily concerned with guaranteeing the integrity of the memory of the machine that is hosting a program: in a nutshell, the security features within the Java language want to ensure that a program will be unable to discern or modify sensitive information that may reside in the memory of a user's machine. In terms of applets, these protections also mean that applets will be unable to determine information about each other; each applet is given, in essence, its own memory space in which to operate.
In this chapter, we'll look at the features of the Java language that provide this type of security. We'll also look at how these features are enforced, including a look at Java's bytecode verifier. With a few exceptions, the information in this chapter is largely informational; because the features we are going to discuss are immutable within the Java language, there are fewer programming considerations than we'll find in later chapters. However, the information we'll present here is crucial in understanding the entire Java security story; it is very helpful in ensuring that your Java environment is secure and in assessing the security risks that Java deployment might pose. The security of the Java environment is dependent on the security of each of its pieces, and the Java language forms the first fundamental piece of that security.
As we discuss the language features in this chapter, keep in mind that we're only dealing with the Java language itself--as is the common thread of this book, all security features we're going to discuss do not apply when the language in question is not Java. If you use Java's native interface to run arbitrary C code, that C code will be able to do pretty much anything it wants to do, even when it violates the precepts we're outlining in this chapter.
2.1. Java Language Security Constructs
In this chapter, we're going to be concerned primarily with how Java operates on things that are in memory on a particular machine. Within a Java program, every entity--that is, every object reference and every primitive data element--has an access level associated with it. To review, this access level may be:
The notion of assigning data entities an access level is certainly not exclusive to Java; it's a hallmark of many object-oriented languages. Since the Java language borrows heavily from C++, it's not surprising that it would borrow the basic notion of these access levels from C++ as well (although there are slight differences between the meanings of these access modifiers in Java and in C++).
As a result of this borrowing, the use of these access modifiers is generally thought of in terms of the advantage such modifiers bring to program design: one of the hallmarks of object-oriented design is that it permits data hiding and data encapsulation. This encapsulation ensures that objects may only be operated upon through the interface the object provides to the world, instead of being operated upon by directly manipulating the object's data elements. These and other design-related advantages are indeed important in developing large, robust, object-oriented systems. But in Java, these advantages are only part of the story.
In a language like C++, if I create a CreditCard object that encapsulates my mother's maiden name and my account number, I would probably decide that those entities should be private to the object and provide the appropriate methods to operate on those entities. But nothing in C++ prevents me from cheating and accessing those entities through a variety of back-door operations. The C++ compiler is likely to complain if I write code that attempts to access a private variable of another class, but the C++ runtime isn't going to care if I convert a pointer to that class into an arbitrary memory pointer and start scanning through memory until I find a location that contains a string with 16 digits--a possible account number. In C++ systems, no one typically worried about such occurrences because all parts of the system were presumed to originate from the same place: it's my program, and if I want to work around my data model to get access to that data, so be it.
Things change with Java. I might be surfing to play some cool game applet on www.EvilSite.org, and then I might go shopping at www.Acme.com. When my Java wallet applet runs, I'd hate for the applet that is still running from www.EvilSite.org to be able to access the private CreditCard object that's contained in my Java wallet--and while it's necessary for www.Acme.com to know that I have a valid CreditCard object, I don't necessarily feel comfortable telling them my mother's maiden name. Because I'm now in the midst of a dynamic system with active programs from multiple sites, I need to make sure that the data entities are accessed by only those objects that are supposed to have access to them. It's obvious that I want protection from EvilSite.org, whom I don't want to know about the CreditCard object contained in my Java wallet. But I also want to be protected from Acme.com, a site I feel relatively comfortable about, but who should not be granted access to all the data elements of an object that it must use.
This is only one example of why the Java platform must provide memory integrity--that is, it must ensure that entities in memory are accessed only when they are allowed to be, and that these entities cannot be somehow corrupted. To that end, Java always enforces the following rules:
These are the techniques by which the Java language ensures that memory locations are read and written only when such access should normally be allowed. This restriction protects the user's machine from the outside: if I download an applet onto my machine, I don't want that applet accessing the private variables of my CreditCard class. However, if that applet has a private variable within it, nothing prevents me (depending on my operating system) from using a program outside of the browser to scan the memory on my system and figure out somehow what value that particular variable has. Similarly, nothing prevents me from having another program outside the browser change the value of a particular variable that is held in memory on my machine.
If you're an applet developer and are worried about this type of problem, you're pretty much on your own to come up with a solution to it. This might be particularly troublesome if you had, say, a variable somewhere in your applet that held a Boolean value indicating whether or not the user was licensed for a particular operation; a very clever user can go outside the browser and manipulate the machine's memory so that the integrity of your licensing scheme is violated. This problem is not new to Java, but it's not solved by Java either.
2.1.1. Object Serialization and Memory Integrity
There is one general exception to the rules about public, private, and protected access in Java. Object serialization is a feature of Java that allows an object to be written as a series of bytes; when those bytes are read someplace else, a new object is created that has the same state as the original object. Object serialization has two main purposes: it's used extensively in the RMI API to allow clients and servers to exchange objects, and it's used whenever you need to save a particular object to disk and want to recreate the object at some later point in time.
The murky issue here is just what constitutes an object's state. In the case of our CreditCard object, the account number is pretty basic to creating that object, but it's a variable that needs to be private for the reasons we've been discussing. In order for object serialization to work, it must have access to those private variables so it can correctly save and restore the object's state. That's why the object serialization API can access and save all private variables of an object (as well as its default, protected, and public variables). Similarly, the object serialization API is able to store those values back into the private data members when the object is actually reconstituted.
Depending on your perspective, this is a good thing or a bad thing. From a security perspective, it can be a bad thing: if the CreditCard object is saved to disk, something else can come along and read all that information from the disk file. Worse yet, the file could be edited in such a way that the object will be recreated in a completely different state than it originally had, with potentially damaging results.
In theory, this is the same problem we just discussed about influences outside the browser being able to read and write the private data of objects that are held in memory (which may help to explain why object serialization works this way by default). In practice, however, it's much easier to change the data in a binary file than to figure out how to access and change the value of an object in memory. Hence, object serialization has two additional mechanisms associated with it that make it more secure.
The first of these is that object serialization can only occur on objects that implement the java.io.Serializable interface (or its subclass, the java.io.Externalizable interface). The Serializable interface requires no methods, so it can be thought of simply as a flag to the virtual machine that says: "Hey, virtual machine--I've thought about the security aspects of this class, and it's okay if you serialize it by writing out all its data." By default, an object is not serializable, lest its internal private state be violated.
The second of these mechanisms is that object serialization respects the transient keyword associated with a variable: if our account number in the CreditCard class were declared as private transient, then object serialization would not be allowed to read or write that particular variable. This lets us design classes that can be stored and reconstituted without showing their private data to the world.
Of course, a CreditCard object without an account number is worthless; what we really need is something that can save and reconstitute the transient data in such a way that the data can't be compromised. This can be achieved by having our class implement the writeObject() and readObject() methods. The writeObject() method is responsible for writing out all data in the class; it typically uses the defaultWriteObject() method to write out all non-transient data, and then it writes the transient data out in any format it desires. Similarly, the readObject() method uses the defaultReadObject() method to read the data and then must restore the corresponding transient data. It's your decision how to save and reconstitute the transient data so that its integrity is preserved, but this will mean that you'll want to use one of the encryption APIs we'll discuss in Chapter 13, "Encryption".
Storing and reconstituting the transient data can also be achieved by implementing the Externalizable interface and implementing the writeExternal() and the readExternal() methods of that interface. The difference in this case is that these two methods are now responsible for saving and reconstituting the entire state of the object--no data can be stored or reconstituted by any default methods.
Using either of these techniques, you have the ability to protect any sensitive data contained in your objects, even if you choose to share those objects over the network or save those objects to some sort of persistent storage.
Copyright © 2001 O'Reilly & Associates. All rights reserved.