home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


Exploring Java

Previous Chapter 7
Basic Utility Classes
Next
 

7.4 Vectors and Hashtables

Vectors and hashtables are collection classes. Each stores a group of objects according to a particular retrieval scheme. Aside from that, they are not particularly closely related things. A hashtable is a dictionary; it stores and retrieves objects by a key value. A vector, on the other hand, holds an ordered collection of elements. It's essentially a dynamic array. Both of these, however, have more subtle characteristics in common. First, they are two of the most useful aspects of the core Java distribution. Second, they both take full advantage of Java's dynamic nature at the expense of some of its more static type safety.

If you work with dictionaries or associative arrays in other languages, you should understand how useful these classes are. If you are someone who has worked in C or another static language, you should find collections to be truly magical. They are part of what makes Java powerful and dynamic. Being able to work with lists of objects and make associations between them is an abstraction from the details of the types. It lets you think about the problems at a higher level and saves you from having to reproduce common structures every time you need them.

java.util.Vector

A Vector is a dynamic array; it can grow to accommodate new items. You can also insert and remove elements at arbitrary positions within it. As with other mutable objects in Java, Vector is thread-safe. The Vector class works directly with the type Object, so we can use them with instances of any kind of class.[3] We can even put different kinds of Objects in a Vector together; the Vector doesn't know the difference.

[3] In C++, where classes don't derive from a single Object class that supplies a base type and common methods, the elements of a collection would usually be derived from some common collectable class. This forces the use of multiple inheritance and brings its associated problems.

As you might guess, this is where things get tricky. To do anything useful with an Object after we take it back out of a Vector, we have to cast it back (narrow) it to its original type. This can be done with safety in Java because the cast is checked at run-time. Java throws a ClassCastException if we try to cast an object to the wrong type. However, this need for casting means that your code must remember types or methodically test them with instanceof. That is the price we pay for having a completely dynamic collection class that operates on all types.

You might wonder if you can subclass Vector to produce a class that looks like a Vector, but that works on just one type of element in a type-safe way. Unfortunately, the answer is no. We could override Vector's methods to make a Vector that rejects the wrong type of element at run-time, but this does not provide any new compile-time, static type safety. In C++, templates provide a safe mechanism for parameterizing types by restricting the types of objects used at compile-time. The keyword generic is a reserved word in Java. This means that it's possible that future versions might support C++-style templates, using generic to allow statically checked parameterized types.

We can construct a Vector with default characteristics and add elements to it using addElement() and insertElement():

Vector things = new Vector(); 
 
String one = "one"; 
String two = "two"; 
String three = "three"; 
 
things.addElement( one ); 
things.addElement( three ); 
things.insertElementAt( two, 1 ); 

things now contains three String objects in the order "one," "two," and "three." We can retrieve objects by their position with elementAt(), firstElement(), and lastElement():

String s1 = (String)things.firstElement();       // "one" 
String s3 = (String)things.lastElement();        // "three" 
String s2 = (String)things.elementAt(1);         // "two" 

We have to cast each Object back to a String in order to assign it a String reference. ClassCastException is a type of RuntimeException, so we can neglect to guard for the exception if we are feeling confident about the type we are retrieving. Often, as in this example, you'll just have one type of object in the Vector. If we were unsure about the types of objects we were retrieving, we would want to be prepared to catch the ClassCastException or test the type explicitly with the instanceof operator.

We can search for an item in a Vector with the indexOf() method:

int i = things.indexOf( three );                 // i = 2 

indexOf() returns a value of -1 if the object is not found. As a convenience, we can also use contains() simply to test for the presence of the object.

Finally, removeElement() removes a specified Object from the Vector:

things.removeElement( two ); 

The element formerly at position three now becomes the second element.

The size() method reports the number of objects currently in the Vector. You might think of using this to loop through all elements of a Vector, using elementAt() to get at each element. This works just fine, but there is a more general way to operate on a complete set of elements like those in a Vector.

java.util.Enumeration

The java.util.Enumeration interface can be used by any sort of set to provide serial access to its elements. An object that implements the Enumeration interface presents two methods: nextElement() and hasMoreElements(). nextElement() returns an Object type, so it can be used with any kind of collection. As with taking objects from a Vector, you need to know or determine what the objects are and cast them to the appropriate types before using them.

Enumeration is useful because any type of object can implement the interface and then use it to provide access to its elements. If you have an object that handles a set of values, you should think about implementing the Enumeration interface. Simply provide a hasMoreElements() test and a nextElement() iterator and declare that your class implements java.util.Enumeration. One advantage of an Enumeration is that you don't have to provide all values up front; you can provide each value as it's requested with nextElement(). And since Enumeration is an interface, you can write general routines that operate on all of the elements Enumeration.

An Enumeration does not guarantee the order in which elements are returned, however, so if order is important you don't want to use an Enumeration. You can iterate through the elements in an Enumeration only once; there is no way to reset it to the beginning or move backwards through the elements.

A Vector returns an Enumeration of its contents when we call the elements() method:

Enumeration e = things.elements(); 
 
while ( e.hasMoreElements() ) { 
    String s = (String)e.nextElement(); 
    System.out.println( s ): 
} 

The above code loops three times, as call nextElement(), to fetch our strings. The actual type of object returned by elements() is a VectorEnumeration, but we don't have to worry about that. We can always refer to an Enumeration simply by its interface.

Note that Vector does not implement the Enumeration interface. If it did, that would put a serious limitation on Vector because we could cycle through the elements in it only once. That's clearly not the purpose of a Vector, which is why Vector instead provides a method that returns an Enumeration.

java.util.Hashtable

As I said earlier, a hashtable is a dictionary, similar to an associative array. A hashtable stores and retrieves elements with key values; they are very useful for things like caches and minimalist databases. When you store a value in a hashtable, you associate a key with that value. When you need to look up the value, the hashtable retrieves it efficiently using the key. The name hashtable itself refers to how the indexing and storage of elements is performed, as we'll discuss shortly. First I want to talk about how to use a hashtable.

The java.util.Hashtable class implements a hashtable that, like Vector, operates on the type Object. A Hashtable stores an element of type Object and associates it with a key, also of type Object. In this way, we can index arbitrary types of elements using arbitrary types as keys. As with Vector, casting is generally required to narrow objects back to their original type after pulling them out of a hashtable.

A Hashtable is quite easy to use. We can use the put() method to store items:

Hashtable dates = new Hashtable(); 
 
dates.put( "christmas", 
    new GregorianCalendar( 1997, Calendar.DECEMBER, 25 ) ); 
dates.put( "independence", 
    new GregorianCalendar( 1997, Calendar.JULY, 4 ) ); 
dates.put( "groundhog", 
    new GregorianCalendar( 1997, Calendar.FEBRUARY, 2 ) ); 

First we create a new Hashtable. Then we add three GregorianCalendar objects to the hashtable, using String objects as keys. The key is the first argument to put(); the value is the second. Only one value can be stored per key. If we try to store a second object under a key that already exists in the Hashtable, the old element is booted out and replaced by the new one. The return value of the put() method is normally null, but if the call to put() results in replacing an element, the method instead returns the old stored Object.

We can now use the get() method to retrieve each of the above dates by name, using the String key by which it was indexed:

GregorianCalendar g = (GregorianCalendar)dates.get( "christmas" ); 

The get() method returns a null value if no element exists for the given key. The cast is required to narrow the returned object back to type GregorianCalendar. I hope you can see the advantage of using a Hashtable over a regular array. Each value is indexed by a key instead of a simple number, so unlike a simple array, we don't have to remember where each GregorianCalendar is stored.

Once we've put a value in a Hashtable, we can take it back out with the remove() method, again using the key to access the value:

dates.remove("christmas"); 

We can test for the existence of a key with containsKey():

if ( dates.containsKey( "groundhog" ) ) {    // yes 

Just like with a Vector, we're dealing with a set of items. Actually, we're dealing with two sets: keys and values. The Hashtable class has two methods, keys() and elements(), for getting at these sets. The keys() method returns an Enumeration of the keys for all of the elements in the Hashtable. We can use this Enumeration to loop through all of the keys:

for (Enumeration e = dates.keys(); e.hasMoreElements(); ) { 
    String key = (String)e.nextElement(); 
    ... 
} 

Similarly, elements() provides an Enumeration of the elements themselves.

Hashcodes and key values

If you've used a hashtable before, you've probably guessed that there's more going on behind the scenes than I've let on so far. An element in a hashtable is not associated with its key by identity, but by something called a hashcode. Every object in Java has an identifying hashcode value determined by its hashCode() method, which is inherited from the Object class. When you store an element in a hashtable, the hashcode of the key object registers the element internally. Later, when you retrieve the item, that same hashcode looks it up efficiently.

A hashcode is usually a random-looking integer value based on the contents of an object, so it's different for different instances of a class. Two objects that have different hashcodes serve as unique keys in a hashtable; each object can reference a different stored object. Two objects that have the same hashcode value, on the other hand, appear to a hashtable as the same key. They can't coexist as keys to different objects in the hashtable.

Generally, we want our object instances to have unique hash codes, so we can put arbitrary items in a hashtable and index them with arbitrary keys. The default hashCode() method in the Object class simply assigns each object instance a unique number to be used as a hashcode. If a class does not override this method, each instance of the class will have a unique hashcode. This is sufficient for most objects.

However, it's also useful to allow equivalent objects to serve as equivalent keys. String objects provide a good example of this case. Although Java does its best to consolidate them, a literal string that appears multiple times in Java source code is often represented by different String objects at run-time. If each of these String objects has a different hash code, even though the literal value is the same, we could not use strings as keys in a hashtable, like we did the in above examples.

The solution is to ensure that equivalent String objects return the same hashcode value so that they can act as equivalent keys. The String class overrides the default hashCode() method so that equivalent String objects return the same hash code, while different String objects have unique hashcodes. This is possible because String objects are immutable; the contents can't change, so neither can the hashcode.

A few other classes in the Java API also override the default hashCode() method in order to provide equivalent hashcodes for equivalent objects. For example, each of the primitive wrapper classes provides a hashCode() method for this purpose. Other objects likely to be used as hashtable keys, such as Color, Date, File, and URL, also implement their own hashCode()methods.

So now maybe you're wondering when you need to override the default hashCode() method in your objects. If you're creating a class to use for keys in a hashtable, think about whether the class supports the idea of "equivalent objects." If so, you should implement a hashCode() method that returns the same hashcode value for equivalent objects.

To accomplish this, you need to define the hashcode of an object to be some suitably complex and arbitrary function of the contents of that object. The only criterion for the function is that it should be almost certain to provide different values for different contents of the object. Because the capacity of an integer is limited, hashcode values are not guaranteed to be unique. This limitation is not normally a problem though, as there are 2^32 possible hashcodes to choose from. The more sensitive the hashcode function is to small differences in the contents the better. A hashtable works most efficiently when the hashcode values are as randomly and evenly distributed as possible. As an example, you could produce a hashcode for a String object by adding the character values at each position in the string and multiplying the result by some number, producing a large random-looking integer.

java.util.Dictionary

java.util.Dictionary is the abstract superclass of Hashtable. It lays out the basic get(), put(), and remove() functionality for dictionary-style collections. You could derive other types of dictionaries from this class. For example, you could implement a dictionary with a different storage format, such as a binary tree.


Previous Home Next
Dates Book Index Properties

Java in a Nutshell Java Language Reference Java AWT Java Fundamental Classes Exploring Java