Networking in Java (Java Distributed Computing)

We saw in Chapter 1, "Introduction" how the socket and stream classes in the java.net and java.io packages could be used to do basic networking between agents. In this chapter we take a more detailed look at the networking support in Java, as the foundation for distributed systems. The topics we'll cover include:

We'll look at these topics in increasing pecking order from the networking perspective. Sockets first, since they are the most primitive communication object in the Java API; then streams, which let you impose some order on the data flowing over these sockets; next, the classes associated with the HTTP protocol, namely, the URL, URLConnection, and ContentHandler classes; finally, the ClassLoader, which, when coupled with the others, offers the ability to transmit actual Java classes over the wire.

2.1. Sockets and Streams

The java.net package provides an object-oriented framework for the creation and use of Internet Protocol (IP)[1] sockets. In this section, we'll take a look at these classes and what they offer.

2.1.1. IP Addressing

Before communicating with another party, you must first know how to address your messages so they can be delivered correctly. Notice that I didn't say that you need to know where the other party is located--once a scheme for encoding a location is established, I simply need to know my party's encoded address to communicate. On IP networks, the addressing scheme in use is based on hosts and port numbers.

A given host computer on an IP network has a hostname and a numeric address. Either of these, in their fully qualified forms, is a unique identifier for a host on the network. The JavaSoft home page, for example, resides on a host named www.javasoft.com, which currently has the IP address 204.160.241.98. Either of these addresses can be used to locate the machine on an IP network. The textual name for the machine is called its Domain Name Services (DNS) name, which can be thought of as a kind of alias for the numeric IP address.

In the Java API, the InetAddress class represents an IP address. You can query an InetAddress for the name of the host using its getHostName() method, and for its numeric address using getAddress() . Notice that, even though we can uniquely specify a host with its IP address, we do not necessarily know its physical location. I look at the web pages on www.javasoft.com regularly, but I don't know where the machine is (though I could guess that it's in California somewhere). Conversely, even if I knew where the machine was physically, it wouldn't do me a bit of good if I didn't know its IP address (unless someone was kind enough to label the machine with it, or left a terminal window open on the server's console for me to get its IP address directly).

Now, you typically don't want to communicate with a given host, but rather with one or many agent processes running on the host. To engage in network communications, each process must associate itself with a port on the host, identified by a number. HTTP servers, for example, typically attach themselves to port 80 on their host machine. When you ask to connect to http://www.javasoft.com/ from your web browser, the browser automatically assumes the default port and attempts to connect to the process running on www.javasoft.com listening to port 80. If this process is an HTTP server process that understands the commands that the browser is sending, the browser and the server will commence communications.

This host/port scheme is the basis of the IP addressing protocol, and is supported directly in the Java API. All network connections are specified using an Inet-Address and a port number. The Java environment does the hard work of initiating the IP protocol communications and creating Java objects that represent these network connections.

2.1.2. Your Basic Socket

At the core of Java's networking support are the Socket and DatagramSocket classes in java.net. These classes define channels for communication between processes over an IP network. A new socket is created by specifying a host, either by name or with an InetAddress object, and a port number on the host. There are two basic flavors of network sockets on IP networks: those that use the Transmission Control Protocol (TCP) and those that use the User Datagram Protocol (UDP). TCP is a reliable protocol in which data packets are guaranteed to be delivered, and delivered in order. If a packet expected at the receiving end of a TCP socket doesn't arrive in a set period of time, then it is assumed lost, and the packet is requested from the sender again. The receiver doesn't move on to the next packet until the first is received. UDP, on the other hand, makes no guarantees about delivery of packets, or the order in which packets are delivered. The sender transmits a UDP packet, and it either makes it to the receiver or it doesn't. TCP sockets are used in the large majority of IP applications. UDP sockets are typically used in bandwidth-limited applications, where the overhead associated with resending packets is not tolerable. A good example of this is real-time network audio applications. If you are delivering packets of audio information over the network to be played in real-time, then there is no point in resending a late packet. By the time it gets delivered it will be useless, since the audio track must play continuously and sequentially, without backtracking.

The Socket class is used for creating TCP connections over an IP network. A Socket is typically created using an InetAddress to specify the remote host, and a port number to which the host can connect. A process on the remote host must be listening on that port number for incoming connection requests. In Java, this can be done using a ServerSocket:

On client side, the code simply creates a socket to the remote host on the specified port (5000, in this case):

UDP socket connections are created and used through the DatagramSocket and DatagramPacket classes. A DatagramSocket sends and receives data using UDP packets, represented as DatagramPacket objects. Before two agents can talk to each other over a UDP connection, they both have to have a DatagramSocket connected to a port on their local machines. This is done by simply creating a DatagramSocket object:

In this example we are connecting a UDP socket to a specific port (5000) on the local host. If we don't particularly care which port is used, then we can construct the DatagramSocket without specifying the port. An unused port on the local host will be used, and we can find out which one by asking the new socket for its port number:

In order for two agents to send data to each other over a UDP socket, they must know the host name and port number of each other's socket connection. So they will either have preordained ports for each other and will create DatagramSockets using these port numbers, or they will create a socket on a random local port and transmit their port numbers to each other over another connection.

Data is sent over a DatagramSocket using DatagramPacket objects. Each DatagramPacket contains a data buffer, the address of the remote host to send the data to, and the port number the remote agent is listening to. So to send a buffer of data to a process listening to port 5000 on host my.host.com, we would do something like this:

The remote process can receive the data in the form of a DatagramPacket by calling the receive() method on its DatagramSocket. The received DatagramPacket will have the host address and port of the sender filled in as a side-effect of the call.

Note that in all of the examples, we would have to catch the appropriate exceptions and handle them. Sending a DatagramPacket, for example, can generate an IOException if the network transmission fails for some reason. A robust networked program will catch this exception and behave appropriately, perhaps by resending the packet if the application warrants, or perhaps by simply noting the lost packet and continuing.

2.1.3. Multicast Sockets

There is a subset of the IP protocol that supports multicasting . Multicasting can be thought of as broadcasting data over a network connection to many connected agents, as opposed to unicasting packets between two agents on a normal connection. Multicasting is done using UDP packets that are broadcast out on a multicast IP address. Any agent "listening in" to that IP address will receive the data packets that are broadcast. The analogy to radio and television broadcasting is no accident--the very first practical uses of multicast IP were for broadcasting audio and video over the Internet from special events.[2]

Java supports multicast IP through the java.net.MulticastSocket class, which is an extension of the DatagramSocket class. Joining a multicast group is done almost the same way that you would establish a UDP connection between two agents. Each agent that wants to listen on the multicast address creates a MulticastSocket and then joins the multicast session by calling the joinGroup() method on the MulticastSocket:

Once the connection to the multicast session is established, the agent can read data being broadcast on the multicast "channel":

Data can also be sent out on the multicast channel to all the other listening agents using the send() method on the MulticastSocket.

Once the broadcast is over, or we simply want to stop listening, we can disconnect from the session using the leaveGroup() method:

Multicasting is useful when we want to connect many agents together on a common communication channel. Shared audio and video channels are the most obvious uses, but multicasting can also be applied in collaborative tools like shared whiteboards, or between application servers performing synchronization tasks, like load balancing. However, since multicast IP is based on UDP, you have to be willing to accept the possibility of losing some data along the way, and dealing with it gracefully. Also, since clients can join a multicast session asynchronously, they have to be ready to synchronize themselves with the current state of the multicast session when they join.

2.1.4. Streams, Readers, and Writers for Input and Output

Once we make a connection between two processes over the network, we need a simple, easy way to send and receive data in different formats over the connection. Java provides this through the stream classes in the java.io package. Included in the java.io package are the InputStream and OutputStream classes and their subclasses for byte-based I/O, and the Reader and Writer classes and their subclasses for character-based I/O. The InputStream and OutputStream classes handle data as bytes, with basic methods for reading and writing bytes and byte arrays. Their subclasses can connect to various sources and destinations (files, string buffers), and provide methods for directly sending and receiving basic Java data types, like floating-point values. The Reader and Writer classes transmit data in the form of 16-bit Unicode characters, which provides a platform-independent way to send and receive textual data. Like the InputStream and OutputStream subclasses, the subclasses of Reader and Writer specialize in terms of their source and destination types.

A Socket, once it's created, can be queried for its input/output streams using getInputStream() and getOutputStream(). These methods return in-stances of InputStream and OutputStream, respectively. If you need to exchange mostly character-based data between two agents in your distributed system, then you can wrap the InputStream with an InputStreamReader(a subclass of Reader), or the OutputStream with an OutputStreamWriter (a subclass of Writer).

Another way to create an interprocess communication link is to use the java.lang.Runtime interface to execute a process, then obtain the input and output streams from the returned Process object, as shown in Example 2-1. You would do this if you had a local subtask that needed to run in a separate process, but with which you still needed to exchange messages.

Example 2-1. Interprocess I/O Using Runtime-Executed Processes

Runtime r = Runtime.getRuntime();
Process p = r.exec("/bin/ls /tmp");
InputStream in = p.getInputStream();
OutputStream out = p.getOutputStream();

From the abstract I/O classes, the java.io package offers several specializations which vary the format of the data transmitted over the stream, as well as the type of data source/receiver at the ends of the stream. The InputStream, OutputStream, Reader, and Writer classes provide basic interfaces for data I/O (read() and write() methods that just transfer bytes, byte arrays, characters and character arrays). To define data types and communication protocols on top of these base classes, Java offers the FilterInputStream and FilterOutputStream classes for byte-oriented I/O, and the FilterReader and FilterWriter for character-based I/O. Subclasses of these offer a higher level of control and structure to the data transfers. A BufferedInputStream or BufferedReader uses a memory buffer for efficient reading of data. The overhead associated with data read requests is minimized by performing large data reads into a buffer, and offering data to the caller from the local buffer until it's been exhausted. This feature can be used to minimize the latency associated with slow source devices and communication media. The BufferedOutputStream or BufferedWriter performs the same service on outgoing data. A PushbackInputStream or PushbackReader provides a buffer for pushing back data onto the incoming data stream. This is useful in parsing applications, where the next branch in the parse tree is determined by peeking at the next few bytes or characters in the stream, and then letting the subparser operate on the data. The other interesting subclasses of FilterInputStream and FilterOutputStream are the DataInputStream and DataOutputStream classes. These classes read and write Java data primitives in a portable binary format. There aren't similar subclasses of FilterReader and FilterWriter, since Readers and Writers only transfer character data, and the serialized form of Java data types are represented in bytes.

Besides being useful in their own right for manipulating and formatting input/output data streams, the subclasses of FilterInputStream, FilterOutputStream, FilterReader, and FilterWriter are also well suited for further specialization to define application-specific data stream protocols. Each of the stream classes offers a constructor method, which accepts an InputStream or OutputStream as an argument. Likewise, the FilterReader class has a constructor that accepts a Reader, and FilterWriter has a constructor that accepts a Writer object. In each case, the constructor argument is taken as the source or sink of the stream that is to be filtered, which enables the construction of stream filter "pipelines." So defining a special-purpose data protocol is simply a matter of subclassing from an appropriate I/O class, and wrapping an existing data source or sink with the new filter.

For example, if we wanted to read an XDR-formatted[3] data stream, we could write a subclass of FilterInputStream that would offer the same methods to read Java primitive data types as DataInputStream, but would be implemented to parse the XDR format, rather than the portable binary format of the DataInputStream. Example 2-2 shows a skeleton for the input version of this kind of stream; Example 2-2 shows a sample application using the stream. The application first connects to a host and port, where presumably another process is waiting to accept this connection. The remote process uses XDR-formatted data to communicate, so we wrap the input stream from the socket connection with our XDRInputStream and begin reading data.

Example 2-2. An InputStream Subclass for Reading XDR-Formatted Data

package dcj.examples;

import java.io.*;
import java.net.*;

class XDRInputStream extends FilterInputStream {
  public XDRInputStream(InputStream in) {
    super(in);
  }

  // Overridden methods from FilterInputStream, implemented
  // to read XDR-formatted data

  public boolean readBoolean() throws IOException;
  public byte    readByte() throws IOException;
  public int     readUnsignedByte() thows IOException;
  public float   readFloat() thows IOException;
  // Other readXXX() methods omitted in this example...

  // We'll assume this stream doesn't support mark/reset operations

  public boolean markSupported() { return false; }
}

Example 2-3. Example XDRInputStream Client

import dcj.examples.XDRInputStream;
import java.io.*;

class XDRInputExample
{
  public static void main(String argv[])
    {
      String host = argv[0];

      // Default port is 5001
      int port = 5001;

      try
        {
          port = Integer.parseInt(argv[1]);
        }
      catch (NumberFormatException e)
        {
          System.out.println("Bad port number given, using default "
                             + port);
        }

      // Try connecting to specified host and port
      Socket serverConn = null;
      try { serverConn = new Socket(host, port); }
      catch (UnknownHostException e)
        {
          System.out.println("Bad host name given.");
          System.exit(1);
        }

      // Wrap an XDR stream around the input stream
      XDRInputStream xin = new XDRInputStream(serverConn.getInputStream());

      // Start reading expected data from XDR-formatted stream
      int numVals = xin.readInt();
      float val1 = xin.readFloat();
      ...
    }
}

The classes in the java.io package also offer the ability to specialize the sources and destinations of data.Table 2-1 summarizes the various stream, writer, and reader classes in java.io, and the types of sources and destinations that they can access. The purpose and use of the file, byte-array, and string classes are fairly obvious, and we won't spend any time going into detail about them here, since we'll see them being used in some of the examples later in the book. The stream classes that allow communication between threads deserve some explanation, though.

Table 2-1. Source and Destination Types Supported by java.io

The PipedInputStream and PipedOutputStream classes access data from each other. That is, a PipedInputStream reads data from a PipedOutputStream, and a PipedOutputStream writes data to a PipedInputStream. This class design allows the developer to establish data pipes between threads in the same process. Example 2-4 and Example 2-5 show client and server classes that use piped streams to transfer information, and Example 2-6 shows an application of these classes.

Example 2-4. A Piped Client

package dcj.examples;

import java.lang.*;
import java.net.*;
import java.io.*;
import java.util.*;

public class PipedClient extends Thread
{
  PipedInputStream pin;
  PipedOutputStream pout;

  public PipedClient(PipedInputStream in, PipedOutputStream out)
  {
    pin = in;
    pout = out;
  }

  public void run()
  {
    // Wrap a data stream around the input and output streams
    DataInputStream din = new DataInputStream(pin);
    DataOutputStream dout = new DataOutputStream(pout);

    // Say hello to the server...
    try
      {
        System.out.println("PipedClient: Writing greeting to server...");
        dout.writeChars("hello from PipedClient\n");
      }
    catch (IOException e)
      {
        System.out.println("PipedClient: Couldn't get response.");
        System.exit(1);
      }

    // See if it says hello back...
    try
      {
        System.out.println("PipedClient: Reading response from server...");
        String response = din.readLine();
        System.out.println("PipedClient: Server said: \"" 
                           + response + "\"");
      }
    catch (IOException e)
      {
        System.out.println("PipedClient: Failed to connect to peer.");
      }

    stop();
  }
}

The example shows two threads, a client and a server, talking to each other over piped streams. The PipedClient class accepts a PipedInputStream and PipedOutputStream as constructor arguments; the PipedServer class does the same. Both are extensions of the Thread class. The client attempts to send a "hello" message to the server over its output stream, then listens for a response on its input stream. The server listens for the "hello" from the client on its input stream, then sends a response back on its output stream. The PipedStreamExample class sets up the stream connections for the threads by creating two pairs of piped streams. It then creates a PipedClient and a PipedServer, sends each the input stream from one pair and the output stream from the other, and tells each of them to start their threads. The important feature of this example is that the piped streams are connected to each other within the same process, and are not connected to any remote hosts.

Example 2-5. A Piped Server

package dcj.examples;

import java.lang.*;
import java.net.*;
import java.io.*;

public class PipedServer extends Thread
{
  PipedInputStream pin;
  PipedOutputStream pout;

  public PipedServer(PipedInputStream in, PipedOutputStream out)
  {
    pin = in;
    pout = out;
  }

  public void run()
  {
    // Wrap a data stream around the input and output streams
    DataInputStream din = new DataInputStream(pin);
    DataOutputStream dout = new DataOutputStream(pout);

    // Wait for the client to say hello...
    try
      {
        System.out.println("PipedServer: Reading from client...");
        String clientHello = din.readLine();
        System.out.println("PipedServer: Client said: \""
                           + clientHello + "\"");
      }
    catch (IOException e)
      {
        System.out.println("PipedServer: Couldn't get hello from client.");
        stop();
      }

    // ...and say hello back.
    try
      {
        System.out.println("PipedServer: Writing response to client...");
        dout.writeChars("hello I am the server.\n");
      }
    catch (IOException e)
      {
        System.out.println("PipedServer: Failed to connect to client.");
      }
    stop();
  }
}

Example 2-6. Piped Stream Application

package dcj.examples;

import java.net.*;
import java.io.*;
import java.lang.*;

import dcj.examples.PipedClient;
import dcj.examples.PipedServer;

class PipedStreamExample {
  public static void main(String argv[]) {
    // Make two pairs of connected piped streams
    PipedInputStream pinc = null;
    PipedInputStream pins = null;
    PipedOutputStream poutc = null;
    PipedOutputStream pouts = null;
    
    try {
      pinc = new PipedInputStream();
      pins = new PipedInputStream();
      poutc = new PipedOutputStream(pins);
      pouts = new PipedOutputStream(pinc);
    }
    catch (IOException e) {
      System.out.println(
        "PipedStreamExample: Failed to build piped streams.");
      System.exit(1);
    }

    // Make the client and server threads, connected by the streams
    PipedClient pc = new PipedClient(pinc, poutc);
    PipedServer ps = new PipedServer(pins, pouts);

    // Start the threads
    System.out.println("Starting server...");
    ps.start();
    System.out.println("Starting client...");
    pc.start();

    // Wait for threads to end
    try {
      ps.join();
      pc.join();
    }
    catch (InterruptedException e) {}

    System.exit(0);
  }
}

Note that a similar scenario could be set up using the PipedReader and PipedWriter classes, if you knew the two threads were going to exchange character arrays.

Chapter 2. Networking in Java

Contents: