Security (Java Servlet Programming)

So far we have imagined that our servlets exist in a perfect world, where everyone is trustworthy and nobody locks their doors at night. Sadly, that's a 1950s fantasy world: the truth is that the Internet has its share of fiendish rogues. As companies place more and more emphasis on online commerce and begin to load their Intranets with sensitive information, security has become one of the most important topics in web programming.

Security is the science of keeping sensitive information in the hands of authorized users. On the web, this boils down to three important issues:

A client wants to be sure that it is talking to a legitimate server (authentication), and it also want to be sure that any information it transmits, such as credit card numbers, is not subject to eavesdropping (confidentiality). The server is also concerned with authentication and confidentiality. If a company is selling a service or providing sensitive information to its own employees, it has a vested interest in making sure that nobody but an authorized user can access it. And both sides need integrity to make sure that whatever information they send gets to the other party unaltered.

Authentication, confidentiality, and integrity are all linked by digital certificate technology. Digital certificates allow web servers and clients to use advanced cryptographic techniques to handle identification and encryption in a secure manner. Thanks to Java's built-in support for digital certificates, servlets are an excellent platform for deploying secure web applications that use digital certificate technology. We'll be taking a closer look at them later.

Security is also about making sure that crackers can't gain access to the sensitive data on your web server. Because Java was designed from the ground up as a secure, network-oriented language, it is possible to leverage the built-in security features and make sure that server add-ons from third parties are almost as safe as the ones you write yourself.

This chapter introduces the basics of web security and digital certificate technology in the context of using servlets. It also discusses how to maintain the security of your web server when running servlets from untrusted third-parties. You'll notice that this chapter takes a higher-level approach and shows fewer examples than previous chapters. The reason is that many of the topics in this chapter require web server-specific administration to implement. The servlets just tag along for the ride.

Finally, a note of caution. We are just a couple of servlet programmers, and we disclaim all responsibility for any security-related incidents that might result from following our advice. For a much more complete overview of web security technology and procedures, see Web Security & Commerce by Simson Garfinkel with Gene Spafford (O'Reilly). Of course, they probably won't accept responsibility either.

8.1. HTTP Authentication

As we discussed briefly in Chapter 4, "Retrieving Information", the HTTP protocol provides built-in authentication support--called basic authentication--based on a simple challenge/response, username/password model. With this technique, the web server maintains a database of usernames and passwords and identifies certain resources (files, directories, servlets, etc.) as protected. When a user requests access to a protected resource, the server responds with a request for the client's username and password. At this point, the browser usually pops up a dialog box where the user enters the information, and that input is sent back to the server as part of a second authorized request. If the submitted username and password match the information in the server's database, access is granted. The whole authentication process is handled by the server itself.

Basic authentication is very weak. It provides no confidentiality, no integrity, and only the most basic authentication. The problem is that passwords are transmitted over the network, thinly disguised by a well-known and easily reversed Base64 encoding. Anyone monitoring the TCP/IP data stream has full and immediate access to all the information being exchanged, including the username and password. Plus, passwords are often stored on the server in clear text, making them vulnerable to anyone cracking into the server's file system. While it's certainly better than nothing, sites that rely exclusively on basic authentication cannot be considered really secure.

Digest authentication is a variation on the basic authentication scheme. Instead of transmitting a password over the network directly, a digest of the password is used instead. The digest is produced by taking a hash (using the very secure MD5 encryption algorithm) of the username, password, URI, HTTP request method, and a randomly generated "nonce" value provided by the server. Both sides of the transaction know the password and use it to compute digests. If the digests match, access is granted. Transactions are thus somewhat more secure than they would be otherwise because digests are valid for only a single URI request and nonce value. The server, however, must still maintain a database of the original passwords. And, as of this writing, digest authentication is not supported by very many browsers.

The moral of the story is that HTTP authentication can be useful in low-security environments. For example, a site that charges for access to content--say, an online newspaper--is more concerned with ease of use and administration than lock-tight security, so HTTP authentication is often sufficient.

8.1.1. Retrieving Authentication Information

A servlet can retrieve information about the server's authentication using two methods introduced in Chapter 4, "Retrieving Information": getRemoteUser() and getAuthType(). Example 8-1 shows a simple servlet that tells the client its name and what kind of authentication has been performed (basic, digest, or some alternative). To see this servlet in action, you should install it in your web server and protect it with a basic or digest security scheme. Because web server implementations vary, you'll need to check your server documentation for the specifics on how to set this up.

Example 8-1. Snooping the authorization information

import java.io.*;
import javax.servlet.*;
import javax.servlet.http.*;

public class AuthorizationSnoop extends HttpServlet { 

  public void doGet(HttpServletRequest req, HttpServletResponse res)
                               throws ServletException, IOException {
    res.setContentType("text/html");
    PrintWriter out = res.getWriter();

    out.println("<HTML><HEAD><TITLE>Authorization Snoop</TITLE></HEAD><BODY>");

    out.println("<H1>This is a password protected resource</H1>");
    out.println("<PRE>");
    out.println("User Name: " + req.getRemoteUser());
    out.println("Authorization Type: " + req.getAuthType());
    out.println("</PRE>");
    out.println("</BODY></HTML>");  
  }
}

8.1.2. Custom Authorization

Normally, client authentication is handled by the web server. The server administrator tells the server which resources are to be restricted to which users, and information about those users (such as their passwords) is somehow made available to the server.

This is often good enough, but sometimes the desired security policy cannot be implemented by the server. Maybe the user list needs to be stored in a format that is not readable by the server. Or maybe you want any username to be allowed, as long as it is given with the appropriate "skeleton key" password. To handle these situations, we can use servlets. A servlet can be implemented so that it learns about users from a specially formatted file or a relational database; it can also be written to enforce any security policy you like. Such a servlet can even add, remove, or manipulate user entries--something that isn't supported directly in the Servlet API, except through proprietary server extensions.[1]

A servlet uses status codes and HTTP headers to manage its own security policy. The servlet receives encoded authorization credentials in the Authorization header. If it chooses to deny those credentials, it does so by sending the SC_UNAUTHORIZED status code and a WWW-Authenticate header that describes the desired credentials. A web server normally handles these details without involving its servlets, but for a servlet to do its own authorization, it must handle these details itself, while the server is told not to restrict access to the servlet.

The Authorization header, if sent by the client, contains the client's username and password. With the basic authorization scheme, the Authorization header contains the string of "username:password" encoded in Base64. For example, the username of "webmaster" with the password "try2gueSS" is sent in an Authorization header with the value:

If a servlet needs to, it can send an WWW-Authenticate header to tell the client the authorization scheme and the realm against which users will be verified. A realm is simply a collection of user accounts and protected resources. For example, to tell the client to use basic authorization for the realm "Admin", the WWW-Authenticate header is:

Example 8-2 shows a servlet that performs custom authorization, receiving an Authorization header and sending the SC_UNAUTHORIZED status code and WWW-Authenticate header when necessary. The servlet restricts access to its "top-secret stuff" to those users (and passwords) it recognizes in its user list. For this example, the list is kept in a simple Hashtable and its contents are hard-coded; this would, of course, be replaced with some other mechanism, such as an external relational database, for a production servlet.

To retrieve the Base64-encoded username and password, the servlet needs to use a Base64 decoder. Fortunately, there are several freely available decoders. For this servlet, we have chosen to use the sun.misc.BASE64Decoder class that accompanies the JDK. Being in the sun.* hierarchy means it's unsupported and subject to change, but it also means it's probably already on your system. You can find the details of Base64 encoding in RFC 1521 at http://www.ietf.org/rfc/rfc1521.txt.

Example 8-2. Security in a servlet

import java.io.*;
import java.util.*;
import javax.servlet.*;
import javax.servlet.http.*;

public class CustomAuth extends HttpServlet { 

  Hashtable users = new Hashtable();

  public void init(ServletConfig config) throws ServletException {
    super.init(config);
    users.put("Wallace:cheese",     "allowed");
    users.put("Gromit:sheepnapper", "allowed");
    users.put("Penguin:evil",       "allowed");
  }

  public void doGet(HttpServletRequest req, HttpServletResponse res)
                               throws ServletException, IOException {
        PrintWriter out = res.getWriter();

    // Get Authorization header
    String auth = req.getHeader("Authorization");

    // Do we allow that user?
    if (!allowUser(auth)) {
      // Not allowed, so report he's unauthorized
      res.setHeader("WWW-Authenticate", "BASIC realm=\"users\"");
      res.sendError(res.SC_UNAUTHORIZED);
      // Could offer to add him to the allowed user list
    }
    else {
      // Allowed, so show him the secret stuff
      out.println("Top-secret stuff");
    }
  }

  // This method checks the user information sent in the Authorization
  // header against the database of users maintained in the users Hashtable.
  protected boolean allowUser(String auth) throws IOException {
    if (auth == null) return false;  // no auth

    if (!auth.toUpperCase().startsWith("BASIC ")) 
      return false;  // we only do BASIC

    // Get encoded user and password, comes after "BASIC "
    String userpassEncoded = auth.substring(6);

    // Decode it, using any base 64 decoder
    sun.misc.BASE64Decoder dec = new sun.misc.BASE64Decoder();
    String userpassDecoded = new String(dec.decodeBuffer(userpassEncoded));
    
    // Check our user list to see if that user and password are "allowed"
    if ("allowed".equals(users.get(userpassDecoded)))
      return true;
    else
      return false;
  }
}

Although the web server is told to grant any client access to this servlet, the servlet sends its top-secret output only to those users it recognizes. With a few modifications, it could allow any user with a trusted skeleton password. Or, like anonymous FTP, it could allow the "anonymous" username with any email address given as the password.

Custom authorization can be used for more than restricting access to a single servlet. Were we to add this logic to our ViewFile servlet, we could implement a custom access policy for an entire set of files. Were we to create a special subclass of HttpServlet and add this logic to that, we could easily restrict access to every servlet derived from that subclass. Our point is this: with custom authorization, the security policy limitations of the server do not limit the possible security policy implementations of its servlets.

8.1.3. Form-based Custom Authorization

Servlets can also perform custom authorization without relying on HTTP authorization, by using HTML forms and session tracking instead. It's a bit more effort to give users a well-designed, descriptive, and friendly login page. For example, imagine you're developing an online banking site. Would you rather let the browser present a generic prompt for username and password or provide your customers with a custom login form that politely asks for specific banking credentials, as shown in Figure 8-1?

Figure 8-1. An online banking login screen

Many banks and other online services have chosen to use form-based custom authorization. Implementing such a system is relatively straightforward with servlets. First, we need the login page. It can be written like any other HTML form. Example 8-3 shows a sample login.html file that generates the form shown in Figure 8-2.

Example 8-3. The login.html file

<HTML>
<TITLE>Login</TITLE>
<BODY>
<FORM ACTION=/servlet/LoginHandler METHOD=POST>
<CENTER>
<TABLE BORDER=0>
<TR><TD COLSPAN=2>
<P ALIGN=center>
Welcome!  Please enter your Name<br>
 and Password to log in.
</TD></TR>

<TR><TD>
<P ALIGN=right><B>Name:</B>
</TD>
<TD>
<P><INPUT TYPE=text NAME="name" VALUE="" SIZE=15>
</TD></TR>

<TR><TD>
<P ALIGN=right><B>Password:</B>
</TD>
<TD>
<P><INPUT TYPE=password NAME="passwd" VALUE="" SIZE=15>
</TD></TR>

<TR><TD COLSPAN=2>
<CENTER>
<INPUT TYPE=submit VALUE="  OK   ">
</CENTER>
</TD></TR>
</TABLE>
</BODY></HTML>

Figure 8-2. A friendly login form

This form asks the client for her name and password, then submits the information to the LoginHandler servlet that validates the login. We'll see the code for LoginHandler soon, but first we should ask ourselves, "When is the client going to see this login page?" It's clear she can browse to this login page directly, perhaps following a link on the site's front page. But what if she tries to access a protected resource directly without first logging in? In that case, she should be redirected to this login page and, after a successful login, be redirected back to the original target. The process should work as seamlessly as having the browser pop open a window--except in this case the site pops open an intermediary page.

Example 8-4 shows a servlet that implements this redirection behavior. It outputs its secret data only if the client's session object indicates she has already logged in. If she hasn't logged in, the servlet saves the request URL in her session for later use, and then redirects her to the login page for validation.

Example 8-4. A protected resource

import java.io.*;
import java.util.*;
import javax.servlet.*;
import javax.servlet.http.*;

public class ProtectedResource extends HttpServlet {

  public void doGet(HttpServletRequest req, HttpServletResponse res)
                               throws ServletException, IOException {
    res.setContentType("text/plain");
    PrintWriter out = res.getWriter();

    // Get the session
    HttpSession session = req.getSession(true);

    // Does the session indicate this user already logged in?
    Object done = session.getValue("logon.isDone");  // marker object
    if (done == null) {
      // No logon.isDone means he hasn't logged in.
      // Save the request URL as the true target and redirect to the login page.
      session.putValue("login.target",
                       HttpUtils.getRequestURL(req).toString());
      res.sendRedirect(req.getScheme() + "://" +
                       req.getServerName() + ":" + req.getServerPort() +
                       "/login.html");
      return;
    }

    // If we get here, the user has logged in and can see the goods
    out.println("Unpublished O'Reilly book manuscripts await you!");
  }
}

This servlet sees if the client has already logged in by checking her session for an object with the name "logon.isDone". If such an object exists, the servlet knows that the client has already logged in and therefore allows her to see the secret goods. If it doesn't exist, the client must not have logged in, so the servlet saves the request URL under the name "login.target", and then redirects the client to the login page. Under form-based custom authorization, all protected resources (or the servlets that serve them) have to implement this behavior. Subclassing, or the use of a utility class, can simplify this task.

Now for the login handler. After the client enters her information on the login form, the data is posted to the LoginHandler servlet shown in Example 8-5. This servlet checks the username and password for validity. If the client fails the check, she is told that access is denied. If the client passes, that fact is recorded in her session object and she is immediately redirected to the original target.

Example 8-5. Handling a login

import java.io.*;
import java.util.*;
import javax.servlet.*;
import javax.servlet.http.*;

public class LoginHandler extends HttpServlet {

  public void doPost(HttpServletRequest req, HttpServletResponse res)
                                throws ServletException, IOException {
    res.setContentType("text/html");
    PrintWriter out = res.getWriter();

    // Get the user's name and password
    String name = req.getParameter("name");
    String passwd = req.getParameter("passwd");

    // Check the name and password for validity
    if (!allowUser(name, passwd)) {
      out.println("<HTML><HEAD><TITLE>Access Denied</TITLE></HEAD>");
      out.println("<BODY>Your login and password are invalid.<BR>");
      out.println("You may want to <A HREF=\"/login.html\">try again</A>");
      out.println("</BODY></HTML>");
    }
    else {
      // Valid login. Make a note in the session object.
      HttpSession session = req.getSession(true);
      session.putValue("logon.isDone", name);  // just a marker object

      // Try redirecting the client to the page he first tried to access
      try {
        String target = (String) session.getValue("login.target");
        if (target != null)
          res.sendRedirect(target);
        return;
      }
      catch (Exception ignored) { }

      // Couldn't redirect to the target. Redirect to the site's home page.
      res.sendRedirect(req.getScheme() + "://" + 
                       req.getServerName() + ":" + req.getServerPort());
    }
  }

  protected boolean allowUser(String user, String passwd) {
    return true;  // trust everyone
  }
}

The actual validity check in this servlet is quite simple: it assumes any username and password are valid. That keeps things simple, so we can concentrate on how the servlet behaves when the login is successful. The servlet saves the user's name (any old object will do) in the client's session under the name "logon.isDone", as a marker that tells all protected resources this client is okay. It then redirects the client to the original target saved as "login.target", seamlessly sending her where she wanted to go in the first place. If that fails for some reason, the servlet redirects the user to the site's home page.

Chapter 8. Security

Contents: