9.5. Protecting Data on the WebThe Web isn't a secure environment. The open nature of the networking and web protocols—TCP, IP, and HTTP—has allowed the development of many tools that can listen in on data transmitted between browsers and web servers. It is easy to snoop on passing traffic and read the contents of HTTP requests and responses. With a little extra effort, a hacker can manipulate traffic and even masquerade as another user. If an application transmits sensitive information over the Web, an encrypted connection should be provided between the browser and the web server. The information that would warrant an encrypted connection includes:
In this section we focus on the common method of encrypting data sent over the Web using the Secure Sockets Layer. We discuss the basic mechanics of SSL in this section, and provide an installation and configuration guide for SSL and Apache as part of Appendix A. This section isn't designed to cover the enormous topic of encryption. We limit our brief discussion to the features of SSL, and how SSL can protect web traffic. More details about cryptographic systems can be found in the references listed in Appendix E. 9.5.1. The Secure Sockets Layer ProtocolThe data that is sent between web servers and browsers can be protected using the encryption services of the Secure Sockets Layer protocol, SSL. The SSL protocol addresses three goals:
SSL was originally developed by Netscape, and there are two versions: SSL v2.0 and SSL v3.0. We don't detail the differences here, but Version 3.0 supports more security features than 2.0. The SSL protocol isn't a standard as such, and the Transport Layer Security 1.0 (TLS) protocol has been proposed by the Internet Engineering Task Force (IETF) as an SSL v3.0 replacement. 9.5.1.1. SSL architectureTo understand how SSL works, you need to consider how browsers and web servers actually send and receive HTTP messages. Browsers send HTTP requests by calling on the host systems' TCP/IP networking software, the software that does the work of sending and receiving data over the Internet. When a request is to be sent—for example when a user clicks on a hypertext link—the browser formulates the HTTP request in memory and uses the host's TCP/IP network service to send the request to the server. TCP/IP doesn't care that the message is HTTP; it is only responsible for getting the complete message to the destination. When a web server receives a message, data is read from its host's TCP/IP service and then interpreted as HTTP. We discuss the relationship between HTTP and TCP/IP in more detail in Appendix B. As shown in Figure 9-4, The SSL protocol operates as a layer between the browser and the TCP/IP services provided by the host. A browser passes the HTTP message to the SSL layer to be encrypted before the message is passed to the host's TCP/IP service. The SSL layer, configured into the web server, decrypts the message from the TCP/IP service and then passes it to the web server. Once SSL is installed and the web server is configured correctly, the HTTP requests and responses are automatically encrypted. There is no scripting required to use the SSL services. Figure 9-4. HTTP clients and servers, SSL, and the network layer that implements TCP/IPBecause SSL sits between HTTP and TCP/IP, secure web sites technically don't serve HTTP, at least not directly over TCP. URLs that locate resources on a secure server begin with https://, which means HTTP over SSL. The default port for an SSL service is 443, not port 80 as with HTTP; for example, when a browser connects to https://secure.example.com, it makes a TCP/IP connection to port 443 on secure.example.com. Most browsers and web servers can support SSL, but keys and certificates need to be included in the configuration of the server (and possibly the browser, if client certification is required). 9.5.1.2. Cipher suitesTo provide a service that addresses the goals of privacy, integrity, and authentication, SSL uses a combination of cryptographic techniques and functions, such as message digests, digital certificates, and, of course, encryption. There are many different standard algorithms that implement these functions, and SSL can use different combinations to meet particular requirements, such as being legal to use in a particular country! When an SSL connection is established, clients and servers negotiate the best combination of techniques—based on common capabilities—to ensure the highest level of protection. The combinations of techniques that can be negotiated are known as cipher suites. 9.5.1.3. SSL sessionsWhen a browser connects to a secure site, the SSL protocol performs the following four steps:
These four steps briefly summarize the network handshaking between the browser and server when SSL is used. Once the browser and server have completed these steps, the HTTP request can be encrypted by SSL and sent to the web server. The SSL handshaking is slow, and if this was to occur with every HTTP request, the performance of a secure web site would be poor. To improve performance, SSL uses the concept of sessions to allow multiple requests to share the negotiated cipher suite, the shared secret key, and the certificates. An SSL session is managed by the SSL software and isn't the same as a PHP session. 9.5.1.4. Certificates and Certification AuthoritiesA signed digital certificate encodes information so that the integrity of the information and the signature can be tested. The information contained in a certificate that is used by SSL includes details about the organization and the organization's public key. The public key that is contained in a certificate matches a private key held by the organization that is configured into the organization's web server. The browser uses the public key when an SSL session is established to encrypt a secret. The secret can only be decrypted using the private key configured into the organization's server. Encryption techniques that use a public and private key are known as asymmetric, and SSL uses asymmetric encryption to exchange a secret key. The secret key can then be used to encrypt the messages transmitted over the Internet. A signed certificate also contains details about the Certification Authority (CA). The CA digitally signs a certificate by adding its own organization details, an encrypted digest of the certificate, and its own public key. With this information encoded, the complete signed certificate can be verified as being correct. There are dozens, perhaps hundreds, of CAs. A browser—or the user confronted by a browser warning—can't be expected to recognize the digital signatures from all these authorities. The X.509 certificate standard solves this problem by allowing issuing CAs to have their signatures digitally signed by a more authoritative CA, who can in turn have its signature signed by yet another, more trusted CA. Eventually the chain of signatures ends with that of a root Certification Authority. It is the certificates from the root CAs that are often preinstalled in a browser. Some browsers allow users to add their own trusted certificates. Self-signed certificates can be created and used to configure a web server with SSL. We show how to create self-signed certificates in Appendix A. But will they be trusted? The answer is probably not for secure applications. Copyright © 2003 O'Reilly & Associates. All rights reserved. |
|