TCP Connections
To understand port forwarding,
it's important to know some details about TCP, the Transmission
Control Protocol. TCP is a fundamental building block of the
Internet. Built on top of IP, it is the transport mechanism for many
application-level Internet protocols such as FTP, Telnet, HTTP, SMTP,
POP, IMAP, and SSH itself.
TCP comes with strong guarantees. A TCP connection is a virtual,
full-duplex circuit between two communicating parties, acting like a
two-way pipe. Either side may write any number of bytes at any time
to the pipe, and the bytes are guaranteed to arrive unaltered and in
order at the other side. The mechanisms that implement these
guarantees, though, are designed to counter transmission problems in
the network, such as routing around failed links, or retransmitting
data corrupted by noise or lost due to temporary network congestion.
They aren't effective against deliberate attempts to steal a
connection or alter data in transit. SSH provides this protection
that TCP alone lacks.
If an application doesn't need these guarantees about data
integrity and order, or doesn't want the overhead associated
with them, another protocol called User Datagram Protocol
(UDP) often suffices. It is
packet-oriented, and has no guarantees of delivery or packet
ordering. Some protocols that run over UDP are NFS, DNS, DHCP,
NetBIOS, TFTP, Kerberos, SYSLOG, and NTP.
When a program establishes a TCP connection to a service, it needs
two pieces of information: the IP address of the destination machine
and a way to identify the desired service. TCP (and UDP) use a
positive integer, called a port
number, to identify a service. For example,
SSH uses port 22, telnet uses port 23, and IMAP
uses port 143. Port numbers allow multiple services at the same IP
address.
The combination of an IP address and a port number is called a
socket.
For example, if you run telnet to connect to
port 23 on the machine at IP address 128.220.91.4, the socket is
denoted "(128.220.91.4,23)." Simply put, when you make a
TCP connection, its destination is a socket. The source (client
program) also has a socket on its end of the connection, and the
connection as a whole is completely defined by the pair of source and
destination sockets.
In order for a connection attempt to a socket to succeed, something
must be
"listening"
on that socket. That is, a program running on the destination machine
must ask TCP to accept connection requests on that port and to pass
the connections on to the program. If you've ever attempted a
TCP connection and received the response "connection
refused," it means that the remote machine is up and running,
but nothing is listening on the target socket.
How does a client program know the target port number of a listening server? Port numbers for many protocols are standardized, assigned by the
Internet Assigned Numbers Authority or IANA. (IANA's complete list of port numbers is found at
http://www.isi.edu/in-notes/iana/assignments/port-numbers.)
For instance, the TCP port number assigned to the NNTP (Usenet news) protocol is 119. Therefore, news servers listen on port 119, and newsreaders (clients) connect to them via port 119. More specifically, if a newsreader is configured to talk to a news server at IP address 10.1.2.3, it requests a TCP connection to the socket (10.1.2.3,119).
Port numbers aren't always hardcoded into programs. Many operating systems let applications refer to protocols by name, instead of number, by defining a table of TCP names and port numbers. Programs can then look up port numbers by the protocol name. Under Unix, the table is often contained in the file
/etc/services
or the NIS services map, and queries are performed using the library routines getservbyname()
, getservbyport()
, and related procedures. Other environments allow servers to register their listening ports dynamically via a naming service, such as the AppleTalk Name Binding Protocol or DNS's WKS and SRV records.
So far, we've discussed the port number used by a TCP server when a TCP client program wants to connect. We call this the
target
port number. The client also uses a port number, called the
source
port number, so the server can transmit to the client. If you combine the client's IP address and its source port number, you get the client's socket.
Unlike target port numbers, source port numbers aren't
standard. In most cases, in fact, neither the client nor the server
cares which source port number is used by the client. Often a client
will let TCP select an unused port number for the source. (The
Berkeley r-commands, however, do care about source ports. [Section 3.4.2.3, "Trusted-host authentication (Rhosts and RhostsRSA)"]) If you examine the existing TCP connections on a machine with a command such as netstat -a or lsof -i tcp
, you will see connections to the well-known port numbers for common services (e.g., 23 for Telnet, 22 for SSH), with large, apparently random source port numbers on the other end. Those source ports were chosen from the range of unassigned ports by TCP on the machines initiating those connections.
Once established, a TCP connection is completely determined by the combination of its source and target sockets. Therefore, multiple TCP clients may connect to the same target socket. If the connections originate from different hosts, the IP address portions of their source sockets will differ, distinguishing the connections. If they come from two different programs running on the same host, TCP on that host ensures they have different source port numbers.
|