next up previous contents index
Next: Static Statements Up: Configuration Guide Previous: SNMP Statement

The Kernel Statement


While the kernel interface isn't technically a routing protocol, it has many characteristics of one, and GateD handles it similarly to one. The routes GateD chooses to install in the kernel forwarding table are those that will actually be used by the kernel to forward packets.

The add, delete and change operations GateD must use to update the typical kernel forwarding table take a non-trivial amount of time. This does not present a problem for older routing protocols (RIP, EGP), which are not particularly time critical and do not easily handle very large numbers of routes anyway. The newer routing protocols (OSPF, BGP) have stricter timing requirements and are often used to process many more routes. The speed of the kernel interface becomes critical when these protocols are used.

To prevent GateD from locking up for significant periods of time installing large numbers of routes (up to a minute or more has been observed on real networks), the processing of these routes is now done in batches. The size of these batches may be controlled by the tuning parameters described below, but normally the default parameters will provide the proper functionality.

During normal shutdown processing, GateD normally deletes all the routes it has installed in the kernel forwarding table, except for those marked with retain. Optionally, GateD can leave all routes in the kernel forwarding table by not deleting any routes. In this case changes will be made to insure that routes with a retain indication are installed in the table. This is useful on systems with large numbers of routes as it prevents the need to re-install the routes when GateD restarts. This can greatly reduce the time it takes to recover from a restart.

Forwarding tables and Routing tables


The table in the kernel that controls the forwarding of packets is a forwarding table, also know in ISO speak as a forwarding information base, or FIB. The table that GateD uses internally to store routing information it learns from routing protocols is a routing table, known in ISO speak as a routing information base , or RIB. The routing table is used to collect and store routes from various protocols. For each unique combination of network and mask an active route is chosen, this route will be the one with the best (numerically smallest) preference. All the active routes are installed in the kernel forwarding table. The entries in this table are what the kernel actually uses to forward packets.

Updating the Forwarding Table


There are two main methods of updating the kernel FIB, the ioctl() interface and the routing socket interface. Their various characteristics are described here.

Updating the Forwarding Table with the ioctl interface


The ioctl() interface to the forwarding table was introduced in BSD 4.3 and widely distributed in BSD 4.3. This is a one-way interface, it only allows GateD to update the kernel forwarding table. It has several other limitations:

Fixed subnet masks
The BSD 4.3 networking code assumed that all subnets of a given network had the same subnet mask. This limitation is enforced by the kernel. The network mask is not stored in the kernel forwarding table, but determined when a packet is forwarded by searching for interfaces on the same network.

One way interface
GateD is able to update the kernel forwarding table, but it is not aware of other modifications of the forwarding table. GateD is able to listen to ICMP messages and guess how the kernel has updated the forwarding table in response to ICMP redirects.

Blind updates
GateD is not able to detect changes to the forwarding table resulting from the use of the the route command by the system administrator. Use of the route command on systems that use the ioctl() interface is strongly discouraged while GateD is running.

Changes not supported
In all known implementations, there is no change operation supported, to change a route that exists in the kernel, the route must be deleted and a new one added.

Updating the Forwarding Table with the routing socket interface


The routing socket interface to the kernel forwarding table was introduced in BSD 4.3 Reno, widely distributed in BSD 4.3 Net/2 and improved in BSD 4.4. This interface is simply a socket, similar to a UDP socket, on which the kernel and GateD exchange messages. It has several advantages over the ioctl() interface:

Variable subnet masks
The network mask is passed to the kernel explicitly. This allows different masks to be used on subnets of the same network. It also allows routes with masks that are more general than the natural mask to be used. This is known as classless routing.

Two way interface
Not only is GateD able to change the kernel forwarding table with this interface, but the kernel can also report changes to the forwarding table to GateD. The most interesting of these is an indication that a redirect has modified the kernel forwarding table; this means that gated no longer needs to monitor ICMP messages to learn about redirects. Plus, there is an indication of whether the kernel processed the redirect, GateD can safely ignore redirect messages that the kernel did not process.

Updates visible
Changes to the routing table by other processes, including the route(8) command are received via the routing socket. This allows GateD to insure that the kernel forwarding table is in sync with the routing table. Plus it allows the system administrator the ability to do some operations with the route(8) command while gated is running.

Changes supported
There is a functioning change message that allows routes in the kernel to be atomically changed. Some early versions of the routing socket code had bugs in the change message processing. There are compilation time and configuration time options that cause delete and add sequences to be used in lieu of change messages.

New levels of kernel/GateD communications may be added by adding new message types.

Reading the Forwarding Table


When GateD starts up it reads the kernel forwarding table and installs corresponding routes in the routing table. These routes are called remnants  and are timed out after a configured interval (which defaults to 3 minutes), or as soon as a more attractive route is learned. This allows forwarding to occur during the time it takes the routing protocols to start learning routes.

There are three main methods for reading the forwarding table from the kernel.

Reading forwarding table via kmem

On many systems, especially those based on BSD 4.3, GateD must have knowledge of the kernel's data structures and poke around in the kernel to read the current state of forwarding table. This method is slow and subject to error if the kernel forwarding table is updated while GateD is in the middle of reading it. This can happen if the system administrator uses the route command, or an ICMP redirect message is received while GateD is starting up.

Due to an oversight some systems, such as OSF/1, which are based on BSD 4.3 Reno or later, do not have the getkerninfo() system call described below which allows GateD to read routes from the kernel without know about kernel internal structures. On these systems it is necessary to read the kernel radix tree from the kernel by poking around in kernel memory. This is even more error prone than reading the hash based forwarding table.

Reading the forwarding table via getkerninfo/sysctl

Besides the routing socket, BSD 4.3 Reno introduced the getkerninfo() system call. This call allows a user process (of which GateD is one) to read various information from the kernel without knowledge of the kernel data structures. In the case of the forwarding table, it is returned to gated atomically as a series of routing socket messages. This prevents the problem associated with the forwarding table changing while GateD is in the process of reading it.

BSD 4.4 changed the getkerninfo() interface into the sysctl() interface, which takes different parameters, but otherwise functions identically.

Reading the forwarding table via OS specific methods

Some operating systems, for example SunOS 5, define their own method of reading the kernel forwarding table. The SunOS 5 version is similar in concept to the getkerninfo() method.

Reading the interface list


The kernel support subsystem of GateD is responsible for reading the status of the kernel's physical and protocol interfaces periodically. GateD detects changes in the interface list and notifies the protocols so they can start or stop instances or peers. The interface list is read one of two ways:

Reading the interface list with SIOCGIFCONF


On systems based on BSD 4.3, 4.3 Reno and 4.3 Net/2 the SIOCGIFCONF ioctl interface is used to read the kernel interface list. Using this method a list of interfaces and some basic information about them is return by the SIOCGIFCONF call. Other information must be learned by issuing other ioctls to learn the interface network mask, flags, MTU, metric, destination address (for point-to-point interfaces) and broadcast address (for broadcast capable interfaces).

GateD re-reads this list every 15 second looking for changes. When the routing socket is in use, it also re-reads it whenever a messages is received indicating a change in routing configuration. Receipt of a SIGUSR2 signal also causes GateD to re-read the list. This interval may be explicitly configured in the interface configuration.

Reading the interface list with sysctl

BSD 4.4 added the ability to read the kernel interface list via the sysctl system call. The interface status is returned atomically as a list of routing socket messages which GateD parses for the required information.

BSD 4.4 also added routing socket messages to report interface status changes immediately. This allows GateD to react quickly to changes in interface configuration.

When this method is in use, GateD re-reads the interface list only once a minute. It also re-reads it on routing table changes indications and when a SIGUSR2 is received. This interval may be explicitly configured in the interface configuration.

Reading interface physical addresses

Later version of the getkerninfo() and sysctl() interfaces return the interface physical addresses as part of the interface information. On most systems where this information is not returned, GateD scans the kernel physical interface list for this information for interfaces with IFF_BROADCAST set, assuming that their drivers are handled the same as Ethernet drivers. On some systems, such as SunOS 4 and SunOS 5, system specific interfaces are used to learn this information.

The interface physical addresses are useful for IS-IS. For IP protocols, they are not currently used, but may be in the future.

Reading kernel variables


At startup, GateD reads some special variables out of the kernel. This is usually done with the nlist() (or kvm_nlist()) system call, but some systems use different methods.

The variables read include the status of UDP checksum creation and generation, IP forwarding and kernel version (for informational purposes). On systems where the routing table is read directly from kernel memory, the root of the hash table or radix tree routing table is read. On systems where interface physical addresses are not supplied by other means, the root of the interface list is read.

Special route flags

The later BSD based kernels support the special route flags described here.

  Instead of forwarding a packet like a normal route, routes with RTF_REJECT cause packets to be dropped and unreachable messages to be sent to the packet originators. This flag is only valid on routes pointing at the loopback interface.

  Like the RTF_REJECT flag, routes with RTF_BLACKHOLE cause packets to be dropped, but unreachable messages are not sent. This flag is only valid on routes pointing at the loopback interface.

  When GateD starts, it reads all the routes currently in the kernel forwarding table. Besides interface routes, it usually marks everything else as a remnant from a previous run of GateD and deletes it after a few minutes. This means that routes added with the route command will not be retained after GateD has started.

To fix this the RTF_STATIC flag was added. When the route command is used to install a route that is not an interface route it sets the RTF_STATIC flag. This signals to GateD that said route was added by the systems administrator and should be retained.

Kernel Configuration

kernel { options [ nochange ] [ noflushatexit ] [ remnantholdtime time ] ; routes number ; flash [ limit number ] [ type interface | interior | all ] ; background [ limit number ] [ priority flash | higher | lower ] ; traceoptions trace_options ; } ;

Tracing options

While the kernel interface isn't technically a routing protocol, in many cases it is handled as one. The following two symbols make sense when entered from the command line since the code that uses them is executed before the trace file is parsed.

Symbols read from the kernel, by nlist() or similar interface.

Interface list scan. This option is useful when entered from the command line as the first interface list scan is performed before the configuration file is parsed.

The following tracing options may only be specified in the configuration file. They are not valid from the command line.

Routes read from the kernel when GateD starts.

Requests by GateD to Add/Delete/Change routes in the kernel forwarding table.

The following general option and packet-tracing options only apply on systems that use the routing socket to exchange routing information with the kernel. They do not apply on systems that use the old BSD4.3 ioctl() interface to the kernel.

Informational messages received from the routing socket, such as TCP lossage, routing lookup failure, and route resolution requests. GateD does not currently do processing on these messages, just logs the information if requested.

Packet tracing options (which may be modified with detail, send and recv):

Routes exchanged with the kernel, including Add/Delete/Change messages and Add/Delete/Change messages received from other processes.

Redirect messages received from the kernel.

Interface status messages received from the kernel. These are only supported on systems with networking code derived from BSD 4.4.

Other messages received from the kernel, including those mentioned in the info type above.

next up previous contents index
Next: Static Statements Up: Configuration Guide Previous: SNMP Statement

Laurent Joncheray
Wed Jun 12 15:35:22 EDT 1996