home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco

Next: Configuration support Up: Developer's Guide Previous: Neighbor communication

Routes storage mechanism

This component of the Gated core functionality provides a central repository for different protocol instances' currently preferred routes to destinations. It also provides an efficient way to store and process BGP route attribute information.

Gated Route Database

Gated provides a central repository for storing one or more of a protocol instance's preferred routes to a given destination. The route database module embodies a set of rules which select which protocol instance's route (and, if a protocol instance stores more than one route which one of these) to a same destination should be chosen.

Different protocol modules use the route database in different ways. OSPF maintains its link state database separately and instantiates routes into the routing database when it performs a link state computation. From OSPF, there can be only one route to a given destination. RIP and BGP add routes received in updates from peers; so the routing database may contain as many BGP routes to a destination as there are peers.

A sequence of atomic changes to the route database are made by explicitly locking the database. Whenever the database is unlocked, the route database module informs all protocol instances of all changes that have taken place; this can result in route update messages or unreachables being sent.

A detailed description of the data types, macros and function definitions of the route database may be found here.

Route Database Data Structures

The main data structure in the route database is the rt_entry. This is a placeholder for route-specific information for a particular destination/mask pair: next-hop, metric, preference, AS path and so on.

All routes to the same destination/mask pair are chained together in a doubly-linked list. The head of this list is a rt_head structure. A rt_entry contains a back pointer to this structure. The rt_head also contains a pointer to the "active" (i.e. most preferred) route to this destination/mask, a pointer to the previously "active" route and other information.

The Gated route database maintains a single radix tree for all the routes in the database (actually, one per address family). Tree searches are keyed on destination/mask pairs. The internal nodes of this tree are radix_nodes, and each internal node points to a rt_head structure.

A figure helps explain the routing database data structures better.

Some other data structures are of interest to protocol implementations. The rt_list structure contains a list of changes that have occurred to the database; this list is communicated to all protocol instances every time a change occurs. The rt_parms is used to temporarily store route parameters when parsing routing protocol messages; a rt_entry is later created for the route.

When Gated selects the ``active'' route, it notifies all protocol instances (this corresponds to an OSPF instance, or a BGP connection) of the new ``active'' route. A protocol instance's export policy may allow it to announce that route. The rt_bits field encodes the protocol instances that are currently announcing a particular route. This field is a bit mask; each protocol instance is allocated a ``route bit'' from a global bitmask -- if a particular bit in the rt_bits bitmask is set, it means that the protocol instance allocated that bit is announcing the route.

Protocol instances maintain their own maps of routes they have stored in the routing database. Given a rt_entry, how do protocol instances find out which of their own data structures contain references to the rt_entry? Such protocol specific data is stored in a byte array in the route entry's rt_head. The position of this data within the byte array is determined by the protocol's ``route bit''. To each protocol instance, Gated statically allocates block of locations in this byte array where it may store protocol specific data. Gated maintains a global bitmask of allocated locations in the byte array (in the rttsi_map) as well as a mapping from a protocol's route bit to the allocated locations ( rtbit_info). These data structures are better explained using the following figure.

Route Database Entry Points

Protocol instances call rt_add() to add a route with the specified parameters and to the specified destination/mask pair to the route database. This function allocates a rt_head and inserts that into the radix tree (if necessary, when no other routes to the same destination exist).

The function rt_delete() logically deletes a route from the route database. Resources allocated to the route are not released immediately, pending notification of the delete to all protocol instances that may be announcing the route.

The functions rttsi_alloc(), rttsi_set(), and rttsi_get(), respectively allocate a contiguous block of bytes for storing protocol-specific data in a rt_head entry, store and retrieve protocol instance specific data from a specified rt_head.

When a protocol instance receives an update for a route previously sent from a peer, it calls rt_change_aspath() to update the route entry with the new information. As an optimization, if the route in question is currently ``active'', Gated stores the changes separately and applies them when the route is re-advertised.

The function rthlist_all() walks through a specified address family's radix tree and returns a list of all rt_head entries in the tree. Variant functions exist for returning the list of all rt_entry elements in the tree, or the list of those specifying a certain criterion.

Finally, rt_new_policy() is called when a re-configuration occurs, and rt_flash_update() is queued by rt_close and called from the task scheduler to notify the kernel routing table and all protocol instances of the most recent routing table changes.

Route Database Internals

The functions rt_table_add(), rt_table_locate(), rt_table_delete(), respectively add, find, and delete a rt_head entry from a radix tree, given a destination address and mask. These are standard radix tree operations.

The functions prefixed by rt_event_ flag a rt_entry with a specified state change. Actions related to a state change (e.g. when a route is deleted, a new active route must be computed) are also performed here.

The function rt_insert() encodes the routing database's rules on selecting the ``active'' route. The route with the highest preference is selected for the ``active'' route. To break ties, routes with shorter AS paths are preferred. To break ties among these, routes with the lowest next-hop addresses are chosen.

Gated AS Path Module

The Gated AS path module provides a facility to parse and otherwise manipulate BGP attributes. It is ostensibly separated from the BGP implementation under the expectation that other protocols (IS-IS) will also need similar facilities in the future.

In the abstract, given a set of attributes (origin, AS path and optional transitive attributes), this facility returns a descriptor for those attributes. This module also maintains the invariant that identical attribute sets share the same descriptor.

A detailed description of the data types, macros and function definitions of the AS path module may be found here.

AS Path Data Structures

The main data structure of the AS path module is an attribute descriptor block as_path. This is a variable sized block containing the origin attribute, the AS path, and any optional transitive attributes found in a route's attribute list.

If two routes have the same attributes, they share the same as_path struct. The descriptor contains a reference count of the number of routes sharing an AS path.

An AS path referenced by one or more routes is stored in a global hash table path_list, for quick retrieval. The hash value is a function of the AS path length, the origin attribute, and the length of the optional transitive attributes.

One other data structure of interest internally is the as_path_list structure. This is used to aggregate a number of different routes; a list of these structures denotes the AS paths of the routes contributing to the aggregate.

AS Path Module Entry Points

The primary function of the AS path module is performed by two routines aspath_attr() and aspath_format_v4(). Given a pointer to a BGP update message containing an attribute list, aspath_attr() returns a pointer to a as_path struct having the same attributes. aspath_format_v4() performs the opposite function: given an as_path struct, it formulates a BGP update message attributes part using the contents of the as_path descriptor.

Whenever routing updates are received and the Gated routing database changes, Gated updates its forwarding table. Before it does so, it tries to aggregate the route changes, if possible. aspath_do_aggregation() is used to compute the AS path of the aggregate route.

Finally, aspath_rt_build() is used to associate an AS path descriptor with a rt_entry. This simply ups the reference count in the common case.

AS Path Module Internals

The AS path hash table manipulation is relatively straightforward. aspath_find() implements the invariant of having matching attribute sets be represented by a single descriptor.

AS path memory allocation uses the fixed block allocator for the two common cases of attribute blocks being either smaller than 32 bytes or 64 bytes, and uses malloc() otherwise.

The Gated implementation allows a router running Gated to belong to a number of different ASs. An AS path may have been advertised to BGP instances running in different ASs on the same router. When this router advertises the route, it must correctly include all the local ASs which received this advertisement and passed it on. The AS path module maintains an array of "local" AS numbers. The aslocal_*() set of functions is used to manipulate this array. The as_path descriptor maintains a bitmask of local ASs (a bit position in the bitmask indicates the index into the local AS array) which received this AS path. When the AS path module formulates an update message, it ensures that it lists all the local ASs relevant to the AS path as an AS set.

Routing database module type declarations

Type rt_changes

This structure is part of the rt_head structure. The latter heads a list of routes and contains a pointer to the active route being advertised to this destination. This structure contains changes (in metric, preference, AS path etc) if any, that have occurred in the active route.

type rt_aggr_entry

When a protocol aggregates a number of different destinations, the routes to be aggregated are stored in a doubly-linked list. This element forms the head of this doubly-linked list. It contains a back-pointer to the rt_entry for this route.

type rt_aggr_head

The head of a list of rt_aggr_entrys. This also contains the AS paths of the contributing routes.

type rt_head

The routing database in gated is a radix tree, each internal node of which contains a pointer to an rt_head structure. This structure contains a destination address and a doubly linked list of rt_entrys for routes to this destination. This structure also contains a back pointer to the internal node in the radix tree.

type rt_entry

This structure contains one of the routes known to the GateD routing database to a given destination. The information contained in this descriptor includes how the route was learned (IGP/EGP etc.), the next hop to the destination, the metric/preference value, the AS paths and so on.

type rt_parms

A temporary structure used to hold route parameters until all sanity checks have been done and the route has been added to the database.

type rt_parms

Each rt_head entry has a linked list of target specific information (tsi). This is a linked list of 16byte chunks of data which is route and protocol specific (for instance, the tsi data for a BGP route points to the corresponding bgp_rto_entry that contains this route). The tsi's for each protocol are stored sequentially; the rttsi*() set of routines is used to access this data.

type rt_list

Route flash updates to be processed are stored in this structure. An rt_list structure is a descriptor block at the head of a page. Descriptor blocks to routes to be processed are store in the page after this descriptor block. Subsequent pages are chained in a list through the rtl_next field in the descriptor.

Routing database module macros

macro RTBIT_*

Each protocol or protocol peer (in peer-based protocols) is assigned a bit. The rtbit_map array contains a pointer to protocol specific state for each protocol. Each rt_entry contains a bit mask which specifies which protocol(s) this route was learned from. These macros manipulate this bitmask. The GateD routing database module maintains a mapping from bit number to the task that has the bit registered with the module.

macro RT_ALLRT(_REV)

Macro to sequentially traverse, from the beginning (resp. end), a linked list of rt_entrys.

macro RT_ALLRT(_REV)_END

The corresponding macros for ending the loop traversal.

macro RT_LIST_*

These macros manipulate the rt_lists containing flash updates. These include macros to traverse the list one element at a time as well as macros to allocate the list.

Routing database module function definitions

function rttsi_alloc()

The rttsi_map array is a bitmask of all the tsi positions currently allocated. We simply find the first block of unallocated data bytes and proceed to allocate from there. This ensures that, for each protocol, we allocate tsi's contiguously. This function sets the index field in the rtbit_map array entry accordingly.

function rttsi_get()

Traverse a tsi list in a given rt_head structure and find out the data corresponding to the specified rtbit. Copy the data out onto the specified buffer.

function rttsi_set()

Traverse a tsi list in a given rt_head structure and write the specified data to the tsi location specified by the rtbit.

function rttsi_reset()

Traverse a tsi list in a given rt_head structure and reset the data bytes at the tsi location specified by the rtbit.

function rttsi_release()

Walk through a tsi list in the specified rt_head structure and release each block of tsi data.

function rttsi_free()

This changes the bitmap of tsi allocations to reflect the fact that the specified rtbit_map entry is no longer represented in the bitmap.

function rth_remove()

This removes all the resources associated with the specified rt_head including the tsi list and the radix tree node to which the rt_head belongs. The list of routes themselves are taken care of elsewhere.

function rth_locate()

For a given destination/mask pair, call the tree search routine to find a matching internal node with a non-null external data. If one exists, return the external data. Otherwise, allocate an rt_head entry, initialize it and insert it into the radix tree.

function rtchanges_free()

Free the resources associated with an rt_changes structure. These include any AS paths and address information.

function rt_free()

Free the resources (AS path, rt_head if necessary) for the given rt_entry structure.

function select_active()

Called whenever a list of routes to a destination changes (i.e. either a route is added or one is removed or both). This function selects the first route in the list of routes pointed to by the specified rt_head. This is used as the active route unless a holddown protocol is advertising the currently active route.

function rt_remove()

Remove a route entry from the list to which it belongs. Be sure to decrement the count of routes in the queue head.

function rt_remove()

In a list of routes, the routes are ordered active first, followed by hidden and "deleted" routes. To insert a "deleted" route, add to end of list. To insert a hidden route, search list from the back until you find an appropriate location to insert. Finally, to insert an active route, insert in order of preference. For routes of the same preference, aspath lengths and numerical origin IDs are used to break ties.

function rt_event_*()

This set of functions is called when a route state transition happens. Examples of such transitions include a route becoming active/inactive, routes changing preferences or next-hops etc. The action taken depends on the state (or combination of states) the route is currently in, and usually involves setting the state flag appropriately and (in some cases) could also involve re-queuing the route and recomputing the new active route.

function rt_new_policy()

Called when gated is initialized or after a reconfiguration. Remove any existing flash list and build a complete flash list from the entire routing database. Update the kernel routing table.

function rt_flash_update()

Called when gated is shutting down. Simply process the existing flash list by recomputing the aggregate list and flashing the kernel routing table with the given list.

function rtbit_alloc()

The rtbit_map array contains a map of bits allocated to a protocol or protocol peer. Traverse through this array to find the first unallocated bit and allocate it to this protocol. Also allocate a tsi block from the rttsi_map for this bit number and for the specified data size.

function rtbit_free()

Free the specified rtbit_map entry.

function rtbit_set()

Given a rt_entry, set the specified protocol bit to indicate that this protocol or protocol instance is announcing this route.

function rtbit_reset()

Reset the specified protocol bit in the specified route entry.

function rtbit_reset_all()

Reset the specified protocol bit in all existing routes. Make sure to clear out the tsi fields in each route header to which such route entries may belong.

function rt_locate*()

Given a destination address and mask, locate the rt_entry corresponding to this destination and additionally matching certain criteria (e.g. matching state and protocol, gateway entry etc.). Traverse the radix tree, find the matching rt_head, and traverse the rt_list until you find the corresponding rt_entry. Appears to be used in RIP and OSPF, not in BGP.

function rt_lookup()

Similar to rt_lookup*(), this function finds route to a destination/mask matching a set of criteria. However, it scans the entire routing table for aggregates to this destination and finds the most specific aggregate matching this criteria. Seems to be used mostly in the SNMP code.

function rt_add()

Add a route to the GateD routing database, given the various parameters to the route in a rt_parms struct. This function is called when updates are received in BGP, for instance. Simply locate the head of the queue into which this rt_entry should go into. Allocate a rt_entry copy the parameters (preference, metrics etc) into it. Add the rt_entry to the list of routes advertised by the specified router. Allocate an AS path to the route if necessary. Finally, call rt_insert() to insert the route into the list in preference order.

function rt_change_aspath()

Inaptly named, called when an update is received for any existing route and one or more of the route's attributes (metric, AS path, preference) might have changed. Most complicated processing in this case: is when a next-hop (or set of next-hops for a multi-path) has changed.

function rt_delete()

Called when an unreachable has been received. This simply removes the route from its gateway list, and marks it for deletion (which has to be done carefully if it is on a flash list, for example).

function rt_default*()

A configuration may specify a default route with a specified next hop and preference. These set of routines are used to add/delete or reinitialize the default route after a re-configuration.

Next: Configuration support Up: Developer's Guide Previous: Neighbor communication

Laurent Joncheray
Wed Jun 12 15:35:22 EDT 1996

bigmir)net

221