This component of the Gated core functionality provides a central repository for different protocol instances' currently preferred routes to destinations. It also provides an efficient way to store and process BGP route attribute information.
Gated provides a central repository for storing one or more of a protocol instance's preferred routes to a given destination. The route database module embodies a set of rules which select which protocol instance's route (and, if a protocol instance stores more than one route which one of these) to a same destination should be chosen.
Different protocol modules use the route database in different ways. OSPF maintains its link state database separately and instantiates routes into the routing database when it performs a link state computation. From OSPF, there can be only one route to a given destination. RIP and BGP add routes received in updates from peers; so the routing database may contain as many BGP routes to a destination as there are peers.
A sequence of atomic changes to the route database are made by explicitly locking the database. Whenever the database is unlocked, the route database module informs all protocol instances of all changes that have taken place; this can result in route update messages or unreachables being sent.
A detailed description of the data types, macros and function definitions of the route database may be found here.
The main data structure in the route database is the rt_entry. This is a placeholder for route-specific information for a particular destination/mask pair: next-hop, metric, preference, AS path and so on.
All routes to the same destination/mask pair are chained together in a doubly-linked list. The head of this list is a rt_head structure. A rt_entry contains a back pointer to this structure. The rt_head also contains a pointer to the "active" (i.e. most preferred) route to this destination/mask, a pointer to the previously "active" route and other information.
The Gated route database maintains a single radix tree for all the routes in the database (actually, one per address family). Tree searches are keyed on destination/mask pairs. The internal nodes of this tree are radix_nodes, and each internal node points to a rt_head structure.
A figure helps explain the routing database data structures better.
Some other data structures are of interest to protocol implementations. The rt_list structure contains a list of changes that have occurred to the database; this list is communicated to all protocol instances every time a change occurs. The rt_parms is used to temporarily store route parameters when parsing routing protocol messages; a rt_entry is later created for the route.
When Gated selects the ``active'' route, it notifies all protocol instances (this corresponds to an OSPF instance, or a BGP connection) of the new ``active'' route. A protocol instance's export policy may allow it to announce that route. The rt_bits field encodes the protocol instances that are currently announcing a particular route. This field is a bit mask; each protocol instance is allocated a ``route bit'' from a global bitmask -- if a particular bit in the rt_bits bitmask is set, it means that the protocol instance allocated that bit is announcing the route.
Protocol instances maintain their own maps of routes they have stored in the routing database. Given a rt_entry, how do protocol instances find out which of their own data structures contain references to the rt_entry? Such protocol specific data is stored in a byte array in the route entry's rt_head. The position of this data within the byte array is determined by the protocol's ``route bit''. To each protocol instance, Gated statically allocates block of locations in this byte array where it may store protocol specific data. Gated maintains a global bitmask of allocated locations in the byte array (in the rttsi_map) as well as a mapping from a protocol's route bit to the allocated locations ( rtbit_info). These data structures are better explained using the following figure.
Protocol instances call rt_add() to add a route with the specified parameters and to the specified destination/mask pair to the route database. This function allocates a rt_head and inserts that into the radix tree (if necessary, when no other routes to the same destination exist).
The function rt_delete() logically deletes a route from the route database. Resources allocated to the route are not released immediately, pending notification of the delete to all protocol instances that may be announcing the route.
The functions rttsi_alloc(), rttsi_set(), and rttsi_get(), respectively allocate a contiguous block of bytes for storing protocol-specific data in a rt_head entry, store and retrieve protocol instance specific data from a specified rt_head.
When a protocol instance receives an update for a route previously sent from a peer, it calls rt_change_aspath() to update the route entry with the new information. As an optimization, if the route in question is currently ``active'', Gated stores the changes separately and applies them when the route is re-advertised.
The function rthlist_all() walks through a specified address family's radix tree and returns a list of all rt_head entries in the tree. Variant functions exist for returning the list of all rt_entry elements in the tree, or the list of those specifying a certain criterion.
Finally, rt_new_policy() is called when a re-configuration occurs, and rt_flash_update() is queued by rt_close and called from the task scheduler to notify the kernel routing table and all protocol instances of the most recent routing table changes.
The functions rt_table_add(), rt_table_locate(), rt_table_delete(), respectively add, find, and delete a rt_head entry from a radix tree, given a destination address and mask. These are standard radix tree operations.
The functions prefixed by rt_event_ flag a rt_entry with a specified state change. Actions related to a state change (e.g. when a route is deleted, a new active route must be computed) are also performed here.
The function rt_insert() encodes the routing database's rules on selecting the ``active'' route. The route with the highest preference is selected for the ``active'' route. To break ties, routes with shorter AS paths are preferred. To break ties among these, routes with the lowest next-hop addresses are chosen.
The Gated AS path module provides a facility to parse and otherwise manipulate BGP attributes. It is ostensibly separated from the BGP implementation under the expectation that other protocols (IS-IS) will also need similar facilities in the future.
In the abstract, given a set of attributes (origin, AS path and optional transitive attributes), this facility returns a descriptor for those attributes. This module also maintains the invariant that identical attribute sets share the same descriptor.
A detailed description of the data types, macros and function definitions of the AS path module may be found here.
The main data structure of the AS path module is an attribute descriptor block as_path. This is a variable sized block containing the origin attribute, the AS path, and any optional transitive attributes found in a route's attribute list.
If two routes have the same attributes, they share the same as_path struct. The descriptor contains a reference count of the number of routes sharing an AS path.
An AS path referenced by one or more routes is stored in a global hash table path_list, for quick retrieval. The hash value is a function of the AS path length, the origin attribute, and the length of the optional transitive attributes.
One other data structure of interest internally is the as_path_list structure. This is used to aggregate a number of different routes; a list of these structures denotes the AS paths of the routes contributing to the aggregate.
The primary function of the AS path module is performed by two routines aspath_attr() and aspath_format_v4(). Given a pointer to a BGP update message containing an attribute list, aspath_attr() returns a pointer to a as_path struct having the same attributes. aspath_format_v4() performs the opposite function: given an as_path struct, it formulates a BGP update message attributes part using the contents of the as_path descriptor.
Whenever routing updates are received and the Gated routing database changes, Gated updates its forwarding table. Before it does so, it tries to aggregate the route changes, if possible. aspath_do_aggregation() is used to compute the AS path of the aggregate route.
Finally, aspath_rt_build() is used to associate an AS path descriptor with a rt_entry. This simply ups the reference count in the common case.
The AS path hash table manipulation is relatively straightforward. aspath_find() implements the invariant of having matching attribute sets be represented by a single descriptor.
AS path memory allocation uses the fixed block allocator for the two common cases of attribute blocks being either smaller than 32 bytes or 64 bytes, and uses malloc() otherwise.
The Gated implementation allows a router running Gated to belong to a number of different ASs. An AS path may have been advertised to BGP instances running in different ASs on the same router. When this router advertises the route, it must correctly include all the local ASs which received this advertisement and passed it on. The AS path module maintains an array of "local" AS numbers. The aslocal_*() set of functions is used to manipulate this array. The as_path descriptor maintains a bitmask of local ASs (a bit position in the bitmask indicates the index into the local AS array) which received this AS path. When the AS path module formulates an update message, it ensures that it lists all the local ASs relevant to the AS path as an AS set.
This structure is part of the rt_head structure. The latter heads a list of routes and contains a pointer to the active route being advertised to this destination. This structure contains changes (in metric, preference, AS path etc) if any, that have occurred in the active route.
When a protocol aggregates a number of different destinations, the routes to be aggregated are stored in a doubly-linked list. This element forms the head of this doubly-linked list. It contains a back-pointer to the rt_entry for this route.
The head of a list of rt_aggr_entrys. This also contains the AS paths of the contributing routes.
The routing database in gated is a radix tree, each internal node of which contains a pointer to an rt_head structure. This structure contains a destination address and a doubly linked list of rt_entrys for routes to this destination. This structure also contains a back pointer to the internal node in the radix tree.
This structure contains one of the routes known to the GateD routing database to a given destination. The information contained in this descriptor includes how the route was learned (IGP/EGP etc.), the next hop to the destination, the metric/preference value, the AS paths and so on.
A temporary structure used to hold route parameters until all sanity checks have been done and the route has been added to the database.
Each rt_head entry has a linked list of target specific information (tsi). This is a linked list of 16byte chunks of data which is route and protocol specific (for instance, the tsi data for a BGP route points to the corresponding bgp_rto_entry that contains this route). The tsi's for each protocol are stored sequentially; the rttsi*() set of routines is used to access this data.
Route flash updates to be processed are stored in this structure. An rt_list structure is a descriptor block at the head of a page. Descriptor blocks to routes to be processed are store in the page after this descriptor block. Subsequent pages are chained in a list through the rtl_next field in the descriptor.
Each protocol or protocol peer (in peer-based protocols) is assigned a bit. The rtbit_map array contains a pointer to protocol specific state for each protocol. Each rt_entry contains a bit mask which specifies which protocol(s) this route was learned from. These macros manipulate this bitmask. The GateD routing database module maintains a mapping from bit number to the task that has the bit registered with the module.
Macro to sequentially traverse, from the beginning (resp. end), a linked list of rt_entrys.
The corresponding macros for ending the loop traversal.
These macros manipulate the rt_lists containing flash updates. These include macros to traverse the list one element at a time as well as macros to allocate the list.
The rttsi_map array is a bitmask of all the tsi positions currently allocated. We simply find the first block of unallocated data bytes and proceed to allocate from there. This ensures that, for each protocol, we allocate tsi's contiguously. This function sets the index field in the rtbit_map array entry accordingly.
Traverse a tsi list in a given rt_head structure and find out the data corresponding to the specified rtbit. Copy the data out onto the specified buffer.
Traverse a tsi list in a given rt_head structure and write the specified data to the tsi location specified by the rtbit.
Traverse a tsi list in a given rt_head structure and reset the data bytes at the tsi location specified by the rtbit.
Walk through a tsi list in the specified rt_head structure and release each block of tsi data.
This changes the bitmap of tsi allocations to reflect the fact that the specified rtbit_map entry is no longer represented in the bitmap.
This removes all the resources associated with the specified rt_head including the tsi list and the radix tree node to which the rt_head belongs. The list of routes themselves are taken care of elsewhere.
For a given destination/mask pair, call the tree search routine to find a matching internal node with a non-null external data. If one exists, return the external data. Otherwise, allocate an rt_head entry, initialize it and insert it into the radix tree.
Free the resources associated with an rt_changes structure. These include any AS paths and address information.
Free the resources (AS path, rt_head if necessary) for the given rt_entry structure.
Called whenever a list of routes to a destination changes (i.e. either a route is added or one is removed or both). This function selects the first route in the list of routes pointed to by the specified rt_head. This is used as the active route unless a holddown protocol is advertising the currently active route.
Remove a route entry from the list to which it belongs. Be sure to decrement the count of routes in the queue head.
In a list of routes, the routes are ordered active first, followed by hidden and "deleted" routes. To insert a "deleted" route, add to end of list. To insert a hidden route, search list from the back until you find an appropriate location to insert. Finally, to insert an active route, insert in order of preference. For routes of the same preference, aspath lengths and numerical origin IDs are used to break ties.
This set of functions is called when a route state transition happens. Examples of such transitions include a route becoming active/inactive, routes changing preferences or next-hops etc. The action taken depends on the state (or combination of states) the route is currently in, and usually involves setting the state flag appropriately and (in some cases) could also involve re-queuing the route and recomputing the new active route.
Called when gated is initialized or after a reconfiguration. Remove any existing flash list and build a complete flash list from the entire routing database. Update the kernel routing table.
Called when gated is shutting down. Simply process the existing flash list by recomputing the aggregate list and flashing the kernel routing table with the given list.
The rtbit_map array contains a map of bits allocated to a protocol or protocol peer. Traverse through this array to find the first unallocated bit and allocate it to this protocol. Also allocate a tsi block from the rttsi_map for this bit number and for the specified data size.
Free the specified rtbit_map entry.
Given a rt_entry, set the specified protocol bit to indicate that this protocol or protocol instance is announcing this route.
Reset the specified protocol bit in the specified route entry.
Reset the specified protocol bit in all existing routes. Make sure to clear out the tsi fields in each route header to which such route entries may belong.
Given a destination address and mask, locate the rt_entry corresponding to this destination and additionally matching certain criteria (e.g. matching state and protocol, gateway entry etc.). Traverse the radix tree, find the matching rt_head, and traverse the rt_list until you find the corresponding rt_entry. Appears to be used in RIP and OSPF, not in BGP.
Similar to rt_lookup*(), this function finds route to a destination/mask matching a set of criteria. However, it scans the entire routing table for aggregates to this destination and finds the most specific aggregate matching this criteria. Seems to be used mostly in the SNMP code.
Add a route to the GateD routing database, given the various parameters to the route in a rt_parms struct. This function is called when updates are received in BGP, for instance. Simply locate the head of the queue into which this rt_entry should go into. Allocate a rt_entry copy the parameters (preference, metrics etc) into it. Add the rt_entry to the list of routes advertised by the specified router. Allocate an AS path to the route if necessary. Finally, call rt_insert() to insert the route into the list in preference order.
Inaptly named, called when an update is received for any existing route and one or more of the route's attributes (metric, AS path, preference) might have changed. Most complicated processing in this case: is when a next-hop (or set of next-hops for a multi-path) has changed.
Called when an unreachable has been received. This simply removes the route from its gateway list, and marks it for deletion (which has to be done carefully if it is on a flash list, for example).
A configuration may specify a default route with a specified next hop and preference. These set of routines are used to add/delete or reinitialize the default route after a re-configuration.