RIB Quarantining
RIB quarantining solves the problem in the interaction between routing protocols and the RIB. The problem
is a persistent oscillation between the RIB and routing protocols that occurs when a route is continuously
inserted and then withdrawn from the RIB, resulting in a spike in CPU use until the problem is resolved. If
there is no damping on the oscillation, then both the protocol process and the RIB process have high CPU
use, affecting the rest of the system as well as blocking out other protocol and RIB operations. This problem
occurs when a particular combination of routes is received and installed in the RIB. This problem typically
happens as a result of a network misconfiguration. However, because the misconfiguration is across the
network, it is not possible to detect the problem at configuration time on any single router.
The quarantining mechanism detects mutually recursive routes and quarantines the last route that completes
the mutual recursion. The quarantined route is periodically evaluated to see if the mutual recursion has gone
away. If the recursion still exists, the route remains quarantined. If the recursion has gone away, the route is
released from its quarantine.
The following steps are used to quarantine a route:
1. RIB detects when a particular problematic path is installed.
2. RIB sends a notification to the protocol that installed the path.
3. When the protocol receives the quarantine notification about the problem route, it marks the route as being
“quarantined.” If it is a BGP route, BGP does not advertise reachability for the route to its neighbors.
4. Periodically, RIB tests all its quarantined paths to see if they can now safely be installed (moved from
quarantined to "Ok to use" state). A notification is sent to the protocol to indicate that the path is now safe
to use.
Route and Label Consistency Checker
The Route Consistency Checker and Label Consistency Checker (RCC/LCC) are command-line tools that
can be used to verify consistency between control plane and data plane route and label programming in IOS
XR software.
Routers in production networks may end up in a state where the forwarding information does not match the
control plane information. Possible causes of this include fabric or transport failures between the Route
Processor (RP) and the line cards (LCs), or issues with the Forwarding Information Base (FIB). RCC/LCC
can be used to identify and provide detailed information about resultant inconsistencies between the control
plane and data plane. This information can be used to further investigate and diagnose the cause of forwarding
problems and traffic loss.
RCC/LCC can be run in two modes. It can be triggered from using the appropriate command modes as an
on-demand, one-time scan (On-demand Scan), or be configured to run at defined intervals in the background
during normal router operation (Background Scan). RCC compares the Routing Information Base (RIB)
against the Forwarding Information Base (FIB) while LCC compares the Label Switching Database (LSD)
against the FIB. When an inconsistency is detected, RCC/LCC output will identify the specific route or label
and identify the type of inconsistency detected as well as provide additional data that will assist with further
troubleshooting.
RCC runs on the Route Processor. FIB checks for errors on the line card and forwards first the 20 error reports
to RCC. RCC receives error reports from all nodes, summarizes them (checks for exact match), and adds it
to two queues, soft or hard. Each queue has a limit of 1000 error reports and there is no prioritization in the
Routing Configuration Guide for Cisco NCS 5500 Series Routers, IOS XR Release 6.3.x
124
Implementing and Monitoring RIB
RIB Quarantining