OmniSwitch OS6860/OS6900/OS10K Troubleshooting Guide Part No.032996-00 Rev.A
AOS Release 7.X and 8.X January 2015
Alcatel-Lucent Page 86 of 148
which means, an active OSPF process running in the primary CMM of the Master chassis controls data
forwarding in all the NIs of the system.
During the system start up, OSPF is loaded (task spawned) in all the CMMs of the system, which includes the
primary and secondary CMM of all the chassis in the system. However, OSPF will be activated ONLY on the
primary CMM of the Master chassis. Active OSPF process enables the OSPF interfaces, sends Hello messages,
discover neighboring routers, elect Designated Router (DR) and exchange link-state advertisements (LSAs).
Once the LSA exchanges are completed, OSPF calculates the Shortest Path First (SPF) table and instructs
IPRM to install the routes into all the NIs of the system. The neighbor router information and SPF table
information will NOT be synced with OSPF running on other CMMs in the system. This is because OSPF
minimizes the possibility of routing loops and/or black holes caused by lack of database synchronization
between the Master and Slave chassis. OSPF process on other CMMs completes the initialization and waits for
takeover message from Chassis Supervisor. It will not send/receive any protocol messages.
OSPF process during Virtual Chassis Takeover
When the Master chassis is reset or powered down, the Slave chassis takeover the control functions. During the
takeover process, the chassis supervisor in the primary CMM of the Slave chassis sends takeover message to
its OSPF task. On receiving the takeover message, the OSPF task on the primary CMM of the Slave chassis
will be activated. The OSPF neighbor table and LSA database is rebuilt in the Slave chassis. The forwarding
tables in the NI will remain intact throughout the takeover process intentionally to allow continuous
forwarding of traffic across CMM takeover. However traffic forwarding is disrupted briefly during takeover.
This undesired behavior is due to the following reason. When the adjacencies are formed with the neighboring
routers, the sequence numbers used in protocol packets (DB Descriptor packets) are not retained across
takeover causing the neighboring router to reset the adjacencies. This resetting of OSPF adjacencies results in
neighboring routers flushing their forwarding table entries in NI causing traffic disruption.
OSPF Graceful Restart (Unplanned)
To overcome the traffic disruption due to adjacency reset during takeover, OSPF graceful restart feature is
implemented. OSPF takeover could be either planned or unplanned. Since AOS7 supports only unplanned
graceful restart feature, this document discuss only about unplanned graceful restart feature. Unplanned OSPF
restart could be due Master chassis powered down or process crash in Master CMM. When the OSPF task on
slave chassis receives takeover, it checks if the graceful restart feature is enabled. If yes, then OSPF enters
graceful restart mode. On entering the graceful restart, OSPF on the restarting router first sends Graceful LSA
Update message to the neighboring routers on the enabled OSPF interface. On receiving the Graceful LSA
Update message, the neighboring router enters into helper mode, in which it will NOT reset the adjacency due
to sequence number mismatch in protocol packets. The neighboring router continues to advertise the LSA of
restarting router until the restarting router forms FULL adjacency. Once the restarting router forms FULL
adjacency with its neighboring router, it sends Graceful LSA to terminate the graceful restart period.
Requirements for supporting graceful restart:
The neighbor relationship status between the restarting router and neighbor router should be in
“FULL” state in the neighbor router for processing the Graceful LSA Update from restarting router. If
for any reason the neighbor goes down before the Graceful LSA Update message, then the neighbor
router simply discard the LSA resulting in OSPF adjacency restart (flushes the forwarding table) when
out of sequence protocol packets are received.
There should not be any OSPF topology change in the network during the graceful restart period. If
the neighboring router detects any OSPF network topology changes, then it updates the SPF table and
resets the forwarding table in NI.
Graceful LSA Update message should be sent out first before the Hello packet to the neighbor, to
avoid adjacency reset due to OSPF state mismatch in Hello packet.
Sub-second convergence during Virtual Chassis Takeover
To achieve sub-second convergence during VC takeover, the requirements for OSPF graceful restart should be
met.