15 Failover
In this section:
• What is failover? (page 108)
• What happens during failover? (page 108)
• What does the user need to do? (page 110)
• Running failback (page 110)
What is failover?
Failover is when active software sessions cease to run on a node of a couplet within a cluster and
are moved and restarted on the other node within the couplet. All configured VTL devices, NAS
shares, replication mappings and StoreOnce Catalyst stores become available within 15 minutes
of failing on their original node. There are two types of failover – controlled and uncontrolled.
NOTE: Failover occurs only between nodes in a couplet and not between couplets in a cluster.
Controlled failover
A controlled failover is when the node that is failing is still in a state where it can actively manage
any open active connections at the point of failure. It can complete a graceful shutdown of all the
virtual devices running on that node, prior to the node being removed from its use in a cluster.
The current model for a controlled failover is for some system maintenance instances, such as a
maintenance update to one of the hardware elements, for example the storage controller. In this
case, the virtual devices on a failing node are able to cause known error or check conditions to
be reported to the backup application via the appropriate interface. When the service set from
the “failed” node is restarted on the paired node, its power on checks will be quicker and simpler
since it was stopped in a controlled manner
Uncontrolled failover
An uncontrolled failover is when the active node suffers a rapid catastrophic hardware failure or
other trigger events that cause the active B6000 Management Console to invoke its capability to
power down the failing node.
Some examples are:
• The B6000 Management Console does not receive the Service Set node heartbeat.
• The node server behaves as if it is powered off – for example by the power button.
• All of the node's internal network communications fail, leading to loss of heartbeat.
• Software kernel panic, leading to loss of heartbeat.
• Both SAS connections to the node’s local storage fail.
What happens during failover?
For an example with screenshots see What happens to the GUI during failover
GUI and CLI
When the a node fails over and it does not have the currently active B6000 Management Console,
there will be no change to the operation of the GUI or CLI.
108 Failover