Understanding System Redundancy with Dual MSMs Installed—Modular Switches Only
ExtremeWare XOS 11.3 Concepts Guide
73
After one application completes bulk checkpointing, the next application proceeds with its bulk
checkpointing.
To monitor the checkpointing status, use the
show checkpoint-data {<process>} command.
To view the status of bulk checkpointing and see if the backup MSM is synchronized with the master
MSM, use the
show switch {detail} command.
Dynamic Checkpointing
After an application transfers its saved state to the backup MSM, dynamic checkpointing requires that
any new configuration information or state changes that occur on the master be immediately relayed to
the backup. This ensures that the backup has the most up-to-date and accurate information.
Viewing Checkpoint Statistics
Use the following command to view and check the status of one or more processes being copied from
the master to the backup MSM:
show checkpoint-data {<process>}
This command is also helpful in debugging synchronization problems that occur at run time.
This command displays, in percentages, the amount of copying completed by each process and the
traffic statistics between the process on both the master and the backup MSMs.
Viewing Node Status
ExtremeWare XOS allows you to view node statistical information. Each node (MSM) installed in your
system is self-sufficient and runs the ExtremeWare XOS management applications. By reviewing this
output, you can see the general health of the system along with other node parameters.
To view node status, use the following command:
show node {detail}
Table 10 lists the node status collected by the switch.
Table 10: Node states
Node State Description
BACKUP In the backup state, this node becomes the master node if the master fails or enters the DOWN
state. The backup node also receives the checkpoint state data from the master.
DOWN In the down state, the node is not available to participate in leader election. The node enters this
state during any user action, other than a failure, that makes the node unavailable for
management. Examples of user actions are:
• Upgrading the software
• Rebooting the system using the reboot command
• Initiating an MSM failover using the run msm-failover command
• Synchronizing the MSMs software and configuration in non-volatile storage using the
synchronize command