Dell EMC ME4 Series - Test a Recovery Plan; Failover and Failback

To Next Page

To Previous Page

Test a recovery plan

You can automatically create a non-disruptive, isolated testing environment on the recovery site by using replication and connecting virtual

machines to your isolated testing network. You can also save test results for viewing and export at any time.

Testing a recovery plan exercises nearly every aspect of a recovery plan, though several concessions are made to avoid disruption of

ongoing operations. While testing a recovery plan has no lasting eects on either site, running a recovery plan has signicant eects on

both sites.

You should run test recoveries as often as needed. Testing a recovery plan does not aect replication or the ongoing operations of either

site (though it might temporarily suspend the selected local virtual machines at the recovery site if recoveries are congured to do so). You

can cancel a recovery plan test at any time.

In the case of planned migrations, a recovery stops replication after a nal synchronization of the source and the target. Note that for

disaster recoveries, virtual machines are restored to the most recent available state, as determined by the recovery point objective (RPO).

After the nal replication is completed, SRM makes changes at both sites that require signicant time and eort to reverse. Because of

this, the privilege to test a recovery plan and the privilege to run a recovery plan must be separately assigned.

When SRM test failovers to the recovery site are requested, SRM performs the following steps:

1 Determines the latest recovery point for each replicated volume.

2 Creates a writeable test snapshot for each recovery point, with a name in the form srannnnnn where nnnnnn is a monotonically

increasing number.

3 Maps the test snapshots to the appropriate ESXi hosts on the recovery site.

When testing stops, the test snapshots are unmapped and deleted.

Failover and failback

Failback is the process of setting the replication environment back to its original state at the protected site prior to failover. Failback with

SRM is an automated process that occurs after recovery. This makes the failback process of the protected virtual machines relatively

simple in the case of a planned migration. If the entire SRM environment remains intact after recovery, failback is done by running the

reprotect recovery steps with SRM, followed by running the recovery plan again, which moves the virtual machines congured within their

protection groups back to the original protected SRM site.

In disaster scenarios, failback steps vary with respect to the degree of failure at the protected site. For example, the failover could have

been due to a storage system failure or the loss of the entire data center. The manual conguration of failback is important because the

protected site may have a dierent hardware or SAN conguration after a disaster. Using SRM, after failback is congured, it can be

managed and automated like any planned SRM failover. The recovery steps can dier based on the conditions of the last failover that

occurred. If failback follows an unplanned failover, a full data re-mirroring between the two sites may be required. This step usually takes

most of the time in a failback scenario.

All recovery plans in SRM include an initial attempt to synchronize data between the protection and recovery sites, even during a disaster

recovery scenario.

During the disaster recovery, an initial attempt will be made to shut down the protection group’s virtual machines and establish a nal

synchronization between the sites. This is designed to ensure that virtual machines are static and quiescent before running the recovery

plan, in order to minimize data loss wherever possible. If the protected site is no longer available, the recovery plan will continue to execute

and will run to completion even if errors are encountered.

This new attribute minimizes the possibility of data loss during a disaster recovery, balancing the requirement for virtual machine

consistency with the ability to achieve aggressive recovery-point objectives.

Using SRM for disaster recovery

Other manuals for Dell EMC ME4 Series