EasyManua.ls Logo

IBM z13s - Page 571

IBM z13s
588 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Loading...
Appendix I. GDPS Virtual Appliance 543
An HA/DR implementation has various levels:
򐂰 High availability (HA): The attribute of a system to provide service during defined periods
at agreed upon levels by masking unplanned outages from users. HA employs component
duplication (hardware and software), automated failure detection, retry, bypass, and
reconfiguration.
򐂰 Continuous operations (CO): Attribute of a system to continuously operate and mask
planned outages from users. CO provides the means to minimize planned downtime
during maintenance windows. It employs nondisruptive hardware and software changes,
nondisruptive configuration, and software coexistence.
򐂰 Continuous availability (CA): Attribute of a system to deliver nondisruptive service to the
user 7 days a week, 24 hours a day (there are no planned or unplanned outages).
A system outage can occur for various reasons. Outages can be categorized as either
planned or unplanned.
Planned outages can be caused by the following situations:
򐂰 Backups
򐂰 Operating system installation and maintenance
򐂰 Application software maintenance
򐂰 Hardware and software upgrades
Unplanned outages can be caused by the following situations:
򐂰 Non-disaster events such as:
Application failure
Operator errors (human error)
Power outages
Network failure
Hardware and software failures
򐂰 Disaster events such as:
Natural disasters or other catastrophes that damage the production facilities beyond
usability (for example, fire, flood, earthquake, or bombing)
Failure of a regional power grid
Outages that require a recovery procedure at an off-site location
Automation is key when implementing a HA/DR solution. The major benefits of an automated
solution are as follows:
򐂰 Provides reliable, consistent RTO
򐂰 Provides consistent and predictive recovery time as the environment scales
򐂰 Reduces infrastructure management cost and staff skills
򐂰 Reduces or eliminates human intervention, and therefore the probability of human error
򐂰 Facilitates regular testing for repeatable and reliable results of business continuity
procedures
򐂰 Helps maintain recovery readiness by managing/monitoring servers, data replication,
workload, and network with notification of events that occur within the environment

Table of Contents

Related product manuals