EasyManuals Logo

IBM Power 570 User Manual

IBM Power 570
142 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #109 background imageLoading...
Page #109 background image
Chapter 4. Continuous availability and manageability 95
Draft Document for Review September 2, 2008 5:05 pm4405ch04 Continuous availability and manageability.fm
The use of redundant part allows the system to remain operational:
򐂰 Redundant service processor
Redundant service processor function for managing service processors when one fails, is
available for system configurations with two or more CEC enclosures. Redundant Service
Processor function requires that the HMC be attached to the Service Interface Card in
both CEC enclosure 1 and CEC enclosure 2. The Service Interface Card in these two
enclosures must be connected using an external Power Control cable (FC 6006 or
similar).
򐂰 Processor power regulators
A third Processor Power Regulator is required to provide redundant power support to
either one or two processor cards in the enclosure. All CEC enclosures are shipped with
three Processor Power Regulators (FC 5625) except for the system configurations with
one or two FC 5620 processors in a single CEC enclosure.
򐂰 Redundant spare memory bits in cache, directories and main memory
򐂰 Redundant and hot-swap cooling
򐂰 Redundant and hot-swap power supplies
For maximum availability it is highly recommended to connect power cords from the same
system to two separate PDUs in the rack. And to connect each PDU to independent power
sources.
4.1.4 Continuous field monitoring
Aided by the IBM First Failure Data Capture (FFDC) methodology and the associated error
reporting strategy, commodity managers build an accurate profile of the types of failures that
might occur, and initiate programs to enable corrective actions. The IBM support team also
continually analyzes critical system faults, testing to determine if system firmware and
maintenance procedures and tools are effectively handling and recording faults as designed.
See section 4.3.1, “Detecting errors” on page 105.
4.2 Availability
IBMs extensive system of FFDC error checkers also supports a strategy of Predictive Failure
Analysis®: the ability to track intermittent correctable errors and to vary components off-line
before they reach the point of hard failure causing a crash.
This methodology supports IBMs autonomic computing initiative. The primary RAS design
goal of any POWER processor-based server is to prevent unexpected application loss due to
unscheduled server hardware outages. To accomplish this goal this system have a quality
design that includes critical attributes for:
򐂰 Self-diagnose and self-correct during run time
򐂰 Automatically reconfigure to mitigate potential problems from suspect hardware
򐂰 The ability to self-heal or to automatically substitute good components for failing
components
4.2.1 Detecting and deallocating failing components
Runtime correctable or recoverable errors are monitored to determine if there is a pattern of
errors. If these components reach a predefined error limit, the service processor initiates an

Table of Contents

Questions and Answers:

Question and Answer IconNeed help?

Do you have a question about the IBM Power 570 and is the answer not in the manual?

IBM Power 570 Specifications

General IconGeneral
BrandIBM
ModelPower 570
CategoryServer
LanguageEnglish

Related product manuals