114 IBM BladeCenter PS700, PS701, and PS702 Technical Overview and Introduction
On POWER7 processor-based servers, hardware and software failures are recorded in the
system log. An ELA routine analyzes the error, forwards the event to the IVM Service Focal
Point (SFP) application running on the blade, and notifies the system administrator that it has
isolated a likely cause of the system problem. The service processor event log also records
unrecoverable checkstop conditions, forwards them to the SFP application and the
BladeCenter AMM, and notifies the system administrator. After the information is logged in
the SFP application and AMM event log, if the system or BladeCenter AMM are properly
configured, a call-home service request is initiated. The pertinent failure data, with service
parts information and part locations, is sent to an IBM Service organization. Customer
contact information and specific system-related data (such as the machine type, model, and
serial number), along with error log data related to the failure, is sent to IBM Service.
Error logging and analysis
When the root cause of an error has been identified by a fault isolation component, an error
log entry is created with basic data:
An error code uniquely describing the error event
The location of the failing component
The part number of the component to be replaced, including pertinent data such as
engineering and manufacturing levels
Return codes
Resource identifiers
First-failure data capture data
Data containing information about the effect that the repair will have on the system is also
included. Error log routines in the operating system can use this information and decide
whether to contact service and support, send a notification message, or continue without an
alert.
Service Focal Point
A critical requirement in a logically partitioned environment is to ensure that errors are not lost
before being reported for service, and that an error should only be reported once, regardless
of how many logical partitions experience the potential effect of the error. The Manage
Serviceable Events task, under the Service Focal Point section of the IVM user interface
(Figure 4-3 on page 115), is responsible for aggregating duplicate error reports, and ensures
that all errors are recorded for review and management on the single blade IVM is controlling.