Chapter 4. Continuous availability and manageability  181
4.3.3  Reporting
In the unlikely event that a system hardware or environmentally induced failure is diagnosed, 
IBM Power Systems servers report the error through a number of mechanisms. The analysis 
result is stored in system NVRAM. Error log analysis (ELA) can be used to display the failure 
cause and the physical location of the failing hardware.
With the integrated service processor, the system has the ability to automatically send out an 
alert through a phone line to a pager, or call for service in the event of a critical system failure. 
A hardware fault also illuminates the amber system fault LED, located on the system unit, to 
alert the user of an internal hardware problem.
On POWER7+ processor-based servers, hardware and software failures are recorded in the 
system log. When a management console is attached, an ELA routine analyzes the error, 
forwards the event to the Service Focal Point (SFP) application running on the management 
console, and has the capability to notify the system administrator that it has isolated a likely 
cause of the system problem. The service processor event log also records unrecoverable 
checkstop conditions, forwards them to the SFP application, and notifies the system 
administrator. After the information is logged in the SFP application, if the system is properly 
configured, a call-home service request is initiated and the pertinent failure data with service 
parts information and part locations is sent to the IBM service organization.This information 
will also contain the client contact information as defined in the Electronic Service Agent 
(ESA) guided set-up wizard.
Error logging and analysis
When the root cause of an error is identified by a fault isolation component, an error log entry 
is created with basic data such as the following examples:
 An error code that uniquely describes the error event
 The location of the failing component
 The part number of the component to be replaced, including pertinent data such as 
engineering and manufacturing levels
 Return codes
 Resource identifiers
 FFDC data
Data that contains information about the effect that the repair will have on the system is also 
included. Error log routines in the operating system and FSP can then use this information 
and decide whether the fault is a call-home candidate. If the fault requires support 
intervention, a call will be placed with service and support, and a notifcation will be sent to the 
contact that is defined in the ESA guided set-up wizard
Remote support
The Resource Monitoring and Control (RMC) subsystem is delivered as part of the base 
operating system, including the operating system that runs on the Hardware Management 
Console. RMC provides a secure transport mechanism across the LAN interface between the 
operating system and the Hardware Management Console and is used by the operating 
system diagnostic application for transmitting error information. It performs a number of other 
functions also, but these are not used for the service infrastructure.