Chapter 4. Continuous availability and manageability 163
Figure 4-4 shows a schematic of fault isolation register implementation.
Figure 4-4 Schematic of FIR implementation
Fault isolation
The service processor interprets error data that is captured by the FFDC checkers (saved in
the FIRs or other firmware-related data capture methods) to determine the root cause of the
error event.
Root cause analysis might indicate that the event is recoverable, meaning that a service
action point or need for repair has not been reached. Alternatively, it might indicate that a
service action point has been reached, where the event exceeded a predetermined threshold
or was unrecoverable. Based on the isolation analysis, recoverable error-threshold counts
can be incremented. No specific service action is necessary when the event is recoverable.
When the event requires a service action, additional required information is collected to
service the fault. For unrecoverable errors or for recoverable events that meet or exceed their
service threshold, meaning that a service action point has been reached, a request for
service is initiated through an error logging component.
Memory
CPU
L2 / L3
Text
Text
Text
Text
Text
Text
Text
Text
Text
Text
Text
Text
Text
Text
Text
Text
L1
Disk
Text
Text
Text
Text
Text
Text
Text
Text
Non-volatile
RAM
Service
Processor
Error checkers
Text
Fault isolation register (FIR)
Unique fingerprint of each
captured error
Log error