Intel® 5000 Series Chipsets Server Board Family Datasheet Error Reporting and Handling
Revision 1.1
Intel order number D38960-004
143
5.2 Error Handling and Logging
This section defines how errors are handled by the system BIOS, including a discussion of the
role of the BIOS in error handling and the interaction between the BIOS, platform hardware, and
server management firmware with regard to error handling. In addition, error-logging techniques
are described and beep codes for errors are defined.
5.2.1 Error Sources and Types
Server management must correctly and consistently handle system errors. System errors that
can be enabled and disabled individually or as a group can be categorized as follows:
PCI bus
Memory single- and multi-bit errors
Sensors
Errors detected during POST and logged as POST errors
Sensors are managed by the BMC. The BMC is capable of receiving event messages from
individual sensors and logging system events.
5.2.2 Error Logging via SMI Handler
The SMI handler handles and logs system-level events that are not visible to server
management firmware. The SMI handler pre-processes all system errors, even those that are
normally considered to generate an NMI.
The SMI handler sends a command to the BMC to log the event and provides the data to be
logged. For example, the BIOS programs the hardware to generate an SMI on a single-bit
memory error and logs the location of the failed FBDIMM in the system event log. System
events that are handled by the BIOS generate SMIs. After the BIOS finishes logging the error it
will assert the NMI if needed.
5.2.2.1 PCI Bus Error
The PCI bus defines two error pins, PERR# and SERR#. These are used for reporting PCI
parity errors and system errors, respectively. The BIOS can be instructed to enable or disable
reporting PERR# and SERR# through NMI. Disabling NMI for PERR# and / or SERR# also
disables logging of the corresponding event.
In the case of PERR#, the PCI bus master has the option to retry the offending transaction, or to
report it using SERR#. All other PCI-related errors are reported by SERR#. All PCI-to-PCI
bridges are configured so that they generate SERR# on the primary interface whenever there is
SERR# on the secondary side, as long as SERR# is enabled in BIOS Setup. The same is true
for PERR#.
5.2.2.2 PCI Express* Errors
Fatal and critical PCI Express* errors are logged as PCI system errors and promoted to an NMI.
All non-critical PCI Express errors are logged as PCI parity errors.