Intel® Server Board SE7520BD2 Technical Product Specification Error Reporting and Handling
Revision 1.3 Intel Confidential
123
5.4.2 SMI Handler
The SMI handler is used to handle and log system level events that are not visible to the server
management firmware. If the SEL Error Logging in Setup is disabled, no SMI signals are
generated on system errors. If enabled, the SMI handler preprocesses all system errors, even
those that are normally considered to generate an NMI. The SMI handler sends a command to
the BMC to log the event and provides the data to be logged. For example, BIOS programs the
hardware to generate an SMI on a single-bit memory error and logs the location of the failed
DIMM in the SEL. System events that are handled by the BIOS generate an SMI.
5.4.2.1 PCI Bus Error
The PCI bus defines two error pins, PERR# for reporting parity errors, and SERR# for reporting
system errors. The BIOS can be instructed to enable or disable reporting PERR# and SERR#
errors through an NMI
1
. For PERR #, the PCI bus master has the option to retry the offending
transaction, or to report it using SERR#. All other PCI-related errors are reported by SERR#.
SERR# is routed to the NMI if bit 2 of I/O register 61 is set to 0. If SERR# is enabled in BIOS
Setup, all PCI-to-PCI bridges will generate an SERR# on the primary interface whenever an
SERR# occurs on the secondary side of the bus. The same is true for PERR#s.
5.4.2.2 Processor Bus Error
The BIOS enables the error correction and detection capabilities of the processors by setting
appropriate bits in the processor model specific register (MSR) and appropriate bits inside the
chipset.
In the case of irrecoverable errors on the host processor bus, proper execution of the SMI
handler cannot be guaranteed and the SMI handler cannot be relied upon to log such
conditions. The BIOS SMI handler will record the error to the SEL only if the system has not
experienced a catastrophic failure that compromises the integrity of the SMI handler.
5.4.2.3 Memory Bus Error
The hardware is programmed to generate an SMI on single-bit data errors in the memory array
if ECC memory is installed. The SMI handler records the error and the DIMM location to the
SEL. Double-bit errors in the memory array are mapped to the SMI because the BMC cannot
determine the location of the bad DIMM. The double-bit errors may have corrupted the contents
of SMRAM. The SMI handler will log the failing DIMM number to the BMC if the SMRAM
contents are still valid. The ability to isolate the failure down to a single DIMM may not be
available on certain platforms, and/or during early POST.
5.4.2.4 System Limit Error
The BMC monitors system operational limits. It manages the A/D converter, defining voltage
and temperature limits as well as fan sensors and chassis intrusion. Any sensor values outside
of specified limits are fully handled by the BMC. The BIOS does not generate an SMI to the host
processor for these types of system events.
Refer to the platform's BMC External Product Specification for details on various sensors and
how they are managed.
1
Disabling NMI for PERR# and/or SERR# also disables logging of the corresponding event.