EasyManuals Logo

Intel 5000 Series User Manual

Intel 5000 Series
170 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #55 background image
Intel® 5000 Series Chipsets Server Board Family Datasheet System BIOS
Revision 1.1
Intel order number D38960-004
41
3.3.10.1.2 Faulty Links
FBDIMM technology is a serial technology. Therefore, errors or failures can occur on the serial
path between FBDIMMs. These errors are different from ECC errors, and do not necessarily
occur as a result of faulty FBDIMMs. The BIOS keeps track of such link-level failures.
In general, when a link failure occurs, the BIOS will disable all FBDIMMs on that link. If all
FBDIMMs are present on the same faulty link, the BIOS will generate POST code 0xE1 to
indicate that the system has no usable memory, and then halts the system.
If a link failure occurs during normal operation at runtime (after POST), the BIOS will signal a
fatal error and perform policies related to fatal error handling.
The BIOS handles memory errors thru a variety of platform-specific policies. Each of these
policies is aimed at providing comprehensive diagnostic support to the system administrator
towards system recovery following the failure.
The BIOS uses error counters on the Intel
®
5000 Series Chipsets and internal software counters
to track the number of correctable and Multi-bit correctable errors that occur at runtime. The
chipset increments the count for these counters when an error occurs. The count also decays at
a given rate, programmable by the BIOS. Because of this particular nature of the counters, they
are termed leaky bucket counters.
3.3.10.1.3 Error Counters and Thresholds
The leaky bucket counters provide a measurement of the frequency of errors. The BIOS
configures and uses the leaky bucket counters and the decay rate such that it can be notified of
a failing FBDIMM. A failing FBDIMM will typically generate a burst of errors in a short period of
time, which is detected by the leaky bucket algorithm. The chipset maintains separate internal
leaky bucket counters for correctable and multi-bit correctable errors respectively.
The BIOS initializes the correctable error leaky bucket counters to a value of 10 for correctable
ECC errors. These counters are on a per-rank basis. A rank applies to a pair of FBDIMMs on
adjacent channels functioning in lock-stepped mode.
3.3.10.1.3.1 BIOS Policies on Correctable Errors
For each correctable error that occurs before the threshold is reached, the BIOS will log a
Correctable Error SEL entry. No other action will be taken, and the system will continue to
function normally.
When the error threshold reaches 10, the BIOS logs a SEL entry to indicate the correctable
error. In addition, the following steps occur:
1. If sparing is enabled, the chipset initiates a spare fail-over to a spare FBDIMM. In all
other memory configurations, Future correctable errors are masked and no longer
reported to the SEL.
2. The BIOS logs a Max Threshold Reached SEL event.
3. The BIOS sends a DIMM Failed event to the BMC. This causes the BMC to light the
system fault LEDs to initiate memory performance degradation and an assertion of the
failed FBDIMM.

Table of Contents

Questions and Answers:

Question and Answer IconNeed help?

Do you have a question about the Intel 5000 Series and is the answer not in the manual?

Intel 5000 Series Specifications

General IconGeneral
BrandIntel
Model5000 Series
CategoryServer Board
LanguageEnglish

Related product manuals