Intel® Server System S7000FC4UR TPS BIOS Initialization
Revision 1.0
131
14.2.14.4.4 Recoverable Error Handling in Redundant / Mirror Mode
Memory write cycles are issued to both mirror domains (i.e. MCH branches). Memory read
cycles can be issued to either mirror domain / branch.
In the event of a memory subsystem recoverable error, the chipset hardware attempts to issue
an AMB fast reset to both branches. In addition, it may retry the failing transaction depending on
whether the error occurs on a memory write or memory read transaction.
14.2.14.4.4.1 AMB Fast Reset Fails
If the AMB fast reset fails on both branches or on the alternate branch then the following is
determined:
There is generally no possibility for a recovery for either memory write or read
transactions.
The BIOS attempts to log a SEL entry for Uncorrectable ECC Memory error and
generates an NMI to halt the system.
If the AMB fast reset fails on the branch generating the error then the following actions result:
The chipset disables the branch.
The BIOS logs SEL entries indicating a Memory ECC Uncorrectable Error and a
transition to non-redundant mode.
On memory write transactions, the data has also been written to the alternate domain
and the system continues operation.
On memory read transactions, the system retries the transaction on the alternate branch
in non-redundant mode. If the error persists on retry, the BIOS attempts to log another
SEL entry for Uncorrectable ECC Memory error. In addition it generates an NMI to halt
the system. If the retry is successful, the system continues operation in non-redundant
mode.
14.2.14.4.4.2 AMB Fast Reset Succeeds
If the AMB fast reset succeeds on both branches then the following actions result:
The chipset retries the memory transaction.
If the retry is successful, the system continues operation. On subsequent read cycles to
the same location if the c/s detects another Uncorrectable ECC Error then the branch is
disabled and system transitions to non-redundant mode.
If a memory write retry fails, the chipset disables the branch. The BIOS reports a SEL
entry indicating an Uncorrectable ECC Memory Error and a transition to non-redundant
mode before continuing operation.
If a memory read retry fails (at this point both branches have failed), the BIOS reports a
SEL entry indicating an Uncorrectable ECC Memory Error. It then generates an NMI to
halt the system.