EasyManuals Logo

IBM Power 570 User Manual

IBM Power 570
142 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #113 background imageLoading...
Page #113 background image
Chapter 4. Continuous availability and manageability 99
Draft Document for Review September 2, 2008 5:05 pm4405ch04 Continuous availability and manageability.fm
Memory protection
Memory and cache arrays comprise data bit lines that feed into a memory word. A memory
word is addressed by the system as a single element. Depending on the size and
addressability of the memory element, each data bit line may include thousands of individual
bits or memory cells. For example:
򐂰 A single memory module on a Dual Inline Memory Module (DIMM) can have a capacity of
1 Gb, and supply eight bit lines of data for an ECC word. In this case, each bit line in the
ECC word holds 128 Mb behind it, corresponding to more than 128 million memory cell
addresses.
򐂰 A 32 KB L1 cache with a 16-byte memory word, on the other hand, would have only 2 Kb
behind each memory bit line.
A memory protection architecture that provides good error resilience for a relatively small L1
cache might be very inadequate for protecting the much larger system main store. Therefore,
a variety of different protection methods are used in POWER6 processor-based systems to
avoid uncorrectable errors in memory.
Memory protection plans must take into account many factors, including:
򐂰 Size
򐂰 Desired performance
򐂰 Memory array manufacturing characteristics.
POWER6 processor-based systems have a number of protection schemes designed to
prevent, protect, or limit the effect of errors in main memory. These capabilities include:
Hardware scrubbing Hardware scrubbing is a method used to deal with soft errors. IBM
POWER6 processor-based systems periodically address all
memory locations and any memory locations with an ECC error are
rewritten with the correct data.
Error correcting code Error correcting code (ECC) allows a system to detect up to two
errors in a memory word and correct one of them. However, without
additional correction techniques if more than one bit is corrupted, a
system will fail.
Chipkill™ Chipkill is an enhancement to ECC that enables a system to
sustain the failure of an entire DRAM. Chipkill spreads the bit lines
from a DRAM over multiple ECC words, so that a catastrophic
DRAM failure would affect at most one bit in each word. Barring a
future single bit error, the system can continue indefinitely in this
state with no performance degradation until the failed DIMM can be
replaced.
Redundant bit steering IBM systems use redundant bit steering to avoid situations where
multiple single-bit errors align to create a multi-bit error. In the event
that an IBM POWER6 processor-based system detects an
abnormal number of errors on a bit line, it can dynamically steer the
data stored at this bit line into one of a number of spare lines.

Table of Contents

Questions and Answers:

Question and Answer IconNeed help?

Do you have a question about the IBM Power 570 and is the answer not in the manual?

IBM Power 570 Specifications

General IconGeneral
BrandIBM
ModelPower 570
CategoryServer
LanguageEnglish

Related product manuals