EasyManua.ls Logo

IBM Power 570 User Manual

IBM Power 570
142 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #113 background imageLoading...
Page #113 background image
Chapter 4. Continuous availability and manageability 99
Draft Document for Review September 2, 2008 5:05 pm4405ch04 Continuous availability and manageability.fm
Memory protection
Memory and cache arrays comprise data bit lines that feed into a memory word. A memory
word is addressed by the system as a single element. Depending on the size and
addressability of the memory element, each data bit line may include thousands of individual
bits or memory cells. For example:
򐂰 A single memory module on a Dual Inline Memory Module (DIMM) can have a capacity of
1 Gb, and supply eight bit lines of data for an ECC word. In this case, each bit line in the
ECC word holds 128 Mb behind it, corresponding to more than 128 million memory cell
addresses.
򐂰 A 32 KB L1 cache with a 16-byte memory word, on the other hand, would have only 2 Kb
behind each memory bit line.
A memory protection architecture that provides good error resilience for a relatively small L1
cache might be very inadequate for protecting the much larger system main store. Therefore,
a variety of different protection methods are used in POWER6 processor-based systems to
avoid uncorrectable errors in memory.
Memory protection plans must take into account many factors, including:
򐂰 Size
򐂰 Desired performance
򐂰 Memory array manufacturing characteristics.
POWER6 processor-based systems have a number of protection schemes designed to
prevent, protect, or limit the effect of errors in main memory. These capabilities include:
Hardware scrubbing Hardware scrubbing is a method used to deal with soft errors. IBM
POWER6 processor-based systems periodically address all
memory locations and any memory locations with an ECC error are
rewritten with the correct data.
Error correcting code Error correcting code (ECC) allows a system to detect up to two
errors in a memory word and correct one of them. However, without
additional correction techniques if more than one bit is corrupted, a
system will fail.
Chipkill™ Chipkill is an enhancement to ECC that enables a system to
sustain the failure of an entire DRAM. Chipkill spreads the bit lines
from a DRAM over multiple ECC words, so that a catastrophic
DRAM failure would affect at most one bit in each word. Barring a
future single bit error, the system can continue indefinitely in this
state with no performance degradation until the failed DIMM can be
replaced.
Redundant bit steering IBM systems use redundant bit steering to avoid situations where
multiple single-bit errors align to create a multi-bit error. In the event
that an IBM POWER6 processor-based system detects an
abnormal number of errors on a bit line, it can dynamically steer the
data stored at this bit line into one of a number of spare lines.

Table of Contents

Question and Answer IconNeed help?

Do you have a question about the IBM Power 570 and is the answer not in the manual?

IBM Power 570 Specifications

General IconGeneral
BrandIBM
ModelPower 570
CategoryServer
LanguageEnglish

Summary

Chapter 1. General description

1.1 System specifications

Lists general system specifications including operating temperature, humidity, noise, and altitude.

1.2 Physical package

Details the physical attributes and dimensions of the CEC drawer building blocks.

1.3 System features

Outlines key features like core configurations, memory capacity, and disk drive support.

1.3.1 Processor card features

Describes processor card types, frequencies, cache, and Capacity on Demand (CoD) options.

1.3.2 Memory features

Details memory feature codes, capacities, frequencies, and population rules.

1.3.4 I/O drawers

Explains the types of I/O drawers, their slots, and connectivity options.

1.4 System racks

Covers rack compatibility, features, and installation considerations for the system.

1.4.1 IBM 7014 Model T00 rack

Describes the features and specifications of the 1.8-meter IBM 7014 Model T00 rack.

1.4.4 Intelligent Power Distribution Unit (iPDU)

Details the characteristics and function of the Intelligent Power Distribution Unit.

Chapter 2. Architecture and technical overview

2.1 The POWER6 processor

Explains the POWER6 processor's enhancements, core architecture, and advanced features.

2.1.1 Decimal floating point

Details the decimal floating-point processor's support for data types and instructions.

2.3 Processor cards

Describes the POWER6 processor cards, their layout, and memory interfaces.

2.4 Memory subsystem

Covers the memory controller, DIMM slots, and memory architecture.

2.4.1 Fully buffered DIMM

Explains the fully buffered DIMM technology for enhanced memory performance.

2.7 Integrated Virtual Ethernet adapter

Details the IVE adapter, its features, ports, and system integration.

2.8 PCI adapters

Discusses PCI and PCIe adapter types, slots, and general support.

2.8.1 LAN adapters

Lists available LAN adapters for connecting to a local area network.

2.8.3 iSCSI

Explains the iSCSI protocol for storage transport over IP networks.

2.9 Internal storage

Covers the internal disk subsystem using SAS interface and DASD backplane.

2.10 External I/O subsystems

Describes external I/O drawers like 7311-D11, 7311-D20, and 7314-G30.

2.10.1 7311 Model D11 I/O drawers

Details the 7311 Model D11 I/O drawer's features and slot configurations.

2.12 Hardware Management Console

Explains the HMC's role in managing system tasks and partitions.

Chapter 3. Virtualization

3.1 POWER Hypervisor

Introduces the POWER Hypervisor as a core component for system virtualization.

Virtual SCSI

Describes the virtual SCSI mechanism for storage virtualization using VIO Server.

Virtual Ethernet

Explains the virtual Ethernet switch function for secure inter-partition communication.

3.2 Logical partitioning

Discusses LPARs and virtualization for resource utilization and configuration.

3.2.2 Micro-Partitioning

Details Micro-Partitioning for allocating processor fractions to logical partitions.

3.3 PowerVM

Covers the PowerVM platform for industry-leading virtualization.

3.3.1 PowerVM editions

Outlines the functional elements of PowerVM Standard and Enterprise editions.

3.3.2 Virtual I/O Server

Explains the VIO Server's role in sharing physical resources among logical partitions.

3.3.4 PowerVM Live Partition Mobility

Describes moving running logical partitions between systems without disruption.

3.4 System Planning Tool

Explains the SPT for designing system configurations and planning partitions.

Chapter 4. Continuous availability and manageability

4.1 Reliability

Discusses the design principles for achieving high system reliability.

4.1.1 Designed for reliability

Covers design choices that reduce failure opportunities and improve reliability.

4.2 Availability

Details features that prevent unexpected application loss due to outages.

4.2.1 Detecting and deallocating failing components

Explains monitoring and deconfiguring faulty hardware to avoid system outages.

4.3 Serviceability

Outlines the strategy for efficient system service and repair.

4.3.1 Detecting errors

Covers the critical ability to accurately detect system errors.

4.3.2 Diagnosing problems

Explains how systems perform self-diagnosis using hardware and OS logic.

4.3.5 Locating and repairing the problem

Details methods for quickly identifying and replacing service parts.

4.5 Manageability

Covers functions and tools for efficient system management.

4.5.1 Service processor

Describes the service processor's role in monitoring, managing, and error detection.

4.5.6 IBM System p firmware maintenance

Explains the process of managing and installing microcode updates.

Related publications

IBM Redbooks

Lists IBM Redbooks relevant for detailed discussion of topics.

Online resources

Provides links to relevant IBM websites for further information.

Related product manuals