EasyManuals Logo

IBM System/370 Guide

IBM System/370
194 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #165 background imageLoading...
Page #165 background image
SECTION
50:
RELIABILITY,
AVAILABILITY,
AND
SERVICEABILITY (RAS)
FEATURES
50:05
INTRODUCTION
with
the
growth
of
more
and
more
online
data
processing
activities,
as
distinguished
from
traditional
batch
accounting
functions,
the
availability
of
the
data
processing
system
becomes
a
very
essential
factor
in
company
operations,
and
complete
system
failure
is
extremely
disruptive.
Because
of
the
growing
frequency
of
online
processing
and
the
fact
that
the
System/370
Model
165
is
designed
to
operate
in
such
an
environment,
IBM
has
provided
an
extensive
group
of
advanced
reliability,
availability,
and
serviceability
features
for
the
Model
165.
These
RAS
features
are
designed
to
improve
the
reliability
of
hardware,
to
increase
the
availability
of
the
computing
system,
and
to
improve
the
serviceability
of
system
hardware
components.
The
RAS
features
of
the
System/370
Model
165
are
designed
to
reduce
the
frequency
and
impact
of
system
interruptions
that
are
caused
by
hardware
failure
and
necessitate
a
re-IPL
as
follows.
More
reliable
components"
such
as
integrated
circuits
with
fewer
connections,
will
be
used
to
improve
hardware
reliability.
Recovery
facilities,
both
hardware
and
programmdng
systems,
not
available
on
System/360
Models
65
and
75,
are
provided
to
reduce
considerably
the
number
of
failures
that
cause
a
complete
system
termination.
This
permits
deferred
maintenance.
Repair
procedures
include
more
online
diagnosis
and
repair
of
malfunctions
concurrently
with
normal
job
execution
in
a
multiprogramming
environment
in
order
to
reduce
the
effect
of
such
repairs
on
system
unavailable
time.
Each
RAS
feature,
recovery
or
repair,
is
discussed
in
the
remainder
of
this
section.
The
following
recovery
features
are
implemented
in
hardware:
CPU
retry
of
most
failing
CPU
operations,
including
those
caused
by
a
buffer
malfunction
ECC
validity
checking
on
processor
stora~e
to
correct
all
single-
bit
errors
I/O
operation
retry
facilities,
including
channel
retry
data
and
channel/control
unit
command
retry
procedures,
to
correct
failing
I/O
operations
Expanded
machine
check
interrupt
facilities
to
facilitate
better
error
recording
and
recovery
procedures
The
following
recovery
features
are
provided
by
programming
systems:
Recovery
management
support
(RMS)
to
handle
the
expanded
machine
check
interrupt
and
channel
retry
data.
Model
165
MCH
and
CCH
routines
are
provided
for
OS
MFT
and
MVT
only.
Error
recovery
procedures
(ERP)
to
retry
failing
I/O
device
and
channel
operations
69

Table of Contents

Other manuals for IBM System/370

Questions and Answers:

Question and Answer IconNeed help?

Do you have a question about the IBM System/370 and is the answer not in the manual?

IBM System/370 Specifications

General IconGeneral
BrandIBM
ModelSystem/370
CategoryServer
LanguageEnglish

Related product manuals