EasyManua.ls Logo

Cray CRAY-1 - Division Algorithm

Default Icon
216 pages
Print Icon
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Loading...
Division algorithm
The
CRAY-l
performs
floating
point division
by
the
method
of reciprocal
approximation. This
facilitates
the hardware implementation of a
fully-
segmented
functional
unit.
Operands
may
enter
the reciprocal unit
each
clock period because of
this
segmentation.
In
vector
mode,
results
are
produced
at
a
one
clock period
rate.
These
results
may
be
used
in other
vector operations during chaining because
all
functional units in the
CRAY-l
have
the
same
result
rate.
The
division algorithm
that
computes
Sl
/S2
to
full
precision requires
four operations:
1.
S3
=
I/S
2
2.
S4
=
(2
S3
3.
S5
=
Sl
* S
3
4.
S6
=
S4
*
S5
*
S2)
Reciprocal approximation
Reciprocal
iteration
Numerator
* approximation
Half-precision quotient
* correction factor
The
approximation
is
based
on
Newton's
method.
The
reciprocal approxima-
tion
at
step 1
is
correct
to
30
bits.
The
additional
Newton
iteration
at
step 2 increases
this
accuracy to
47
bits.
This
iteration
is
applied
as
a correction
factor
with a
full-precision
multiply operation.
Where
31
bits
of accuracy
is
sufficient,
the reciprocal approximation
instruction
may
be
used
with the half-precision multiply to produce a
half-precision quotient.
The
18
low-order
bits
of the half-precision
results
are returned
as
zeros
with a
round
applied to the low-order
bit
of the
30-bit
result.
A
scalar
quotient
is
computed
in
29
clock periods since operations 2
and
3 issue in successive clock periods.
A vector quotient requires
effectively
three vector times since operations
1
and
3 are chained together. This hides
one
of the multiply operations.
A vector time
is
one
clock period for
each
element in the vector.
For
example,
two
50-element vectors are divided in about 3 *
50
clock
periods. This estimate
does
not include overhead associated with the
functional
units.
2240004
3-30
E
~.

Table of Contents