Dispatch and Execution Timing 4-19
18524C/0—Nov1996 AMD-K5 Processor Technical Reference Manual
4.2.4 Floating-Point Instructions
Floating-point ROPs are always dispatched in pairs to the FPU
reservation station. The first ROP conveys the lower halves of
the A and B operands, and it always has the fpfill ROP type.
The second ROP conveys the upper halves of the operands, as
well as the numeric opcode. Data from both ROPs is merged in
the reservation station and must be converted into an internal
floating-point format before it can be issued to the add pipe
(fadd), multiply pipe (fmul), or detect pipe (fmv). It takes one
cycle to perform the conversion, and this delay is incurred
whenever the source of the data is the register file or one of
the other functional units (e.g., load/store, ALU). If data is
being forwarded from the FPU itself, however, no format con-
version is required and operands are fast-forwarded from the
back end of a pipe to the front of any other pipe without the
one-cycle delay.
The add/subtract/reverse FPU latencies assume that cancella-
tion does not occur in the adder/subtractor. If cancellation
does occur, an extra cycle is required to normalize the result.
Table 4-3 shows the execution-unit usage for each floating-
point instruction, along with relative cycle numbers for dis-
patch and execution of the associated ROPs for the instruction.
Table 4-3. Floating-Point Instructions
Instruction Mnemonic Opcode Format
Fastpath or
Microcoded
Execution
Unit Timing
FABS 0_0x_11011001_100_xxx F
fpfill 1/2/4
fmv 1/2/4
FADD ST, ST(i) 0_0x_11011000_000_xxx F
fpfill 1/2/5
fadd 1/2/5
FADD ST(i), ST 0_0x_11011000_000_xxx F
fpfill 1/2/5
fadd 1/2/5
FADD real_32 0_1x_11011000_000_xxx F
ld 1/1
fpfill 1/3/6
fadd 1/3/6
FADD real_64 0_1x_11011100_000_xxx M
ld 1/1
ld 1/2
fpfill 1/4/7
fadd 1/4/7