2-10 Internal Architecture
AMD-K5 Processor Technical Reference Manual 18524C/0—Nov1996
Floating-Point Unit The IEEE 854-compatible floating-point unit (FPU) can issue
pipelined ROPs from its 2-entry reservation station at the rate
of one per clock. One ROP can be issued to either the add or
multiply pipeline in each clock, even when the operations are
separated by an exchange ROP. The add and multiply pipe-
lines use a common pre-detect unit and rounder. The rounder
can return one result per clock.
When data is loaded from memory, it is converted to an inter-
nal 82-bit extended format before being stored in the stack.
The format uses two of the internal 41-bit operand or result
buses.
Load/Store Units Two load/store units read and write data-cache and memory
operands. A shared, 4-entry reservation station buffers incom-
ing ROPs, and a shared, 4-entry store buffer accepts outgoing
speculative-state operands destined for the data cache or mem-
ory. The reservation station is dual-ported and the store buffer
is single-ported, so that the processor can perform two loads or
one load and one store per clock.
Each unit holds copies of segment-descriptor fields so that it
can calculate logical and linear addresses and check protection
variables and segment limits. Data loaded by one instruction in
a load/store unit can be used by another instruction in another
execution unit in the next clock. There is no load-use penalty.
The data cache can be accessed in a single clock. These low
latencies provide an important performance advantage
because a majority of x86 instructions in typical desktop pro-
grams involve memory as one of their operands.
The load/store units can service two accesses in parallel (two
loads or one load and one store), except a load and store to the
same data-cache index and bank, or when one of the accesses is
an I/O load, a locked access, a segment-descriptor load, a data
breakpoint, or the first half of a misaligned access.
Branch Unit The branch unit has a 2-entry reservation station and executes
correctly predicted branches with zero delay. The unit exe-
cutes calls, returns, conditional jumps, conditional byte-sets,
floating-point exchanges, and microbranches. Speculative exe-
cution occurs whenever a conditional-branch instruction exe-
cutes. The branch unit is the only execution unit that decodes
condition codes and supports speculative flag input operands.