To Next Page

To Previous Page

Prefetch and Predecode 2-3

18524C/0—Nov1996 AMD-K5 Processor Technical Reference Manual

2.1 Prefetch and Predecode

Figure 2-1 (top-left corner) shows the processor’s prefetch and

predecode logic being fed with data from the external bus via

the memory management unit. Prefetching attempts to keep

the instruction cache and prefetch cache filled ahead of the

execution pipeline’s fetch requirements. The processor only

prefetches during fetch-stage misses in the instruction cache,

which typically occur during taken branches.

When a miss occurs, the prefetcher initiates a 32-byte burst

memory read cycle on the bus to fill a prefetch cache. For cache-

able accesses, the prefetch cache also fills 32-byte lines in the

instruction cache. For non-cacheable accesses, the prefetch

cache provides instructions directly to the execution pipeline.

The instruction cache contains a copy of certain fields in the

current code-segment descriptor. During a taken branch, the

fetch logic adds the code-segment base to the effective address

and places the resulting linear address in the prefetch program

counter, which then increments as a linear address along a

sequential stream. All branches during prefetching are

assumed to be not taken.

The processor predecodes its x86-instruction stream in the

same clock in which x86 instructions come out of the prefetch

cache. An x86 instruction can be from 1 to 15 bytes long. Prede-

coding annotates each instruction byte with information that

later enables the decode stage of the pipeline to perform more

efficiently. The predecode information identifies whether the

byte is the start and/or end of an x86 instruction, whether it is

an opcode byte, and the number of internal RISC operations

(ROPs) it will require at the decode stage. The predecode

information is stored in the instruction cache with each x86

instruction byte. It is passed during instruction fetching to the

decode stage, where it allows multiple x86 instructions to be

decoded in parallel. This avoids delaying the decode of one

instruction until the decode of the prior instruction has deter-

mined its ending byte.

Manufacturer

AMD

Model

Architecture

x86

Microarchitecture

Introduction Year

1996

Clock Speed

75 - 133 MHz

Core Count

Socket

Socket 7

Core stepping

SSA/5, 5k86

Voltage

3.3V

Transistors

4.3 million

L1 Cache

8 KB (data) + 16 KB (instruction)

FSB

50 MHz to 66 MHz

Process Technology

350 nm

AMD K5 User Manual

Table of Contents

Questions and Answers:

AMD K5 Specifications

Related product manuals