AMD AMD5K86 - Fetch

416 pages

Save Page as PDF

To Next Page

To Next Page

To Previous Page

To Previous Page

Loading...

AMD~

AMD5~6

Processor

Technical

Reference

Manual

1

8524B/O-Marl

996

2.2.1

Fetch

2-6

The

processor

can

fetch

up

to

16

bytes

per

clock

out

of

the

instruction

cache.

Fetching

begins

with

the

calculation

of

the

linear

address

for

the

next

instruction

along

a

predicted

branch

of

the

x86

instruction

stream.

The

address

accesses

the

instruction

cache

or,

during

a miss,

the

pre

fetch

cache.

Fetch-

ing

can

occur

along

a

single

execution

stream

with

up

to

three

taken

branches.

Fetches

that

miss

both

the

instruction

cache

and

pre

fetch

cache

are

driven

to

the

prefetcher.

In

addition

to

fetching

instructions,

the

fetch

logic

handles

branch

predictions

and

detects

conditions

requiring

pipeline

invalidation

and

restarting,

such

as

context

switches

or

branches

into

cache

lines

that

do

not

contain

the

correct

prede·

code

state.

Branches

are

dynamically

predicted

on

a

cache-line

basis

using

a

l·bit

algorithm.

Each

of

the

1024

instruction-

cache

lines

has

a

tag

that

predicts

the

last

byte

in

the

cache

line

to

be

executed,

whether

or

not

the

branch

will

be

taken,

and

the

cache

index

ofthe

branch

target

(called

the

successor

index).

When

the

caches

are

invalidated,

all

branch

predictions

are

cleared.

During

prefetch

all

branch

instructions

are

predicted

as

not·

taken.

Later,

if

the

execution

of

a

branch

instruction

reveals

a

misprediction,

the

fetch

unit

backs

out

of

the

branch

by

invali-

dating

all

speculative

states

in

the

prefetch

cache,

reorder

buffer,

load/store

reservation

station,

and

store

buffer.

Then,

for

cacheable

instructions,

the

branch

prediction

stored

in

the

instruction

cache

is

updated

while

the

correct

branch

target

is

fetched.

Prediction

updates

are

disabled

when

the

branch

instruction

is

non-cacheable,

because

no

prediction

informa·

tion

is

saved

for

non-cacheable

instructions.

In

typical

x86

desktop

programs,

a

branch

occurs

about

once

every

seven

x86

instructions.

Without

branch

prediction,

branch

targets

remain

unresolved

until

the

execution

phase,

which

creates

pipeline

delays.

The

processor's

branch-predic-

tion

mechanism

accurately

predicts

70%

to

85%

of

branches

(depending

on

program

behavior)

and

has

a

misprediction

pen-

alty

of

only

three

processor

clocks.

Internal

Architecture

Table of Contents

Related product manuals

Preview: AMD FX series