EasyManuals Logo

Intel ARCHITECTURE IA-32 User Manual

Intel ARCHITECTURE IA-32
568 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #315 background imageLoading...
Page #315 background image
Optimizing Cache Usage 6
6-25
Balance single-pass versus multi-pass execution
Resolve memory bank conflict issues
Resolve cache management issues
The subsequent sections discuss all the above items.
Software Prefetch Scheduling Distance
Determining the ideal prefetch placement in the code depends on many
architectural parameters, including the amount of memory to be
prefetched, cache lookup latency, system memory latency, and estimate
of computation cycle. The ideal distance for prefetching data is
processor- and platform-dependent. If the distance is too short, the
prefetch will not hide any portion of the latency of the fetch behind
computation. If the prefetch is too far ahead, the prefetched data may be
flushed out of the cache by the time it is actually required.
Since prefetch distance is not a well-defined metric, for this discussion,
we define a new term, prefetch scheduling distance (PSD), which is
represented by the number of iterations. For large loops, prefetch
scheduling distance can be set to 1, that is, schedule prefetch
instructions one iteration ahead. For small loop bodies, that is, loop
iterations with little computation, the prefetch scheduling distance must
be more than one iteration.
A simplified equation to compute PSD is deduced from the
mathematical model. For a simplified equation, complete mathematical
model, and methodology of prefetch distance determination, refer to
Appendix E, “Mathematics of Prefetch Scheduling Distance”.
Example 6-3 illustrates the use of a prefetch within the loop body. The
prefetch scheduling distance is set to 3,
esi is effectively the pointer to a
line,
edx is the address of the data being referenced and xmm1-xmm4 are
the data used in computation. Example 6-4 uses two independent cache

Table of Contents

Questions and Answers:

Question and Answer IconNeed help?

Do you have a question about the Intel ARCHITECTURE IA-32 and is the answer not in the manual?

Intel ARCHITECTURE IA-32 Specifications

General IconGeneral
Instruction Setx86
Instruction Set TypeCISC
Memory SegmentationSupported
Operating ModesReal mode, Protected mode, Virtual 8086 mode
Max Physical Address Size36 bits (with PAE)
Max Virtual Address Size32 bits
ArchitectureIA-32 (Intel Architecture 32-bit)
Addressable Memory4 GB (with Physical Address Extension up to 64 GB)
Floating Point Registers8 x 80-bit
MMX Registers8 x 64-bit
SSE Registers8 x 128-bit
RegistersGeneral-purpose registers (EAX, EBX, ECX, EDX, ESI, EDI, ESP, EBP), Segment registers (CS, DS, SS, ES, FS, GS), Instruction pointer (EIP), Flags register (EFLAGS)
Floating Point UnitYes (x87)

Related product manuals