EasyManuals Logo

Intel ARCHITECTURE IA-32 User Manual

Intel ARCHITECTURE IA-32
568 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #53 background image
IA-32 Intel® Architecture Processor Family Overview
1-25
Reordering loads with respect to each other can prevent a load miss
from stalling later loads. Reordering loads with respect to other loads
and stores to different addresses can enable more parallelism, allowing
the machine to execute operations as soon as their inputs are ready.
Writes to memory are always carried out in program order to maintain
program correctness.
A cache miss for a load does not prevent other loads from issuing and
completing. The Pentium 4 processor supports up to four (or eight for
Pentium 4 processor with CPUID signature corresponding to family 15,
model 3) outstanding load misses that can be serviced either by on-chip
caches or by memory.
Store buffers improve performance by allowing the processor to
continue executing instructions without having to wait until a write to
memory and/or cache is complete. Writes are generally not on the
critical path for dependence chains, so it is often beneficial to delay
writes for more efficient use of memory-access bus cycles.
Store Forwarding
Loads can be moved before stores that occurred earlier in the program if
they are not predicted to load from the same linear address. If they do
read from the same linear address, they have to wait for the store data to
become available. However, with store forwarding, they do not have to
wait for the store to write to the memory hierarchy and retire. The data
from the store can be forwarded directly to the load, as long as the
following conditions are met:
Sequence: the data to be forwarded to the load has been generated
by a programmatically-earlier store which has already executed
Size: the bytes loaded must be a subset of (including a proper
subset, that is, the same) bytes stored
Alignment: the store cannot wrap around a cache line boundary, and
the linear address of the load must be the same as that of the store

Table of Contents

Questions and Answers:

Question and Answer IconNeed help?

Do you have a question about the Intel ARCHITECTURE IA-32 and is the answer not in the manual?

Intel ARCHITECTURE IA-32 Specifications

General IconGeneral
Instruction Setx86
Instruction Set TypeCISC
Memory SegmentationSupported
Operating ModesReal mode, Protected mode, Virtual 8086 mode
Max Physical Address Size36 bits (with PAE)
Max Virtual Address Size32 bits
ArchitectureIA-32 (Intel Architecture 32-bit)
Addressable Memory4 GB (with Physical Address Extension up to 64 GB)
Floating Point Registers8 x 80-bit
MMX Registers8 x 64-bit
SSE Registers8 x 128-bit
RegistersGeneral-purpose registers (EAX, EBX, ECX, EDX, ESI, EDI, ESP, EBP), Segment registers (CS, DS, SS, ES, FS, GS), Instruction pointer (EIP), Flags register (EFLAGS)
Floating Point UnitYes (x87)

Related product manuals