EasyManuals Logo

Intel ARCHITECTURE IA-32 User Manual

Intel ARCHITECTURE IA-32
568 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #104 background imageLoading...
Page #104 background image
IA-32 Intel® Architecture Optimization
2-32
Store Forwarding
The processors memory system only sends stores to memory (including
cache) after store retirement. However, store data can be forwarded
from a store to a subsequent load from the same address to give a much
shorter store-load latency.
There are two kinds of requirements for store forwarding. If these
requirements are violated, store forwarding cannot occur and the load
must get its data from the cache (so the store must write its data back to
the cache first). This incurs a penalty that is related to pipeline depth.
The first requirement pertains to the size and alignment of the
store-forwarding data. This restriction is likely to have high impact to
overall application performance. Typically, performance penalty due to
violating this restriction can be prevented. Several examples of coding
pitfalls that cause store-forwarding stalls and solutions to these pitfalls
are discussed in detail in the “Store-to-Load-Forwarding Restriction on
Size and Alignment” section. The second requirement is the availability
of data, discussed in the “Store-forwarding Restriction on Data
Availability” section.
A good practice is to eliminate redundant load operations, see some
guidelines below.
It may be possible to keep a temporary scalar variable in a register and
never write it to memory. Generally, such a variable must not be
accessible via indirect pointers. Moving a variable to a register
eliminates all loads and stores of that variable and eliminates potential
problems associated with store forwarding. However, it also increases
register pressure.
Load instructions tend to start chains of computation. Since the out of
order engine is based on data dependence, load instructions play a
significant role in the engine capability to execute at a high rate.
Eliminating loads should be given a high priority.

Table of Contents

Questions and Answers:

Question and Answer IconNeed help?

Do you have a question about the Intel ARCHITECTURE IA-32 and is the answer not in the manual?

Intel ARCHITECTURE IA-32 Specifications

General IconGeneral
Instruction Setx86
Instruction Set TypeCISC
Memory SegmentationSupported
Operating ModesReal mode, Protected mode, Virtual 8086 mode
Max Physical Address Size36 bits (with PAE)
Max Virtual Address Size32 bits
ArchitectureIA-32 (Intel Architecture 32-bit)
Addressable Memory4 GB (with Physical Address Extension up to 64 GB)
Floating Point Registers8 x 80-bit
MMX Registers8 x 64-bit
SSE Registers8 x 128-bit
RegistersGeneral-purpose registers (EAX, EBX, ECX, EDX, ESI, EDI, ESP, EBP), Segment registers (CS, DS, SS, ES, FS, GS), Instruction pointer (EIP), Flags register (EFLAGS)
Floating Point UnitYes (x87)

Related product manuals