EasyManuals Logo

Intel ARCHITECTURE IA-32 User Manual

Intel ARCHITECTURE IA-32
568 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #163 background imageLoading...
Page #163 background image
General Optimization Guidelines 2
2-91
Using memory as a destination operand may further reduce register
pressure at the slight risk of making trace cache packing more difficult.
On the Pentium 4 processor, the sequence of loading a value from
memory into a register and adding the results in a register to memory is
faster than the alternate sequence of adding a value from memory to a
register and storing the results in a register to memory. The first
sequence also uses one less μop than the latter.
Assembly/Compiler Coding Rule 59. (ML impact, M generality) Give
preference to adding a register to memory (memory is the destination) instead
of adding memory to a register. Also, give preference to adding a register to
memory over loading the memory, adding two registers and storing the result.
Assembly/Compiler Coding Rule 60. (M impact, M generality) When an
address of a store is unknown, subsequent loads cannot be scheduled to
execute out of order ahead of the store, limiting the out of order execution of
the processor. When an address of a store is computed by a potentially long
latency operation (such as a load that might miss the data cache) attempt to
reorder subsequent loads ahead of the store.
Instruction Scheduling
Ideally, scheduling or pipelining should be done in a way that optimizes
performance across all processor generations. This section presents
scheduling rules that can improve the performance of your code on the
Pentium 4 processor.
Latencies and Resource Constraints
Assembly/Compiler Coding Rule 61. (M impact, MH generality) Calculate
store addresses as early as possible to avoid having stores block loads.
Example 2-25 Recombining LOAD/OP Code into REG,MEM Form
LOAD reg1, mem1
... code that does not write to reg1...
OP reg2, reg1
... code that does not use reg1 ...

Table of Contents

Questions and Answers:

Question and Answer IconNeed help?

Do you have a question about the Intel ARCHITECTURE IA-32 and is the answer not in the manual?

Intel ARCHITECTURE IA-32 Specifications

General IconGeneral
Instruction Setx86
Instruction Set TypeCISC
Memory SegmentationSupported
Operating ModesReal mode, Protected mode, Virtual 8086 mode
Max Physical Address Size36 bits (with PAE)
Max Virtual Address Size32 bits
ArchitectureIA-32 (Intel Architecture 32-bit)
Addressable Memory4 GB (with Physical Address Extension up to 64 GB)
Floating Point Registers8 x 80-bit
MMX Registers8 x 64-bit
SSE Registers8 x 128-bit
RegistersGeneral-purpose registers (EAX, EBX, ECX, EDX, ESI, EDI, ESP, EBP), Segment registers (CS, DS, SS, ES, FS, GS), Instruction pointer (EIP), Flags register (EFLAGS)
Floating Point UnitYes (x87)

Related product manuals