EasyManuals Logo

Intel ARCHITECTURE IA-32 User Manual

Intel ARCHITECTURE IA-32
568 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #172 background imageLoading...
Page #172 background image
IA-32 Intel® Architecture Optimization
2-100
order engine. When tuning, note that all IA-32 based processors have very
high branch prediction rates. Consistently mispredicted are rare. Use
these instructions only if the increase in computation time is less than the
expected cost of a mispredicted branch. 2-16
Assembly/Compiler Coding Rule 3. (M impact, H generality) Arrange
code to be consistent with the static branch prediction algorithm: make
the fall-through code following a conditional branch be the likely target
for a branch with a forward target, and make the fall-through code
following a conditional branch be the unlikely target for a branch with a
backward target. 2-19
Assembly/Compiler Coding Rule 4. (MH impact, MH generality)
Near calls must be matched with near returns, and far calls must be
matched with far returns. Pushing the return address on the stack and
jumping to the routine to be called is not recommended since it creates a
mismatch in calls and returns. 2-21
Assembly/Compiler Coding Rule 5. (MH impact, MH generality)
Selectively inline a function where doing so decreases code size, or if the
function is small and the call site is frequently executed. 2-22
Assembly/Compiler Coding Rule 6. (H impact, M generality) Do not
inline a function if doing so increases the working set size beyond what
will fit in the trace cache. 2-22
Assembly/Compiler Coding Rule 7. (ML impact, ML generality) If
there are more than 16 nested calls and returns in rapid succession,
consider transforming the program, for example, with inline, to reduce the
call depth. 2-22
Assembly/Compiler Coding Rule 8. (ML impact, ML generality)
Favor inlining small functions that contain branches with poor prediction
rates. If a branch misprediction results in a RETURN being prematurely
predicted as taken, a performance penalty may be incurred. 2-22
Assembly/Compiler Coding Rule 9. (L impact, L generality) If the last
statement in a function is a call to another function, consider converting
the call to a jump. This will save the call/ return overhead as well as an
entry in the return stack buffer. 2-22

Table of Contents

Questions and Answers:

Question and Answer IconNeed help?

Do you have a question about the Intel ARCHITECTURE IA-32 and is the answer not in the manual?

Intel ARCHITECTURE IA-32 Specifications

General IconGeneral
Instruction Setx86
Instruction Set TypeCISC
Memory SegmentationSupported
Operating ModesReal mode, Protected mode, Virtual 8086 mode
Max Physical Address Size36 bits (with PAE)
Max Virtual Address Size32 bits
ArchitectureIA-32 (Intel Architecture 32-bit)
Addressable Memory4 GB (with Physical Address Extension up to 64 GB)
Floating Point Registers8 x 80-bit
MMX Registers8 x 64-bit
SSE Registers8 x 128-bit
RegistersGeneral-purpose registers (EAX, EBX, ECX, EDX, ESI, EDI, ESP, EBP), Segment registers (CS, DS, SS, ES, FS, GS), Instruction pointer (EIP), Flags register (EFLAGS)
Floating Point UnitYes (x87)

Related product manuals