EasyManuals Logo

Intel ARCHITECTURE IA-32 User Manual

Intel ARCHITECTURE IA-32
568 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #169 background imageLoading...
Page #169 background image
General Optimization Guidelines 2
2-97
User/Source Coding Rules
User/Source Coding Rule 1. (M impact, L generality) If an indirect branch
has two or more common taken targets, and at least one of those targets are
correlated with branch history leading up to the branch, then convert the
indirect branch into a tree where one or more indirect branches are preceded
by conditional branches to those targets. Apply this “peeling” procedure to the
common target of an indirect branch that correlates to branch history. 2-24
User/Source Coding Rule 2. (H impact, M generality) Pad data structures
defined in the source code so that every data element is aligned to a natural
operand size address boundary. If the operands are packed in a SIMD
instruction, align to the packed element size (64- or 128-bit). 2-39
User/Source Coding Rule 3. (M impact, L generality) Beware of false
sharing within a cache line (64 bytes) for both Pentium 4, Intel Xeon, and
Pentium M processors; and within a sector of 128 bytes on Pentium 4 and Intel
Xeon processors. 2-42
User/Source Coding Rule 4. (H impact, ML generality) Consider using a
special memory allocation library to avoid aliasing. 2-46
User/Source Coding Rule 5. (M impact, M generality) When padding
variable declarations to avoid aliasing, the greatest benefit comes from
avoiding aliasing on second-level cache lines, suggesting an offset of 128 bytes
or more. 2-46
User/Source Coding Rule 6. (H impact, H generality) Optimization
techniques such as blocking, loop interchange, loop skewing and packing are
best done by the compiler. Optimize data structures to either fit in one-half of
the first-level cache or in the second-level cache; turn on loop optimizations
in the compiler to enhance locality for nested loops. 2-52
User/Source Coding Rule 7. (M impact, ML generality) If there is a blend
of reads and writes on the bus, changing the code to separate these bus
transactions into read phases and write phases can help performance. Note,
however, that the order of read and write operations on the bus are not the
same as they appear in the program. 2-52

Table of Contents

Questions and Answers:

Question and Answer IconNeed help?

Do you have a question about the Intel ARCHITECTURE IA-32 and is the answer not in the manual?

Intel ARCHITECTURE IA-32 Specifications

General IconGeneral
Instruction Setx86
Instruction Set TypeCISC
Memory SegmentationSupported
Operating ModesReal mode, Protected mode, Virtual 8086 mode
Max Physical Address Size36 bits (with PAE)
Max Virtual Address Size32 bits
ArchitectureIA-32 (Intel Architecture 32-bit)
Addressable Memory4 GB (with Physical Address Extension up to 64 GB)
Floating Point Registers8 x 80-bit
MMX Registers8 x 64-bit
SSE Registers8 x 128-bit
RegistersGeneral-purpose registers (EAX, EBX, ECX, EDX, ESI, EDI, ESP, EBP), Segment registers (CS, DS, SS, ES, FS, GS), Instruction pointer (EIP), Flags register (EFLAGS)
Floating Point UnitYes (x87)

Related product manuals