EasyManuals Logo

Intel ARCHITECTURE IA-32 User Manual

Intel ARCHITECTURE IA-32
568 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #131 background imageLoading...
Page #131 background image
General Optimization Guidelines 2
2-59
to early out). However, be careful of introducing more than a total of two
values for the floating point control word, or there will be a large performance
penalty. See “Floating-point Modes”.
User/Source Coding Rule 13. (H impact, ML generality) Use fast
float-to-int routines, FISTTP, or SSE2 instructions. If coding these routines, use
the
fisttp instruction if SSE3 is available or cvttss2si, cvttsd2si
instructions if coding with Streaming SIMD Extensions 2.
Many libraries do more work than is necessary. The FISTTP instruction
in SSE3 can convert floating-point values to 16-bit, 32-bit or 64-bit
integers using truncation without accessing the floating-point control
word (FCW). The instructions
cvttss2si/cvttsd2si save many µops
and some store-forwarding delays over some compiler implementations.
This avoids changing the rounding mode.
User/Source Coding Rule 14. (M impact, ML generality) Break dependence
chains where possible.
Removing data dependence enables the out of order engine to extract
more ILP from the code. When summing up the elements of an array,
use partial sums instead of a single accumulator. For example, to
calculate
z = a + b + c + d, instead of:
x = a + b;
y = x + c;
z = y + d;
use:
x = a + b;
y = c + d;
z = x + y;
User/Source Coding Rule 15. (M impact, ML generality) Usually, math
libraries take advantage of the transcendental instructions (for example,
fsin) when evaluating elementary functions. If there is no critical need to
evaluate the transcendental functions using the extended precision of 80 bits,
applications should consider alternate, software-based approach, such as
look-up-table-based algorithm using interpolation techniques. It is possible to
improve transcendental performance with these techniques by choosing the

Table of Contents

Questions and Answers:

Question and Answer IconNeed help?

Do you have a question about the Intel ARCHITECTURE IA-32 and is the answer not in the manual?

Intel ARCHITECTURE IA-32 Specifications

General IconGeneral
Instruction Setx86
Instruction Set TypeCISC
Memory SegmentationSupported
Operating ModesReal mode, Protected mode, Virtual 8086 mode
Max Physical Address Size36 bits (with PAE)
Max Virtual Address Size32 bits
ArchitectureIA-32 (Intel Architecture 32-bit)
Addressable Memory4 GB (with Physical Address Extension up to 64 GB)
Floating Point Registers8 x 80-bit
MMX Registers8 x 64-bit
SSE Registers8 x 128-bit
RegistersGeneral-purpose registers (EAX, EBX, ECX, EDX, ESI, EDI, ESP, EBP), Segment registers (CS, DS, SS, ES, FS, GS), Instruction pointer (EIP), Flags register (EFLAGS)
Floating Point UnitYes (x87)

Related product manuals