EasyManuals Logo

Intel ARCHITECTURE IA-32 User Manual

Intel ARCHITECTURE IA-32
568 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #143 background imageLoading...
Page #143 background image
General Optimization Guidelines 2
2-71
Recommendation: Use the compiler switch to generate SSE2 scalar
floating-point code over x87 code.
When working with scalar SSE/SSE2 code, pay attention to the need for
clearing the content of unused slots in an xmm register and the
associated performance impact. For example, loading data from
memory with movss or movsd causes an extra micro-op for zeroing
the upper part of the xmm register.
On Pentium M, Intel Core Solo and Intel Core Duo processors; this
penalty can be avoided by using movlpd. However, using movlpd
causes performance penalty on Pentium 4 processors.
Another situation occurs when mixing single-precision and
double-precision code. On Pentium 4 processors, using cvtss2sd has
performance penalty relative to the alternative sequence:
xorps xmm1, xmm1
movss xmm1, xmm2
cvtps2pd xmm1, xmm1
On Intel Core Solo and Intel Core Duo processors, using cvtss2sd is
more desirable over the alternative sequence.
Memory Operands
Double-precision floating-point operands that are eight-byte aligned
have better performance than operands that are not eight-byte aligned,
since they are less likely to incur penalties for cache and MOB splits.
Floating-point operation on a memory operands require that the operand
be loaded from memory. This incurs an additional µop, which can have
a minor negative impact on front end bandwidth. Additionally, memory
operands may cause a data cache miss, causing a penalty.

Table of Contents

Questions and Answers:

Question and Answer IconNeed help?

Do you have a question about the Intel ARCHITECTURE IA-32 and is the answer not in the manual?

Intel ARCHITECTURE IA-32 Specifications

General IconGeneral
Instruction Setx86
Instruction Set TypeCISC
Memory SegmentationSupported
Operating ModesReal mode, Protected mode, Virtual 8086 mode
Max Physical Address Size36 bits (with PAE)
Max Virtual Address Size32 bits
ArchitectureIA-32 (Intel Architecture 32-bit)
Addressable Memory4 GB (with Physical Address Extension up to 64 GB)
Floating Point Registers8 x 80-bit
MMX Registers8 x 64-bit
SSE Registers8 x 128-bit
RegistersGeneral-purpose registers (EAX, EBX, ECX, EDX, ESI, EDI, ESP, EBP), Segment registers (CS, DS, SS, ES, FS, GS), Instruction pointer (EIP), Flags register (EFLAGS)
Floating Point UnitYes (x87)

Related product manuals