EasyManuals Logo

Intel ARCHITECTURE IA-32 User Manual

Intel ARCHITECTURE IA-32
568 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #135 background imageLoading...
Page #135 background image
General Optimization Guidelines 2
2-63
FPU control word (FCW), such as when performing conversions to
integers. On Pentium M, Intel Core Solo and Intel Core Duo processors;
FLDCW is improved over previous generations.
Specifically, the optimization for
FLDCW allows programmers to
alternate between two constant values efficiently. For the
FLDCW
optimization to be effective, the two constant FCW values are only
allowed to differ on the following 5 bits in the FCW:
FCW[8-9] precision control
FCW[10-11] rounding control
FCW[12] infinity control
If programmers need to modify other bits (for example: mask bits) in the
FCW, the
FLDCW instruction is still an expensive operation.
In situations where an application cycles between three (or more)
constant values,
FLDCW optimization does not apply and the performance
degradation occurs for each
FLDCW instruction.
One solution to this problem is to choose two constant FCW values,
take advantage of the optimization of the
FLDCW instruction to alternate
between only these two constant FCW values, and devise some means
to accomplish the task that requires the 3rd FCW value without actually
changing the FCW to a third constant value. An alternative solution is to
structure the code so that, for periods of time, the application alternates
between only two constant FCW values. When the application later
alternates between a pair of different FCW values, the performance
degradation occurs only during the transition.
It is expected that SIMD applications are unlikely to alternate FTZ and
DAZ mode values. Consequently, the SIMD control word does not have
the short latencies that the floating-point control register does. A read of
the
MXCSR register has a fairly long latency, and a write to the register is
a serializing instruction.
There is no separate control word for single and double precision; both
use the same modes. Notably, this applies to both FTZ and DAZ modes.

Table of Contents

Questions and Answers:

Question and Answer IconNeed help?

Do you have a question about the Intel ARCHITECTURE IA-32 and is the answer not in the manual?

Intel ARCHITECTURE IA-32 Specifications

General IconGeneral
Instruction Setx86
Instruction Set TypeCISC
Memory SegmentationSupported
Operating ModesReal mode, Protected mode, Virtual 8086 mode
Max Physical Address Size36 bits (with PAE)
Max Virtual Address Size32 bits
ArchitectureIA-32 (Intel Architecture 32-bit)
Addressable Memory4 GB (with Physical Address Extension up to 64 GB)
Floating Point Registers8 x 80-bit
MMX Registers8 x 64-bit
SSE Registers8 x 128-bit
RegistersGeneral-purpose registers (EAX, EBX, ECX, EDX, ESI, EDI, ESP, EBP), Segment registers (CS, DS, SS, ES, FS, GS), Instruction pointer (EIP), Flags register (EFLAGS)
Floating Point UnitYes (x87)

Related product manuals