EasyManuals Logo

Intel ARCHITECTURE IA-32 User Manual

Intel ARCHITECTURE IA-32
568 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #130 background imageLoading...
Page #130 background image
IA-32 Intel® Architecture Optimization
2-58
Guidelines for Optimizing Floating-point Code
User/Source Coding Rule 10. (M impact, M generality) Enable the
compilers use of SSE, SSE2 or SSE3 instructions with appropriate switches.
Follow this procedure to investigate the performance of your
floating-point application:
Understand how the compiler handles floating-point code.
Look at the assembly dump and see what transforms are already
performed on the program.
Study the loop nests in the application that dominate the execution
time.
Determine why the compiler is not creating the fastest code.
See if there is a dependence that can be resolved.
Determine the problem area: bus bandwidth, cache locality, trace
cache bandwidth or instruction latency. Focus on optimizing the
problem area. For example, adding prefetch instructions will not
help if the bus is already saturated. If trace cache bandwidth is the
problem, added prefetch µops may degrade performance.
For floating-point coding, follow all the general coding
recommendations discussed in this chapter, including:
blocking the cache
using prefetch
enabling vectorization
unrolling loops
User/Source Coding Rule 11. (H impact, ML generality) Make sure your
application stays in range to avoid denormal values, underflows.
Out-of-range numbers cause very high overhead.
User/Source Coding Rule 12. (M impact, ML generality) Do not use double
precision unless necessary. Set the precision control (PC) field in the x87 FPU
control word to “Single Precision”. This allows single precision (32-bit)
computation to complete faster on some operations (for example, divides due

Table of Contents

Questions and Answers:

Question and Answer IconNeed help?

Do you have a question about the Intel ARCHITECTURE IA-32 and is the answer not in the manual?

Intel ARCHITECTURE IA-32 Specifications

General IconGeneral
Instruction Setx86
Instruction Set TypeCISC
Memory SegmentationSupported
Operating ModesReal mode, Protected mode, Virtual 8086 mode
Max Physical Address Size36 bits (with PAE)
Max Virtual Address Size32 bits
ArchitectureIA-32 (Intel Architecture 32-bit)
Addressable Memory4 GB (with Physical Address Extension up to 64 GB)
Floating Point Registers8 x 80-bit
MMX Registers8 x 64-bit
SSE Registers8 x 128-bit
RegistersGeneral-purpose registers (EAX, EBX, ECX, EDX, ESI, EDI, ESP, EBP), Segment registers (CS, DS, SS, ES, FS, GS), Instruction pointer (EIP), Flags register (EFLAGS)
Floating Point UnitYes (x87)

Related product manuals