Intel ARCHITECTURE IA-32 - Using SIMD Floating-point with x87 Floating-point; Scalar Floating-point Code

To Next Page

To Previous Page

Optimizing for SIMD Floating-point Applications 5

5-3

• Is the data arranged for efficient utilization of the SIMD

floating-point registers?

• Is this application targeted for processors without SIMD

floating-point instructions?

For more details, see the section on “Considerations for Code

Conversion to SIMD Programming” in Chapter 3.

Using SIMD Floating-point with x87 Floating-point

Because the XMM registers used for SIMD floating-point computations

are separate registers and are not mapped onto the existing x87

floating-point stack, SIMD floating-point code can be mixed with either

x87 floating-point or 64-bit SIMD integer code.

Scalar Floating-point Code

There are SIMD floating-point instructions that operate only on the

least-significant operand in the SIMD register. These instructions are

known as scalar instructions. They allow the XMM registers to be used

for general-purpose floating-point computations.

In terms of performance, scalar floating-point code can be equivalent to

or exceed x87 floating-point code, and has the following advantages:

• SIMD floating-point code uses a flat register model, whereas x87

floating-point code uses a stack model. Using scalar floating-point

code eliminates the need to use

fxch instructions, which has some

performance limit on the Intel Pentium 4 processor.

• Mixing with MMX technology code without penalty.

• Flush-to-zero mode.

• Shorter latencies than x87 floating-point.

Related product manuals