EasyManuals Logo

Intel ARCHITECTURE IA-32 User Manual

Intel ARCHITECTURE IA-32
568 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #198 background image
IA-32 Intel® Architecture Optimization
3-18
Here, fvec.h is the class definition file and F32vec4 is the class
representing an array of four floats. The “+” and “=” operators are
overloaded so that the actual Streaming SIMD Extensions
implementation in the previous example is abstracted out, or hidden,
from the developer. Note how much more this resembles the original
code, allowing for simpler and faster programming.
Again, the example is assuming the arrays, passed to the routine, are
already aligned to 16-byte boundary.
Automatic Vectorization
The Intel C++ Compiler provides an optimization mechanism by which
loops, such as in Example 3-8 can be automatically vectorized, or
converted into Streaming SIMD Extensions code. The compiler uses
similar techniques to those used by a programmer to identify whether a
loop is suitable for conversion to SIMD. This involves determining
whether the following might prevent vectorization:
the layout of the loop and the data structures used
dependencies amongst the data accesses in each iteration and across
iterations
Once the compiler has made such a determination, it can generate
vectorized code for the loop, allowing the application to use the SIMD
instructions.
Example 3-11 C++ Code Using the Vector Classes
#include <fvec.h>
void add(float *a, float *b, float *c)
{
F32vec4 *av=(F32vec4 *) a;
F32vec4 *bv=(F32vec4 *) b;
F32vec4 *cv=(F32vec4 *) c;
*cv=*av + *bv;
}

Table of Contents

Questions and Answers:

Question and Answer IconNeed help?

Do you have a question about the Intel ARCHITECTURE IA-32 and is the answer not in the manual?

Intel ARCHITECTURE IA-32 Specifications

General IconGeneral
Instruction Setx86
Instruction Set TypeCISC
Memory SegmentationSupported
Operating ModesReal mode, Protected mode, Virtual 8086 mode
Max Physical Address Size36 bits (with PAE)
Max Virtual Address Size32 bits
ArchitectureIA-32 (Intel Architecture 32-bit)
Addressable Memory4 GB (with Physical Address Extension up to 64 GB)
Floating Point Registers8 x 80-bit
MMX Registers8 x 64-bit
SSE Registers8 x 128-bit
RegistersGeneral-purpose registers (EAX, EBX, ECX, EDX, ESI, EDI, ESP, EBP), Segment registers (CS, DS, SS, ES, FS, GS), Instruction pointer (EIP), Flags register (EFLAGS)
Floating Point UnitYes (x87)

Related product manuals