EasyManuals Logo

Intel ARCHITECTURE IA-32 User Manual

Intel ARCHITECTURE IA-32
568 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #218 background imageLoading...
Page #218 background image
IA-32 Intel® Architecture Optimization
3-38
Note that this can be applied to both SIMD integer and SIMD
floating-point code.
If there are multiple consumers of an instance of a register, group the
consumers together as closely as possible. However, the consumers
should not be scheduled near the producer.
SIMD Optimizations and Microarchitectures
Pentium M, Intel Core Solo and Intel Core Duo processors have a
different microarchitecture than Intel NetBurst
®
microarchitecture. The
following sub-section discusses optimizing SIMD code targeting Intel
Core Solo and Intel Core Duo processors.
The register-register variant of the following instructions has improved
performance on Intel Core Solo and Intel Core Duo processor relative to
Pentium M processors. This is because the instructions consist of two
micro-ops instead of three. Relevant instructions are: unpcklps,
unpckhps, packsswb, packuswb, packssdw, pshufd, shuffps and shuffpd.
top_of_loop:
movq mm0, [A + eax]
pcmpgtw mm0, [B + eax]; Create compare mask
movq mm1, [D + eax]
pand mm1, mm0; Drop elements where A<B
pandn mm0, [E + eax] ; Drop elements where A>B
por mm0, mm1; Crete single word
movq [C + eax], mm0
add eax, 8
cmp eax, MAX_ELEMENT*2
jle top_of_loop
Example 3-21 Emulation of Conditional Moves (continued)

Table of Contents

Questions and Answers:

Question and Answer IconNeed help?

Do you have a question about the Intel ARCHITECTURE IA-32 and is the answer not in the manual?

Intel ARCHITECTURE IA-32 Specifications

General IconGeneral
Instruction Setx86
Instruction Set TypeCISC
Memory SegmentationSupported
Operating ModesReal mode, Protected mode, Virtual 8086 mode
Max Physical Address Size36 bits (with PAE)
Max Virtual Address Size32 bits
ArchitectureIA-32 (Intel Architecture 32-bit)
Addressable Memory4 GB (with Physical Address Extension up to 64 GB)
Floating Point Registers8 x 80-bit
MMX Registers8 x 64-bit
SSE Registers8 x 128-bit
RegistersGeneral-purpose registers (EAX, EBX, ECX, EDX, ESI, EDI, ESP, EBP), Segment registers (CS, DS, SS, ES, FS, GS), Instruction pointer (EIP), Flags register (EFLAGS)
Floating Point UnitYes (x87)

Related product manuals