EasyManuals Logo

Intel ARCHITECTURE IA-32 User Manual

Intel ARCHITECTURE IA-32
568 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #257 background image
Optimizing for SIMD Integer Applications 4
4-37
These transformations, in general, increase the number of instructions
required to perform the desired operation. For Pentium II, Pentium III,
and Pentium 4 processors, the benefit of avoiding forwarding problems
outweighs the performance penalty due to the increased number of
instructions, making the transformations worthwhile.
Supplemental Techniques for Avoiding Cache Line Splits
Some video processing applications sometimes cannot avoid loading
data from memory address that are aligned to 16 byte boundary. An
example of this situation is when each line in a video frame is averaged
by shifting horizontally half a pixel. Example 4-28 shows a common
operation in video processing that loads data from memory address not
aligned to 16 byte boundary. As video processing traverses each line in
the video frame, it will experience at least a cache line split for each
64 bytes loaded from memory.
Example 4-28 An Example of Video Processing with Cache Line Splits
// Average half-pels horizonally (on // the “x” axis),
// from one reference frame only.
nextLinesLoop:
movdqu xmm0, XMMWORD PTR [edx] // may not be 16B aligned
movdqu xmm0, XMMWORD PTR [edx+1]
movdqu xmm1, XMMWORD PTR [edx+eax]
movdqu xmm1, XMMWORD PTR [edx+eax+1]
pavgbxmm0, xmm1
pavgbxmm2, xmm3
movdqaXMMWORD PTR [ecx], xmm0
movdqaXMMWORD PTR [ecx+eax], xmm2
// (repeat ...)

Table of Contents

Questions and Answers:

Question and Answer IconNeed help?

Do you have a question about the Intel ARCHITECTURE IA-32 and is the answer not in the manual?

Intel ARCHITECTURE IA-32 Specifications

General IconGeneral
Instruction Setx86
Instruction Set TypeCISC
Memory SegmentationSupported
Operating ModesReal mode, Protected mode, Virtual 8086 mode
Max Physical Address Size36 bits (with PAE)
Max Virtual Address Size32 bits
ArchitectureIA-32 (Intel Architecture 32-bit)
Addressable Memory4 GB (with Physical Address Extension up to 64 GB)
Floating Point Registers8 x 80-bit
MMX Registers8 x 64-bit
SSE Registers8 x 128-bit
RegistersGeneral-purpose registers (EAX, EBX, ECX, EDX, ESI, EDI, ESP, EBP), Segment registers (CS, DS, SS, ES, FS, GS), Instruction pointer (EIP), Flags register (EFLAGS)
Floating Point UnitYes (x87)

Related product manuals