EasyManuals Logo

Intel ARCHITECTURE IA-32 User Manual

Intel ARCHITECTURE IA-32
568 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #9 background imageLoading...
Page #9 background image
ix
Data Alignment........................................................................................................................ 5-4
Data Arrangement............................................................................................................ 5-4
Vertical versus Horizontal Computation...................................................................... 5-5
Data Swizzling............................................................................................................ 5-9
Data Deswizzling...................................................................................................... 5-14
Using MMX Technology Code for Copy or Shuffling Functions................................ 5-17
Horizontal ADD Using SSE....................................................................................... 5-18
Use of cvttps2pi/cvttss2si Instructions .................................................................................. 5-21
Flush-to-Zero and Denormals-are-Zero Modes .................................................................... 5-22
SIMD Floating-point Programming Using SSE3 ................................................................... 5-22
SSE3 and Complex Arithmetics ..................................................................................... 5-23
SSE3 and Horizontal Computation................................................................................. 5-26
SIMD Optimizations and Microarchitectures .................................................................. 5-27
Packed Floating-Point Performance......................................................................... 5-27
Chapter 6 Optimizing Cache Usage
General Prefetch Coding Guidelines....................................................................................... 6-2
Hardware Prefetching of Data................................................................................................. 6-4
Prefetch and Cacheability Instructions.................................................................................... 6-5
Prefetch................................................................................................................................... 6-6
Software Data Prefetch .................................................................................................... 6-6
The Prefetch Instructions – Pentium 4 Processor Implementation................................... 6-8
Prefetch and Load Instructions......................................................................................... 6-8
Cacheability Control................................................................................................................ 6-9
The Non-temporal Store Instructions.............................................................................. 6-10
Fencing..................................................................................................................... 6-10
Streaming Non-temporal Stores ............................................................................... 6-10
Memory Type and Non-temporal Stores................................................................... 6-11
Write-Combining....................................................................................................... 6-12
Streaming Store Usage Models...................................................................................... 6-13
Coherent Requests................................................................................................... 6-13
Non-coherent requests ............................................................................................. 6-13
Streaming Store Instruction Descriptions ....................................................................... 6-14
The fence Instructions.................................................................................................... 6-15
The sfence Instruction .............................................................................................. 6-15
The lfence Instruction ............................................................................................... 6-16
The mfence Instruction............................................................................................. 6-16
The clflush Instruction .................................................................................................... 6-17
Memory Optimization Using Prefetch.................................................................................... 6-18
Software-controlled Prefetch .......................................................................................... 6-18

Table of Contents

Questions and Answers:

Question and Answer IconNeed help?

Do you have a question about the Intel ARCHITECTURE IA-32 and is the answer not in the manual?

Intel ARCHITECTURE IA-32 Specifications

General IconGeneral
Instruction Setx86
Instruction Set TypeCISC
Memory SegmentationSupported
Operating ModesReal mode, Protected mode, Virtual 8086 mode
Max Physical Address Size36 bits (with PAE)
Max Virtual Address Size32 bits
ArchitectureIA-32 (Intel Architecture 32-bit)
Addressable Memory4 GB (with Physical Address Extension up to 64 GB)
Floating Point Registers8 x 80-bit
MMX Registers8 x 64-bit
SSE Registers8 x 128-bit
RegistersGeneral-purpose registers (EAX, EBX, ECX, EDX, ESI, EDI, ESP, EBP), Segment registers (CS, DS, SS, ES, FS, GS), Instruction pointer (EIP), Flags register (EFLAGS)
Floating Point UnitYes (x87)

Related product manuals