EasyManua.ls Logo

Intel ARCHITECTURE IA-32 - Page 565

Intel ARCHITECTURE IA-32
568 pages
Print Icon
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Loading...
Index
Index-5
O
optimizing cache utilization
cache management, 6-44
examples, 6-15
non-temporal store instructions, 6-10
prefetch and load, 6-9
prefetch Instructions, 6-8
prefetching, 6-7
SFENCE instruction, 6-15, 6-16
streaming, non-temporal stores, 6-10
optimizing floating-point applications
copying, shuffling, 5-17
data arrangement, 5-4
data deswizzling, 5-14
data swizzling using intrinsics, 5-12
horizontal ADD, 5-18
planning considerations, 5-2
rules and suggestions, 5-1
scalar code, 5-3
vertical versus horizontal computation, 5-5
optimizing floating-point code, 2-58
P
pack instruction, 4-10
pack instructions, 4-8
packed average byte or word), 4-31
packed multiply high unsigned, 4-30
packed shuffle word, 4-18
packed signed integer word maximum, 4-29
packed sum of absolute differences, 4-30
parallelism, 3-12, E-7
parameter alignment, D-4
partial memory accesses, 4-35
PAVGB instruction, 4-31
PAVGW instruction, 4-31
Pentium Processor Extreme Edition, 1-39
Performance and Usage Models
Multithreading, 7-2
Performance and Usage Models, 7-2
Performance Library Suite, A-14
optimizations, A-16
PEXTRW instruction, 4-13
PGO. See profile-guided optimization
PINSRW instruction, 4-14
PMINSW instruction, 4-29
PMINUB instruction, 4-30
PMOVMSKB instruction, 4-16
PMULHUW instruction, 4-30
predictable memory access patterns, 6-7
prefetch and cacheability Instructions, 6-4
prefetch and load Instructions, 6-8
prefetch concatenation, 6-26, 6-28
prefetch instruction, 6-1
prefetch instruction considerations, 6-24
cache blocking techniques, 6-34
concatenation, 6-26
minimizing prefetches number, 6-29
no preloading or prefetch, E-6
prefetch scheduling distance, E-5
scheduling distance, 6-25
single-pass execution, 6-3, 6-41
spread prefetch with computation
instructions, 6-32
strip-mining, 6-37
prefetch instructions, 6-7
prefetch scheduling distance, 6-25, E-5, E-7,
E-10
prefetch use
predictable memory access patterns, 6-7
time-consuming innermost loops, 6-7
prefetching concept, 6-6
prefetchnta instruction, 6-36
profile-guided optimization, A-7
prolog sequences, 2-90
PSADBW instruction, 4-30
PSHUF instruction, 4-18
P-states, 9-1

Table of Contents

Related product manuals