EasyManua.ls Logo

Intel ARCHITECTURE IA-32 - Figure 6-7 Examples of Prefetch and Strip-Mining for Temporally Adjacent and Non-Adjacent Passes Loops

Intel ARCHITECTURE IA-32
568 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Loading...
IA-32 IntelĀ® Architecture Optimization
6-36
Figure 6-7 shows how prefetch instructions and strip-mining can be
applied to increase performance in both of these scenarios.
For Pentium 4 processors, the left scenario shows a graphical
implementation of using
prefetchnta to prefetch data into selected
ways of the second-level cache only (SM1 denotes strip mine one way
of second-level), minimizing second-level cache pollution. Use
prefetchnta if the data is only touched once during the entire
execution pass in order to minimize cache pollution in the higher level
caches. This provides instant availability, assuming the prefetch was
issued far ahead enough, when the read access is issued.
Figure 6-7 Examples of Prefetch and Strip-mining for Temporally Adjacent and
Non-Adjacent Passes Loops
Temporally
non-adjacent passes
Temporally
adjacent passes
Prefetchnta
Dataset A
Reuse
Dataset A
Reuse
Dataset B
Prefetchnta
Dataset B
SM1
SM1
Prefetcht0
Dataset A
Prefetcht0
Dataset B
Reuse
Dataset B
Reuse
Dataset A
SM2

Table of Contents

Related product manuals