Intel ARCHITECTURE IA-32

To Next Page

To Previous Page

IA-32 Intel® Architecture Optimization

B-14

Current implementations of the BSQ_cache_reference event do not

distinguish between programmatic read and write misses.

Programmatic writes that miss must get the rest of the cache line and

merge the new data. Such a request is called a read for ownership

(RFO). To the “BSQ_cache_reference” hardware, both a programmatic

read and an RFO look like a data bus read, and are counted as such.

Further distinction between programmatic reads and RFOs may be

provided in future implementations.

Current implementations of the BSQ_cache_reference event can suffer

from perceived over- or under-counting. References are based on BSQ

allocations, as described above. Consequently, read misses are

generally counted once per 128-byte line BSQ allocation (whether one

or both sectors are referenced), but read and write (RFO) hits and most

write (RFO) misses are counted once per 64-byte line, the size of a core

reference. This makes the event counts for read misses appear to have a

2-times overcounting with respect to read and write (RFO) hits and

write (RFO) misses. This granularity mismatch cannot always be

corrected for, making it difficult to correlate to the number of

programmatic misses and hits. If the user knows that both sectors in a

128 -byte line are always referenced soon after each other, then the

number of read misses can be multiplied by two to adjust miss counts to

a 64-byte granularity.

Prefetches themselves are not counted as either hits or misses, as of

Pentium 4 and Intel Xeon processors with a CPUID signature of 0xf21.

However, in Pentium 4 Processor implementations with a CPUID

signature of 0xf07 and earlier have the problem that reads to lines that

are already being prefetched are counted as hits in addition to misses,

thus overcounting hits.

The number of “Reads Non-prefetch from the Processor” is a good

approximation of the number of outermost cache misses due to loads or

RFOs, for the writeback memory type.

Intel ARCHITECTURE IA-32 - Page 468