EasyManuals Logo

Intel ARCHITECTURE IA-32 User Manual

Intel ARCHITECTURE IA-32
568 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #469 background imageLoading...
Page #469 background image
Using Performance Monitoring Events B
B-15
Usage Notes on Bus Activities
A number of performance metrics in Table B-1 are based on
IOQ_active_entries and BSQ_active entries. The next three paragraphs
provide information of various bus transaction underway metrics. These
metrics nominally measure the end-to-end latency of transactions
entering the BSQ; i.e., the aggregate sum of the allocation-to-
deallocation durations for the BSQ entries used for all individual
transaction in the processor. They can be divided by the corresponding
number-of-transactions metrics (i.e., those that measure allocations) to
approximate an average latency per transaction. However, that
approximation can be significantly higher than the number of cycles it
takes to get the first chunk of data for the demand fetch (e.g., load),
because the entire transaction must be completed before deallocation.
That latency includes deallocation overheads, and the time to get the
other half of the 128-byte line, which is called an adjacent-sector
prefetch. Since adjacent-sector prefetches have lower priority than
demand fetches, there is a high probability on a heavily utilized system
that the adjacent-sector prefetch will have to wait until the next bus
arbitration cycle from that processor. Note also that on current
implementations, the granularities at which BSQ_allocation and
BSQ_active_entries count can differ, leading to a possible 2-times
overcounting of latencies for non-partial programmatic loads.
Users of the bus transaction underway metrics would be best served by
employing them for relative comparisons across BSQ latencies of all
transactions. Users that want to do cycle-by-cycle or type-by-type
analysis should be aware that this event is known to be inaccurate for
“UC Reads Chunk Underway” and “Write WC partial underway”
metrics. Relative changes to the average of all BSQ latencies should be
viewed as an indication that overall memory performance has changed.
That memory performance change may or may not be reflected in the
measured FSB latencies.
Also note that for Pentium 4 and Intel Xeon Processor implementations
with an integrated 3rd-level cache, BSQ entries are allocated for all
2nd-level writebacks (replaced lines), not just those that become bus

Table of Contents

Questions and Answers:

Question and Answer IconNeed help?

Do you have a question about the Intel ARCHITECTURE IA-32 and is the answer not in the manual?

Intel ARCHITECTURE IA-32 Specifications

General IconGeneral
Instruction Setx86
Instruction Set TypeCISC
Memory SegmentationSupported
Operating ModesReal mode, Protected mode, Virtual 8086 mode
Max Physical Address Size36 bits (with PAE)
Max Virtual Address Size32 bits
ArchitectureIA-32 (Intel Architecture 32-bit)
Addressable Memory4 GB (with Physical Address Extension up to 64 GB)
Floating Point Registers8 x 80-bit
MMX Registers8 x 64-bit
SSE Registers8 x 128-bit
RegistersGeneral-purpose registers (EAX, EBX, ECX, EDX, ESI, EDI, ESP, EBP), Segment registers (CS, DS, SS, ES, FS, GS), Instruction pointer (EIP), Flags register (EFLAGS)
Floating Point UnitYes (x87)

Related product manuals