EasyManuals Logo

Intel ARCHITECTURE IA-32 User Manual

Intel ARCHITECTURE IA-32
568 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #462 background imageLoading...
Page #462 background image
IA-32 Intel® Architecture Optimization
B-8
Microarchitecture Notes
Trace Cache Events
The trace cache is not directly comparable to an instruction cache. The
two are organized very differently. For example, a trace can span many
lines' worth of instruction-cache data. As with most microarchitectural
elements, trace cache performance is only an issue if something else is
not a bigger bottleneck. If an application is bus bandwidth bound, the
bandwidth that the front end is getting uops to the core may be
irrelevant. When front-end bandwidth is an issue, the trace cache, in
deliver mode, can issue uops to the core faster than either the decoder
(build mode) or the microcode store (the MS ROM). Thus the percent of
time in trace cache deliver mode, or similarly, the percentage of all
bogus and non-bogus uops from the trace cache can be a useful metric
for determining front-end performance.
The metric that is most analogous to an instruction cache miss is a trace
cache miss. An unsuccessful lookup of the trace cache (colloquially, a
miss) is not interesting, per se, if we are in build mode and don't find a
trace available; we just keep building traces. The only “penalty” in that
case is that we continue to have a lower front-end bandwidth. The trace
cache miss metric that is currently used is not just any TC miss, but
rather one that is incurred while the machine is already in deliver mode;
i.e., when a 15-20 cycle penalty is paid. Again, care must be exercised:
a small average number of TC misses per instruction does not indicate
good front-end performance if the percentage of time in deliver mode is
also low.
Bus and Memory Metrics
In order to correctly interpret the observed counts of performance
metrics related to bus events, it is helpful to understand transaction
sizes, when entries are allocated in different queues, and how sectoring
and prefetching affect counts.

Table of Contents

Questions and Answers:

Question and Answer IconNeed help?

Do you have a question about the Intel ARCHITECTURE IA-32 and is the answer not in the manual?

Intel ARCHITECTURE IA-32 Specifications

General IconGeneral
Instruction Setx86
Instruction Set TypeCISC
Memory SegmentationSupported
Operating ModesReal mode, Protected mode, Virtual 8086 mode
Max Physical Address Size36 bits (with PAE)
Max Virtual Address Size32 bits
ArchitectureIA-32 (Intel Architecture 32-bit)
Addressable Memory4 GB (with Physical Address Extension up to 64 GB)
Floating Point Registers8 x 80-bit
MMX Registers8 x 64-bit
SSE Registers8 x 128-bit
RegistersGeneral-purpose registers (EAX, EBX, ECX, EDX, ESI, EDI, ESP, EBP), Segment registers (CS, DS, SS, ES, FS, GS), Instruction pointer (EIP), Flags register (EFLAGS)
Floating Point UnitYes (x87)

Related product manuals