EasyManuals Logo

Intel ARCHITECTURE IA-32 User Manual

Intel ARCHITECTURE IA-32
568 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #337 background imageLoading...
Page #337 background image
Optimizing Cache Usage 6
6-47
The memory copy algorithm can be optimized using the Streaming
SIMD Extensions with these considerations:
alignment of data
proper layout of pages in memory
cache size
interaction of the transaction lookaside buffer (TLB) with memory
accesses
combining prefetch and streaming-store instructions.
The guidelines discussed in this chapter come into play in this simple
example. TLB priming is required for the Pentium 4 processor just as it
is for the Pentium III processor, since software prefetch instructions will
not initiate page table walks on either processor.
TLB Priming
The TLB is a fast memory buffer that is used to improve performance of
the translation of a virtual memory address to a physical memory
address by providing fast access to page table entries. If memory pages
are accessed and the page table entry is not resident in the TLB, a TLB
miss results and the page table must be read from memory.
The TLB miss results in a performance degradation since another
memory access must be performed (assuming that the translation is not
already present in the processor caches) to update the TLB. The TLB
can be preloaded with the page table entry for the next desired page by
accessing (or touching) an address in that page. This is similar to
prefetch, but instead of a data cache line the page table entry is being
loaded in advance of its use. This helps to ensure that the page table
entry is resident in the TLB and that the prefetch happens as requested
subsequently.

Table of Contents

Questions and Answers:

Question and Answer IconNeed help?

Do you have a question about the Intel ARCHITECTURE IA-32 and is the answer not in the manual?

Intel ARCHITECTURE IA-32 Specifications

General IconGeneral
Instruction Setx86
Instruction Set TypeCISC
Memory SegmentationSupported
Operating ModesReal mode, Protected mode, Virtual 8086 mode
Max Physical Address Size36 bits (with PAE)
Max Virtual Address Size32 bits
ArchitectureIA-32 (Intel Architecture 32-bit)
Addressable Memory4 GB (with Physical Address Extension up to 64 GB)
Floating Point Registers8 x 80-bit
MMX Registers8 x 64-bit
SSE Registers8 x 128-bit
RegistersGeneral-purpose registers (EAX, EBX, ECX, EDX, ESI, EDI, ESP, EBP), Segment registers (CS, DS, SS, ES, FS, GS), Instruction pointer (EIP), Flags register (EFLAGS)
Floating Point UnitYes (x87)

Related product manuals