EasyManuals Logo

Intel ARCHITECTURE IA-32 User Manual

Intel ARCHITECTURE IA-32
568 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #383 background imageLoading...
Page #383 background image
Multi-Core and Hyper-Threading Technology 7
7-37
latency of scattered memory reads can be improved by issuing multiple
memory reads back-to-back to overlap multiple outstanding memory
read transactions. The average latency of back-to-back bus reads is
likely to be lower than the average latency of scattered reads
interspersed with other bus transactions. This is because only the first
memory read needs to wait for the full delay of a cache miss.
User/Source Coding Rule 29. (M impact, M generality) Consider using
overlapping multiple back-to-back memory reads to improve effective cache
miss latencies.
Another technique to reduce effective memory latency is possible if one
can adjust the data access pattern such that the access strides causing
successive cache misses in the last-level cache is predominantly less
than the trigger threshold distance of the automatic hardware prefetcher.
See “Example of Effective Latency Reduction with H/W Prefetch” in
Chapter 6.
User/Source Coding Rule 30. (M impact, M generality) Consider adjusting
the sequencing of memory references such that the distribution of distances of
successive cache misses of the last level cache peaks towards 64 bytes.
Use Full Write Transactions to Achieve Higher Data Rate
Write transactions across the bus can result in write to physical memory
either using the full line size of 64 bytes or less than the full line size.
The latter is referred to as a partial write. Typically, writes to writeback
(WB) memory addresses are full-size and writes to write-combine (WC)
or uncacheable (UC) type memory addresses result in partial writes.
Both cached WB store operations and WC store operations utilize a set
of six WC buffers (64 bytes wide) to manage the traffic of write
transactions. When competing traffic closes a WC buffer before all
writes to the buffer are finished, this results in a series of 8-byte partial
bus transactions rather than a single 64-byte write transaction.
User/Source Coding Rule 31. (M impact, M generality) Use full write
transactions to achieve higher data throughput.

Table of Contents

Questions and Answers:

Question and Answer IconNeed help?

Do you have a question about the Intel ARCHITECTURE IA-32 and is the answer not in the manual?

Intel ARCHITECTURE IA-32 Specifications

General IconGeneral
Instruction Setx86
Instruction Set TypeCISC
Memory SegmentationSupported
Operating ModesReal mode, Protected mode, Virtual 8086 mode
Max Physical Address Size36 bits (with PAE)
Max Virtual Address Size32 bits
ArchitectureIA-32 (Intel Architecture 32-bit)
Addressable Memory4 GB (with Physical Address Extension up to 64 GB)
Floating Point Registers8 x 80-bit
MMX Registers8 x 64-bit
SSE Registers8 x 128-bit
RegistersGeneral-purpose registers (EAX, EBX, ECX, EDX, ESI, EDI, ESP, EBP), Segment registers (CS, DS, SS, ES, FS, GS), Instruction pointer (EIP), Flags register (EFLAGS)
Floating Point UnitYes (x87)

Related product manuals