EasyManuals Logo

Intel ARCHITECTURE IA-32 User Manual

Intel ARCHITECTURE IA-32
568 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #260 background imageLoading...
Page #260 background image
IA-32 Intel® Architecture Optimization
4-40
same DRAM page have shorter latencies than sequential accesses to
different DRAM pages. In many systems the latency for a page miss
(that is, an access to a different page instead of the page previously
accessed) can be twice as large as the latency of a memory page hit
(access to the same page as the previous access). Therefore, if the loads
and stores of the memory fill cycle are to the same DRAM page, a
significant increase in the bandwidth of the memory fill cycles can be
achieved.
Increasing UC and WC Store Bandwidth by Using Aligned
Stores
Using aligned stores to fill UC or WC memory will yield higher
bandwidth than using unaligned stores. If a UC store or some WC stores
cross a cache line boundary, a single store will result in two transaction
on the bus, reducing the efficiency of the bus transactions. By aligning
the stores to the size of the stores, you eliminate the possibility of
crossing a cache line boundary, and the stores will not be split into
separate transactions.
Converting from 64-bit to 128-bit SIMD Integer
The SSE2 define a superset of 128-bit integer instructions currently
available in MMX technology; the operation of the extended
instructions remains the same and simply operate on data that is twice as
wide. This simplifies porting of current 64-bit integer applications.
However, there are few additional considerations:
Computation instructions which use a memory operand that may not
be aligned to a 16-byte boundary must be replaced with an
unaligned 128-bit load (
movdqu) followed by the same computation
operation that uses instead register operands. Use of 128-bit integer
computation instructions with memory operands that are not 16-byte
aligned will result in a General Protection fault. The unaligned
128-bit load and store is not as efficient as the corresponding

Table of Contents

Questions and Answers:

Question and Answer IconNeed help?

Do you have a question about the Intel ARCHITECTURE IA-32 and is the answer not in the manual?

Intel ARCHITECTURE IA-32 Specifications

General IconGeneral
Instruction Setx86
Instruction Set TypeCISC
Memory SegmentationSupported
Operating ModesReal mode, Protected mode, Virtual 8086 mode
Max Physical Address Size36 bits (with PAE)
Max Virtual Address Size32 bits
ArchitectureIA-32 (Intel Architecture 32-bit)
Addressable Memory4 GB (with Physical Address Extension up to 64 GB)
Floating Point Registers8 x 80-bit
MMX Registers8 x 64-bit
SSE Registers8 x 128-bit
RegistersGeneral-purpose registers (EAX, EBX, ECX, EDX, ESI, EDI, ESP, EBP), Segment registers (CS, DS, SS, ES, FS, GS), Instruction pointer (EIP), Flags register (EFLAGS)
Floating Point UnitYes (x87)

Related product manuals