EasyManuals Logo

Intel ARCHITECTURE IA-32 User Manual

Intel ARCHITECTURE IA-32
568 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #201 background imageLoading...
Page #201 background image
Coding for SIMD Architectures 3
3-21
By adding the padding variable pad, the structure is now 8 bytes, and if
the first element is aligned to 8 bytes (64 bits), all following elements
will also be aligned. The sample declaration follows:
typedef struct { short x,y,z; char a; char pad; }
Point;
Point pt[N];
Using Arrays to Make Data Contiguous
In the following code,
for (i=0; i<N; i++) pt[i].y *= scale;
the second dimension y needs to be multiplied by a scaling value. Here
the
for loop accesses each y dimension in the array pt thus disallowing
the access to contiguous data. This can degrade the performance of the
application by increasing cache misses, by achieving poor utilization of
each cache line that is fetched, and by increasing the chance for accesses
which span multiple cache lines.
The following declaration allows you to vectorize the scaling operation
and further improve the alignment of the data access patterns:
short ptx[N], pty[N], ptz[N];
for (i=0; i<N; i++) pty[i] *= scale;
With the SIMD technology, choice of data organization becomes more
important and should be made carefully based on the operations that
will be performed on the data. In some applications, traditional data
arrangements may not lead to the maximum performance.
A simple example of this is an FIR filter. An FIR filter is effectively a
vector dot product in the length of the number of coefficient taps.
Consider the following code:
(data [ j ] *coeff [0] + data [j+1]*coeff [1]+...+data
[j+num of taps-1]*coeff [num of taps-1]),
If in the code above the filter operation of data element i is the vector
dot product that begins at data element
j, then the filter operation of
data element
i+1 begins at data element j+1.

Table of Contents

Questions and Answers:

Question and Answer IconNeed help?

Do you have a question about the Intel ARCHITECTURE IA-32 and is the answer not in the manual?

Intel ARCHITECTURE IA-32 Specifications

General IconGeneral
Instruction Setx86
Instruction Set TypeCISC
Memory SegmentationSupported
Operating ModesReal mode, Protected mode, Virtual 8086 mode
Max Physical Address Size36 bits (with PAE)
Max Virtual Address Size32 bits
ArchitectureIA-32 (Intel Architecture 32-bit)
Addressable Memory4 GB (with Physical Address Extension up to 64 GB)
Floating Point Registers8 x 80-bit
MMX Registers8 x 64-bit
SSE Registers8 x 128-bit
RegistersGeneral-purpose registers (EAX, EBX, ECX, EDX, ESI, EDI, ESP, EBP), Segment registers (CS, DS, SS, ES, FS, GS), Instruction pointer (EIP), Flags register (EFLAGS)
Floating Point UnitYes (x87)

Related product manuals