EasyManua.ls Logo

Intel ARCHITECTURE IA-32 - Figure 5-3 Horizontal Add Using MovhlpsMovlhps

Intel ARCHITECTURE IA-32
568 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Loading...
Optimizing for SIMD Floating-point Applications 5
5-19
Figure 5-3 Horizontal Add Using movhlps/movlhps
Example 5-9 Horizontal Add Using movhlps/movlhps
void horiz_add(Vertex_soa *in, float *out) {
__asm {
mov ecx, in // load structure addresses
mov edx, out
movaps xmm0, [ecx] // load A1 A2 A3 A4 => xmm0
movaps xmm1, [ecx+16] // load B1 B2 B3 B4 => xmm1
movaps xmm2, [ecx+32] // load C1 C2 C3 C4 => xmm2
movaps xmm3, [ecx+48] // load D1 D2 D3 D4 => xmm3
continued
A1+A2+A3+A4 B1+B2+B3+B4 C1+C2+C3+C4 D1+D2+D3+D4
A1+A3 B1+B3 C1+C3 D1+D3 A2+A4 B2+B4 C2+C4 D2+D4
A1+A3 A2+A4 B1+B3 B2+B4 C1+C3 C2+C4 D1+D3 D2+D4
A1 A2 A3 A4 B1 B2 B3 B4
C1 C2 C3 C4 D1 D2 D3 D4
A1 A2 B1 B2 A3 A4 B3 B4
C1 C2 D1 D2 C3 C4 D3 D4
ADDPS
SHUFPS SHUFPS
ADDPS ADDPS
MOVLHPS MOVLHPS
xmm0 xmm2
MOVHLPS MOVHLPS
xmm1 xmm3

Table of Contents

Related product manuals