EasyManua.ls Logo

Intel Xeon Phi User Manual

Intel Xeon Phi
32 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #19 background imageLoading...
Page #19 background image
Intel® Xeon Phi Coprocessor DEVELOPERS QUICK START GUIDE
19
float reduction(float *data, int size)
{
float ret = 0.f;
#pragma offload target(mic) in(data:length(size))
for (int i=0; i<size; ++i)
{
ret += data[i];
}
return ret;
}
Code Example 2: Serial Reduction with Offload
Vector Reduction with Offload
Each core on the Intel® Xeon Phi™ Coprocessor has a VPU. The auto vectorization option is enabled by default
on the offload compiler. Alternately, as seen in the example below, the programmer can use the Intel® Cilk™
Plus Extended Array Notation to maximize vectorization and take advantage of the Intel® MIC Architecture
core’s 32 512-bit registers. The offloaded code is executed by a single thread on a single core. The thread
uses the built-in reduction function __sec_reduce_add() to use the core’s 32 512-bit vector registers to
reduce the elements in the array sixteen at a time.
float reduction(float *data, int size)
{
float ret = 0;
#pragma offload target(mic) in(data:length(size))
ret = __sec_reduce_add(data[0:size]); //Intel® Cilk™ Plus
//Extended Array Notation
return ret;
}
Code Example 3: Vector Reduction with Offload in C/C++
Asynchronous Offload and Data Transfer
Asynchronous offload and data transfer between the host and the Intel® Xeon Phi™ Coprocessor is available.
For details see the “About Asynchronous Computationand “About Asynchronous Data Transfer” sections in
the Intel® C++ Compiler User and Reference Guide (under “Key Features/Programming for the Intel® MIC
Architecture”).
For an example showing the use of asynchronous offload and transfer, refer to /opt/intel/composerxe
/Samples/en_US/C++/mic_samples/intro_sampleC/sampleC13.c
Note that when using the Explicit Memory Copy Model in C/C++, arrays are supported provided the array
element type is scalar or bitwise copyable struct or class. So arrays of pointers are not supported. For C/C++
complex data structure, use the Implicit Memory Copy Model. Please consult the section “Restrictions on
Offload Code Using a Pragma” in the document “Intel C++ Compiler User and Reference Guide” for more
information.
Using the Offload Compiler Implicit Memory Copy Model
Intel Composer XE 2013 SP1 includes two additional keyword extensions for C and C++ (but not Fortran) that
provide a shared memory offload programming model appropriate for dealing with complex, pointer-based

Table of Contents

Question and Answer IconNeed help?

Do you have a question about the Intel Xeon Phi and is the answer not in the manual?

Intel Xeon Phi Specifications

General IconGeneral
Threads per core4
ThreadsUp to 288
InterfacePCIe 3.0 x16
PCIe Version3.0
CoresUp to 72
MemoryUp to 16 GB
TDP215 W to 300 W
SocketLGA 3647
Manufacturing Process14 nm
Instruction Setx86-64
Process Technology14nm

Summary

Understanding the Intel® Xeon Phi™ Coprocessor

Introduction

Overview of the Intel® Xeon Phi™ Coprocessor and its purpose.

Intel® Many Integrated Core Architecture Overview

Details the architecture of the Intel® Xeon Phi™ Coprocessor, including its cores and vector units.

Intel® Xeon Phi™ System Setup and Installation

Preparing Your System for First Use

Guides through the initial setup and installation of drivers and software.

Steps to install the Software Development tools

Instructions for installing the necessary compilers and development tools.

Regaining Access to the Intel® Xeon Phi™ Coprocessor after Reboot

Procedures for re-establishing access to the coprocessor after a system reboot.

Developing Applications for Intel® Xeon Phi™

Getting Started/Developing Intel® Xeon Phi™ Software

Introduces the process of developing applications for the Intel® Xeon Phi™ Coprocessor.

Available Software Development Tools / Environments

Lists and describes the compilers, libraries, and tools for development.

Documentation and Sample Code

Points to essential documentation and sample code for learning and development.

Optimizing Performance on Intel® Xeon Phi™

Using the Offload Compiler – Explicit Memory Copy Model

Explains the explicit memory copy model for offloading code to the coprocessor.

Parallel Programming Options on the Intel® Xeon Phi™ Coprocessor

Covers various parallel programming models like OpenMP, Cilk Plus, and TBB.

Using Intel® MKL

Details how to use the Intel Math Kernel Library for performance optimization.

Related product manuals