IBM Power7 - Compiler Versions and Optimization Levels

To Next Page

To Previous Page

108 POWER7 and POWER7+ Optimization and Tuning Guide

6.1 Compiler versions and optimization levels

The IBM XL compilers are updated periodically to improve application performance and add

processor-specific tuning and capabilities. The XLC11/XLF13 compilers for AIX and Linux are

the first versions to include the capabilities of POWER7, and are the preferred version for

projects that target current generation systems. The newer XLC12/XLF14 compilers provide

performance improvements, and are preferred for template-heavy C++ codes.

The enterprise Linux distributions (RHEL6.1 GCC- 4.4 and SLES11/SP1 GCC- 4.3) include

GCC compilers with POWER7 enabled (using the -mcpu and -mtune options), but do not have

the latest Higher Order Optimizations. For the GNU GCC, G++ and gfortran compilers on

Linux, the IBM Advance Toolchain 4.0 (GCC- 4.5) and 5.0 (GCC- 4.6) versions contain

releases that are preferred for POWER7. XLF is preferred over gfortran for its high floating

point performance characteristics.

For all production codes, it is imperative to enable a minimum level of compiler optimization by

adding the -O option for the XL compilers, or -O2 with the GNU compilers (-O3 is the preferred

option). Without optimization, the focus of the compiler is on faster compilation and debug

ability, and it generates code that performs poorly at run time. In practice, many projects set

up a dual build environment, with a development build without optimization for use during

development and debugging, and a production build with optimization to be used for

performance verification and production delivery.

For projects with increased focus on runtime performance, you should take advantage of the

more advanced compiler optimization. For numerical or compute-intensive codes, the XL

compiler options -O3 or -qhot -O3 enable loop transformations, which improve program

performance by restructuring loops to make their execution more efficient by the target

system. These options perform aggressive transformations that can sometimes cause minor

differences on precision of floating point computations. If that is a concern, the original

program semantics can be fully recovered with the -qstrict option.

For GCC, the minimum suggested level of optimization is -O3. The GCC default is a strict

mode, but the -ffast-math option disables strict mode. The -Ofast option combines -O3 with

-ffast-math in a single option. Other important options include -fpeel-loops,

-funroll-loops, -ftree-vectorize, -fvect-cost-model, and -mcmodel=medium.

By default, these compilers generate code that run on various Power Systems. Options

should be added to exclude older processor chips that are not supported by the target

application. This configuration might enable better code generation as the compiler takes

advantage of capabilities not available on those older systems.

There are two major XL compiler options to control this support:

򐂰 -qarch: Indicates the oldest processor chip generation that the binary file supports.

򐂰 -qtune: Indicates the processor chip generation of most interest for performance.

For example, for an application that must run on POWER6 systems, but for which most users

are on a POWER7 system, the appropriate combination is -qarch=pwr6 -qtune=pwr7. For an

application that must run well across both POWER6 and POWER7 Systems in current

common usage, consider using -qtune=balanced.

On GCC, the equivalent options are -mcpu and -mtune. So, for an application that must run on

POWER6, but which is usually run on POWER7, the options are -mcpu=power6

and -mtune=power7.

Main Page

IBM Power7 - Compiler Versions and Optimization Levels

Table of Contents

Related product manuals