EasyManua.ls Logo

IBM Power7 - Deeper Empirical Analysis

IBM Power7
224 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Loading...
174 POWER7 and POWER7+ Optimization and Tuning Guide
򐂰 A RISC-based, superscalar, out-of-order execution processor chip such as POWER7
requires more aggressive inlining and loop-unrolling to capitalize on the larger register set
and superscalar design point. Also, automatic vectorization is not enabled at this lower
(-O2) optimization level, and so the vector registers and ISA feature go unused.
򐂰 In GCC, you must specify the -O3 optimization level and inform the compiler that you are
running on a newer processor chip with the Vector ISA extensions. In fact, with GCC, you
need both -O3 and -mcpu=power7 for the compiler to generate code that capitalizes on the
new VSX feature of POWER7.
One source of optimized libraries is the IBM Advance Toolchain for PowerLinux. The Advance
Toolchain provides alternative runtime libraries for all the common POSIX C language, Math,
and pthread libraries that are highly optimized (-O3 and -mcpu=) for multiple Power platforms
(including POWER7). The Advance Toolchain run time RPM provides multiple CPU tuned
library instances and automatically selects the specific library version that is optimized for the
specific POWER5, POWER6, or POWER7 machine.
If there are specific open source or third-party libraries that are dominating the execution
profile of your application, you must ask the distribution or library product owner to provide a
build using higher optimization. Alternatively, for open source library packages, you can build
your own optimized binary version of those packages.
Deeper empirical analysis
If simple recompilation with higher optimization options or even a more capable compiler does
not provide acceptable performance, then deeper analysis is required. The IBM SDK for
PowerLinux integrates the following analysis tools:
򐂰 Migration Assistant analysis, non-performing codes, and data types
򐂰 Application-specific hotspot profiling
򐂰 Source Code Advisor (SCA) analysis for non-performing code idioms and induced
execution hazards
The Migration Assistant analyzes the source code directly and does not require a running
binary application for analysis. Profiling and the SCA do require compiled application binary
files and an application-specific benchmark or repeatable workload for analysis.
The Migration Assistant
For applications that originate on another platform, the Migration Assistant (MA) can identify
non-portable code that must be addressed for a successful port to Power Systems. The MA
uses the Eclipse infrastructure to analyze:
򐂰 Data endian dependent unions and structures
򐂰 Casts with potential endian issues
򐂰 Non-portable data types
򐂰 Non-portable inline assembler code
򐂰 Non-portable or arch dependent compiler built-ins
򐂰 Proprietary or architectural-specific APIs
Program usage of non-portable data types and an inline assembler can cause poor
performance on the POWER processor, which always must be investigated and addressed.

Table of Contents

Related product manuals