162 POWER7 and POWER7+ Optimization and Tuning Guide
More information about Rational Performance Advisor, including a trial download, can be
found in Rational Developer for Power Systems Software, available at:
http://www.ibm.com/software/rational/products/rdp/
AIX
The section introduces tools and techniques that are used for optimizing software for a
combination of Power Systems and AIX. The intended audience for this section is software
development teams. As such, this section does not address performance topics that are
related to capacity planning, and system-level performance monitoring and tuning.
For capacity planning, see the IBM Systems Workload Estimator, available at:
http://www-912.ibm.com/estimator
For system-level performance monitoring and tuning information for AIX, see Performance
Management, available at:
http://publib.boulder.ibm.com/infocenter/aix/v7r1/index.jsp?topic=/com.ibm.aix.prf
tungd/doc/prftungd/multiple_page_size_support.htm
The bedrock of any empirically based software optimization effort is a suite of repeatable
benchmark tests. To be useful, such tests must be representative of the manner in which
users interact with the software. For many commercial applications, a benchmark test
simulates the actions of multiple users that drive a prescribed mix of application transactions.
Here, the fundamental measure of performance is throughput (the number of transactions
that are run over a period) with an acceptable response time. Other applications are more
batch-oriented, where few jobs are started and the time that is taken to completion is
measured. Whichever benchmark style is used, it must be repeatable. Within some small
tolerance (typically a few percent), running the benchmark several times on the same setup
yields the same result.
Tools and techniques that are employed in software performance analysis focus on
pinpointing aspects of the software that inhibit performance. At a high level, the two most
common inhibitors to application performance are:
Areas of code that consume large amounts of CPU resources. This code is usually caused
by using inefficient algorithms, poor coding practices, or inadequate compiler optimization
Waiting for locks or external events. Locks are used to serialize execution through critical
sections, that is, sections of code where the need for data consistency requires that only
one software thread run at a time. An example of an external event is the system that is
waiting for a disk I/O to complete. Although the amount of time that an application must
wait for external events might be outside of the control of the application (for example, the
time that is required for a disk I/O depends on the type of storage employed), simply being
aware that the application is having to wait for such an event can open the door to potential
optimizations.