EasyManua.ls Logo

Freescale Semiconductor MC68881 - Page 217

Default Icon
409 pages
Print Icon
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Loading...
5.1.2
Optimization of Code for the
MC68882
A program that runs successfully on the MC68881 runs on the MC68882 with improved
performance. However, the code can be optimized to exploit the features of the MC68882
for the maximum performance improvement. Optimization requires the following steps:
1. Unroll any rolled loops to obtain at least a 2x unrolled version,
2. Eliminate register conflicts by rearranging FMOVE instructions, and
3. Rearrange FMOVE instructions so that the fastest FMOVE instructions follow the fast-
est arithmetic instructions, and the slowest FMOVE instructions follow the slowest
arithmetic instructions.
5.1.2.1 UNROLLING LOOPS. A rolled loop consists of the instructions to perform the
operations of the loop once using a single index value during each iteration. The perform-
ance of the MC68882 is improved by unrolling the loop, so that an iteration performs those
operations more than once, using two or more index values. The recommended 2x unrolled
version performs the operations twice.
The rolled version of a loop allows little optimization; a register conflict is inevitable. The
2x unrolled version can use different floating-point data registers for each repetition of the
instructions. The FMOVE instructions can be placed in the optimum locations.
5.1.2.2 AVOIDING REGISTER CONFLICTS. The following rules define conflicts between
floating-point data registers.
A register conflict occurs when the destination register of an instruction is the source
register
of the following instruction, and that instruction is a fully-concurrent instruction
listed in Table 5-5. For example:
FADD.D (ea~,FP0
FMOVE.D FP0,(ea~ FP0 conflicts
A register conflict occurs when the destination register of an instruction is the desti-
nation register of the following instruction, and that instruction is a fully-concurrent
instruction listed in Table 5-5. For example:
FADD.D (ea),FP0
FMOVE.D (ea~,FP0 FP0 conflicts
No other combination of source and destination registers of two consecutive instruc-
tions cause a register conflict.
The second case (where an FMOVE instruction uses the same destination register as the
preceding instruction) is an unlikely case, since the result of the first instruction is lost.
However, the MC68882 provides the same result as the MC68881 even for this case.
5.1.2.3 ARRANGING FMOVE INSTRUCTIONS. The FMOVE instruction is fully concurrent
when the operands are in binary real data format, no register conflicts exist, and the notes
of Table 5-5 do not apply. However, the execution time of the FMOVE instruction is hidden
completely only when the overlap time of the preceding instruction exceeds the execution
time of the FMOVE instruction. Thus, the fastest FMOVE instructions should follow the
fastest arithmetic instructions, FADD, for example. Also, the slowest FMOVE instructions
should follow the slowest arithmetic instructions, such as FMUL. Refer to the tables of
MC68881/MC68882 USER'S MANUAL
FREESCALE
5-9

Table of Contents