Freescale Semiconductor MC68881

To Next Page

To Previous Page

5.1.2

Optimization of Code for the

MC68882

A program that runs successfully on the MC68881 runs on the MC68882 with improved

performance. However, the code can be optimized to exploit the features of the MC68882

for the maximum performance improvement. Optimization requires the following steps:

1. Unroll any rolled loops to obtain at least a 2x unrolled version,

2. Eliminate register conflicts by rearranging FMOVE instructions, and

3. Rearrange FMOVE instructions so that the fastest FMOVE instructions follow the fast-

est arithmetic instructions, and the slowest FMOVE instructions follow the slowest

arithmetic instructions.

5.1.2.1 UNROLLING LOOPS. A rolled loop consists of the instructions to perform the

operations of the loop once using a single index value during each iteration. The perform-

ance of the MC68882 is improved by unrolling the loop, so that an iteration performs those

operations more than once, using two or more index values. The recommended 2x unrolled

version performs the operations twice.

The rolled version of a loop allows little optimization; a register conflict is inevitable. The

2x unrolled version can use different floating-point data registers for each repetition of the

instructions. The FMOVE instructions can be placed in the optimum locations.

5.1.2.2 AVOIDING REGISTER CONFLICTS. The following rules define conflicts between

floating-point data registers.

• A register conflict occurs when the destination register of an instruction is the source

of the following instruction, and that instruction is a fully-concurrent instruction

listed in Table 5-5. For example:

FADD.D (ea~,FP0

FMOVE.D FP0,(ea~ FP0 conflicts

• A register conflict occurs when the destination register of an instruction is the desti-

nation register of the following instruction, and that instruction is a fully-concurrent

instruction listed in Table 5-5. For example:

FADD.D (ea),FP0

FMOVE.D (ea~,FP0 FP0 conflicts

• No other combination of source and destination registers of two consecutive instruc-

tions cause a register conflict.

The second case (where an FMOVE instruction uses the same destination register as the

preceding instruction) is an unlikely case, since the result of the first instruction is lost.

However, the MC68882 provides the same result as the MC68881 even for this case.

5.1.2.3 ARRANGING FMOVE INSTRUCTIONS. The FMOVE instruction is fully concurrent

when the operands are in binary real data format, no register conflicts exist, and the notes

of Table 5-5 do not apply. However, the execution time of the FMOVE instruction is hidden

completely only when the overlap time of the preceding instruction exceeds the execution

time of the FMOVE instruction. Thus, the fastest FMOVE instructions should follow the

fastest arithmetic instructions, FADD, for example. Also, the slowest FMOVE instructions

should follow the slowest arithmetic instructions, such as FMUL. Refer to the tables of

MC68881/MC68882 USER'S MANUAL

FREESCALE

5-9

Main Page

Related product manuals

Freescale Semiconductor MC68332

266 pages

Freescale Semiconductor MPC5553

1208 pages

Freescale Semiconductor MPC5604B

150 pages

Freescale Semiconductor MC68881 - Page 217

Table of Contents

Related product manuals