i960 Processor Compiler User's Guide
11-22
11
The left diagram shows that path A➠B➠D is heavily traveled and would
thus be detected as a superblock candidate. To form a superblock from
this candidate, it is necessary to remove the arc C➠D. This is done as
shown in the middle diagram. Block D is duplicated, and block C is
altered to flow to D'. The dashed arc from block B to block D indicates
that it is likely that these two blocks will be merged into a single block.
This merging increases the scope of the local optimizer and of the
scheduler, optimizations that work on a single block at a time. The
superblock loop containing only blocks A, B, and D is formed in the
diagram on the right. An empty header block, H, has been created, and
the original single loop in the middle diagram now becomes two loops, a
nested superblock loop headed by A, and an outer loop headed by H.
The fundamental advantage that superblock formation yields is the
removal of data dependencies. In the diagram on the left, any data
modifications in block C must be considered when optimizing the loop.
These modifications often have a negative effect, inhibiting the classic
loop optimizations. For example, if block C contains a procedure call, it
appears to modify all memory variables. Optimizations involving memory
references are inhibited in this case. In the diagram on the right, data
modifications in block C do not effect loop optimizations in the superblock
loop ABD.
Profile-based Branch-prediction Bit Setting
Without program profile data, the compiler uses a fixed rule for setting the
branch-prediction bits for the processor.
With program profile data, the branch-prediction bits are set based on that
profile data. This setting is better for a given program.