Unlike previous versions of GCC, -fbranch-probabilities has positive effect on generated code3.2. We tested our work on i386/GNU/Linux platform, but the majority of platforms should be functional in 3.1 and 3.2 development trees. In cfg-branch tree we briefly tested Power-PC and Sparc and found it functional.
On CFG-branch we added following command line options to control new optimization passes:
Webizer pass improves register allocation and common subexpression elimination in cases where single variable is used in multiple contexts, like i as counter in multiple loops.
Tracer performs code duplication in order to help other optimizers. The resulting code is larger, but should run faster unless code cache limits are hit.
Basic block reordering reduces amount of taken conditional jumps in code resulting in better instruction decoder performance and smaller code cache footprint.
Function reordering further improves code locality and avoids code cache conflicts. This option has effect only when profile feedback is available and target assembler supports named sections.
Function unrolling duplicates loop body several times to improve other optimizations and instruction decoder performance. By default unrolling is done only for loop where iteration counter can be identified. -param max-unrolled-insns= and -param max-unroll-times= may be used to control amount of unrolling. First option specifies the number of instructions in the loop body unroller is attempting to reach, while the second limits the number of copies of the loop body done.
Same as -fnew-unroll-loops, but all loops are unrolled.
Function peeling duplicates a loop body in the front of loop itself. For loops with small average iterations counts it can effectively avoid the loop. Peeled loop body can also be better optimized by other optimization passes and scheduled into the code just before loop.
Again -param max-peeled-insns= and -param max-peel-times= options can be used with analogous meaning to the -fnew-unroll-loops parameters.
Loop unswitching avoids invariant conditionals in the body of loop by duplicating the loop body and moving the conditional into the header. This usually results in better performance and larger code size.
Midlevel RTL is an alternate intermediate code representation in GCC that may be used for more aggressive optimizations. For some targets, midlevel RTL is required for loop unswitching and loop unrolling to be effective.
This option is on by default.
We have added an attribute noprofile for disabling profiling (see [1] for details).
Jan Hubicka 2003-05-04