The Profile

In many cases compiler must make choice whether to trade speed in one part of code for speed in another, or trade code size for code speed. In such cases it is useful to know information about how often given block executes and that is the purpose for maintaining profile within flow graph.

GCC allows the profile to be either feedback based or statically estimated.

The feedback based profile is produced by compiling the program with instrumentation, executing it on the train run and reading the numbers of executions of basic block edges back to the compiler while re-compiling the program to produce final executable. This method provides very accurate information about where program spends most of time on the train run. Whether it matches the average run depends, of course, on the choice of train data set, but several studies has shown, that the behavior of program usually changes just marginally over different data sets.

When profile feedback is not available, compiler attempts to predict the behavior of each branch in the program using a set of heuristics (see predict.def for details) and compute estimated frequencies of each basic block by propagating the probabilities over the graph.

Each basic block contains two fields -- frequency and count. Frequency is an estimation how often is basic block executed within a function and is represented as integer scaled in the range 0-BB_FREQ_BASE. Most frequent basic block in function is initially set to BB_FREQ_BASE and rest of frequencies are scaled according to that. During optimization, the frequency of most frequent basic block can both decrease (for instance by loop unrolling) or grow (for instance by cross-jumping optimization).

The count contains number of execution measured during training run and is nonzero only when profile feedback is available. This value is represented as 64bit integer. Most optimization passes can use only the frequencies of basic block, while few passes may want to know exact counts. The frequencies should always match the counts after scaling, however during updating of the profile information numerical error may accumulate into quite large errors.

Similarly each edge contains probability field--an integer in range from 0 to REG_BR_PROB_BASE. It represents probability of passing control from the end of source basic block to the destination. The probability that control flow arrives via given edge to the destination basic block is called reverse probability and is not directly represented, but it may be easily computed from frequencies of basic blocks. EDGE_FREQUENCY macro is available to compute how frequently is given edge taken. count field is present for each edge as well, representing same information as for basic block.

The basic block frequencies are not represented in the RTL instruction stream, the edge frequencies are represented only for conditional jump via REG_BR_PROB, since they are used when instructions are output to the assembly file and flow graph is no longer maintained.

Jan Hubicka 2003-05-04