Implementation

First we have to know what is stored in a register (in RTL representation). We have added one more field to the RTL representation. This field contains a pointer to structure describing (a part of) a variable stored in register. We use a hash table for these structures. Information about content of a register is updated when using sub-registers, allocating registers or when doing Live-Range Splitting. There is also similar description of variable stored in memory.

The Variable Tracking pass does data-flow analysis. Computing data-flow on RTL itself would be complicated, so it first scans each basic block and remembers each use of register/memory in the basic block. Each such a record contains the register/memory, the instruction that it is in and the purpose (LT_SET_DEST9.2, LT_PARAM9.3, LT_CLOBBERED9.4) or the register/memory in the instruction.

Now we perform the data-flow analysis to propagate the variable locations. We use Hybrid Search Algorithm[8] that is faster than the traditional iterative or worklist algorithms (actually hybrid search algorithm is a combination of iterative and worklist algorithm). First, we initialize the IN and OUT sets for each basic block. The sets are actually arrays of link-lists, each link-list contains the information about (a part of) a variable that is stored in corresponding register. The memory references will be remembered in a hash table. We use link-list for registers because there can be more than one variable assigned to a register (see an example of code where register allocator can assign two variables to one register) but the number will be quite low so the link-list is probably the fastest solution.

Example: Register allocator can assign the same register to variable A and B in this code.
\begin{algorithmic}
\IF{condition}
\STATE{define value of variable A}
\ELSE
\STA...
...nor B}
\IF{condition}
\STATE{use A}
\ELSE
\STATE{use B}
\ENDIF
\end{algorithmic}

Then we clear the "visited" bitmap and set the "pending" bitmap and put all blocks to "worklist" (Fibonacci heap). The blocks in worklist are ordered in reverse completion order of Depth-First Search (DFS) which should speed up the data-flow analysis. We take the first block from the worklist and start search from it. Given a basic block, the IN set of the block is computed as a union of predecessors' OUT sets. Then we clear the pending bit and set the visited bit, and compute the OUT set from the IN set and the recorded purposes of each register/memory in the basic block. If the OUT set has changed we set the successors' pending bits. The search continues in the unvisited successors.

We are passing the locations of variables by NOTE_INSN_VAR_LOCATION notes to the debug info writer. Each note describes the location of one variable at the point in instruction stream where the note is. The note contains a list of pairs (offset, location) because variable can have several parts and each part has its own location (for example 64-bit integer variable on 32-bit machine has usually 2 parts). There is no need to emit a note for each variable before each instruction, we only emit these notes where the location of variable changes.

After the data-flow finishes, we know where the variables are in the beginning and the end of each basic block. We emit two groups of notes for each basic block:

  1. Notes for the changes between the OUT set of the previous block and the IN set of the current block (the OUT set of the "previous" block of the first block is empty by definition) will be emitted before the head of actual basic block.
  2. Notes for changes of variable locations because of the effects of instructions in actual basic block. If the purpose of the reg/mem in the instruction is LT_SET_DEST or LT_CLOBBERED the note will be emitted after the instruction that the reg/mem is in (because the instruction sets/destroys the value). If the purpose of the reg/mem is LT_PARAM the note will be emitted before the instruction because a variable is already in the location before this instruction.

Jan Hubicka 2003-05-04