Subsections
The project was leaded by David
Bednárek13.1. Following students
participated in the project directly:
- Zdenek Dvorák
- 13.2
Zdenek implemented the thread safe profiling code, added infrastructure for
high level branch predictor algorithms.
Major part of his work is related to loop optimizations. He implemented
new data structure to hold natural loop tree and modified natural loop discovery
code originally contributed by Michael Hayes to produce it. On the top of it
he implemented loop unswitching, loop unrolling and loop peeling algorithms,
one of the most successful new optimizations in our project.
He also made numerous contributions to the control flow graph handling code,
most importantly he changed the basic block representation from linearly ordered
array to double linked list avoiding quadratical complexity of many important
algorithms in new CFG aware code. This required many changes of code in
compiler that uses basic blocks.
He also found and fixed many latent bugs in the compiler and our new improved
CFG manipulation code.
- Jan Hubicka
- 13.3
Jan had special role in project by being the only participant with previous
knowledge of GCC internals and its development model. In early stages of
project he helped other participants to familiarize with project by solving
relatively simple problems and has been concentrating on adding infrastructure
necessary for CFG and profile manipulation. He also added new branch prediction
heuristics and a pass to estimate profile.
Predating the official start of project, he implemented first prototype of
working basic block profiler, extended branch prediction pass and modified few
passes to use the information. Since the information was not maintained
only register allocation used the information. He also
implemented bare bones of future cfg-cleanup pass.
In later stages of project he implemented few new optimization passes (tracer,
webizer, register coalescing), but continued to concentrate on GCC cleanups and
infrastructural changes and later on mid-level RTL implementation. He also
reviewed majority of contributions of other participants before integrating to
the source tree.
- Pavel Nejedlý
- 13.4
Major part of Pavel's work is concentrated on the basic block profiler. He
implemented new, robust and extensible file format to hold profiles. His
format allows overall information to be summarized which is used to identify better hot
spots in the program and to do code placement decisions as well as user errors
(such as using outdated profiles) to be detected and reported. In future his
work will allow more sophisticated profile merging algorithm, as new format
stores each train run separately instead of merging them all at once.
Pavel has also reimplemented the interface between profile instrumentation
generated by compiler and the libgcc runtime in cleaner and more extensible
way.
Last his major contribution is new data structure to hold dominance information
that allows fast dynamical updates and is necessary for Zdenek's new loop optimizer
code.
- Josef Zlomek
- 13.5
In early stages Josef has found and fixed major problem of Michael's natural
loop discovery code. Later he has implemented and tuned the new basic block
reordering algorithm based on software trace cache. His implementation is
in many way superior to the one described by original article. He also
improved and fixed the cfglayout module he used for his work.
Most important contribution is probably the new variable tracking pass fixing
the aged defect of GCC debugging information output code allowing us to enable
more optimizations at default level.
Josef has also reviewed lots of code done by Jan Hubicka and provided
useful comments and fixed important bugs.
Following people has contributed to our project:
- Andreas Jaeger
-
Andreas set up the SPEC2000 tester to do daily benchmarking of our development tree
which allowed us to easily tune new optimizations on nontrivial benchmark suite.
He also set up an extended weekly tester which provided feedback for our branch
prediction heuristics, and kindly benchmarked results of our work that are presented in
this documentation.
- Richard Henderson
-
Richard, one of the most active maintainers of GCC, has reviewed and approved for
inclusion to the mainline the essential parts of our work. He also provided
number of important ideas. Without his help our project would not be possible
at all.
He also helped us to redesign scope tree maintenance code in cfglayout.c.
- Jeff Law
-
Jeff pushed us to implement midRTL.
- Daniel Berlin
-
Daniel implemented and tested Dwarf2 output routine based on Josef's
variable tracking pass. He also tested and fixed our version of compiler on
powerPC and added improvements to the loop unrolling debug output code.
Our documentation has been written by all participants of the project each
describing his portion of work. The printed section contains introduction part
and experimental results. The later section describing data structures and
interfaces has been committed for inclusion in the official GCC manual
[1].
Major portion of documentation however is in the sources. This is dictated
by GNU Coding Standards [1] and maintainers of GCC project. Since
GCC contains a lots of aged and complicated code, good documentation is essential.
Each function should have description of purpose of its arguments and return
value, in addition to commenting all nontrivial steps in the function body.
Each of our newly added modules contains summary of algorithm in the
introductory comment.
Since our project is integrated to the GCC source tree, we provide short
overview of locations of our changes. We have implemented from scratch following
modules:
cfgcleanup.c, cfgloop.c, cfgloopanal.c, tracer.c,
web.c, coalesce.c, loop-new.c unroll-new.c, unroll-new.c
var-tracking.c, bb-reorder.c13.6, midrtl.c, et-forest.c, et-forest.h,
predict.def and analyze_branches script.
We made major changes to following modules:
- flow.c
- We split out the CFG handling code into cfg.c,
cfganal.c, cfgrtl.c, cfgloop.c and importantly changed
(de facto reimplemented) all the contents. We also added support for
easy dynamic updating of liveness information.
- emit-rtl.c
- We modified all routines to update the flow graph.
- toplev.c
- We reorganized the
rest_of_compilation
function.
- cfglayout.c
- We rewrote the code to handle syntactic scope nest tree and
changed all interfaces. We added functionality for code duplication and reorganized
the way how the insn chain is put together.
- gcse.c
- GCSE now maintains liveness information and uses our code
hoisting infrastructure code to avoid limitations of previous implementation.
- df.c
- Since we are the first users of Michael Hayes' data flow module,
we had to fix bugs and some design defects. We also cleaned up the code.
- gen*.c
- All these functions had to be updated to handle midRTL.
- recog.c
- The most of mid level RTL code is present here.
- optabs.c
- We had to modify all RTL generation routines to produce mid level RTL code.
- ifcvt.c
- contains double test optimization pass.
- predict.c
- contains new branch predictor implementation.
We basically rewrote this module from scratch.
- profile.c, final.c, libgcc.c
- contain updated profiler implementation and changed interfaces.
- reg-stack.c,reg-class.c,local.c,global.c
- Updated to use the profile.
- stmt.c
- contains high level branch predictors.
Because of the nature of project, we had to touch almost all other back-end modules in GCC as well.
Exact progress of the project is described in ChangeLog, ChangeLog.6 and ChangeLog.cfg files in the
GCC subdirectory, but here we point out some dates we consider especially important for our project.
- Jul 28
- First prototype of new branch predictor and profiler integrated to GCC + modification of register allocator. (Jan Hubicka)
- Sep 10
- First version of cfgcleanup integrated. (Jan Hubicka)
- Sep 25
- Large reorganization of CFG related modules. flow.c breakup (Jan Hubicka)
- Nov 12
- CFG efforts moved from mainline to cfg-branch due to destabilization.
- Nov 12
- Natural loop discovery code finally works. (Josef Zlomek)
- Nov 12
- Cross-jumping and jump threading merged to mainline. (Jan Hubicka)
- Nov 13
- Old jump optimization pass is finally dead. (Jan Hubicka)
- Nov 15
- High level branch prediction integrated. (Zdenek Dvorák)
- Nov 17
- CFG layout duplication code and preliminary tracer implementation. (Jan Hubicka)
- Nov 26
- Double test combining pass. (Jan Hubicka)
- Dec 11
- First version of software trace cache implementation. (Josef Zlomek)
- Dec 12
- Tracer now produces better code.
- Dec 13
- Major of our changes are now in the mainline tree. Thanks to Richard Henderson.
- Dec 15
- GCC mainline feature freeze.
- Jan 1
- New profiler file format. (Pavel Nejedlý)
- Jan 2
- Thread safe profiling integrated. (Zdenek Dvorák)
- Jan 9
- Software trace cache now produces better code (new loop rotation code). (Josef Zlomek)
- Jan 16
- Code hoisting infrastructure and GCSE revamp. (Jan Hubicka)
- Jan 21
- Midlevel RTL prototype. (Jan Hubicka)
- Jan 23
- Variable tracking code. (Jan Hubicka)
- Jan 24
- New natural loop code. (Zdenek Dvorák)
- Feb 13
- Finished integration of most of cleanups, fixes and infrastructural updates to the mainline. Thanks to Richard Henderson.
- Feb 15
- Final mainline freeze.
- Feb 19
- Major cfglayout cleanups and fixes. (Jan Hubicka)
- Feb 21
- Loop code updates integrated. (Zdenek Dvorák)
- Feb 25
- Loop unswitching code. (Zdenek Dvorák)
- Mar 18
- Loop unrolling and peeling code. (Zdenek Dvorák)
- Mar 25
- New debug output code. (Daniel Berlin, Josef Zlomek)
- Apr 1
- Feature freeze.
- Apr 3
- PowerPC and Sparc bootstrapped.
- Apr 9
- Fast dominance tree updating code integrated. (Pavel Nejedlý)
Jan Hubicka
2003-05-04