Updated September 8, 2010: Our translation validator implementation has been improved dramatically since our first release. The current version uses an alternative to structural analysis which allows us to handle more functions. In addition, we have improved our validation results for our original optimizations, and added new optimizations. The new results are summarized in the table below.

optimized functionsValidated optimizationsAlarmsRatio
GVN 535 480 55 89%
SCCP 78 71 9 91%
LICM 88 86 2 98%
DCE 82 82 0 100%
LD 32 32 0 100%
New Optimiations
ADCE 82 82 0 100%
CONSTPROP 6 2 2 50%
DIE 82 82 0 100%

Using our prototype implementation of LLVM M.D., we have run three experiments. First, we have a number of difficult, hand-written examples that we have optimized using LLVM or by hand. These hand-written examples are designed to test extreme cases of both correct and incorrect optimizations. Those examples include optimizations such as global constant and copy propagation and folding, common-subexpression elimination with global value numbering or lazy code motion, list, trace and global scheduling, loop fission and fusion, loop invariant code motion, loop deletion, sparse conditional constant propagation, and loop unswitching.

Our second experiment uses the sqlite3 embedded database library. This library is comprised of approximately 100,000 lines of C code in over 1300 different functions. We optimize the sqlite3 library using different optimizations and try to validate each transformed function. The results we have obtained are summarized in the Table below, where GVN stands for global value numbering, SCCP stands for sparse conditional constant propagation, LICM stands for loop invariant code motion, DCE stands for dead code elimination, and LD stands for loop deletion. Note that the current implementation of LLVM uses alias analysis, and eliminates some partially redundant computations and redundant loads. We believe that it implements an AWZ-like GVN analysis.

optimized functionsValidated optimizationsAlarmsRatio
GVN 350 326 24 93%
SCCP 42 35 7 83%
LICM 79 76 3 96%
DCE 43 43 0 100%
LD 25 25 0 100%

The second column is the number of functions for which we observe one or more transformations. For instance, for GVN (using alias information), we have observed that 350 functions were transformed, of which we validated 326, and produced 24 alarms.

Finally, for our third experiment, we have used LLVM M.D. to validate optimizations on C code produced by The Mathworks Real-time workshop from a combined Stateflow/Simulink model. The model is a classic example of the temperature of a plant along with it's cooling system in Simulink, and a control-command program that takes as input the current temperature and decides when fans should be turned on or off. The important conclusion from this experiment is that for such code, that respects the safety-critical guidelines for C, we can restructure every function and apply the validator, even for inter-procedural optimization.