Interactions of data-flow analyses
Deep common sub-expressions
Using available-expressions analysis to identify redundant expressions only
works for expressions that are textually identical. For example, an application of common-subexpression elimination will recognize that t1 in the
code fragment
t1 = b + c; a = t1 + d;
has the same value as does t2 in
t2 = b + c; e = t2 + d;
as long as the variables b and c have not been redefined in between. It
does not, however, recognize that a and e are also the same. It is
possible to find such "deep" common subexpressions by re-applying common
subexpression elimination until no new common subexpressions are found
on one round.
Multiple-passes do not always help
One data-flow analysis can improve the results of another data-flow analysis and vice-versa.
Classic example: constant propagation and unreachable code elimination: performing constant propagation
and folding may replace branch predicates with constant boolean values, enabling more code to be identified as unreachable; conversely, eliminating unreachable code can remove non-constant assignments to variables, thus improving the precision of constant-propagation.
There are many more examples of mutually-beneficial interactions of different data-flow analyses.
Phase-ordering problem: if two or more analyses are mutually beneficial, then any ordering of the analyses in which each is run only once may yield sub-optimal results. Some common responses to the
phase-ordering problem:
- Carefully choose the order; allow the same analysis to be performed multiple times if needed.
- Use a meta-fixed-point loop that applies all analyses in a sequence and keeps repeating it, until there is no change.
Unfortunately, in the presence of loops, even the meta-fixed-point approach may yield sub-optimal
result. This is so because: when analyzing a loop, optimistic initial assumptions must be made simultaneously
for all mutually beneficial analyses to reach the best solution: performing the analyses separately, in effect, makes the pessimistic assumptions about the solutions of all other analyses, from which it is not possible to recover, simply by iterating the separate analyses.
Example:
x := 10;
while (...) {
if (x == 10) {
DoSomething();
} else {
DoSomethingElse();
x := x + 1;
}
}
y := x;
If we do constant-propagation analysis and dead-code elimination separately, we will never be able
to simplify this loop. On the other hand, a super-analysis that does the two analyses simultaneously would result in the following optimized code:
x := 10;
while (...) {
DoSomething();
}
y := 10;
So, the phase-ordering problem is not simply an ordering problem, but a problem that can get you stuck in a local minimum even if you try all possible orders, as compared to a super-analysis that composes the analyses together.
In general, developing such a super-analysis is intractable and so the best-effort solution is to try multiple passes in some sequence and repeatedly, until nothing changes.