Interactions of data-flow analyses

Deep common sub-expressions

Using available-expressions analysis to identify redundant expressions only works for expressions that are textually identical. For example, an application of common-subexpression elimination will recognize that t1 in the code fragment

t1 = b + c; a = t1 + d;

has the same value as does t2 in

t2 = b + c; e = t2 + d;

as long as the variables b and c have not been redefined in between. It does not, however, recognize that a and e are also the same. It is possible to find such "deep" common subexpressions by re-applying common subexpression elimination until no new common subexpressions are found on one round.

Multiple-passes do not always help

One data-flow analysis can improve the results of another data-flow analysis and vice-versa. Classic example: constant propagation and unreachable code elimination: performing constant propagation and folding may replace branch predicates with constant boolean values, enabling more code to be identified as unreachable; conversely, eliminating unreachable code can remove non-constant assignments to variables, thus improving the precision of constant-propagation. There are many more examples of mutually-beneficial interactions of different data-flow analyses. Phase-ordering problem: if two or more analyses are mutually beneficial, then any ordering of the analyses in which each is run only once may yield sub-optimal results. Some common responses to the phase-ordering problem:

Carefully choose the order; allow the same analysis to be performed multiple times if needed.
Use a meta-fixed-point loop that applies all analyses in a sequence and keeps repeating it, until there is no change.

Unfortunately, in the presence of loops, even the meta-fixed-point approach may yield sub-optimal result. This is so because: when analyzing a loop, optimistic initial assumptions must be made simultaneously for all mutually beneficial analyses to reach the best solution: performing the analyses separately, in effect, makes the pessimistic assumptions about the solutions of all other analyses, from which it is not possible to recover, simply by iterating the separate analyses.

Example:

x := 10;
while (...) {
  if (x == 10) {
    DoSomething();
  } else {
    DoSomethingElse();
    x := x + 1;
  }
}
y := x;

If we do constant-propagation analysis and dead-code elimination separately, we will never be able to simplify this loop. On the other hand, a super-analysis that does the two analyses simultaneously would result in the following optimized code:

x := 10;
while (...) {
  DoSomething();
}
y := 10;

So, the phase-ordering problem is not simply an ordering problem, but a problem that can get you stuck in a local minimum even if you try all possible orders, as compared to a super-analysis that composes the analyses together.

In general, developing such a super-analysis is intractable and so the best-effort solution is to try multiple passes in some sequence and repeatedly, until nothing changes.