### An example loop transformation: loop inversion

t = m + n;
for (i = 0; i < n; i++) {
A[i] = t;
}

to
t = m + n;
if (n > 0) {
i = 0;
do {
A[i] = t;
i++;
} while (i < n);
}

This can be thought-of as a special case of loop peeling (peeling of the first few iterations in the loop header). Here, we peel-off the check of the first iteration. It happens to be a very common optimization in modern compilers as it can be applied to any loop.

Optimizations that make a loop faster are usually very consequential and so much effort has gone into this.

### What is a loop?

Goal:
• Define a loop in graph-theoretic terms (control-flow graph)
• Not sensitive to input syntax: a uniform treatment of all loops (e.g., for, while, repeat-until, do-while, etc.).
Informally, a natural loop has
• edges that form at least a cycle.
• a single entry point.
Example:
BB1->BB2
BB1->BB3
BB2->BB4
BB3->BB4
BB4->BB5
BB5->BB4
BB5->BB2

While BB4-BB5 are a natural loop, BB2-BB4-BB5 are not a natural loop (BB3 can jump into the middle of this loop).

Most loops in common programs are natural loops.

### Dominators

• Node d dominates node n in a graph (d dom n) if:
• Every path from the start node to n goes through d.
• A node dominates itself.
• Immediate dominance: d idom n: d dom n, d != n, \not_exists m, s.t., d dom m and m dom n.
• Immediate dominance relationships form a tree.

### Definition of Natural loops

• A single entry point: header(d): a header dominates all nodes in the loop.
• A back edge (n -> d) in a flow graph is an edge whose destination dominates its source (d dom n)
• The natural loop of a back edge (n->d) is d+ {nodes that can reach n without going through d}.

### Algorithm to find natural loops

1. Find the dominator relations in a flow graph.
2. Identify the back edges.
3. Find the natural loop associated with the back edge.

### Finding dominators

A node d dominates n in a graph (d dom n) if every path from the start node to n goes through d.
• Domain of values: all sets of basic blocks.
• Direction: forward
• Meet operator: intersection
• Transfer function: f(x) = x \union b
• Boundary condition: out[entry] = {start}
• Initialization to Top for all non-entry blocks: out[b] = set of all blocks
• Bottom: empty set
Can do in one reverse postorder pass.

### Finding back edges

Any edge whose destination dominate source.

### Constructing natural loops

Remove d from the graph, find all predecessors of n.

### Inner loops

• If two loops do not have the same header:
• They are either disjoint, or
• One is entirely contained (nested within) the other
• Inner loop: the one that contains no other loop.
• If two loops share the same header:
• Hard to tell which is the inner loop
• Combine as one.

### Induction variables

Example
BB1->BB2
BB2->BB2
BB2->BB3

BB1: a=0
BB2: a = a+1

• Summarize the loop body with a transfer function fb: m' = fb(m)
• m'(a)=m(a)+1; m'(x)=m(x) for x != a
• Summarize the effect after i iterations: m' = fb^i(m)
• m'(a)=m(a)+i; m'(x)=m(x) for x != a
• Can now execute each iteration independently (e.g., in parallel).
• Summarize the effect of the entire loop with n iterations: m' = fb^n(m)
• If n is a constant: m'(a)=m(a)+n; m'(x)=m(x) for x != a
• If n is unknown: m'(a)=unknown; m'(x)=m(x) for x != a
• Can now execute each iteration independently (e.g., in parallel).
• Closure of a function: direct evaluation of i applications of a function.

### Region-based analysis

• A region: a group of nodes with a single entry and possibly multiple exits
• Assume a reducible flow graph
• Find natural loops: each defines:
• loop body (acyclic graph of sub-regions)
• loop (one sub-region and back edges)
• Bottom-up analysis: compute transfer function from entry to all exits for each sub-region.
• Top-down analysis: Apply the summary transfer functions (for each sub-region) to compute data flow values of the entire region with its loop's back-edges.

### Example: a semi-lattice of transfer functions

• Example: reaching definition: fi(x) = geni \union (x - killi).
• o : composition of transfer functions (for paths): (f2 o f1)(x) = gen2 \union ((gen1 \union (x-kill1)) - kill2).
• ^ : meet of transfer functions (for confluence points): union: (f1 ^ f2)(x) = (gen1 \union (x-kill1)) \union (gen2 \union (x-kill2))
• *(kleene star) operator: closure of transfer functions (for cycles): f* = meet over f^0, f^1, f^2, ..., f^n for n->\infinity. f^0 is the identity function.
• In this example: f^2(x) = f(x)
• f*(x) = x \union f(x) = gen \union x

### Bottom-up analysis

• Acyclic region: find transfer function to end of every subregion. Apply composition and meet of transfer functions based on paths.
• Cyclic region:
• After (i-1)st complete iterations
• fj: transfer function from loop header to back-edge source.
• fj^(i-1)
• For each loop exit e:
• fe: transfer function from loop head to exit
• After i iterations: fe o fj^(i-1)
• After an unknown number of iterations: fe o fj*

### Top-down analysis

• Given boundary value at the beginning of program: in[entry]
• Acyclic region R, for each subregion S:
• out[S] = f(in[R]), where f is the transfer function from in[R] to the exit of S
• in[S] = in[R] if S is the header of region R
• in[S] = \meet (out[p]) for all predecessors p of S (if S is not the header of region R)
• Cyclic region R, with one subregion S:
• in[S] = fj*(in[R]), where fj is the transfer function from in[R] to back-edge source
Region-based analysis broken down into bottom-up and top-down analysis is useful for more precise analysis by passing appropriate summaries from bottom-up analysis to top-down analysis. For example, if we have two functions f and g (callers), calling the same function h (callee). Using a summary callee function for h allows us to avoid flow of information from f to g and g to f. In this case, the call to h can be summarized as a transfer function with one entry and one exit.

### Induction variables

Example1
for (j = ...) {
//A[j] = 0
t1 = 4*j
t2 = &A
t3 = t1+t2
*t3 = 0
}
//t1 = 4j; t3 = &A+4j in iteration j

to
t3 = &A
for (j = ...) {
*t3=0
t3 = t3+4
}


Example2

loop:
b = a
a = a + 1
c = a + b
d = r + b
e = f
f = read()
g = 2 * f
a = a + 1
p = p + 1
q = p + a

Let m be a map of var->val at start of loop body.

At the end of each statement:

b = m(a)
a = m(a) + 1
c = 2*m(a) + 1
d = m(r) + m(a)
e = m(f)
f = unknown
g = unknown
a = m(a) + 2
p = m(p) + 1
q = m(p) + m(a) + 3


Let m' be a map of var->val at start of loop at end of i-1 iteration

m'(r)=m(r)
m'(a)=m(a)+2i-2
m'(p)=m(p)+i-1
m'(f)=unknown


At end of each statement in i'th iter:

b=m(a)+2i-2
a=m(a)+2i-1
c=2m(a)+4i-3
d=m(r)+m(a)+2i-2
e=unknown
f=unknown
g=unknown
a=m(a)+2i
p=m(p)+i
q=m(p)+m(a)+3i

Wherever possible, represent values as affine expressions on the loop iteration index.

### Strength reduction

Replace multiplications with additions
for (i = ....) {
j = c0 + c1*i
... j ...
}

to
j = c0 + c1;
for (i = ...) {
... j ...
j += c1
}


### Induction Variable

Variables
• All the variables in the program
• A normalized index i for each loop, with values 1, 2, ... for 1st, 2nd iterations etc.
Find variables that are afffine expressions of loop indices and loop-invariant variables. Formulate as a bottom-up region based analysis.

### Transfer function of a statement

• Let f be the transfer function of statement: x = c0+c1*y + c2*z
• m' = f(m)
• For each variable v, m'(v) =
• m(v) if v != x
• c0 + c1*m(y) + c2*m(z) if v = x
• unknown (bottom) otherwise

### Composition of transfer functions

• Composition of transfer function: f2 o f1 : --> f1 --> f2 -->
• Let m1=f1(m) and m'=f2of1(m)
• Let f2 be the transfer function of the statement: x = c0 + c1*y + c2*z
• For each variable v, m'(v) =
• m1(v) if v != x
• c0+c1*m1(y)+c2*m2(z) if v=x and m1(y),m1(z) is not unknown
• unknown otherwise

### Meet of transfer function

• Meet of transfer function: f2 ^ f1
• Let m1 = f1(m), m2=f2(m), m' = f1^f2(m)
• For each variable v, m'(v) =
• m1(v) if m1(v)=m2(v)
• unknown otherwise

### Closure

• Closure of transffer function: f^i
• Let m1=f(m0), m(i-1) = f^(i-1)(m0), m(i) = fof^(i-1)(m0)
• After i-1 iterations
• Loop invariant: m1(v) = m0(v) => m(i-1)(v) = m0(v)
• Basic induction: m1(v) = m0(v) + c => m(i-1)(v) = m0(v) + c(i-1)
• After the ith iteration:
• m1(v) = c0 + \sum_k{ck*m0(vk)} + \sum_j{cj*m0(vj)} where vk is a loop invariant variable, and vj is a basic induction variable incremented by Cj (capital C is different from small c)
• => mi(v) = c0 + \sum_k{ck*m(i-1)(vk)} + \sum_j{cj*m(i-1)(vj)}
• => mi(v) = c0 + \sum_k{ck*m0(vk)} + \sum_j{cj*(m0(vj) + Cj(i-1))}
• otherwise => mi(v) = unknown
• After unknown number of iterations: Let m* = f*(m)
• m1(v) = c0 + \sum_k{ck*m0(vk)} where vk is a loop invariant variable
• => m*(v) = c0 + \sum_k{ck*m0(vk)}
• otherwise => m*(v) = unknown

Summary: region-based analysis is an alternate algorithm for iterative data-flow. It works in transfer function space with composition (o), meet (^) and kleene start (*) operators. Bottom-up analysis summarizes the effect of regions with transfer functions.

Induction variable analysis is useful for strength reduction and for symbolic analysis for parallelization.