**Loop invariant operations**

for (i = 0; i < n; i++) { A[i] = m + n; }to (loop invariant code motion)

t = m + n; for (i = 0; i < n; i++) { A[i] = t; }

**Partial Redundancy Elimination (aka lazy code motion)**

Subsumes

- Common subexpression elimination
- Loop invariant code motion
- plus new capability
BB1->BB2 BB1->BB3 BB2->BB4 BB3->BB4 BB2: a = b + c BB3: b = 7 BB4: d = b + c

can be changed toBB1->BB2 BB1->BB3 BB2->BB4 BB3->BB4 BB2: t = b + c a = t BB3: b = 7 t = b + c BB3: d = t

Can we place calculations of`b+c`

such that no path re-executes the same expression? Algorithm based on PLDI92 paper by Knoop et. al. on "Lazy code motion" - Propose separate forward and backward passes.
- Problem: critical edges
BB1 --> BB3 BB2 --> BB3 BB2 --> BB4 BB1: c = a+b BB3: d = a+b

A critical edge:- Source basic block has multiple successors
- Destination basic block has multiple predecessors

- Introduce a new basic block to eliminate critical edges:
BB1 --> BB3 BB2 --> newBB BB2 --> BB4 newBB --> BB4 BB1: t = a+b c = t newBB: t = a+b BB3: d = t

Algorithm Prelims:

- Assume every statement is a basic block
- We will only place statements at the beginning of a basic block
- Add a basic block for every edge that leads to a basic block with multiple predecessors

Algorithm Overview:

- Determine where the expressions are to be computed
- Consists of four data-flow problems:
- Optimal computation (passes 1 and 2)
- No calculation of expression that is not used, along any execution path.
- No redundancy along any execution path.

- Operations are executed at the latest possible time to minimize register usage (pass 3)
- No unnecessary copy assignments (pass 4)

- Optimal computation (passes 1 and 2)

Example1:

BB1-->BB2 BB1-->BB3 BB1-->BB4 BB2-->BB5 BB3-->BB5 BB4-->BB6 BB2: x = a + b BB5: y = a + b BB4: a = 10 BB6: z = a + bchange to

BB1-->BB2 BB1-->BB3 BB1-->BB4 BB2-->BB5 BB3-->BB5 BB4-->BB6 BB2: x = a + b BB3: t = a + b BB5: y = t BB4: a = 10 t = a + b BB6: z = tNotice that we put t = a+b where a+b is "needed", i.e., it is needed on all paths in future

Example 2

BB1-->BB2 BB2-->BB3 BB3-->BB2 BB3-->BB4 BB2: x = a + bchange to

BB1-->BB2 BB2-->BB3 BB3-->BB2 BB3-->BB4 BB1: t = a+b BB2: x = tAgain, t=a+b is "needed" on all paths in future

Example 3

BB1-->BB2 BB2-->BB3 BB3-->BB4 BB4-->BB5 BB4-->BB6 BB5-->BB3 BB6-->BB3 BB1: x = a+b BB4: y=a+b BB6: a=10change to

BB1-->BB2 BB2-->BB3 BB3-->BB4 BB4-->BB5 BB4-->BB6 BB5-->BB3 BB6-->newBB newBB-->BB3 BB1: t = a+b x=t BB4: y=t BB6: a=10 newBB: t=a+b

Example 4 (not all redundancy can be eliminated)

BB1-->BB2 BB2-->BB3 BB3-->BB4 BB4-->BB5 BB5-->BB3 BB3-->BB6 BB6-->exit BB1: x=a+b BB5: y=a+b BB6: a=10change to

BB1-->BB2 BB2-->BB3 BB3-->BB4 BB4-->BB5 BB3-->BB6 BB6-->newBB newBB-->BB3 BB6-->exit BB1: t=a+b x=t BB5: y=t BB6: a=10 newBB: t=a+bThere is some redundancy in the transformed code which is unavoidable. Notice that the path from newBB to exit computes a+b even though it was not required. The other option is:

BB1-->BB2 BB2-->BB3 BB3-->BB4 BB4-->BB5 BB5-->BB3 BB3-->BB6 BB6-->exit BB1: t=a+b x=t BB5: t = a+b y=t BB6: a=10But now a+b is computed twice on the path B2->B3->B4->B5 (this is the solution that our analysis will yield).

- Domain of values: sets of expressions
- Direction: backward
- Meet operator: intersection. Dictates the transfer function on successor-predecessor edges
- Transfer function across an instruction: F(x) = EUse \union (x - EKill). Euse is the set of expressions used by the instruction (e.g.,
`{a+b}`

for`x=a+b`

). Ekill is the set of expressions*killed*by this expression, e.g., all expressions containing`x`

are killed by`x=a+b`

. - Boundary condition: Needed(exit,in)=empty.
- Initialization: for all b (not exit), in[b] = set of all expressions

**Proposal 1**:

- Place the expression at the frontier of "needed expressions" (between "not needed" and "needed")
- But does it eliminate all redundancies? (e.g., may retain some redundancies)
- And is that the best placement? (e.g., may be too early)

Example1

BB1->BB2 BB2->BB3 BB3->BB4 BB3->exit BB1: x=a+b BB4: y=a+bNow,

`a+b`

is needed at BB4 entry but not needed at BB3 exit. Yet computing `t=a+b`

on the edge BB3->BB4 is redundant because the expression `a+b`

is already available (or not Example2

BB1->BB2 BB1->BB3 BB1->BB4 BB2->BB5 BB3->BB5 BB4: a=10 BB5: y=a+bHere, the frontier occurs at the end of BB1 but that is too early to compute

`t=a+b`

as we will take up an extra register for a long time unnecessarily!
Example3 (a more convincing example of why you should not compute a value too early)

BB1->BB2 BB1->BB3 BB2->BB4 BB4->BB2 BB4->BB5 BB5->BB6 BB3->BB7 BB7->BB6 BB3: a=b+c BB6: y=b+cThe frontier is at the end of BB1 but that would eat-up a register for the entire duration of the BB2-BB4 loop which seems too wasteful.

- Assume: an expression will be calculated at the earliest point it is needed.
- Forward pass: Missing expression
- An expression is missing at
`p`

if its value has not been "needed" by any basic block along SOME path reaching`p`

. Will solve the redundancy problem with the first example. - Domain of values: sets of expressions
- Direction: forward
- Transfer function at instruction b: f(x) = EKill \union (x - Needed.in[b]). It is missing if it has been killed. If it is needed then it is not missing.
- Meet = union
- Boundary condition: out[entry] = set of all expressions.
- Initialization for all other b (not entry): out[b] = empty set.

Place the expression at the earliest point we need it and is missing

- earliest(b)=needed.in[b] \intersect missing.in[b]
- Algorithm: for all basic block b, if
`x+y \in earliest[b]`

, at beginning of`b`

, create a new variable`t`

, such that`t=x+y`

, and replace every original`x+y`

by`t`

. - But it may be too early! Solves example1 but the problems with example2 and example3 remain.

`e`

is postponed at a program point `p`

if: for all paths leading to `p`

, we have seen the earliest placement of `e`

but not a subsequent use. Take the third example again:
BB1->BB2 BB1->BB3 BB2->BB4 BB4->BB2 BB4->BB5 BB5->BB6 BB3->BB7 BB7->BB6 BB3: a=b+c BB6: y=b+cHere,

`b+c`

is - Domain of values: sets of expressions
- Direction: forward
- Transfer function for instruction b: f(x)=(earliest[b] \union x) - EUse
- Meet : Intersection
- Boundary condition: out[entry] = empty set
- Initialization for all b (not entry): out[b] = set of all expressions

`latest[b] = (earliest[b] \union postponed.in[b]) \intersection (EUse \union ExprsForWhichSomeSuccessorIsNotPostponedOrEarliest)`

- ExprsForWhichSomeSuccessorIsNotThePostponedOrEarliest = \union_over_successors (\not earliest[b] \and \not postponed.in[b]).

BB1->BB2 BB2->BB3 BB1: x=a+bNo need to generate temporaries in this case. But our algorithm will. Do a liveness analysis to see if the expression is used later (without an intervening

`latest`

as a `latest`

indicates
recomputation).
- Domain of values: set of expressions
- Direction: backward
- Transfer function: f(x) = (EUse \union x) - latest[b]
- Meet operator: union
- Boundary condition: in[exit] = empty set
- Initialization: in[b] = empty set

- Backward pass to compute needed expressions
- Forward pass to compute missing expressions
- Earliest = intersection of needed and missing
- Forward pass using earliest placement to compute (can be)-postponed expressions
- Latest = frontier of (can be)-postponed expressions
- Backward pass to eliminate unused temporary variable assignments
- For all basic blocks b: if
`x+y \in (latest[b] \intersect used.out[b])`

:- at beginning of
`b`

: add new`t=x+y`

- replace original
`x+y`

by`t`

only if:`x+y \in (EUse \intersect \not(latest[b] \insersect \not(used.out[b])))`

- If it is in
`latest[b]`

and not in`used.out[b]`

, this use of the expression does not need to be (and has not been) replaced by`t`

, so let it be as it is.

- at beginning of