Intermediate Language

The design of an IR is an art --- hard to say that this is the best possible IR which will allow the best code generation/optimization. We will consider IR which resembles high-level assembly

Each instruction is of the form

x = y op z
x = op y

The expression x + y*z is translated

In this representation, each subexpression has a "name" : an effect of allowing only one expression at a time.

IR code generation is very similar to assembly code generation. But use any number or IR registers to hold intermediate results.

igen(e, t)

Example:

igen(e1+e2, t) =
  igen(e1, t1)  //t1 is a fresh register
  igen(e2, t2)  //t2 is a fresh register
  t = t1 + t2
Unlimited number of registers, means IR code generation is simple. Contrast with stack machine, where we were using stack slots to save intermediate results (many instructions to save/restore); here we can just coin a new register name, and save results to it.

LLVM IR is an example of IR. It resembles three-address code, but with usually higher-level opcodes than assembly.

Example of complexities in LLVM IR: