COL731: Advanced Compiler Techniques

Lab 1 : Experiment with LLVM's SLPVectorizer

Due date: 20th August 2023, 11:59pm IST

5 marks total

  1. Clone the LLVM Repository at https://github.com/llvm/llvm-project.
  2. Study and experiment with the SLPVectorizer at path llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp.
  3. Create a report on your experiments that must contain at least the following information:
Submission: Email a PDF copy of your report to the instructor. The PDF filename should contain your entry-number and name. For example if your entry number is 2121CS1000 and your name is ABC, then your filename should be 2121CS1000_ABC.pdf

Submissions

Lab 2 : GEMM

Due date: 24th September 2023, 11:59pm IST

5 marks total

We have already studied about dependence analysis and loop transformations in class. In this assignment, we will look at the potential of loop optimization through the polyhedral model. Compilation using polyhedral model involves representation of programs (especially those involving nested loops and arrays) to parametric polyhedra and exploiting combinatorial and geometrical optimizations on these objects to analyze and optimize the programs. The interest of using polyhedral representations is that they can be manipulated or optimized with algorithms whose complexity depends on their structure and not on the number of elements they represent. Furthermore, generic and compact solutions can be designed that depend on program parameters (e.g., loop bounds, tile sizes, array bounds).

We will use Polly, a high-level loop and data-locality optimizer for LLVM IR for performing experiments in this assignment. There exists other tools, like Graphite (for GCC) and PLUTO, which also support polyhedral compilation.

  1. Clone the Polybench/C benchmark at https://github.com/MatthiasJReisinger/PolyBenchC-4.2.1 and look at the the General matrix-matrix multiplication (GEMM) benchmark at linear-algebra/blas/gemm (direct link.
  2. Use the following flags to compile GEMM:
  3. Study the Polly optimization framework and how gemm is optimized using it. Note that, polly is not enabled by default in Clang/LLVM to automatically optimize C/C++ code during compilation. It needs to be enabled by passing explicit flags. Please refer the below mentioned documents for further details.
  4. Report the run times for gemm.c, when compiled with
  5. Analyze the x86 assembly code generated in each of the above four cases. Write (if possible) a faster x86 implementation using intrinsics or inline assembly than the above code generated by Clang (with Polly) for the GEMM program. You can use any of the supported instruction set extensions for your manual optimization (SSE, FMA, AVX). Report the assembly generated in all four cases, your analysis for these assembly codes, your manually optimized source code and x86 assembly. Report the run times for your implementation. We provide some relevant references below:
Submission: Email a PDF copy of your report to the instructor. The PDF filename should contain your entry-number and name. For example if your entry number is 2121CS1000 and your name is ABC, then your filename should be 2121CS1000_ABC.pdf

Submissions

Lab 3 : Experiment with LLVM's Polly

Due date: 26th October 2023, 11:59pm IST

5 marks total

  1. Through experiments on compiling small hand-written C programs (written by you), identify the strengths and limitations of the Polly framework. Report your findings (along with the hand-written C programs that you used).
  2. In particular, identify the algorithm used in Polly to eliminate the predicates (guards) on individual statements through partitioning. For example, consider a case where we are partitioning an inner loop, whose statement bounds (for each statement) are 1-D intervals where the lower and upper bounds are given by affine expressions of the surrounding loop indices. Write example programs where:
    1. The lower and upper bounds of each interval are statically identifiable during partitioning (after the space partitioning algorithm has identified a schedule and empty iterations have been eliminated).
    2. The lower and upper bounds of each interval are not statically identifiable (after the space partitioning algorithm has identified a schedule and empty iterations have been eliminated).
    For each example, report what method is used by the Polly framework to identify the lower/upper bounds of each partition. Do they resort to runtime generation of partition bounds (to eliminate guards) or do they simply leave the guards in (to avoid the overhead of identifying partition bounds at runtime).
Submission: Email a PDF copy of your report to the instructor. The PDF filename should contain your entry-number and name. For example if your entry number is 2121CS1000 and your name is ABC, then your filename should be 2121CS1000_ABC.pdf

Submissions

Lab 4 : Formal Verification

Due date: 19th November 2023, 11:59pm IST

Submission: Email a PDF copy of your report to the instructor. The PDF filename should contain your entry-number and name. For example if your entry number is 2121CS1000 and your name is ABC, then your filename should be 2121CS1000_ABC.pdf

Submissions