COL731: Advanced Compiler Techniques

Lab 1 : Experiment with LLVM's SLPVectorizer

Due date: 20th August 2023, 11:59pm IST

5 marks total

Clone the LLVM Repository at https://github.com/llvm/llvm-project.
Study and experiment with the SLPVectorizer at path llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp.
Create a report on your experiments that must contain at least the following information:
- What is the algorithm of LLVM's SLP Vectorizer, and how does it differ from the SLP paper that we studied in class? Show the pseudo-code of LLVM's SLP vectorizer, similar to how it is shown in the SLP paper.
- What are the LLVM analysis passes that the SLP Vectorizer depends upon?
- What are the functions defined in the CPP file and what is the brief summary of what each function does?
- Compile an example vectorizable program using clang12 -O3 -mllvm -print-after-all -mllvm -print-before-all -c example.c. Compare the input/output of the SLP vectorizer pass, and discuss the transformations in your report.
  - Choose your example vectorizable program carefully, so that the discussion becomes more interesting.
  - Identify at least one analysis or transformation pass (on which the SLP vectorizer depends) such that if you disable that pass, the SLP vectorization becomes less effective (or ineffective) on your example program. Explain why this happens.

Submission: Email a PDF copy of your report to the instructor. The PDF filename should contain your entry-number and name. For example if your entry number is 2121CS1000 and your name is ABC, then your filename should be 2121CS1000_ABC.pdf

Submissions

Lab 2 : GEMM

Due date: 24th September 2023, 11:59pm IST

5 marks total

We have already studied about dependence analysis and loop transformations in class. In this assignment, we will look at the potential of loop optimization through the polyhedral model. Compilation using polyhedral model involves representation of programs (especially those involving nested loops and arrays) to parametric polyhedra and exploiting combinatorial and geometrical optimizations on these objects to analyze and optimize the programs. The interest of using polyhedral representations is that they can be manipulated or optimized with algorithms whose complexity depends on their structure and not on the number of elements they represent. Furthermore, generic and compact solutions can be designed that depend on program parameters (e.g., loop bounds, tile sizes, array bounds).

We will use Polly, a high-level loop and data-locality optimizer for LLVM IR for performing experiments in this assignment. There exists other tools, like Graphite (for GCC) and PLUTO, which also support polyhedral compilation.

Clone the Polybench/C benchmark at https://github.com/MatthiasJReisinger/PolyBenchC-4.2.1 and look at the the General matrix-matrix multiplication (GEMM) benchmark at linear-algebra/blas/gemm (direct link.
Use the following flags to compile GEMM:
- -DPOLYBENCH_TIME
- -DEXTRALARGE_DATASET
- -DDATA_TYPE_IS_INT
Study the Polly optimization framework and how gemm is optimized using it. Note that, polly is not enabled by default in Clang/LLVM to automatically optimize C/C++ code during compilation. It needs to be enabled by passing explicit flags. Please refer the below mentioned documents for further details.
Report the run times for gemm.c, when compiled with
- GCC (-O3)
- ICC (-O3)
- Clang (-O3) without polly extension
- Clang (-O3) with polly extension
Analyze the x86 assembly code generated in each of the above four cases. Write (if possible) a faster x86 implementation using intrinsics or inline assembly than the above code generated by Clang (with Polly) for the GEMM program. You can use any of the supported instruction set extensions for your manual optimization (SSE, FMA, AVX). Report the assembly generated in all four cases, your analysis for these assembly codes, your manually optimized source code and x86 assembly. Report the run times for your implementation. We provide some relevant references below:
- Intel Intrinsics Guide
- Extended Asm - Assembler Instructions with C Expression Operands

Submissions

Lab 3 : Experiment with LLVM's Polly

Due date: 26th October 2023, 11:59pm IST

5 marks total

Through experiments on compiling small hand-written C programs (written by you), identify the strengths and limitations of the Polly framework. Report your findings (along with the hand-written C programs that you used).
In particular, identify the algorithm used in Polly to eliminate the predicates (guards) on individual statements through partitioning. For example, consider a case where we are partitioning an inner loop, whose statement bounds (for each statement) are 1-D intervals where the lower and upper bounds are given by affine expressions of the surrounding loop indices. Write example programs where:
1. The lower and upper bounds of each interval are statically identifiable during partitioning (after the space partitioning algorithm has identified a schedule and empty iterations have been eliminated).
2. The lower and upper bounds of each interval are not statically identifiable (after the space partitioning algorithm has identified a schedule and empty iterations have been eliminated).
For each example, report what method is used by the Polly framework to identify the lower/upper bounds of each partition. Do they resort to runtime generation of partition bounds (to eliminate guards) or do they simply leave the guards in (to avoid the overhead of identifying partition bounds at runtime).

Submissions

Lab 4 : Formal Verification

Due date: 19th November 2023, 11:59pm IST

Experiment with Alive2. Show:
1. One example where Alive correctly validates a peephole optimization
2. One example where Alive incorrectly validates an (incorrect) peephole optimization. Ideally it should be manually hard to see that this peephole optimization is incorrect.
3. One example where Alive fails to validate a (correct) peephole optimization, e.g., due to a timeout.
You will be evaluated on the creativity and the insightfulness (and where applicable, simplicity) of the example you present.
Experiment with Z3:
1. Check commutativity of multiplication for integers and bitvectors
2. Check if subtraction is the inverse of addition for integers
3. Check if subtraction is the inverse of addition for bitvectors
4. Check if division is the inverse of multiplication for non-zero integers
5. Check if division is the inverse of multiplication for non-zero bitvectors
6. Let "slt" and "ult" represent signed and unsigned less-than comparison operators. For which of the four combinations is x₁lt(a-b,0) equivalent to x₂lt(a,b), for x_1,2 being one of "s" or "u"?
Report your findings.
Experiment with CompCert
- Download and install the CompCert compiler
  - sudo apt install opam
  - opam install menhir menhirLib coq
  - ./configure x86_32-linux
  - make all
- Using some compute-intensive programs, compare the performance of CompCert-generated executables with the optimized executables generated by a production compiler like LLVM/Clang and GCC.
- CompCert is a certified compiler written in the Coq proof assistant. See if you can understand the high-level proof statement of CompCert. Follow the steps below.
  - Read Section 1.2 of the CompCert manual to understand the ``Semantic Preservation Theorem''.
  - Read Theorem transf_c_program_correct in file driver/Compiler.v. This is the top-level correctness theorem for a compilation. Can you try explaining this theorem?
    - [Optional, will not be graded] Try understanding the C language semantics, available in file cfrontend/Csem.v. Also, try understanding the assembly language semantics, available in file x86/Asm.v. Then, try explaining the transf_c_program_correct in light of these semantics.
  - Now, let's look at a particular transformation pass within the compiler pipeline. Read Theorem trans_program_correct in file backend/Constpropproof.v. See if you can explain this theorem informally by taking a look at the following:
    - Take a look at Definition transf_prog in file backend/Constprop.v
    - Take a look at Lemma transf_program_match in file backend/Constpropproof.v
  The full proof specification is relatively deep. For full credit in this part of the question, it would suffice if you only provide an informal explanation of what is happening till a shallow depth e.g., you can just stop at Definition match_prog in backend/Constpropproof.v (by explaining informally what you think it is proving). However, you are welcome to delve deeper if you prefer (you will not be penalized for not delving deeper).

Submission: Email a PDF copy of your report to the instructor. The PDF filename should contain your entry-number and name. For example if your entry number is 2121CS1000 and your name is ABC, then your filename should be 2121CS1000_ABC.pdf

Submissions