COL718 : High-Performance Computing : About
Why study architecture of high-performance computers?
With a saturation in the clock-speeds of a computer, several new ideas
have been developed/proposed to obtain the next level of performance
improvements in computer systems. These ideas usually span multiple system
layers including architecture, compiler-support, and OS-support.
Given the highly evolving nature of modern compute-intensive workloads,
this is one of the busiest areas of research in computer science today.
Course topics
Following are the tentative course topics; the exact topics will evolve as
we go along the course
- Integer instruction sets and their evolution : in particular, we will study the modern x86 instruction set in detail.
- Floating-point instruction sets, data-movement instruction sets
- Out-of-order superscalar architectures : we will study techniques to measure different micro-architectural characteristics of your desktop/laptop/mobile processors through software micro-benchmarks
- Store buffers, Speculation
- Mathematical framework to reason about latency and throughput: Amdahl's law, Little's law
- Memory hierarchy : caching structures, types; micro-benchmarks to measure the memory hierarchy; compiler algorithms to exploit the memory hierarchy.
- Virtual memory : huge page support, OS algorithms to manage huge pages; implications for big-data workloads.
- Instruction Prefetching, Trace caches
- Networking hardware/software : high-throughput network programming interfaces, multi-queue hardware, scheduling and fairness.
- Storage element characteristics : SSD, HDD, throughput/latency optimization algorithms and tradeoffs.
- Programming models like MapReduce, TensorFlow for domain-specific applications.
Lab assignments
We will have assignments primarily based on programming and
measurement on x86-based systems to supplement the
course material. The assignment load is expected to be moderate.