Modern Parallel Programming
This page will be updated regularly. Please visit (and reload) often.

Parallel Programming
Course# COL 730 (Sem I, 2024-2025)

Mon, Thu 330-500 (LH622 LH619)

Instructor

Subodh Kumar <subodh@cse.*>
Office hour: Wed 12-1 (IIA-429)

TA

Aryan Dua <Aryan.Dua.cs520@cse.*>

Replace * with iitd.ac.in

News & Announcements


Course Information

This is a first course in parallel programming and does not require any previous parallel computing experience. Knowledge of Data structures and Operating Systems is required. L-T-P: 3-0-2.

With the growing number of cores on a chip, programming them efficiently has become an indispensable knowledge for the future. Modern Parallel Programming is a hands-on course involving significant parallel programming on compute-clusters, multi-core CPUs and massive-core GPUs.

This course will cover a fair bit of theory so the fundamentals are established, as well as a broad set of system level issues in programming modern systems correctly and efficiently. A good background in C/C++ is expected. You will also learn OpenMP, MPI, Cuda, and more.

Contents: Parallel performance metrics, Models of parallel computation, Parallel computer organization, Parallel programming environments, Load distribution, Throughput, Latency and Latency hiding, Memory and Data Organizations, Inter-process communication, Distributed memory architecture, Interconnection network and routing, Shared memory architecture, Memory consistency, Non-uniform memory, Parallel Algorithm techniques: Searching, Sorting, Prefix operations, Pointer Jumping, Divide-and-Conquer, Partitioning, Pipelining, Accelerated Cascading, Symmetry Breaking, Synchronization (Locked/Lock-free).


Tentative outline

Part I

Introduction
From serial to parallel thinking: common gotchas
Performance metrics - speedup, utilization, efficiency, scalability
Models of Parallel Computation
SIMD, MIMD
PRAM (EREW, CREW,CRCW), NC
How useful are these models for modern machines
Parallel Computer Organization
Pipelining and Throughput
Latency and Latency hiding
Memory Organization Inter-process communication Inter-connection network
Message passing
Shared/Distributed memory
Basic Parallel Algorithmic Techniques
Pointer Jumping, Divide-and-Conquer, Partitioning, Pipelining, Accelerated Cascading, Symmetry Breaking, Synchronization (Locked, Lock-free)
Parallel Algorithms
Data organization for shared/distributed memory
Min/Max,Sum
Searching, Merging, Sorting, Prefix operations
Example applications

Part II

Writing Parallel Programs
GPU-Compute Architecture, CUDA, Memory organization in CUDA
Multi-Core CPU programming, OpenMP, MPI
Performance evaluation and scalability

Academic Integrity Code

Academic honesty is required in all your work. Academic honesty also means not accepting dishonesty from others. You must solve all programming assignments entirely on your own, except where group work is explicitly authorised. This means you must not take, neither show, give or otherwise allow others to take your program code, problem solutions, or other work.

This means you must protect your code from access by others. Do not leave it where others can find it. Do not give it to someone for submission on your behalf. Do not use any fragment of code obtained online or from someone else, except what is explicitly authorised as a part of the course. When authorised, any non-original code that you do use must be clearly identified with due reference to the source. Falsifying program output or results is cheating also.

Academic honesty also means not accepting dishonesty from others. When it doubt, please ask your professor.

Students who are caught cheating will be given a 0 and a letter grade penalty. Second violation will result in summary failure from the course.


Assignments & Grading

Work Points Tentative Schedule
Assignment 1 10 Due Aug 26, 1159p.
Assignment 2 10 Due Oct 2, 1159p.
Assignment 3 10 Due Nov 1, 1159p.
Assignment 4 10 Due Dec 1, 1159p.
Quizzes and discussion 10
Minor 20
Major 30

Late Policy: A total of five days of late submission is allowed across the first three assignments. Use them when you need them. Beyond these days, you lose 0.5 marks per day of delay.

Attendance Policy: 100% participation is required. Intimation is required for any absence.

Audit Policy: At least 40% in total AND At least 30% each in exams, assignment and quizzes. 100% participation.


Recommended Text

An Introduction to Parallel Programming by Subodh Kumar

Introduction to Parallel Computing by Ananth Grama, George Karypis, Vipin Kumar, and Anshul Gupta

Parallel Programming in C with MPI and OpenMP by M J Quinn

Programming Massively Parallel Processors by D. Kirk and W. Hwu

Further Reading