00a-CourseOverview

advertisement
CSCI 4125
Programming for Performance
Andrew Rau-Chaplin
arc@cs.dal.ca
www.cs.dal.ca/~arc
Course Objectives

Explore techniques for designing,
implementing and evaluating efficient
programs for




Sequential computers,
Shared-Memory Multiprocessors, and
Distributed Memory Multicomputers
Make it go fast!
Performance oriented dev cycle

techniques and tools for a performance
oriented development cycle




Algorithm design
Implementation
Benchmarking/evaluation
Performance Tuning
Quantifying performance

Themes include:







evaluation of performance
design of test data sets
issues of stability/reliability
scalability
common performance enhancing techniques
parallel algorithm design techniques
identification and elimination of dependencies
Skills Development







how to
how to
how to
how to
how to
tools
how to
how to
design experiments/benchmarks
use of statistics in performance evaluation
instrument code to obtain reliable timings
use compiler switches
use a profiler and performance tuning
use a debugger/tracing tools
plot performance results
Topics






Introduction to Parallelism
Parallel Programming
Parallel Architectures
Parallel Algorithms
Parallel Applications
Other Parallel Architectures & Algorithms
Official Outline



This course explores the design, implementation, and
evaluation of computer programs for applications in
which performance is a central issue.
In the sequential and multi-core settings, it explores topics
such as profiling, cache effects, I/O performance, floatingpoint issues, multi-threading, and performance tuning
techniques.
It introduces techniques for the design, implementation and
evaluation of programs for Multicore processors, SharedMemory Multiprocessors (SMPs) and Distributed Memory
Multicomputers (Clusters).
Resources

Course web page:

www.cs.dal.ca/~arc/teaching/CSc4125
 All notes, readings, assignments
Parallel Machines




Your laptop!
CGM6 & CGM7
Hugh
Readings


Sorry no text book!
Will Assign Readings
Books







Introduction to High Performance Computing for Scientists and
Engineers by Georg Hager and Gerhard Wellein
Parallel Programming by Peter Pacheco, Morgan Kaufman
Structured Parallel Programming by Michael McCool, Arch D. Robison,
and James Reinders
Parallel Programming in C with MPI and OpenMP by Quinn
Parallel Programming with Intel Parallel Studio XE by S. Blair-Chappell
and A. Stokes
Using OpenMP: Portable Shared Memory Parallel Programming By
Barbara Chapman, Gabriele Jost and Ruud van der Pas;
Parallel Programming in OpenMP, by Rohit Chandra, Dave Kohr, Jeff
McDonald, Morgan Kaufman
Prerequisites

Knowledge of C
Csci3120: Operating systems

Good to have


CSci3110 - Analysis of Algorithms
Course Evaluation





Assignments (3)
Seminar
Project
Participation
30%
20%
40%
10%
See course web page for assignment
copies and due dates
Assignments

Selected From







Sequential Optimization
OpenMP
Cilk
Thread building blocks
MPI
Hadoop
CUDA/OpenCL
Project


Select your own topic
Either




Optimize an existing codebase
Design and implementation of an efficient new code
Components: Literature/Code review, some
research or programming work, final paper,
presentation
Main Deliverables: Demo plus Conference style
paper
Questions



Why are you taking this course?
Which performance oriented
technologies are you interested in?
How will you know if the course has
been a success for you?
Download