Introduction

What class is this?
 CS

484 – Parallel Processing
Who am I?
 Mark
Clement
 2214 TMCB
 clement@cs.byu.edu
 422-7608

When are office hours?
 MW
10-12 and appointment
Course and Text

Homepage (Everything is here)
 BYU

Learning Suite
Text:
 Online
tutorials & papers
Course Objectives

Students will understand and demonstrate the ability to
develop




Shared memory parallel programs (OpenMP & Pthreads),
Distributed memory parallel programs (MPI), and
Data parallel programs (CUDA)
Students will understand and be able to implement
several basic parallel algorithms and load balancing
techniques.
Lectures
Follow schedule on the homepage
 We will move quickly.


READ BEFORE CLASS!!!!!
 There

will be in-class quizzes - no makeup
I will post lecture notes on the web.
What is expected of you?


READ!!!
Assignments
 C/C++
on the Supercomputer & CS Open Labs
 All assignments include a report. The report IS
the assignment. The program is what you did to
accomplish the assignment. - I will grade your
writing.
 Submit through Learningsuite

Exams
 In
the testing center
 8 1/2 x 11 sheet of notes allowed
What is expected of you?

Get an account on the university supercomputers





Go to fsl.byu.edu
Register
Do it today!!!
Read through the batch jobs tutorial for marylou
Schedule a simple hello world using a PBS script.




copy ~mjc22/hello/hello.c and ~mjc22/hello/hello.pbs
edit hello.pbs
mpicc hello.c
sbatch hello.pbs
Grading

Grade Distribution





60 %
15 %
15 %
10%
Grade Scale





Homework & labs
Midterm
Final
Project
94 % - Above
90 % - 93.9 %
80 % - 89.9 %
65 % - 79.9 %
A
AB-, B, B+
C-, C, C+
I expect everyone to get a good grade!
Policies

Late Work
don’t like late work. However…..
 10% off per school day – limited to 70% off
I

Programming Assignments
 To
get better than a C-, you must complete ALL labs
 If you don’t complete all the labs, your grade ceiling
is a C-
Other Policies

Honor Code:
I
expect you to follow the honor code including the
dress and grooming standards.
 You can work together in groups on the homework
and laboratories from a conceptual perspective, but
the answers that you give and the programs that you
write should be your own, not copies of other students
work.
 Your reports should focus on your own ideas and
things you have learned from your experimentation.
Other Policies

Cheating
 Cheating
in any form will NOT be tolerated. This
includes copying any part of a homework assignment
or programming lab.
 Any assignment turned in that is not your work will be
given a negative full value score. Any cases of
multiple offense or cheating on a test will result in
failure of the class.

Systems Abuse Policy
 Abuse
in any form will result in immediate suspension
of your accounts(s).
Other Policies

Preventing Sexual Harassment
 Title
IX of the Education Amendments of 1972 prohibits
sex discrimination against any participant in an
educational program or activity receiving federal funds.
Title IX covers discrimination in programs, admissions,
activities, and student-to-student sexual harassment.
 BYU's policy against sexual harassment extends not
only to employees but to students as well.
 If you encounter unlawful sexual harassment or gender
based discrimination, please talk to your professor;
contact the Equal Employment Office at 377-5895 or
367-5629 (24 hours); or contact the Honor Code Office
at 378-2847.
Other Policies

Students With Disabilities
 Brigham
Young University is committed to providing a
working and learning atmosphere that reasonably
accommodates qualified persons with disabilites. If
you have any disability, which may impair your ability
to complete this course successfully, please contact
the Services for Students with Disabilities Office (3782767). Reasonable academic accommodations are
reviewed for all students who have qualified
documented disabilites.
Other Policies

Children in the Classroom
 The
study of Computer Science requires a degree of
concentration and focus that is exceptional. Having
small children in class is often a distraction that
degrades the educational experience for the whole
class. Please make other arrangements for child care
rather than bringing children to class with you. If there
are extenuating circumstances, please talk with your
instructor in advance.
What will we do in this class?

Focus on shared memory and message passing
programming




PThreads
OpenMP
MPI
Data parallel programming with CUDA

Write code for the supercomputers

Study parallel algorithms and parallelization
What is Parallelism?

Multiple tasks working at the same time on
the same problem.
Parallel Computing

What is a parallel computer?
A
set of processors that are able to work
cooperatively to solve a computational problem

Examples
 Parallel
supercomputers
 Clusters of workstations
 Symmetric multiprocessors
 Multiple core processors
Won't serial computers
be fast enough?

Moore's Law
 Double

in speed every 18 months
Predictions of need
 British
government in 1940s predicted they would
only need about 2-3 computers
 Market for Cray was predicted to be about 10

Problem
 Doesn't
take into account new applications.
Applications Drive
Supercomputing

Traditional
 Weather
simulation and prediction
 Climate modeling
 Chemical and physical computing

New apps.
 Collaborative
environments
 DNA
 Virtual
reality
 Parallel databases
 Games
 Photoshop
Application Needs

Graphics
 109
volume elements
 200 operations per element
 Real-time display

Weather & Climate
year simulation involves 1016 operations
 Accuracy can be improved by higher resolution grids
which involves more operations.
 10
Cost-Performance Trend
1990s
1980s
1970s
Performance
1960s
Cost
Serial computers
What does this suggest?



More performance is easy to a point.
Significant performance increases of current
serial computers beyond the saturation point is
extremely expensive.
Connecting large numbers of microprocessors
into a parallel computer overcomes the
saturation point.
 Cost
stays low and performance increases.
Computer Design

Single processor performance has been
increased lately by increasing the level of
internal parallelism.
 Multiple
functional units
 Pipelining

Higher performance gains by incorporating
multiple "computers on a chip.”

Gigahertz race is over (Intel won)

Multiple cores is where performance will come from
Computer Performance
TFLOPS
IBM SP-2
Cray C90
Cray X-MP
Cray 1
1e12
CDC 7600
1e7
IBM 7090
IBM 704
1e2
Eniac
1950
1975
2000
Communication Performance
Early 1990s Ethernet
 Mid 1990s
FDDI
 Mid 1990s
ATM
 Late 1990s Fast Ethernet
 Late 1990s Gig Ethernet
 Gbps is now commonplace

10 Mbps
100 Mbps
100s Mbps
100 Mbps
100s Mps
Performance Summary


Applications are demanding more speed.
Performance trends
 Processors
are increasing in speed.
 Communication performance is increasing.

Future
 Performance
trends suggest a future where
parallelism pervades all computing.
 Concurrency is key to performance increases.
Parallel Processing
Architectures

Architectures

Single computer with lots of processors
 Multiple interconnected computers

Architecture governs programming

Shared memory and locks
 Message passing
Shared Memory Computers

Uniform Memory Access



All access the same
memory
can lead to memory
bottleneck
Non-uniform Memory Access
 different speed access
 more scalable
Message Passing Computers
Distributed Shared Memory
Interconnection Network
Processors
Memory Modules
Message Passing Architectures

Requires some form of interconnection

The network is the bottleneck
 Latency
and bandwidth
 Diameter
 Bisection bandwidth
Message Passing Architectures

Line/Ring

Mesh/Torus
Message Passing Architectures

Tree/Fat Tree
Message Passing Architectures

Hypercube
Message Passing Architectures

Switched

Nice, but also has limitations
Parallel Programming
Properties

Concurrency
 Performance
should increase by employing
multiple processors.

Scalability
 Performance
should continue to increase as we
add more processors.

Locality of Reference
 Performance
will be greater if we only access local
memory.

Architecture affects each of the above!
Parallel Programming Models

Threads
Parallel Programming Models

Message Passing
Parallel Programming Models

Data Parallel
So What?
In this class you will learn to program
using each of the parallel programming
models.
 We will talk about
advantages/disadvantages of each model
 We will also learn about common parallel
algorithms and techniques
