Concurrency The need for speed

advertisement
Concurrency
The need for speed
Why concurrency?

Moore’s law:
1.
2.
The number of components on a chip doubles about every 18 months
The speed of computation doubles about every 18 months

Moore’s law (version 1) continues to hold
Moore’s law (version 2) ended in about 2003
But we’ve gotten to depend on faster and faster machines!

In addition,




Some problems are more easily and naturally solved with concurrency
There are qualitative, as well as quantitative, benefits to faster programs
2
Terminology

The terms thread and process each refer to a single sequential
flow of execution



Parallel processing: Threads running simultaneously, on different
cores (processors), in the same computer


Separate threads run in the same memory space, and interact by reading
from and writing to shared memory locations
Separate processes each have their own memory, and interact by sending
messages
“Parallelism”is essentially a hardware term.
Concurrent processing: Threads running asynchronously, on one
or more cores, usually in the same computer

“Concurrency” is essentially a software term.
3
Synchronicity

Synchronous processes run in parallel, such that each process
“knows” how far along in the computation the other threads are


Asynchronous means that you cannot tell whether operation A in
thread #1 happens before, during, or after operation B in thread
#2



Example: A vector machine, which does the same operation at the same
time to every element of an array
Asynchronous processes may be running simultaneously on different
cores, or they may be sharing time on the same core
Asynchronous processes may be running on distributed computers
An asynchronous function call starts another function executing
but does not wait for it to return


The calling thread just keeps on going
Some other mechanism is used for retrieving the function result later
4
Concurrency with shared state

Shared state: Threads run in the same memory space


Modifying a piece of data while another thread is attempting to read or modify it
will result in inconsistent state
Synchronization must be used to temporarily lock data to give one thread exclusive
access to it



Used by the C family of languages, including Java and Scala
An operation, or block of code, is atomic if it happens “all at once”




Immutable data need not be locked
No other thread can access the same data while the operation is being performed
In a higher-level language, there are essentially no atomic actions
Synchronization is used to make actions atomic.
Check-then-act is a common error pattern


if(check(x)) {act(x)} where check(x) and act(x) are each
synchronized
Another process may modify x in the interval between checking and acting
5
Problems

Race conditions: If two or more processes try to write to the same
data space, or one tries to write and one tries to read, it is
indeterminate which happens first




The processes may even interleave, to produce inconsistent data
Deadlock: Two or more processes are each waiting for data from
the other, or are waiting for a lock held by the other.
Livelock: Two or more processes each repeatedly change state in
an attempt to avoid deadlock, but in so doing continue to block
one another
Starvation: A process never gets an opportunity to run, possibly
because other processes have higher priority
6
Software Transactional Memory


Software Transactional Memory (STM) uses shared state, but
avoids locks
Approach:



Make a snapshot of all relevant data
Perform the operation
Check if the relevant data has changed since the snapshot



STM is optimistic, while locking is pessimistic



If not, atomically replace the data
If so, throw away the result and perform the computation again
We lock because the data might change as we work
STM works well when the data is not likely to change
STM is built in to Clojure, but available in other languages
7
Concurrency without shared state

Actors, with message passing




Processes run in disjoint memory spaces, and all
communication is by sending messages from one process to
another
Each process has a queue of received messages
This approach avoids many of the problems of the shared state
model, but requires processes to be very lightweight (cheap)
This is the approach used by Erlang, and subsequently by
Clojure and Scala
8
Does concurrency help?

Concurrency does not always make things faster

Overhead is work to establish the concurrency that a sequential
program does not need to do





Threads may need to wait for a lock, processes may need to wait for a
message
Unless all processors get exactly the same amount of work,
some will be idle
Processes may have to contend for resources
Shared state requires synchronization, which is expensive
Some things can only be done sequentially
9
Overhead

Overhead is any cost incurred by the parallel algorithm but not by
the corresponding sequential algorithm



Communication among threads and processes (a single thread has no other
threads with which to communicate)
Synchronization is when one thread or process has to wait for results or
events from another thread or process
Contention for a shared resource, such as memory




Java’s synchronized is used to wait for a lock to become free
Extra computation to combine the results of the various threads or
processes
Extra memory may be needed to give each thread or process the memory
required to do its job
Messages must be marshalled (packaged) for transmission and
unmarshalled when received
10
Idle time

Idle time results when


There is a load imbalance--one process may have much less
work to do than another
A process must wait for access to memory or some other
shared resource


Data is registers is most quickly accessed
Data in a cache is next most quickly accessed
 A level 1 cache is the fastest, but also the smallest
 A level 2 cache is larger, but slower
 Memory--RAM--is much slower
 Disk access is very much slower
11
Dependencies

A dependency is when one thread or process requires
the result of another thread or process

Example: (a + b) * (c + d)




The additions can be done in parallel
The multiplication must wait for the results of the additions
Of course, at this level, the hardware itself handles the parallelism
Threads or processors that depend on results from other
threads or processors must wait for those results
12
Amdahl’s law





Some proportion P of a program can be made to run in
parallel, while the remaining (1 - P) must remain
sequential
If there are N processors, then the computation can be
done in (1 - P) + P/N time
The maximum speedup is then
1
.
(1 - P) + P/N
As N goes to infinity, the maximum speedup is 1/(1 - P)
For example, if P = 0.75, the maximum speedup is
(1/0.25), or four times
13
Consequences of Amdahl’s law





If 75% of a process can be parallelized, and there are four
processors, then the possible speedup is
1 / ((1 - 0.75) + 0.75/4) = 2.286
But with 40 processors--ten times as many--the speedup is only
1 / ((1 - 0.75) + 0.75/40) = 3.721
This has led many people (including Amdahl) to conclude that
having lots of processors won’t help very much
However....
For many problems, as the data set gets larger,



The inherently sequential part of the program remains (fairly) constant
Thus, the sequential proportion P becomes smaller
So: The greater the volume of data, the more speedup we can get
14
Why functional programming?

Here are the three most important reasons that functional
programming is better for concurrency than imperative
programming:
Immutable values are automatically thread safe
Immutable values are automatically thread safe
Immutable values are automatically thread safe



Why not functional programming?


Functional languages—Lisp, Haskell, ML, OCaml—have
long been regarded as only for ivory-tower academics
Functional languages are “weird” (meaning: unfamiliar)
15
What’s happening now?

Moore’s law has ended







Instead of getting faster processors, we’re now getting more of
them
Consequently, parallelism, and concurrency, have become much
more important
Python has gotten more functional
Other languages are getting more functional
Microsoft is starting to promote F# (based on ML)
Java 8 will have some functional features
Scala is a hybrid object/functional language


Scala is designed to be a “better Java”
Like many new languages, Scala runs on the JVM
16
The End
17
I’m thinking of something.
0%
0%
er
al
0%
M
in
bl
e
C.
Ve
ge
ta
B.
Animal
Vegetable
Mineral
An
im
al
A.
18
Download