Threads and Multithreading 26-Jul-16

advertisement
Threads and Multithreading
26-Jul-16
Threads

A Thread is a single flow of control



When you step through a program, you are following a
Thread
A Thread is an Object you can create and control
Java programs use multiple Threads

The “main Thread” is the one that starts at your program’s
main method


Other Threads will start at a run() method
There are other Threads running, for example, the one that
does garbage collection
2
States of a Thread

A Thread can be in one of four states:






Ready: all set to run
Running: actually doing something
Waiting, or blocked: needs something
Dead: will never do anything again
State names vary across textbooks
You have some control, but the Java scheduler has more
waiting
start
ready
running
dead
3
Two ways of creating Threads

You can extend the Thread class:




class LongComputation extends Thread {…}
Limiting, since you can only extend one class
Your class must contain a public void run()
method
Or you can implement the Runnable interface:



class LongComputation implements
Runnable {…}
Your class must contain a public void run()
method
This approach is more generally useful
4
Extending Thread


class LongComputation extends Thread {
public void run() { code for this thread }
Anything else you want in this class
}
LongComputation lc= new LongComputation();




A newly created Thread is in the Ready state
To start the lc Thread running, call lc.start();
start() is a request to the scheduler to run the Thread—it
usually does not happen right away
The Thread should eventually enter the Running state
5
Implementing Runnable


class LongComputation implements Runnable {…}
The Runnable interface requires run()




LongComputation lc = new LongComputation ();
Thread myThread = new Thread(lc);
To start the Thread running, call myThread.start();


This is the “main” method of your new Thread
You do not write the start() method—it’s provided by Java
As always, start() is a request to the scheduler to run the
Thread--it may not happen right away
6
Starting and ending a Thread




Every Thread has a start() method
Do not write or override start()
You call start() to request a Thread to run
The scheduler then (eventually) calls run()



You must supply public void run()



Nothing will prevent you from calling run() directly
If you do, it will run as an ordinary method, not in a new Thread
This is where you put the code that the Thread is going to run
When the run() method finishes (or returns), the
Thread “dies”
A dead Thread is truly dead—it still exists as an object,
but it cannot be used again
7
Extending Thread: summary
class LongComputation extends Thread {
public void run() {
while (okToRun) {...}
}
}
LongComputation lc = new LongComputation ();
lc.start();
8
Implementing Runnable: summary
class LongComputation extends Applet
implements Runnable {
public void run() {
while (okToRun) { ... }
}
}
LongComputation lc = new LongComputation ( );
Thread myThread = new Thread(lc);
myThread.start( );
9
Daemon Threads



There are two kinds of Threads—user Threads and daemon
Threads
A daemon Thread is a “helper Thread”
A Java program continues to run until all its user Threads have
died

This means it’s very easy to have forgotten Threads running in the
background, invisibly tying up machine resources

A Java program will exit if the only Threads left are daemon
Threads

To make a Thread into a daemon Thread, call
thread.setDaemon(true), before starting the Thread
You can ask a Thread if it is a daemon Thread with the method
thread.isDaemon()

10
Sleeping

Thread.sleep(int milliseconds);

A millisecond is 1/1000 of a second9

try { Thread.sleep(1000); }
catch (InterruptedException e) {}

sleep only works for the current Thread

Under normal circumstances, nothing in the Java system
will interrupt your sleeping Thread…but
Your program can call someThread.interrupt()

11
Things a Thread can do





Thread.sleep(milliseconds)
yield()
Thread me = currentThread();
int myPriority = me.getPriority();
me.setPriority(NORM_PRIORITY);




As long as a higher-priority Thread exists, it will always run
instead of a lower-priority Thread
Don’t mess with priorities unless you are really sure of what
you are doing
if (otherThread.isAlive()) {…}
join(otherThread);
12
Things a Thread should NOT do



The Thread controls its own destiny
Original Java provided these methods, which were
almost immediately:






myThread.stop()
myThread.suspend()
myThread.resume()
These methods literally cannot be used in a safe way!
Never use these methods!
If you want a Thread to do something, you should ask it
(nicely)

For example, see the use of an okToRun variable in some of the
previous slides
13
Controlling another Thread

boolean okToRun = true;
lc.start();
Do stuff
okToRun = false;

public void run() {
while (okToRun) {…}

This is just meant to be suggestive—typically, the
okToRun variable would be in a different class than
the run() method
14
Thread pools

A thread pool is a collection of reusable threads


This can save a lot of the overhead of creating and disposing
of threads
Very basic introduction (Java 5+):



import java.util.concurrent.*;
...
ExecutorService exec = Executors.newFixedThreadPool(20);
Create some Runnable objects (objects that implement public void run()
)
exec.execute(Some Runnable object)
15
A problem
int k = 0;
Thread #1:
k = k + 1;


Thread #2:
System.out.print(k);
What gets printed as the value of k?
This is a trivial example of what is, in general, a
very difficult problem
16
Concurrency vs. parallelism



Briefly, “parallel” is a hardware term, while
“concurrent” is a software term
Parallel: Different computations are actually happening
at the same time
Concurrent: Different computations might be happening
at the same time (if you have the hardware for it)



Or one might happen before another
Or their execution may be interleaved in any arbitrary order
Concurrent programming can be done on:



a single computer with a single processor
a single computer with multiple processors
multiple computers networked together
17
Shared state


Processes run in completely different parts of memory, and
cannot modify one another’s data
Threads run in the same part of memory and share access
to data



If the data is immutable, reading it from different Threads does not
cause any problems
If the data is mutable, then it describes the state of the computation
(it changes as the computation proceeds), and we have shared state
Shared state makes concurrent programming difficult


Concurrent programs are nondeterministic—things could happen
in many different orders
Every possible ordering, no matter how unlikely, must produce
correct results
18
Problems

Concurrency can lead to data corruption:


Race conditions—if two or more processes try to write to the same data
space, or one tries to write and one tries to read, it is indeterminate which
happens first
Concurrency can lead to “freezing up” and other flow problems:

Deadlock—two or more processes are each waiting for data from the
other, or are waiting for the other to finish

Livelock—two or more processes each repeatedly change state in an
attempt to avoid deadlock, but in so doing continue to block one another

Starvation—a process never gets an opportunity to run, possibly because
other processes have higher priority
19
Why bother with concurrency?

We use concurrency to make programs “faster”

“Faster” may mean more responsive


“Faster” may mean the computation completes sooner




We need threads, even on single core machines, to move slow operations out
of the GUI
We can:
 Break a computation into separate parts
 Distribute these partial computations to several cores
 Collect the partial results into a single result
Thread creation, communication between threads, and thread disposal
constitutes overhead, which is not present in the sequential version
Due to overhead costs, it is not unusual for first attempts at using concurrency
to result in a slower program
Really getting much speedup requires lots of experimentation, timing tests,
and tuning the code

Good performance is not platform independent
20
Atomicity


An operation is atomic if it appears to happen “all at once”
Suppose one Thread is copying an array, while a concurrent
Thread is sorting it





The first Thread might get the sorted or the unsorted array
Much worse, the first Thread might get a mixture of the two, with
some elements duplicated and others lost
We don’t want the operations to overlap; we want one to
finish before the other one starts
In other words, we want these operations to be “atomic”
We can do this by “locking” the array, so that only one
process at a time can access it, and others have to wait
21
Synchronization


Synchronization provides a “locking” mechanism
Here’s how it works:





Pick an object, any object
Method A synchronizes on that object
Method B synchronizes on the same object
Whichever method, A or B, happens to synchronize first, the
other method has to wait until it is done
In the case of sorting vs. copying an array, one obvious
object to synchronize on is the array itself


Or, you might want to synchronize on whatever object
contains the array
But you could literally synchronize on any object, as long as
the two methods synchronize on the same object
22
Ways to synchronize

You can explicitly say which object to synchronize on:



You can synchronize an instance method:




synchronized (obj) { code }
Notice that synchronized is being used as a statement
synchronized void sort(array) { code }
The object used for synchronization is the object (instance)
executing this method
Thus, other synchronized instance methods cannot run at the same
time, for this instance
You can synchronize a static method:


synchronized static void sort(array) { code }
The object used for synchronization is the class object, which is an
object representing this entire class
23
Synchronization is re-entrant




Suppose a Thread obtains a lock on some object:
synchronized (thing) { code }
Then the code calls another method that tries to
synchronize on the same object:
synchronized (thing) { some other code }
This works! If a Thread has a lock on some object, it is
okay for that same Thread to ask for the lock again
Synchronization prevents some other Thread from
getting the same lock
24
Data invariants

Any publicly available method that modifies an object
should take it from one valid state to another valid state


A data invariant is a logical condition (possibly quite
complex) that describes what it means for an object to be valid
Any method that “partially” updates an object must be private


This is a fundamental rule of all object-oriented programming
Any method that modifies a shared object must be
atomic

Example:



Suppose you have a Fraction object with value 10/15
You want to reduce this Fraction to lowest terms: 2/3
It is unsafe to modify the numerator atomically and the denominator
atomically; they must both be changed in a single atomic operation
25
When to synchronize




A method that mutates the state of an object should move it
from one valid state to another
In the process, the data may become temporarily invalid
Invalid data should never be visible to another Thread
Suppose mutable data items A, B, and C are in some sort of
relationship to one another



Trivial example: A + B = C
Often, these data items would be fields in a single object
Fundamental rule: If mutable data can be accessed by
more than one thread, then every access to it, everywhere,
must be synchronized. No exceptions!
26
If you don’t always synchronize
27
Atomic actions


An operation, or block of code, is atomic if it happens “all at once,” that is, no other
Thread can access the same data while the operation is being performed
x++; looks atomic, but at the machine level, it’s actually three separate operations:
1.
2.
3.

Suppose you are maintaining a stack as an array:
void push(Object item) {
this.top = this.top + 1;
this.array[this.top] = item;
}



load x into a register
add 1 to the register
store the register back in x
You need to synchronize this method, and every other access to the stack, to make the
push operation atomic
Atomic actions that maintain data invariants are thread-safe; compound (non-atomic)
actions are not
This is another good reason for encapsulating your objects
28
Check-then-act


A Vector is like an ArrayList, but is synchronized
Hence, the following code looks reasonable:


But there is a “gap” between checking the Vector and adding to it



During this gap, some other Thread may have added the object to the array
Check-then-act code, as in this example, is unsafe
You must ensure that no other Thread executes during the gap


if (!myVector.contains(someObject)) { // check
myVector.add(someObject);
// act
}
synchronized(myVector) {
if (!myVector.contains(someObject)) {
myVector.add(someObject);
}
}
So, what good is it that Vector is synchronized?

It means that each call to a Vector operation is atomic
29
Synchronization is on an object

Synchronization can be done on any object
Synchronization is on objects, not on variables
Suppose you have
synchronized(myVector) { … }
Then it is okay to modify myVector—that is, change the values of its fields
It is not okay to say myVector = new Vector();

Synchronization is expensive








Synchronization entails a certain amount of overhead
Synchronization limits parallelism (obviously, since it keeps other Threads from
executing)
Synchronization can lead to deadlock
Moral: Don’t synchronize everything!
30
Local variables

A variable that is strictly local to a method is thread-safe


If a variable is a primitive type, it is thread-safe



This is because every entry to a method gets a new copy of that variable
Except for long and double!
If a variable holds an immutable object (such as a String) it is
thread-safe, because all immutable objects are thread-safe
If a variable holds a mutable object, and there is no way to access
that variable from outside the method, then it can be made threadsafe



An Object passed in as a parameter is not thread-safe (unless immutable)
An Object returned as a value is not thread-safe (unless immutable)
An Object that has references to data outside the method is not thread-safe
31
Thread deaths


A Thread “dies” (finishes) when its run method finishes
There are two kinds of Threads: daemon Threads and nondaemon Threads


When all non-daemon Threads die, the daemon Threads are automatically
terminated
If the main Thread quits, the program will appear to quit, but other nondaemon Threads may continue to run


A Thread is by default the same type (daemon or nondaemon as the Thread that creates it


These Threads will persist until you reboot your computer
There is a method: void setDaemon(boolean on)
The join(someOtherThread) allows “this” Thread to wait for
some other thread to finish
32
Communication between Threads



Threads can communicate via shared, mutable data
Since the data is mutable, all accesses to it must be synchronized
Example:



synchronized(someObj) { flag = !flag; }
synchronized(someObj) { if (flag) doSomething(); }
The first version of Java provided methods to allow one thread to
control another thread: suspend, resume, stop, destroy



These methods were not safe and were deprecated almost immediately—
never use them!
They are still there because Java never throws anything away
If you want one Thread to control another Thread, do so via shared data
33
Use existing tools


There’s no point in trying to make something thread-safe if a
carefully crafted thread-safe version exists in the Java libraries
java.util.concurrent has (among other goodies):





ConcurrentHashMap
ConcurrentLinkedQueue
ThreadPoolExecutor
FutureTask
And java.util.concurrent.atomic has thread-safe methods on
single variables, such as these in AtomicInteger:




int addAndGet(int delta)
int getAndAdd(int delta)
boolean compareAndSet(int expect, int update)
void lazySet(int newValue)
34
Advice

Any data that can be made immutable, should be made
immutable





This applies especially to input data--make sure it’s completely read in
before you work with it, then don’t allow changes
All mutable data should be carefully encapsulated (confined to
the class in which it occurs)
All access to mutable data (writing and reading it) must be
synchronized
All operations that modify the state of data, such that validity
conditions may be temporarily violated during the operation,
must be made atomic (so that the data is valid both before and
after the operation)
Be careful not to leave Threads running after the program
finishes
35
Debugging



“Debugging can show the presence of errors, but never
their absence.” -- Edgser Dijkstra
Concurrent programs are nondeterministic: Given
exactly the same data and the same starting conditions,
they may or may not do the same thing
It is virtually impossible to completely test concurrent
programs; therefore:



Test the non-concurrent parts as thoroughly as you can
Be extremely careful with concurrency; you have to depend
much more on programming discipline, much less on testing
Document your concurrency policy carefully, in order to make
the program more maintainable in the future
36
The End
37
Download