Lecture 9

advertisement
Instructor
Neelima Gupta
ngupta@cs.du.ac.in
Table of Contents
 Parallel Algorithms
Instructor: Ms Neelima Gupta
Thanks to: Tejinder Kaur (35, MCS '09)
 Solving a problem on multiple processors.
 S(n) is sequential time to solve a problem.
 T(n,p) is the parallel time to solve a problem on p
processors.
 W(n) is the work done by a parallel algorithm.
W(n)=T(n,p) p
 A parallel algorithm is optimal if the work done is
best of known sequential algorithm.
i.e. if W(n)=S(n)
 Speed up is how much time is gained by using
more processors.
speed up = S(n)/T(n,p)
Thanks to: Tejinder Kaur (35,
MCS '09)
Take a problem of computing sum of numbers.
Sequential time = Θ(n)
We have 2 processors p1 and p2 and the numbers are
2,3,4,5,1,11,13,10,7,8
Initially all the numbers are with p1 and it sends half of
them to p2. Both p1 and p2 compute sums and send the
sums s1 and s2 to each other. So both have the final sum.
p1
p2
2,3,4,5,1
11,13,10,7,8
(s1+s2)
(s1+s2)
Communication time= Θ(1)
Computation time= Θ(n/2)
T(n,2)= Θ(n/2)
W(n)= n/2 2 =n
Hence this algorithm is optimal.
Speed up = n/ n = 2
2
Thanks to: Tejinder Kaur (35,
MCS '09)
Distributed Computing
PARALLEL
MODELS
M1
M2
M4
M3
M5
Several independent machines are there.They communicate
with
each oher by passing messages.The final result comes from all
independent machines.
Thanks to: Tejinder Kaur (35,
MCS '09)
SHARED MEMORY MODEL
 All the processors are reading and writing to the
same memory.
 There is no communication between them.
 Can not write at same time but can read at same
time.
p1
p2
p3
pn
Thanks to: Tejinder Kaur (35,
MCS '09)
Shared
memory
Models
for
concurrency
in
shared
memory
model
 EREW(Exclusive read exclusive write)
 CREW(Concurrent read exclusive write)
 CRCW(Concurrent read Concurrent Write)
The weakest is EREW.
CREW is Better than EREW but weaker than
CRCW.
If we go from CRCW to CREW there is a slowdown
of factor of log(n).
Thanks to: Tejinder Kaur (35,
MCS '09)
CRCW Model wrt searching
CRCW
Common
All processors should write
same value
Arbitrary
Priority
Processors can write
different values but
Processors can write
different values but the
processor given the highest
priority gets to write its value
Any one value gets written
Made By : Deepika Kamboj ( Roll No.7, MSc '11 )
Searching for a key
Key =
COMPARISON
x1
x2
x3
…….…
xi
…….…
xn
p1p
p2p
p3p
….….…
pip
…….…
pnp
x2==
x3==
…….… xi==
…….…
x1=
0
Thanks to 'PREETI'
OUTPUT
xn=
CRCW
Key =
COMPARISON
x1
x2
x3
…….…
xi
…….…
xn
p1p
p2p
p3p
….….…
pip
…….…
pnp
x2==
x3==
…….… xi==
…….…
x1=
xn=
Match
Match
found
0
Thanks to 'PREETI'
OUTPUT
CRCW
Key =
COMPARISON
x1
x2
x3
…….…
xi
…….…
xn
p1p
p2p
p3p
….….…
pip
…….…
pnp
x2==
x3==
…….… xi==
…….…
x1=
xn=
Match
Match
found
1
Thanks to 'PREETI'
OUTPUT
VERSION 1 OF SEARCHING
 To find the existence of the given KEY.
 MODEL used
 CRCW
Common
Priority
Arbitrary
Thanks to 'PREETI'
example for version1
Key = 7
COMPARISON
12
7
22
15
7
30
p1p
p2p
p3p
p4p
p5p
p6p
12≠7
7=7
22≠7
15≠7
7=7
30≠7
0
OUTPUT
Thanks to 'PREETI'
example for version1
Key = 7
COMPARISON
12
7
22
15
7
30
p1p
p2p
p3p
p4p
p5p
p6p
12≠7
7=7
22≠7
15≠7
0
Thanks to 'PREETI'
7=7
OUTPUT
30≠7
example for version1
Key = 7
COMPARISON
12
7
22
15
7
30
p1p
p2p
p3p
p4p
p5p
p6p
12≠7
7=7
22≠7
15≠7
1
Thanks to 'PREETI'
7=7
OUTPUT
30≠7
VERSION 2 OF SEARCHING
 To find the processor id.
 MODEL used
 CRCW
Common
Priority
Arbitrary
Thanks to 'PREETI'
example for version2
Key = 7
COMPARISON
12
7
22
15
7
30
p1p
p2p
p3p
p4p
p5p
p6p
12≠7
7=7
22≠7
15≠7
7=7
30≠7
0
OUTPUT
Thanks to 'PREETI'
example for version2
Key = 7
COMPARISON
12
7
22
15
7
30
p1p
p2p
p3p
p4p
p5p
p6p
12≠7
7=7
22≠7
15≠7
0
Thanks to 'PREETI'
7=7
OUTPUT
30≠7
example for version2
Key = 7
COMPARISON
12
7
22
15
7
30
p1p
p2p
p3p
p4p
p5p
p6p
12≠7
7=7
22≠7
15≠7
p5
Thanks to 'PREETI'
7=7
OUTPUT
30≠7
Or
Key = 7
COMPARISON
12
7
22
15
7
30
p1p
p2p
p3p
p4p
p5p
p6p
12≠7
7=7
22≠7
15≠7
p2
Thanks to 'PREETI'
7=7
OUTPUT
30≠7
VERSION 3 OF SEARCHING
 To find the LEFT MOST OCCURRENCE of the given
KEY.
 MODEL used
 CRCW
Common
Arbitrary
Priority
×
Thanks to 'PREETI'
example for version3
Key = 7
COMPARISON
12
7
22
15
7
30
p1p
p2p
p3p
p4p
p5p
p6p
12≠7
7=7
22≠7
15≠7
7=7
30≠7
0
OUTPUT
Thanks to 'PREETI'
example for version3
Key = 7
COMPARISON
12
7
22
15
7
30
p1p
p2p
p3p
p4p
p5p
p6p
12≠7
7=7
22≠7
15≠7
0
Thanks to 'PREETI'
7=7
OUTPUT
30≠7
example for version3
Key = 7
COMPARISON
12
7
22
15
7
30
p1p
p2p
p3p
p4p
p5p
p6p
12≠7
7=7
22≠7
15≠7
p2
Thanks to 'PREETI'
7=7
OUTPUT
30≠7
Find
sum
of n numbers and there are n processors.
SUM
PROBLEM
a1
a2
a3
a4
n
an
processors
n/2
processors
n/4 processors
1 processor
Thanks to: Tejinder Kaur (35,
MCS '09)
Height of this tree is log n.
Each step is taking constant time.
Hence this algo takes O(log n) time.
W(n)= n log n= nlogn.
Speed up=n/log n.
This algorithm is not optimal as half of the processors
are
idle in first step and number of idle processors is
increasing in further steps.
What if we use n/log n processors.
Thanks to: Tejinder Kaur (35,
MCS '09)
As the number of processors is n/log n.Each processor will
get log n values.
s1
s2
sm
Take m=n/log n
Each processor has n/log n values so sm sums will
be generated.
Thanks to: Tejinder Kaur (35,
MCS '09)
The height is log m.
So it will take log m time <= log n
So T(n,p) <= 2logn
= O(log n)
W(n)= n=O(S(n))
As sequential time is O(n).
Hence this algorithm is optimal.
Thanks to: Tejinder Kaur (35,
MCS '09)
SORTING
Sort n numbers in parallel with n processors.
Initially each procesor has an element.
a1
a2
a3
a4
an
n/2,2 merge
n/4,4 merge
1,n merge
Thanks to: Tejinder Kaur (35,
MCS '09)
The last step will take n units of time
n + n/2 + n/4 + - - - - - - + 2
<= 2n
So it takes O(n) time.
W(n)= n2
Thanks to: Tejinder Kaur (35,
MCS '09)
Instructor: Ms Neelima Gupta
Thanks to: Surbhi Tripathi (27, MCS '09)
Definition: Prefix Sums
Given:
Set of n values
A = {a0,a1…….,an-1}
We want to find the prefix sums S0, S1,………..Sn-1.
Where,
S0=a0
S1=a1+a0
|
|
Sn-1=an-1+…………+a1+a0
Thanks to: Surbhi Tripathi (27,
MCS '09)
STEP - II
a0
a1
a2
P1:s1
a3
P2:a2oa3
p1: s2(s1oa2)
p2: s3(s1oa2oa3)
Thanks to: Surbhi Tripathi (27,
MCS '09)
a4
a5
a6
a7
P3:a4oa5
P4:a6oa7
p3:a4oa5oa6
p4:a4oa5oa6oa7
STEP
a0 a1 -a2IIIa3
a4
a5
P2:a2oa3
P1:s1
a6 a7
P3:a4oa5
p1: s2(s1oa2)
P4:a6oa7
p3:a4oa4oa6
p2: s3(s1oa2oa3)
p4:a4oa5oa6oa7
p1=s4 (s3oa4)
p2: s5 (s3oa4oa5)
p3: s6 (s3oa4oa5oa6)
Thanks to: Surbhi Tripathi (27,
MCS '09)
p4: s7 (s3oa4oa5oa6oa7)
CREW Model
Computations of prefix sums do not require any
concurrent writes.
Thanks to: Surbhi Tripathi (27,
MCS '09)
TIME COMPLEXITY
To compute prefix sums of n numbers
As,the number of prefix sums computed doubles at each
step.While computing n prefix sums we get a tree of
height log n.
Each step takes constant time.
So, computing n prefix sums using n processors in
parallel takes log n time
Thanks to: Surbhi Tripathi (27,
MCS '09)
Download