ECE 1747H: Parallel Programming Synchronization Example 1

advertisement
Synchronization
ECE 1747H: Parallel
Programming
Lecture 1-part2: More on parallelism
and dependences -- synchronization
Example 1
f() { a = 1; b = 2; c = 3;}
g() { d = 4; e = 5; f = 6; }
main() { f(); g(); }
• No dependences between f and g.
• Thus, f and g can be run in parallel.
• All programming models give the user the
ability to control the ordering of events on
different processors.
• This facility is called synchronization.
Example 2
f() { a = 1; b = 2; c = 3; }
g() { a = 4; b = 5; c = 6; }
main() { f(); g(); }
• Dependences between f and g.
• Thus, f and g cannot be run in parallel.
1
Example 2 (continued)
f() { a = 1; b = 2; c = 3; }
g() { a = 4 ; b = 5; c = 6; }
main() { f(); g(); }
• Dependences are between assignments to a,
assignments to b, assignments to c.
• No other dependences.
• Therefore, we only need to enforce these
dependences.
Example 2 (continued)
f() { a = 1; b = 2; c = 3; }
g() { a = 4; b = 5; c = 6; }
main() { f(); g(); }
f() { a = 1; signal(e_a); b = 2; signal(e_b); c = 3;
signal(e_c); }
g() { wait(e_a); a = 4; wait(e_b); b = 5; wait(e_c); c = 6; }
main() { f(); g(); }
Synchronization Facility
• Suppose we had a set of primitives,
signal(x) and wait(x).
• wait(x) blocks unless a signal(x) has
occurred.
• signal(x) does not block, but causes a
wait(x) to unblock, or causes a future
wait(x) not to block.
Example 2 (continued)
a = 1;
signal(e_a);
b = 2;
signal(e_b);
c = 3;
signal(e_c);
wait(e_a);
a = 4;
wait(e_b);
b = 5;
wait(e_c);
c = 6;
• Execution is (mostly) parallel and correct.
• Dependences are “covered” by synchronization.
2
About synchronization
• Synchronization is necessary to make some
programs execute correctly in parallel.
• However, synchronization is expensive.
• Therefore, needs to be reduced, or
sometimes need to give up on parallelism.
Example 4
for( i=1; i<100; i++ ) {
a[i] = …;
…;
… = a[i-1];
}
• Loop-carried dependence, not parallelizable
Example 3
f() { a=1; b=2; c=3; }
g() { d=4; e=5; a=6; }
main() { f(); g(); }
f() { a=1; signal(e_a); b=2; c=3; }
g() { d=4; e=5; wait(e_a); a=6; }
main() { f(); g(); }
Example 4 (continued)
for( i=...; i<...; i++ ) {
a[i] = …;
signal(e_a[i]);
…;
wait(e_a[i-1]);
… = a[i-1];
}
3
Example 4 (continued)
• Note that here it matters which iterations are
assigned to which processor.
• It does not matter for correctness, but it
matters for performance.
• Cyclic assignment is probably best.
Example 5 (contimued)
• We will need to make parallel execution
stop after first loop and resume at the
beginning of the second loop.
• Two (standard) ways of doing that:
Example 5
for( i=0; i<100; i++ ) a[i] = f(i);
x = g(a);
for( i=0; i<100; i++ ) b[i] = x + h( a[i] );
• First loop can be run in parallel.
• Middle statement is sequential.
• Second loop can be run in parallel.
Fork-Join Synchronization
• fork() causes a number of processes to be
created and to be run in parallel.
• join() causes all these processes to wait until
all of them have executed a join().
– fork() - join()
– barrier synchronization
4
Example 5 (continued)
fork();
for( i=...; i<...; i++ ) a[i] = f(i);
join();
x = g(a);
fork();
for( i=...; i<...; i++ ) b[i] = x + h( a[i] );
join();
Example 6 (continued)
for( k=0; k<...; k++ ) sum[k] = 0.0;
fork();
for( j=…; j<…; j++ ) sum[k] += a[j];
join();
sum = 0.0;
for( k=0; k<...; k++ ) sum += sum[k];
Example 6
sum = 0.0;
for( i=0; i<100; i++ ) sum += a[i];
• Iterations have dependence on sum.
• Cannot be parallelized, but ...
Reduction
• This pattern is very common.
• Many parallel programming systems have
explicit support for it, called reduction.
sum = reduce( +, a, 0, 100 );
5
Final word on synchronization
• Many different synchronization constructs
exist in different programming models.
• Dependences have to be “covered” by
appropriate synchronization.
• Synchronization is often expensive.
6
Download