ODD/EVEN PARALLEL PREFIX ALGORITHM Uses another divide and conquer approach. Divide N inputs into groups whose indices are odd and even, respectively. A construction which makes use of only one prefix computation of half the size is shown in figure above. The prefix computation is then performed on N/2 sums, yielding the correct result for all even numbered outputs. The odd numbered outputs are then obtained by adding each even numbered result to the next higher odd numbered input. 6 ODD/EVEN PARALLEL PREFIX ALGORITHM Figure above shows the data dependence graph of odd/even algorithm for a vector of 8 elements When the odd/even construction is applied recursively, two operations are added to the depth for each division of the size by two (fig. 2.6). This is not true for the special case of four inputs (fig. 2.7). A simple diagram shows that the construction of Poe (4) from Poe(2) yields a depth of 2 instead of 3. Doing the construction for Poe (N) where N=2k down to Poe (4), the depth of which is 2, 3 yields Depth( P oe (2 k )) 2 Depth( P oe (2 2 )) 2k 2 2 log 2 N 2, k 2. i k A similar repetition of the equation Size ( P oe (2 k )) Size ( P oe (2 k 1 )) 2 k 1 , down to Size ( P oe (2)) 1 yields 1 Size ( P oe (2 k )) (2 i 1) 2 k 1 k 2 2 N log 2 N 2, k 0 i k Thus Pul requires 0.25log2N times as many operations as Poe , but Poe has depth about twice that of Pul. If the number of processors is small that some work should be done sequentially just to the lack of processors, Poe may complete faster than Pul. 7 CHARACTERIZING ALGORITHM BEHAVIOR FOR LARGE PROBLEM SIZE It is useful to be able to estimate how computational cost of solving grows as the size of the problem grows. It is sufficiently to characterize asymptotic behavior. For such cases “big O” and “small o” notations are used Let f(n) and g(n) be functions of integer n. Definition: The notation f(n)=O(g(n)) means that there exists a constant c and an integer N such that for all n>=N, |f(n)|<c|g(n)|. Definition: The notation f(n)= Ω(g(n)) means that there exists a constant c and an integer N such that for all n>=N, |f(n)|>c|g(n)|. Definition: If both f(n)=O(g(n)) and f(n)= Ω(g(n)) then f(n)=Θ(g(n)) Definition: The notation f(n)=o(g(n)) means that for any ε>0 there exists an integer N such that for all n>=N, |f(n)|< ε |g(n)|. f(n)/g(n) tends to zero as n increases Definition: The notation f(n)=ω(g(n)) means that for all C, arbitrarily large, there exists an integer N such that for all n>=N, |f(n)|>C|g(n)| f(n)/g(n) diverges as n increases Examples: f ( n) n3 2n sin (n 3 ) 3 M M Depth( P ul ( N )) O(log 2 N ), Depth( P oe ( N )) O(log 2 N ) 8 PROGRAMMING PARALLEL PREFIX Let’s consider at first sequential pseudocode for Poe algorithm: Level:=2; While level<=N begin For i:=level step level until N V[i]:=V[i]+V[i-level/2]; Level:=level*2; End; Level:=level/2; If level=N then level:=level/2; While level>1 begin For i:=level+level/2 step level until N V[i]:=V[i]+V[i-level/2]; Level:=level/2; End; 9 PROGRAMMING PARALLEL PREFIX (CONT 1) Let’s now consider MIMD odd/even parallel prefix code for process j of NP processes, NP<=N. Private level, j,i; Shared N,NP, V[1:N]; Level:=2; While level<=N begin For i:=level+(j-1)*level step level*NP until N V[i]:=V[i]+V[i-level/2]; Barrier; Level:=2*level; End; Level:=level/2; If level=N then level:=level/2; While level>1 begin For i:=level+level/2+(j-1)*level step level*NP until N V[i]:=V[i]+V[i-level/2]; Barrier; Level:=level/2; End; When a processor executes barrier statement, it waits until all other processors have entered their barrier statements and then continues. This is a synchronization primitive. 10