Probability of Compound Events Shantanu Dutt ECE Dept. Uinv. of Illinois at Chicago Basics—Mutually Exclusive Events • • • • • • p(A) will denote prob. of event A. Two events A, B are mutually exclusive (ME) if one event happening precludes the possibility of the other. In other words p(both A and B happens) = p(A I B) = 0. E.g., Event Di is a stranger’s b’day being on the i’th day of the week, 1 <= i <= 7. p(Di) = 1/7 for each i. However, given that the event D4 has happened (the stranger’s b’day is found to be on Thurs), the prob. of the other events are 0. Thus the Di’s are ME. Another ex. is blocks of code in an if-then-else chain. The event that a particular block Bj will be executed in the current pass through the if-thenelse chain is ME w/ the execution of other blocks If A and B are ME, then the p(either A or B happens) = p(A U B) = p(A) + p(B); see Fig. 1 If A, B are not ME, then p(AUB) = p(A) + p(B) – p(A I B), since we are counting the event p(A I B) twice in p(A) + p(B); see Fig. 2. Similarly, p(A U B U C) = p(A) + p(B) + p(C) – p(A I B) – p(B I C) – p(A I C) + p(A I B I C), since we included p(A I B I C), 3 times in p(A) + …+ p(B) but also excluded it 3 times in – p(A I B) – p(B I C) – p(A I C), and so need to add it back once; see Fig. 3. The general formulation for p(Ui=1n Ei) is obtained from a generalization of the above –inclusionexclusion pattern and is formally called the inclusion-exclusion principle; see http://en.wikipedia.org/wiki/Inclusion%E2%80%93exclusion_principle U A B Fig. 1 U A AIB • • B Fig. 2 AIB U A AIBIC B C AIC Fig. 3 BIC Basics—Mutually Exclusive Events • Two events A, B are independent if any one of the events happening does not affect the probability of the other. This is expressed as P(A/B) = P(A), and P(B/A) = p(B), where p(A/B) is the probability of event A happening given that B has happened. E.g., Event A = it rains today; Event B = there are dark clouds in the sky; Event C = it is very sunny; Event D = today is Wed. – – – • • • • Then A and D are independent, i.e., P(A) is not affected by D happening, or p(A/D) = p(A). Also, p(D/A) = p(D). Similarly B and D, and C and D are independent. But A and B are not independent, and statistics will tell us that p(A/B) > p(A), and p(B/A) > p(B) Similarly, A and C are not independent, and p(A/C) < p(A) and p(C/A) < p(C) If A and B are independent, then p(A I B) = p(A)*p(B). Otherwise, p(A I B) = p(A)*p(B/A) = p(B)*(A/B). E.g., in above ex., say, p(A) = 0.2 and p(B) = 0.25. Then p(A I B) will be much higher than p(A)*p(B) = 0.05, since the probability that both A and B happens will be very close to p(A) = 0.2 (i.e., p(A/B) is close to 1), since if it is raining it is very likely that the clouds are very dark The determination of whether 2 events are independent are determined logically based on the given even scenario OR statistically. Similarly the formulation of p(A/B) is determined based on the underlying p(A) and p(B) formulations OR from statistics. Once independence or non-independence and p(A/B) are determined by non-probabilistic methods, p(A I B) can be determined. U A AIB • B Fig. 4: Note that if A I B is not empty it does not have any bearing on whether A and B are indep. or not, which are determined by other means. Exclusive Atomic Events • • • • • • • • • • • Scenario: There is a box w/ a red ball in it. A group of n randomly picked individuals are picked and placed equidistant from the box. At the count of 3, they are to rush to the box and take the ball out. Clearly, only one person can get the ball. See Figs. 5 and 6. The atomic event Ei is person i gets the ball, 1 <= i <= n. p(Ei) = 1/n. Also, since only one person can get the ball, the Ei’s are pair-wise mutually exclusive (i.e., ME wrt to each other). Hence the prob. that either person i or person j gets the ball p p(Ei U Ej) = p(Ei) + p(Ej) = 2/n. Similarly, the prob. that any person in a subset of k people gets the ball = k/n. Consider event (Ei)’ = event that i does not get the ball. p((Ei)’) = 1 – p(Ei) [by definition of the complement event] = (n-1)/n. (Ei)’ and (Ej)’ are not ME (both can happen), and they are also not independent. The latter is so, since if Ei does not get the ball, then the prob. of Ej increases to 1/(n-1), i.e. p(Ei/(Ej)’) = 1/(n-1) != p(Ei). Thus p((Ei)’/(Ej)’) = 1 - p(Ei/(Ej)’) = 1 – (1/(n-1)) = (n-2)/(n-1) != p((Ei)’). Thus p((Ei)’ I (Ej)’) = p((Ej)’)*p((Ei)’/(Ej)’) = ((n-1)/n)*((n-2)/(n-1)) = (n-2)/n Prob. that one of person i or j gets the ball = p(Ei U Ej) can also be derived as (whether these events are ME or not): 1 – prob.(none of them get the ball) = 1- p((Ei)’ I (Ej)’) = 1 - (n-2)/n = 2/n If 2 events A, B are not ME then p(AUB) = 1 –p(A’ I B’) is an easier derivation than determining p(AUB) using the inclusion-exclusion principle, especially if the number of events in the union is > 2. For A, B, C, p(AUBUC) = 1 - p(A’ I B’ I C’) = 1 – p(A’/ B’ I C’) *p(B’ I C’) = 1 - p(A’/ B’ I C’) )*p(B’/C’)*p(C’). For the above scenario, this is 1 – (1-p(A/ B’ I C’))*(n-2)/n = 1-(11/(n-2)*(n-3)/n = 1 – (n-3)/n = 3/n Thus for the ex. in Fig. 5, p(E1 U E3) can be obtained as either: p(E1)+p(E3) (as they are ME) = 2/3 or the more general (since ME atomic events are rare), 1 –p(E1’ I E3’) = 1 – p(E1’/E3’)*p(E3’) = 1 – (1-p(E1/E3’)*(1-p(E3)) = 1-(1-1/2)(1-1/3) = 1-(1/2)*(2/3) = 2/3 2 1 p(E1)=1/3 3 p(E2)=1/3 p(E3)=1/3 Fig. 5: Three persons competing for a single ball. Fig. 6: Three ME events ME Events E1 E2 E3 Exclusive Atomic Events (cont’d) • • • • • • • Scenario: In an indirect tree topology, the top switch (see Fig. 7) is the sole communication route between the (P/2)-processor subsets on its left and right subtree. We need to know the message load on this switch, when each processor sends a msg. to another random processor. We do this by focusing on a single processor v (=4 in Fig. 7) and determining what the prob. is of its msg. going to the (P/2)-proc. subset on the other side of the tree. The atomic events are {Mi: proc. i gets the msg. from v}, w/ only Mv = f, since v does not send a msg. to itself. For other Mi’s, q = p(Mi) = 1/p-1 (=1/5 in Fig. 7), as the msg. is sent to a random dest. & thus the prob. distr. is uniform. The events Mi are also clearly ME, as there is only 1 msg. and only 1 proc. will get it. Thus p(M1 U M2) = p(M1)+p(M2) = 2/(p-1) (=2/5 in ex.). In general, p(a proc. in the other (P/2)-proc. subset getting the msg. from v) = p(M1 U M2 … U MP/2) = p(M1)+p(M2)+…p(MP/2) = (p/2)*q (= 3/5 for the Fig. 7 ex.) Similarly, p(a proc. in the other (P/4)-subset within v’s (P/2)-subset getting the msg.) = (p/4)*q. Also, as in the prev. “ball-grabbing” scenario, Mi’s are not independent, buy similarly, p(M1 U M2 … U MP/2) can also be derived as 1 – p(M1’ I M2’ I …. I M’P/2 ) = 1 – (p(M1’)*p(M2’/M1’)*p(M3’/(M1’ I M2’))* …. *p(M’P/2/(M1’ I M2’ …. M’(P-1)/2)) = 1 – [(p-2)/(p-1)]*[(p-3)/(p-2)]*…..*[(p-p/2 -1)/(p-p/2)] = 1 – (p/2 – 1)/(p-1) = (p/2)/(p-1) = (p/2)*q. Thus the msg load (or prob.) on the top switch in a random msg. pattern is p*(p/2)*q = p2q/2. On the switch below the root/top one, msg load is (p/2)*(p/4)*q = p2q/8. Thus msg load on the top switch is 4 times that of the switch below it, and so on, leading to the rationale for a fat-tree topology in which the switch size and # of links doubles (or increases by a factor > 1 and <= 2) so that msg. latency is not high due to contention/collision. It doubles instead of increasing by a factor of 4, as msg. patterns are mostly not random but more structure and more localized (e.g., in recursive reduction—what is the relative load here?) Top switch in an indirect tree p(Mi) = 1/(P-1) =1/5 1 2 3 P/2 processors 4 5 6 P/2 processors Fig. 7: Scenario in which proc. 4 sends a single message to a random destination. Q: What is the prob. of the msg. going through the top switch? M1 M2 M3 M4 M5 ME Events Fig. 8: Five ME events Non-ME Atomic Events • • • • • • • • • Scenario: In an indirect tree topology, the top switch (see Fig. 9) is the sole communication route between the (P/2)-processor subsets on its left and right subtree. We need to know the message load on this switch, when each processor sends a msg. to another random processor. We consider the left (P/2)-subset sending msg. randomly and determine the prob. of at least 1 msg reaching the right (P/2)subset. This will be representative of the load on the top sw. Assume we did not do the previous analysis for a single-proc. msg. The atomic events are {Mi,j: proc. i gets the msg. from j}, w/ only Mj,j = f, for each j, since a proc. does not send a msg. to itself. For other Mi,j’s, p(Mi,j) = 1/p-1 = q (=1/5 in Fig. 9), as the msg. is sent to a random dest. & thus the prob. distr. is uniform. The events Mi,j for a constant j are clearly ME, as there is only 1 msg. from j and only 1 proc. will get it. However, all Mi,j’s (i.e., when both i & j are variables) are not ME’s as more than one (in fact up to P/2) of these events can happen simultaneously. Thus p(at least 1 msg. coming to left (P/2)-subset from right (P/2)subset) = p(Ui in left, j in right Mi,j) != Si in left, j in right p(Mi,j). Further, using the union/sum approach w/ the inclusion-exclusion principle becomes very cumbersome (though it will yield the correct result). However, using the “complement of the intersection of the complements” approach, p(Ui in left, j in right Mi,j) = 1 – p(Ii in left, j in right (Mi,j)’) = 1 – p(Ii in left (I j in right (Mi,j)’) ). Let Mi be the event that i gets at least 1 msg from the left. Thus p((Mi)’) = p(i gets no msgs from the left) = p (I j in right (Mi,j)’) ). Event (Mi,j)’ = i does not get a msg. from j. Any two events (Mi,j)’ and (Mi,k)’ , j != k, are independent as the occurrence of one does not affect that of the other. Thus p((Mi)’) = p (I j in right (Mi,j)’) = P j in right p((Mi,j)’) = P j in right (1- (1/(P-1)) = P j in right (P-2)/(P-1) = [(P-2)/(P-1)]P/2 Top switch in an indirect tree p(Mi,j) = 1/(P-1) =1/5 1 2 3 P/2 processors 4 5 6 P/2 processors Fig. 9: Scenario in which processors in the left (P/2)-proc. subset sends one message each to a random destination. Q: What is the prob. of at least msg. going through the top switch? This prob. is representative of the msg. load on the top switch under this scenario. Non-ME Atomic Events (contd.) • • • • • • • • • • • Event (Mi,j)’ = i does not get a msg. from j. Any two events (Mi,j)’ and (Mi,k)’ , j != k, are independent as the occurrence of one does not affect that of the other. Thus p((Mi)’) = p (I j in right (Mi,j)’) = P j in right p((Mi,j)’) = P j in right (1- (1/(P-1)) = P j in right (P-2)/(P-1) = [(P-2)/(P-1)]P/2 Top switch in However, (Mi)’ and (Mk)’, i != k, are not independent: no msgs. recvd an indirect tree at i (i.e., (Mi)’ occurring) increases Mk, since it increases each component Mk,j to (1/p-2), and thus decreases (Mk)’. Thus p(Mi’ I Mk’) = p(Mi’)*p(Mk’/Mi’) p(Mi,j) = p(Mk’/Mi’) = p (I j in right (Mk,j)’/Mi’). Each p(Mk,j)’/Mi’) is indep. 1/(P-1) across changing j’s. =1/7 Thus p (I j in right (Mk,j)’/Mi’) = P j in right p((Mk,j)’/Mi’) = P j in right (1(1/(P-2)) = P j in right (P-3)/(P-2) = [(P-3)/(P-2)]P/2 Thus p(Mi’ I Mk’) = p(Mi’)*p(Mk’/Mi’) = [(P-2)/(P-1)]P/2 * [(P-3)/(P1 2 3 4 5 6 7 8 2)]P/2 = [(P-3)/(P-1)]P/2 Extrapolating in an obvious way, p(Ii in left (I j in right (Mi,j)’) ) P/2 processors P/2 processors = p(Ii in left Mi’) = [(P-1 – (P/2))/(P-1)] P/2 = [(P/2)-1)/(P-1)] P/2 Fig. 10: Scenario in which processors in the left (P/2)-proc. subset sends one message p(Ui in left, j in right Mi,j) = 1 – p(Ii in left, j in right (Mi,j)’) = 1 – p(Ii in left (I j in each to a random destination. Q: What is P/2 . right (Mi,j)’) ) = 1 - [(P/2)-1)/(P-1)] the prob. of at least msg. going through the For the ex. in Fig. 9 this prob. = 1 – (2/5)3 = (125 – 8)/125 = 117/125 = top switch? This prob. is representative of the msg. load on the top switch under this 0.936. scenario. For Fig. 10 it is 1 – (3/7)4 = (2401 – 81)/2401 = 2320/2401 = 0.966 Similarly, p(a proc. in one of the (P/4)-subset within the left (P/2)subset getting at least msg. from its other (P/4)-subset) [this will be via its 2nd topmost switch = 1 - [(3P/4)-1)/(P-1)] P/4 = (for Fig. 10) 1 – (5/7)2 = 24/49 = 0.49 (almost ½ of that for the top switch). Thus again, load (in terms of prob. of >= 1 msg. through it) on the top switch is about 2X that of switch below it and so forth Non-ME Atomic Events (contd.) • • • • • • • • • • • • • • Via union and incl-excl principle p(at least 1 msg. to left from right for Fig. 9 ) = (see Fig. 11) p(M1)+p(M2)+p(M3) – p(M1 I M2) – p(M1 I M3) – p(M2 I M3) + p(M1 I M2 I M3). This will be very involved: Each component is quite involved, e.g., p(M1) = 1 – p(M1’) (which itself is using the compl. of compl. approach) = 1 - [(P-2)/(P-1)]P/2 = (for Fig. 9) = 1 – (4/5)3 = (125 – 64)/125 = 61/125 = p(M2) = p(M3) p(M1 I M2) = p(M1) - p(M1 I M2’) ; p(M1 I M2’) = p(M2’) – p(M1’ I M2’) From earlier, p(M1’ I M2’) = [(P-3)/(P-1)]P/2 = (3/5)3 = 27/125 Thus p(M1 I M2’) = p(M2’) – [(P-3)/(P-1)]P/2 = [(P-2)/(P-1)]P/2 - [(P3)/(P-1)]P/2 = (4/5)3 - (3/5)3 = (64-27)/125 = 37/125 p(M1 I M2) = p(M1) - p(M1 I M2’) = 61/125 – 37/125 = 24/125 p(M1 I M2 I M3) = p(M1) - p(M1 I (M2 I M3)’) p(M1 I (M2 I M3)’) = p(M1 I (M2’ U M3’)) = p((M1 I M2’) U (M1 I M3’)) = p(M1 I M2’) + p(M1 I M3’) - p(M1 I M2’ I M3’) p(M1 I M2’ I M3’) = p(M2’ I M3’)*p(M1/(M2’ I M3’)) p(M1/(M2’ I M3’)) = 1 - p(M1’/(M2’ I M3’)) = 1 – [1- (1/(P-3))] P/2 = 1 – [(P-4)/(P-3)] P/2 = 1 – (2/3)3 = (27-8)/27 = 19/27 p(M1 I M2’ I M3’) = p(M2’ I M3’)*p(M1/(M2’ I M3’)) = (27/125)*(19/27) p(M1 I (M2 I M3)’) = p(M1 I M2’) + p(M1 I M3’) - p(M1 I M2’ I M3’) = 37/125 + 37/125 - (27/125)*(19/27) = 74/125 (27/125)*(19/27) p(M1 I M2 I M3) = p(M1) - p(M1 I (M2 I M3)’) = 61/125 – 74/125 + (27/125)*(19/27) = 0.152 – 13/125 = 0.048 Thus p(M1UM2UM3) = p(M1)+p(M2)+p(M3) – p(M1 I M2) – p(M1 I M3) – p(M2 I M3) + p(M1 I M2 I M3) = 3*61/125 - 3*24/125 + 0.048 = 3*37/125 + 0.048 = 0.888 + 0.048 = 0.936 (same as in prev. slide) Top switch in an indirect tree p(Mi,j) = 1/(P-1) =1/5 1 2 3 4 P/2 processors 5 6 P/2 processors Fig. 9: Scenario in which processors in the left (P/2)-proc. subset sends one message each to a random destination. Q: What is the prob. of at least msg. going through the top switch? This prob. is representative of the msg. load on the top switch under this scenario. M1 I M2 U M1 M1 I M2 I M3 M1 I M3 M2 M3 M2 I M3 Fig. 11 Non-ME events Mi for left processors 1-3 for scenario of Fig. 9