PROBABILITY AND STATISTICS FOR ENGINEERING Probability Theory Hossein Sameti Department of Computer Engineering Sharif University of Technology Probability And Statistics Source: http://ocw.mit.edu Sharif University of Technology 2 PROBABILITY THEORY 1.Basics Random Phenomena, Experiments Study of random phenomena Different outcomes Outcomes that have certain underlying patterns about them Experiment - repeatable conditions Certain elementary events Ei occur in different but completely uncertain ways. probability of the event Ei : P(Ei )>=0 6/28/2016 Sharif University of Technology 4 Probability Definitions Laplace’s Classical Definition - without actual experimentation - provided all these outcomes are equally likely. Example • a box with n white and m red balls elementary outcomes: {white , red} Probability of “selecting a white ball”: • P a given number is divisible by a prime p 6/28/2016 1 p Sharif University of Technology 5 Probability Definitions Relative Frequency Definition - The probability of an event A is defined as - nA is the number of occurrences of A - n is the total number of trials 6/28/2016 Sharif University of Technology 6 Probability Definitions Example 1. The probability that a given number is divisible by a prime p: 6/28/2016 Sharif University of Technology 7 Counting - Remark General Product Rule if an operation consists of k steps each of which can be performed in ni ways (i = 1, 2, …, k), then the entire operation can be performed in ni ways. Example - Number of PINs - Number of elements in a Cartesian product - Number of PINs without repetition - Number of Input/Output tables for a circuit with n input signals - Number of iterations in nested loops 6/28/2016 Sharif University of Technology 8 Permutations and Combinations - Remark If order matters choose k from n: - Permutations : If order doesn't matter choose k from n: - Combinations : Example A fair coin is tossed 7 times. What is the probability of obtaining 3 heads? What is the probability of obtaining at most 3 heads? 6/28/2016 Sharif University of Technology 9 Example: The Birthday Problem Suppose you have a class of 23 students. Would you think it likely or unlikely that at least two students will have the same birthday? It turns out that the probability of at least two of 23 people having the same birthday is about 0.5 (50%). 6/28/2016 Sharif University of Technology 10 Axioms of Probability- Basics The axiomatic approach to probability, due to Kolmogorov, developed through a set of axioms The totality of all events known a priori, constitutes a set Ω, the set of all experimental outcomes. 6/28/2016 Sharif University of Technology 11 Axioms of Probability- Basics A and B are subsets of Ω . A A B A B 6/28/2016 B A A B A A Sharif University of Technology 12 Mutually Exclusiveness and Partitions A B , A and B are said to be mutually exclusive if A partition of is a collection of mutually exclusive(ME) subsets of such that their union is . A1 A B Aj A2 Ai An A B 6/28/2016 Sharif University of Technology 13 De-Morgan’s Laws A B A B ; A B A B 6/28/2016 A B A B A B A B A B A B A B Sharif University of Technology 14 Events Often it is meaningful to talk about at least some of the subsets of as events we must have mechanism to compute their probabilities. Example Tossing two coins simultaneously: A: The event of “Head has occurred at least once” . 6/28/2016 Sharif University of Technology 15 Events and Set Operators “Does an outcome belong to A or B” “Does an outcome belong to A and B” “Does an outcome fall outside A”? These sets also qualify as events. We shall formalize this using the notion of a Field. 6/28/2016 Sharif University of Technology 16 Fields A collection of subsets of a nonempty set forms a field F if (i) F (ii) If A F , then A F (iii) If A F and B F , then A B F . Using (i) - (iii), it is easy to show that the following also belong to F. 6/28/2016 Sharif University of Technology 17 Fields If then We shall reserve the term event only to members of F. Assuming that the probability P(Ei ) of elementary outcomes Ei of Ω are apriori defined. The three axioms of probability defined below can be used to assign probabilities to more ‘complicated’ events. 6/28/2016 Sharif University of Technology 18 Axioms of Probability For any event A, we assign a number P(A), called the probability of the event A. Conclusions: 6/28/2016 Sharif University of Technology 19 Probability of Union of to Non-ME Sets A AB A B 6/28/2016 Sharif University of Technology 20 Union of Events Is Union of denumerably infinite collection of pairwise disjoint events Ai an event? If so, what is P(A ) ? We cannot use third probability axiom to compute P(A), since it only deals with two (or a finite number) of M.E. events. 6/28/2016 Sharif University of Technology 21 An Example for Intuitive Understanding in an experiment, where the same coin is tossed indefinitely define: A = “head eventually appears”. Our intuitive experience surely tells us that A is an event. If An head appears for the 1st time on the nth toss {t , t, t , , t , h} n 1 We have: Extension of previous notions must be done based on our intuition as new axioms. 6/28/2016 Sharif University of Technology 22 σ-Field (Definition): A field F is a σ-field if in addition to the three mentioned conditions, we have the following: - For every sequence of pairwise disjoint events belonging to F, their union also belongs to F 6/28/2016 Sharif University of Technology 23 Extending the Axioms of Probability If Ai s are pairwise mutually exclusive from experience we know that if we keep tossing a coin, eventually, a head must show up: But: P ( A) 1. A A n , n 1 Using the fourth probability axiom we have: P ( A) P An n 1 6/28/2016 Sharif University of Technology P( A n 1 n ). 24 Reasonablity In previously mentioned coin tossing experiment: So the fourth axiom seems reasonable. 6/28/2016 Sharif University of Technology 25 Summary: Probability Models The triplet (, F, P) - is a nonempty set of elementary events -F is a -field of subsets of . - P is a probability measure on the sets in F subject to the four axioms The probability of more complicated events must follow this framework by deduction. 6/28/2016 Sharif University of Technology 26 Conditional Probability In N independent trials, suppose NA, NB, NAB denote the number of times events A, B and AB occur respectively. According to the frequency interpretation of probability, for large N, P( A) NA N N , P( B ) B , P( AB) AB . N N N Among the NA occurrences of A, only NAB of them are also found among the NB occurrences of B. Thus the following is a measure of “the event A given that B has already occurred”: N AB N AB / N P( AB) NB NB / N P( B ) 6/28/2016 Sharif University of Technology 27 Satisfying Probability Axioms We represent this measure by P(A|B) and define: P( AB) P( A | B ) , P( B ) P( B ) 0. As we will show, the above definition is a valid one as it satisfies all probability axioms discussed earlier. 6/28/2016 Sharif University of Technology 28 Satisfying Probability Axioms 6/28/2016 Sharif University of Technology 29 Properties of Conditional Probability Example In a dice tossing experiment, - A : outcome is even - B: outcome is 2. The statement that B has occurred makes the odds for A greater 6/28/2016 Sharif University of Technology 30 Law of Total Probability We can use the conditional probability to express the probability of a complicated event in terms of “simpler” related events. Suppose that So, 6/28/2016 Sharif University of Technology 31 Conditional Probability and Independence A and B are said to be independent events, if P ( AB ) P ( A) P ( B ). This definition is a probabilistic statement, not a set theoretic notion such as mutually exclusiveness. If A and B are independent, P( A | B ) P( AB) P( A) P( B ) P( A). P( B ) P( B ) Thus knowing that the event B has occurred does not shed any more light into the event A. 6/28/2016 Sharif University of Technology 32 Independence - Example Example From a box containing 6 white and 4 black balls, we remove two balls at random without replacement. What is the probability that the first one is white and the second one is black? P(W1 B2 ) ? W1 B2 W1B2 B2W1. P(W1B2 ) P( B2W1 ) P( B2 | W1 ) P(W1 ). P(W1 ) 6 6 3 , 6 4 10 5 P( B2 | W1 ) 6/28/2016 P (W1 B2 ) 4 4 , 54 9 Sharif University of Technology 3 4 12 5 9 45 33 Example - continued Are W1 and B2 independent? Removing the first ball has two possible outcomes: These outcomes form a partition because: So, P( B2 ) P( B2 | W1 ) P(W1 ) P( B2 | B1 ) P( B1 ) 4 3 3 4 4 3 1 2 42 2 , 5 4 5 6 3 10 9 5 3 5 15 5 Thus the two events are not independent. P( B2 ) P(W1 ) 6/28/2016 2 3 4 P( B2W1 ) 5 5 15 Sharif University of Technology 34 General Definition of Independence Independence between 2 or more events: Events A1,A2, ..., An are mutually independent if, for all possible subcollections of k ≤ n events: Example In experiment of rolling a die, A = {2, 4, 6} B = {1, 2, 3, 4} C = {1, 2, 4}. Are events A and B independent? What about A and C? 6/28/2016 Sharif University of Technology Source: http://ocw.mit.edu 35 Bayes’ Theorem We have: P( A | B ) Thus, P( AB) , P( B ) P ( AB ) P ( A | B ) P ( B ). Also, P( B | A) P( BA) P( AB) , P( A) P( A) P ( AB ) P ( B | A) P ( A). P( A | B ) P( B ) P ( B | A) P ( A). 6/28/2016 Sharif University of Technology 36 Bayesian Updating: Application Of Bayes’ Theorem Suppose that A and B are dependent events and A has apriori probability of P(A ) . How does Knowing that B has occurred affect the probability of A? The new probability can be computed based on Bayes’ Theorm. Bayes’ Theorm shows how to incorporate the knowledege about B’s occuring to calculate the new probability of A. 6/28/2016 Sharif University of Technology 37 Bayesian Updating - Example Example Suppose there is a new music device in the market that plays a new digital format called MP∞. Since it’s new, it’s not 100% reliable. You know that - 20% of the new devices don’t work at all, - 30% last only for 1 year, - and the rest last for 5 years. If you buy one and it works fine, what is the probability that it will last for 5 years? Source: http://ocw.mit.edu 6/28/2016 Sharif University of Technology 38 Generalization of Bayes’ Theorem A more general version of Bayes’ theorem involves partition of Ω : P ( Ai | B ) P ( B | Ai ) P ( Ai ) P( B ) P ( B | Ai ) P ( Ai ) n P( B | A ) P( A ) i 1 In which, i , i Ai , i 1 n, Represents a collection of mutually exclusive events with assiciated apriori probabilities: P( Ai ), i 1 n. With the new information “B has occurred”, the information about Ai can be updated by the n conditional probabilities: 6/28/2016 Sharif University of Technology 39 Bayes’ Theorem - Example Example 1. Two boxes, B1 and B2 contain 100 and 200 light bulbs respectively. The first box has 15 defective bulbs and the second 5. Suppose a box is selected at random and one bulb is picked out. a) What is the probability that it is defective? 6/28/2016 Sharif University of Technology 40 Example - Continued Suppose we test the bulb and it is found to be defective. What is the probability that it came from box 1? P( B1 | D) ? P( B1 | D) P( D | B1 ) P( B1 ) 0.15 1 / 2 0.8571. P ( D) 0.0875 Note that initially, P( B1 ) 0.5; But because of greater ratio of defective bulbs in B1 ,this probability is increased after the bulb determined to be defective.. 6/28/2016 Sharif University of Technology 41