1/18/2024 [Last time] Repeated Combination π»ππ = πΆππ+π−1 Conditional Probability π(π΅|π΄) is the probability of event B occurring, given that event A has already occurred. • The conditional probability of an event B given an event A, denoted as π(π΄∩π΅) π(π΅ | π΄), is: π(π΅ | π΄) = πππ π(π΄) > 0. π(π΄) • From a relative frequency perspective of n equally likely outcomes: – π(π΄) = (number of outcomes in A) / n – π(π΄ ∩ π΅) = (number of outcomes in π΄ ∩ π΅) / n – π(π΅ | π΄) = number of outcomes in π΄ ∩ π΅ / number of outcomes in A Example 2-22 400 parts are classified by surface flaws and as functionally defective. There are 4 probabilities conditioned on flaws in the below table. Parts Classified Surface Flaws Defective Yes (F ) No (F' ) Yes (D ) 10 18 No (D' ) 30 342 Total 40 360 <sol> Total 28 372 400 Multiplication Rule • The conditional probability can be rewritten to generalize a multiplication rule. π(π΄ ∩ π΅) = π(π΅|π΄) · π(π΄) = π(π΄|π΅) · π(π΅) [Note] π(π΄|π΅) = π(π΄∩π΅) π(π΄∩π΅) π(π΅) π(π΄) , π(π΅ | π΄) = Two Mutually Exclusive Subsets • A and Aο’ are mutually exclusive. • π΄ ∩ π΅ and π΄′ ∩ π΅ are mutually exclusive • B = (π΄ ∩ π΅) ∪ (π΄′ ∩ π΅ ) Multiplication Rule • The conditional probability can be rewritten to generalize a multiplication rule. π(π΄ ∩ π΅) = π(π΅|π΄) · π(π΄) = π(π΄|π΅) · π(π΅) π(π΄|π΅) = π(π΄∩π΅) π(π΄∩π΅) π(π΅) π(π΄) , π(π΅ | π΄) = Two Mutually Exclusive Subsets • A and Aο’ are mutually exclusive. • π΄ ∩ π΅ and π΄′ ∩ π΅ are mutually exclusive • B = (π΄ ∩ π΅) ∪ (π΄′ ∩ π΅ ) Total Probability Rule For any two events A and B π(π΅) = π(π΅ ∩ π΄) + π(π΅ ∩ π΄′ ) = π(π΅|π΄) ⋅ π(π΄) + π(π΅|π΄′ ) ⋅ π(π΄′ ) π(π΄ ∩ π΅) = π(π΅|π΄) · π(π΄) = π(π΄|π΅) · π(π΅) Example 2-27: Semiconductor Contamination Information about product failure based on chip manufacturing process contamination is given below. Find the probability of failure. Probability Level of Probability of Failure Contamination of Level 0.1 0.005 High Not High 0.2 0.8 <sol> Let F denote the event that the product fails. Let H denote the event that the chip is exposed to high contamination during manufacture Total Probability Rule (Multiple Events) • A collection of sets E1, E2, … Ek such that πΈ1 ∪ πΈ2 ∪ … ∪ πΈπ = π is said to be exhaustive. • Assume E1, E2, … Ek are k mutually exclusive and exhaustive. Then π(π΅) = π(π΅ ∩ πΈ1 ) + π(π΅ ∩ πΈ2 ) + β― + π(π΅ ∩ πΈπ ) = π(π΅|πΈ1 ) ⋅ π(πΈ1 ) + π(π΅|πΈ2 ) ⋅ π(πΈ2 ) + β― + π(π΅|πΈπ ) ⋅ π(πΈπ ) Example 2-28: Semiconductor Failures Probability of Failure 0.100 0.010 0.001 Level of Contamination High Medium Low Probability of Level 0.2 0.3 0.5 Find π(πΉ) <sol> Event Independence • Two events are independent if any one of the following equivalent statements is true: 1. π(π΄ | π΅) = π(π΄) 2. π(π΅ | π΄) = π(π΅) 3. π(π΄ ∩ π΅) = π(π΄) · π(π΅) • This means that occurrence of one event has no impact on the probability of occurrence of the other event. Example 2-30: Flaws and Functions Table 1 provides an example of 400 parts classified by surface flaws and as (functionally) defective. Suppose that the situation is different and follows Table 2. Let F denote the event that the part has surface flaws. Let D denote the event that the part is defective. The data shows whether the events are independent. TABLE 2 Parts Classified (data TABLE 1 Parts Classified chg'd) Surface Flaws Surface Flaws Yes No Yes No Defective (F) (F') Total Defective (F) (F') Total Yes (D) 10 18 28 Yes (D) 2 18 20 No (D') 30 342 372 No (D') 38 342 380 Total 40 360 400 Total 40 360 400 <sol> [Note] π(πΉ |π·) = 2 20 = 0.1 = π(πΉ ) = 40 400 = 0.1 Example: If we randomly draw two cards from a deck of 52 cards without replacement. Let A: the first card is K and B: the second card is K. Are events A and B independent? <sol> Example: If we randomly draw two cards from a deck of 52 cards with replacement. Let A: the first card is K and B: the second card is K. Are events A and B independent? <sol> [Note] Let A and B are two events and π(π΄) ≠ 0, π(π΅) ≠ 0 1. If A and B are independent, then A and B are not mutually exclusive. 2. If A and B are mutually exclusive, then A and B are not independent. 3. If A and B are independent, then π¨′ & π©′ , π΄′ &π΅, π΄&π΅′ are all independent, respectively. <pf> Bayes’ Theorem: Let the event of interest B happen under event A with a known conditional probability π(π΅|π΄). Assume the probability of A is known (prior probability). Then the conditional (posterior) probability of the A given that event B happened is: π(π΄|π΅ ) = π(π΅|π΄) ⋅ π(π΄) , πππ π(π΅) > 0 π(π΅) Example 2-36 (Example 2-27) Probability of Failure Level of Contamination Probability of Level 0.1 0.005 High Not High 0.2 0.8 Find the probability of a High level of contamination given that a failure occurred. <sol> Bayes’ Theorem with Total Probability If πΈ1 , πΈ2 , . . . πΈπ are k mutually exclusive and exhaustive events and B is any event, π(π΅|πΈ1 ) ⋅ π(πΈ1 ) π (π΅|πΈ1 ) ⋅ π(πΈ1 ) π(πΈ1 |π΅) = = π(π΅) π(π΅|πΈ1 ) ⋅ π(πΈ1 ) + β― + π(π΅ |πΈπ ) ⋅ π(πΈπ ) πΉππ π(π΅) > 0 [Note] - Total probability expression of the denominator - Numerator is always one term of the denominator Example: The following problem was posed by Casscells, Schoenberger, and Grayboys (1978) to 60 students and staff at an elite medical school: If a test to detect a disease whose prevalence is 1/1000 has a false positive rate of 5%, what is the chance that a person found to have a positive result actually has the disease, assuming you know nothing about the person’s symptoms or signs? Assuming that the probability of a positive result given the disease is 1, Let D denote that the event that you have the disease. Let S denote that the event that your test is positive. Casscells et al. found that only 18% of participants gave this answer. The most frequent response was 95% Before the test, your chance was 0.1%. After the positive result, your chance is now 2%. Example: Drug test for MLB players Suppose 7% of all major league baseball players use an illegal steroid. A league test is considered 96% effective in detecting the illegal steroid, but unfortunately, it also incorrectly tests positive (indicating steroid use) for a person who does not have the illegal steroid in their system 12% of the time. 1. What proportion of the league players will test positive for the illegal steroid? 2. Given that someone tests positive for an illegal steroid, what is the probability that they do not have an illegal steroid in their system? 3. Suppose four players have taken steroids. What is the probability all of them are detected by the test? <sol> A: test positive S: a player takes steroid Naïve Bayes: Let S: Spam email. A, B, C, D are just four key words from the email you try to classify. π(π΄, π΅, πΆ, π·|π) ⋅ π(π) π(π΄, π΅, πΆ, π·) π(π΄, π΅, πΆ, π·|π) ⋅ π(π) = π(π΄, π΅, πΆ, π· |π) ⋅ π(π) + π(π΄, π΅, πΆ, π·|π ′ ) ⋅ π(π ′ ) π(π|π΄, π΅, πΆ, π·) = Assume independence π(π΄, π΅, πΆ, π·|π) = π(π΄|π) ⋅ π(π΅|π) ⋅ π(πΆ |π) ⋅ π(π·|π) π(π΄, π΅, πΆ, π·|π′) = π(π΄|π′) ⋅ π(π΅|π ′ ) ⋅ π(πΆ |π ′ ) ⋅ π(π·|π′) You should be able to calculate all the probabilities on the right through the training data. π(π′|π΄, π΅, πΆ, π· ) = 1 − π(π|π΄, π΅, πΆ, π·) ∴ ππ π(π|π΄, π΅, πΆ, π·) ≥ 0.5 → πΆπππ π πππ¦ π‘βπ πππππ ππ π ππππ! ππ‘βπππ€ππ π, π‘βπ πππππ ππ πππ‘ π ππππ. For example: πΌπ π(π|π΄, π΅, πΆ, π·) = 0.4 < 0.5 → πππ‘ π π πππ πΌπ π(π|π΄, π΅, πΆ, π·) = 0.6 ≥ 0.5 → ππππ