Presentation 5. Probability What are we doing now? So far in this course we have covered: 1. Descriptive Statistics (Plots and Summaries of Variables) 2. Gathering Data (types of studies and design issues) 3. Introduction to Inference (statistical procedures to answer questions about the population; so far we have seen the chi-square test.) Now we are going to learn about probability and random variables which will help us understand inference more completely (i.e. where p-values come from, how are we able to make conclusions based on samples…etc.) Random Circumstances A random circumstance is one in which the outcome is unpredictable. Examples: 1. Selecting a card at random from a deck. 2. Rolling a six sided die. 3. Selecting an individual at random and recording their eye color. 4. Selecting an individual at random and recording their weight. Probability Probability is how likely a particular outcome will be the result of a random circumstance. Examples: 1. Probability 2. Probability 3. Probability 4. Probability lbs. of selecting a spade. of rolling a 4. the person you selected has green eyes. the person you select weighs more than 200 More on Probability Probability is a number between 0 and 1. The sum of the probabilities for all outcomes of a random circumstance should add up to 1 (or 100%). The set of all possible outcomes of a random circumstance is called sample space. For example: 1. 2. 3. Tossing a coin: sample space = {head, tail} Rolling a die: sample space = {1,2,3,4,5,6} Rolling two dice: sample space = {(1,1),(1,2),…,(6,6)} Relative Frequency Interpretation of Probability The word probability has two different interpretations. One is the statistical definition, the other (which might be slightly different) is the use of the word probability in every day life. In situations that we can image repeating many times we define the probability of a specific outcome as the proportion of times it would occur over the long run. This is also called the relative frequency of that particular outcome. Example 0.6 0.5 0.3 0.4 Proportion of Heads 0.7 0.6 0.5 0.4 Proportion of Heads 0.8 Consider Tosses of a Coin, Let us Plot the Proportion of Heads on the n th Toss 0 200 400 600 Number of Flips 800 1000 0 100 200 300 Number of Flips 400 500 Example Suppose we toss a fair coin 2,000 times. It is almost IMPOSSIBLE for us to get exactly 1,000 heads and 1,000 tails. However, the proportion of the heads and tails as we make more and more tosses will approach 0.5 (50%). We have to consider the outcome in the long run. Suppose we toss a coin 3 times and get 1 head and 2 tails. Based on this information, we cannot assign 1/3 as the probability of getting a head, and 2/3 as the probability of getting a tail. Lack of repetition creates a problem in this situation. To correct this error, we need to toss the coin a large number of times. Determining the Relative Frequency Probability of an outcome Method 1: Make an Assumption about the Physical World. Example 7.2 A Simple Lottery Choose a three-digit number between 000 and 999. Player wins if his or her three-digit number is chosen. Suppose the 1000 possible 3-digit numbers (000, 001, 002, . . . , 999) are equally likely. In long run, a player should win about 1 out of 1000 times. This does not mean a player will win exactly once in every thousand plays. Method 2: Observe the Relative Frequency Example 7.4 The Probability of Lost Luggage “1 in 176 passengers on U.S. airline carriers will temporarily lose their luggage.” Example 1: A fair die is rolled once. Determine the probability of each of the following outcomes a. Six dots: b. One or two dots: c. An even number of dots: d. more than 5 dots: e. more than or equal to 5 dots: f. less than 5 dots: Example 2: When 190 students were asked to pick a number from 1 to 10, the number of students selecting each number were as follows: Number: Frequency: 1 2 2 9 3 22 4 21 5 18 6 23 7 56 8 19 9 14 10 6 TOTAL 190 What is the approximate probability that someone asked to pick a number from 1 to 10 will pick 1. The number 3? 2. One of the two extremes, 1 or 10? 3. Odd number? Some Definitions The sample space is the collection of all possible outcomes of a random circumstance (we will denote it with S) A simple event is one possible outcome of a random circumstance. An event is a collection of one or more simple events. Example: Consider rolling a die once. The sample space is S = {1,2,3,4,5,6} Simple events in the sample space are: {1}, {2}, {3}, {4}, {5}, {6}. Event to obtain an odd outcome, i.e. {1,3,5}, is comprised of the simple events {1}, {3}, {5}. Assigning Probabilities to Simple Events P(A) = probability of the event A Conditions for Valid Probabilities Each probability is between 0 and 1. The sum of the probabilities over all possible simple events is 1. Equally Likely Simple Events If there are k simple events in the sample space and they are all equally likely, then the probability of the occurrence of each one is 1/k. Complementary Events The complement of the event A, denoted by Ac, is the collection of all simple events that are not in A. Ac Note: P(A) + P(AC) = 1 Example 7.2 A Simple Lottery (cont) A = player buying single ticket wins AC = player does not win P(A) = 1/1000 so P(AC) = 999/1000 Mutually Exclusive Events Two events are mutually exclusive, or equivalently disjoint, if they do not contain any of the same simple events (outcomes). A B Example 7.2 A Simple Lottery (cont) A = all three digits are the same. B = the first and last digits are different The events A and B are mutually exclusive (disjoint), but they are not complementary. Independent Events Two events are independent of each other if knowing that one will occur (or has occurred) does not change the probability that the other occurs. Are mutually exclusive events independent? Conditional Probability Conditional probability of the event B, given that the event A occurs, is written as P(B|A). 1. If A and B are independent then P(B|A) = 2. If A and B are mutually ex. then P(B|A) = Rules for Finding Probabilities Probability Axioms 1. 0≤ P(A) ≤ 1, for every event A. 2. P(S) = 1 3. For A1,A2,… disjoint events P(A1 or A2 or … ) =P(A1 ) + P(A2 ) + … Note: A B The event “A or B” is the whole grey area. The event “A and B” is the darkest are of the Consequences of these axioms… some basic rules Rule 1 : P(AC) = 1 – P(A) since P (A or Ac ) = P( S ) = 1 and also, P(A or Ac ) = P(A )+ P(Ac) For A1, A2 disjoint: P(A1 and A2) = P(Ø) = 0, where Ø denotes an event that cannot occur. Rule 2 : P(A or B) = P(A) + P(B) – P(A and B) For A and B disjoint: P(A or B) = P(A) + P(B) Rule 3 : P(A and B) = P(A)P(B|A) For A and B independent: P(A and B) = P(A)P(B) For several independent events, P(A1 and A2 and … and An) = P(A1)P(A2)…P(An) Rule 4 : P(B | A) = P(A and B)/P(A) P(A|B) = P(A and B)/P(B) Example 7.12 Probability Two Strangers Both Share Your Birth Month Event A = 1st stranger shares your birth month P(A) = 1/12 Event B = 2nd stranger shares your birth month P(B) = 1/12 Note: Events A and B are independent. P(both strangers share your birth month) = P(A and B) = P(A)P(B) = (1/12)(1/12) = 0.007 Note: The probability that 4 unrelated strangers all share your birth month would be (1/12)4. Example 7.13 Alicia Answering There 50 students, the professor randomly select students to answer 3 questions. If we know Alicia is picked to answer one of the questions, what is the probability it was the first question? A = Alicia selected to answer Question 1, P(A) = 1/50 B = Alicia is selected to answer any one of the questions, P(B) = 3/50 Since A is a subset of B, P(A and B) = 1/50 (draw a diagram) P(A|B) = P(A and B)/P(B) = (1/50)/(3/50) = 1/3 Example 1 Studies on depression indicate that a particular course of treatment improves the condition of 72% of those on whom it is used, does not affect 10%, and worsens the condition of the rest. A person suffering from depression is treated by using this method. What is the probability that his condition will worsen? What is the probability that the treatment is not detrimental to his condition? Example 2 Data gathered at a particular blood center show that .1% of all donors test positive for HIV and 1% test positive for herpes. If 1.05% test positive for one or the other of these problems 1. 2. 3. What is the probability that a randomly selected donor will have neither problem? Would you be surprised to find a donor with both problems? Are these two problems independent? Example 3 Approximately 50% of the population is male, 68% of the population drinks to some extent, and 38.5% drinks to some extent and is male. Given that a randomly selected individual is male, find the probability that he drinks. Is a person's drinking status independent of gender? Is the same true if 34% drinks to some extent and is male?