STAT 110 - Section 5 Lecture 21 Professor Hao Wang University of South Carolina Spring 2012 Chapter 18 – Probability Models probability model – describes all possible outcomes and says how to assign probabilities to any collection of outcomes sample space – collection of all unique outcomes of a random circumstance event – a collection of outcomes Coin Example Suppose you are asked to roll a die with 6 faces. What is the sample space? Possible events are • Roll is an even number • Roll is an odd number • Roll is 5 or 6 What about a roulette Sample space ? Example of events ? Probability Rules 1. Any probability is a number between 0 and 1. So if we observe an event A then we know 0 P( A) 1 Probability Rules 2. All possible outcomes together must have probability 1. • An outcome must occur on every trial. • The sum of the probabilities for all possible outcomes must be exactly 1. Marital Status of a Random Sample of Women • Consider the following assignment of probabilities • Marital Status of a Random Sample of Women Ages 25 to 29 Marital Status Never married Married Widowed Divorced Probability 0.386 0.555 0.004 0.055 Marital Status of a Random Sample of Women • Each of the probabilities is a number between 0 and 1. The probabilities total to 1. 0.386 + 0.555 + 0.004 + 0.055 = 1 Any assignment of probabilities to all individual outcomes that satisfies Rules 1 and 2 is legitimate. What does the probability of D need to be to make this a probability model? P(A)=0.3 A) 0.0 B) 0.1 C) 0.2 D) 0.3 E) 0.4 P(B)=0.2 P(C)=0.1 P(D)=? Which of the following is not a possible probability model: A) P(A)=0.3 P(B)=0.4 P(C)=0.3 B) P(A)=0.3 P(B)=0.7 C) P(A)=1.0 D) P(A)=0.3 P(B)=0.6 P(C)=0.2 Incoherent If a set of probabilities don’t satisfy rules 1 and 2 we say they are incoherent. This often occurs with someone’s personal probabilities in complicated situations Probability Rules 3. The probability that an event does not occur is 1 minus the probability that the event does occur. This is known as the complement rule. • Suppose that P(A) = .70 • Using this rule we can determine P(not A) P(not A) = 1- P(A) = 1-.70 = .30 The event “not A” is known as the complement of A which can be written as A c • Suppose the probability of a horse winning a race is 0.85. What is the probability of the horse not winning? • A. 0.85 • B. 0.15 • C 0.7 • D 0.2 Probability Rules 4. If two events have no outcomes in common, the probability that one or the other occurs is the sum of their individual probabilities. If this is true then the events are said to be disjoint. Suppose events A and B are disjoint and you know that P(A) = .40 and P(B) = .35. What is the P(A or B)? If P(A)=0.5 and P(B)=0.4 and A and B are disjoint, then what is P(A or B)? A) 0.1 B) 0.2 C) 0.4 D) 0.5 E) 0.9 • The probability a student is in honors math is 0.25, the probability a student is in honors science is 0.3, and the probability a student is in both is 0.2. • What is the probability a student is in at least one honors class? Venn Diagrams • The probability it will rain Wednesday AM is 30%. The probability it will rain Wednesday PM is 30%. The probability it will rain both Wednesday AM and Wednesday PM is 10%. What is the probability it will rain on Wednesday? A) 20% D) 50% B) 30% E) 60% C) 40% • Rule 5. Multiplication Rule: If two events, A and B, are independent then P(A and B) = P(A)P(B) • Independence means that the occurrence of event A does not affect the occurrence of event B • What is the probability that the first roll of a die is even, and the second roll is odd? • A – 1/36 • B – 1/12 • C – 1/4 • D – 1/2 Review for Mid Term II • You have data on returns on common stocks for all years since 1945. To show clearly how returns have changed over time, your best choice of graph is a A line graph B bar graph C pie chart D histogram • In our class poll, information was collected on your height in inches. This is an example of a ____________ variable. A categorical B quantitative 9) This data set is best described as: a) Skewed Left b) Symmetric c) Skewed Right d) Bimodal 10) The mean of this data set is: a) Approximately equal to the median b) Greater than the median c) Less than the median d) Can’t tell from the picture Questions 14-15 are based on the data set: 4 2 6 5 12 3 9 14) The median is: a) 5.0 b) 6.0 c) 7.0 d) 8.0 e) 9.0 Questions 14-15 are based on the data set: 4 2 6 5 12 3 9 15) The third quartile (Q3) is: a) 2.0 b) 3.0 c) 5.0 d) 7.5 e) 9.0 16) A list of 20 exam scores range from 64 to 98. If a typo was made and the 64 was entered as a 4, then a) The mean would become smaller and the median would stay the same b) The mean would stay the same and the median would become smaller c) Both the mean and the median would stay the same d) Both the mean and the median would become smaller 20) For a standard normal distribution, what can we say about the percentage of the data is within two standard deviations of the mean? a) 68% b) at least 75% c) at least 88.9% d) 95% e) Can’t tell from the given information Questions 23-24 refer to the heights of a group of men that are approximately normally distributed with a mean of 70 inches and a standard deviation of 3 inches. 23) Approximately what percentage of men in this group are between 73 inches and 76 inches? a) 13.5% b) 27% c) 34% d) 47.5% e) 81.5% Questions 23-24 refer to the heights of a group of men that are approximately normally distributed with a mean of 70 inches and a standard deviation of 3 inches. 24) What percentile would someone in this group who was 67 inches tall be? a) 0.15% b) 2.5% c) 16% d) 50% e) 84% observatio n - mean standard deviation 25) Physics classes are notorious for having very low grades on their exams (that get curved at the end of the semester). The first exam’s scores in one semester were approximately normally distributed with a mean of 40 and a standard deviation of 10. The second exam’s scores were approximately normally distributed with a mean of 30 and a standard observatio n - mean deviation of 5. Use the z-scores = standard deviation to determine which would be a better score, a 50 on exam 1 or a 40 on exam 2? a) 50 on exam 1 b) 40 on exam 2 c) The two scores are equivalent • Suppose an algebra professor found that the correlation between study time (in hours) and exam score (out of 100) is +.80, and the regression line was found to be y = 20 + 4x. He arrived at this equation through years of collecting data on his students, all of whom reported studying anywhere from 0 to 20 hours for his exams. For which values of study time does the professor’s regression equation make sense in terms of predicting exam scores? a. Between 0 and 20 hours. b. Between 0 and 100 hours. c. Anything greater than or equal to 0 hours. d. It is not possible to predict exam score with study time.