Probability Introduction to Statistics Chapter 6 Feb 11-16, 2010 Classes #8-9 Inferential Statistics By knowing the make-up of a population, we can determine the probability of obtaining specific samples In this way, probability gives us a connection between population and samples We try to use a sample to make an inference about the population See marbles example on pp. 135-136 Probability Definition In a situation where several different outcomes are possible, we define the probability for any particular outcome as a fraction or proportion Probability Formula P(event) = # of outcomes classified as the event Total # of possible outcomes The probability of event A, p(A), is the ratio of the number of outcomes that include event A to the total number of possible outcomes Probability Definition Example: What is the probability of obtaining an Ace out of a deck of cards? p(Ace) = 4/52 = .07 or 7% Probability Formula P(event) = # of outcomes classified as the event total # of possible outcomes Example: What is the probability that a selected person has a birthday in October (assume 365 days in a year)? Step 1: How many chances are there to have a birthday in a year? Step 2: How many chances are there to have a birthday in October? Step 3: The probability that a randomly selected person has a birthday in October is: P (October birthday) = 31/365 = 0.0849 Probability Definition Probability values are contained in a limited range (0-1). If P=0, the event will never occur. If P=1, the event will always occur. 1 Probability values are contained in a limited range (0-1). • If P=0, the event will never occur. Certain Likely 0.5 50-50 chance Unlikely • If P=1, the event will always occur. 0 Impossible Probability Definition Probability can be expressed as fraction, decimals or percentage. P=3/4 P=0.75 P= 75% *All of the above values are equal* Simple (or Independent) Random Sampling Each individual in the population has an equal chance of being selected p = 1/N If more than one individual is to be selected for the sample, there must be constant probability for each and every selection In cards, 1/52 for each draw Example 1: Selection and n =3 cards from full deck of cards… p(Ace) = 4/52 (draw is a 3) p(Ace) = 4/51 (draw is a 2) p(Ace) = 4/50 (draw is a ACE) If we drew a 4th card what would be the probability? p(Ace) = ?/?? What is the problem with Example 1? Remember our rules for simple random sampling… How can we fix it? See next slide… Sampling with replacement To keep the probabilities from changing from one selection to the next, it is necessary to replace each sample before you make the next selection Example of Sampling with Replacement You draw a name out of the hat and record it, you put the name back and it can be chosen again Example 1 but this time with Sampling with replacement : Selection and n =3 cards from full deck of cards… p(Ace) = 4/52 (draw is a 3) p(Ace) = 4/52 (draw is a 2) p(Ace) = 4/52 (draw is a ACE) If we drew a 4th card what would be the probability? p(Ace) = ?/?? Probability and Frequency Distributions Because probability and proportion are equivalent, a particular proportion of a graph will then correspond to a particular probability in the population Thus, whenever a population is presented in a frequency distribution graph, it will be possible to represent probabilities as proportions of the graph. Example 2: What is the probability of scoring 70 or above on this exam? p(x > 70) = ? Example 3 30 F r e q u e n c y 28 What is the probability of drawing an exam of B or better out of the pile of all 47 exams? 25 20 15 9 10 4 5 2 2 2 0 0 A B+ B C+ GRADE C D+ 0 D F Probability and the Normal Distribution Statisticians often identify sections of a normal distribution by using z-scores Recall that z-scores measure positions in a distribution in terms of standard deviations from the mean z = 1.00 is 1 SD above the mean z = -2.00 is 2 SD’s below the mean Probability and the Normal Distribution From the normal distribution graph you can then determine the percentage of scores falling above or below the zscore, between the mean and the zscore, or between 2 z-scores See next slide The Normal Curve Mean = 65 S=4 0.70 99.72% of cases Relative Frequency 0.60 95.44% of cases 0.50 68.26% of cases 0.40 0.30 0.20 0.10 0.00 2% 14% 34% 34% 14% 2% 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 -3S -2 S -1 S 0 +1 S +2S +3S The normal distribution following a z-score transformation Note: 1. Left and right sides of distribution have the same proportions. 2. Proportions apply to any normal distribution. Probability and the normal distribution • Why is this important? – We can now describe X-values (raw scores) in terms of probability. • Example: – What is the probability of randomly selecting a person who is taller than 80 inches? SD = 6 68 74 80 What is the probability of randomly selecting a person who is shorter than 74 inches? The Unit Normal Curve Table Normal curve table gives the precise percentage of scores between the mean (z score of 0) and any other z score. Can be used to determine: Proportion of scores above or below a particular z score Proportion of scores between the mean and a particular z score Proportion of scores between two z scores NOTE: Using a z score table assumes that we are dealing with a normal distribution If scores are drawn from a non-normal distribution (e.g., a rectangular distribution) converting these to z scores does not produce a normal distribution The Unit Normal Table The unit normal table lists relationships between z-scores locations and proportions in a normal distribution. Table B.1 in Appendix B, pp. 584-587 Using Appendix B.1 Appendix B.1 has four columns (A), (B), and (C) and (D) (A) = Z scores. (B) = Proportion in body (C) = Proportion in tail (D) = Proportion between mean and z The Unit Normal Table Column B: Proportion in body The Unit Normal Table Column C: Proportion in tail Check this in Table B.1 Using Appendix B.1 to Find Areas Below a Score Appendix B.1 can be used to find the areas above and below a score. First compute the Z score Make a rough sketch of the normal curve and shade in the area in which you are interested. Check this in Table B.1 The area below Z = 1.67 is 0.4525 + 0.5000 = 0.9525. Areas can be expressed as percentages: 0.9525 = 95.25% The Unit Normal Table The body (column b) always corresponds to the larger part of the distribution whether it is on the righthand side or the left-hand side. The tail (column c) is always the smaller section whether it is on the right or the left. The Unit Normal Table The proportions on the right hand side equals to the proportions on the left hand side. Proportions are always positive. The two proportions (body and tail) will always add to 1.00. The Unit Normal Table The table lists proportions of the normal distribution for a full range of possible z-score values. See chart on the next slide A portion of the unit normal table The Unit Normal Table Its important to know how to use it: If you know the specific location (zscore) in a normal distribution, you can use the table to look up the corresponding proportions. *Z-Score ~~~> Proportion* And also…. The Unit Normal Table If you know a specific proportion, you can use the table to look up the exact z-score location in the distribution. *Proportion ~~~> Z-Score* Probabilities, proportions, and scores (X values) The process… 1. Transform the x values into z-scores (Chapter 5). 2. Use the unit normal table to look up the proportions corresponding to the z-score values. See chart on the next slide A map for probability problems Example 4 – Road Race Results Stacy ran: Mean: SD: 37 minutes 36 minutes 4 minutes What percentage of people took longer than Stacy to finish the race? P(X>37) = ? Example 5: What is the probability of randomly selecting someone with an IQ score less than 120? IQ scores Form a normal distribution With mean, = 100 SD, = 15 Our question: p(x<120)? Example 5 Finding scores corresponding to specific proportions or probabilities. We can also find an x value that corresponds to a specific proportion in the distribution See next slide Example 6 What is the IQ score (X-Value) needed to be in the top 80% of the distribution? *This formula is now needed: x= +z() Example 7 After an exam, you learn that the mean for the class is 60, with a standard deviation of 10. Suppose your exam score is 70. What is your Z-score? Where, relative to the mean, does your score lie? What is the probability associated with your score (use Z table Appendix B.1)? What if your score is 72? Calculate your Z-score. What percentage of students have a score below your score? Above? What percentile are you at? What if your mark is 55%? Calculate your Z-score. What percentage of students have a score below your score? Above? What percentile are you at? Another Question… What if you want to know how much better or worse you did than someone else? Suppose you have 72% and your classmate has 55%? How much better is your score? Probability: Let’s say your classmate won’t show you the mark…. How can you make an informed guess about what your neighbor's mark might be? What is the probability that your classmate has a mark between 60% (the mean) and 70% (1 s.d. above the mean)? Credits http://publish.uwo.ca/~pakvis/The%20Normal%20Curve.ppt#289, 30,Probability: http://homepages.wmich.edu/~malavosi/Chapter6PPT_S_05.ppt# 10