Chapter 2 Notes Describing Location in a Distribution 2.1 – Measures of Relative Standing and Density Curves Where do I stand compared to others? NOTATION: Sample Mean Standard Deviation x Sx or s Population “mu” “sigma” Note: Never use x on the calc Sally wants to go to college. She recently took a really hard AP Stats test and got her score back. She went to several of her friends to see their scores. Chapter 1 Test Results 90 72 79 85 85 84 76 86 4 5 6 7 8 9 89 69 60 80 89 4 09 2269 0245569 00 54 48 49 72 82 90 Standardizing Values and Z-Scores We are going to see how the test score looks based on the variability of the data using the standard deviation based on the center of our data, the mean. x mean z standard deviation This tells us how many standard deviations the test score is away from the mean. Standardizing Values and Z-Scores • Standardized values have no units. • z-scores measure the distance of each data value from the mean in standard deviations. • A negative z-score tells us that the data value is below the mean, while a positive zscore tells us that the data value is above the mean. Benefits of Standardizing • Standardized values have been converted from their original units to the standard statistical unit of standard deviations from the mean. • Thus, we can compare values that are measured on different scales, with different units, or from different populations. Back to z-scores • Standardizing data into z-scores shifts the data by subtracting the mean and rescales the values by dividing by their standard deviation. – Standardizing into z-scores does not change the shape of the distribution. – Standardizing into z-scores changes the center by making the mean 0. – Standardizing into z-scores changes the spread by making the standard deviation 1. To standardize: Convert the mean to 0. Convert the standard deviation to 1. When Is a z-score BIG? • A z-score gives us an indication of how unusual a value is because it tells us how far it is from the mean. • A data value that sits right at the mean, has a z-score equal to 0. • A z-score of 1 means the data value is 1 standard deviation above the mean. • A z-score of –1 means the data value is 1 standard deviation below the mean. Example #1 Find Sally’s z-score and comment on how she did on the test. 90 72 79 85 85 84 76 86 89 69 60 80 54 48 49 72 82 90 x mean 80 75 z 0.3607 standard deviation 13.86 She is barely (only 0.36 standard deviations) above the average test score. She is slightly above average. Example #2 The distribution of the duration of human pregnancies (i.e. the number of days between conception and birth) has been found with mean = 266 and the = 16. a. What is the Z-Score for a human pregnancy of 266 days? 0 x – 266 – 266 0 Z= = = = 16 16 . That’s the mean! Example #2 The distribution of the duration of human pregnancies (i.e. the number of days between conception and birth) has been found to be approximately normal with mean = 266 and the = 16. b. What is the Z-Score for a human pregnancy of 250 days? -16 x – 250 – 266 -1 Z= = = = 16 16 One standard deviation below the mean Example #3 Adult female Dalmations weigh an average of 50 pounds with a standard deviation of 3.3 pounds. Adult female Boxers weigh an average of 57.5 pounds with a standard deviation of 1.7 pounds. Mike owns an underweight Dalmatian and an underweight Boxer. The Dalmatian weighs 45 pounds and the Boxer weighs 52 pounds. Which dog is more underweight? Explain. Dalmatian Z = x – = 45 – 50 = 3.3 Boxer x – 52 – 57.5 Z= = = 1.7 -5 -1.515 = 3.3 . -5.5 -3.235 = 1.7 . The Boxer’s weight is VERY low comparative to other Boxers. Hmm…so how unusual is -3.235? Percentile: Percent of the observations at or below a value Chebshev’s Inequality In any distribution, the percent of observations falling within k standard deviations of the mean is at least 1 100 1 2 k -sd sd mean Example #4 Find the percentile ranking of the following: a. One standard deviation from the mean 1 100 1 2 1001 1 100 0 0% 1 b. Two standard deviations from the mean 1 100 1 2 1001 .25 100 0.75 75% 2 c. Three standard deviations from the mean 1 100 1 2 1001 .11 100 0.889 88.9% 3 d. Four standard deviations from the mean 1 100 1 2 1001 .0625 100 0.9375 93.75% 4 Density Curve: • Mathematical model that describes a set of data. • Above x-axis • Total area under curve = 1 Example #5 Using the following uniform density curve, answer the question: a. Verify that this is a density curve. A = bh A = (8)(0.125) A=1 Example #5 Using the following uniform density curve, answer the question: b. What is the probability that the random variable has a value less than 3? A = bh A = (3)(0.125) A = 0.375 Example #5 Using the following uniform density curve, answer the question: c. What is the probability that the random variable has a value between 3 and 5? A = bh A = (2)(0.125) A = 0.25 Example #5 Using the following uniform density curve, answer the question: d. What is the percentile for the variable that has a value of 6? A = bh A = (6)(0.125) A = 0.75 Example #5 Using the following uniform density curve, answer the question: e. What value for the variable is in the 25th percentile? A = bh 0.25 = (b)(0.125) 2=b In a density curve: •The mean, median, and quartiles can be located by eye. •The mean, , is the balance point of the curve, if it were made of solid material. •The median, M, divides the area under the curve in half. •The quartiles with the median divide the curve into quarters. The mean and the median are the same only if the distribution is symmetrical. The median is a measure of center that is resistant to skew and outliers. The mean is not. Mean Median Mean Median Mean Median Example #6 A group of 78 third-grade students in a Midwestern elementary school took a “self-concept” test that measured how well they felt about themselves. Higher scores indicate more positive selfconcepts. A histogram for these students’ self-concept scores are given below. Draw an appropriate density curve for summarizing the histogram on the graph above. How would you describe the shape of this density curve? Frequency 20 10 0 15 25 35 45 55 SelfConcept 65 75 85 Example #6 A group of 78 third-grade students in a Midwestern elementary school took a “self-concept” test that measured how well they felt about themselves. Higher scores indicate more positive selfconcepts. A histogram for these students’ self-concept scores are given below. Draw an appropriate density curve for summarizing the histogram on the graph above. How would you describe the shape of this density curve? Frequency 20 Skew Left 10 0 15 25 35 45 55 SelfConcept 65 75 85 Example #7 Label A, B, and C as either the mean, median, or mode for each picture. A = mode B = median C = mean A = Mean, median, mode A = mean B = n/a B = median C = n/a C = mode Example #8 For the density curve below, which of the following is true? a. b. c. d. The mean and median are equal. The mean is greater than the median. The mean is less than the median The mean could be either greater than or less than the median e. The mean is 0.5 2.2 – Normal Distributions Think of how stupid the average person is, and realize half of them are stupider than that. Normal Curves: Density curve that is symmetric, unimodal, and bell-shaped It is used to compare information from different populations or to find the percentile of a certain value. N(,) where = mean and = standard deviation of the population. The mean is the center and the standard deviation is the distance from the mean. The NORMAL(or BELL-SHAPED) DISTRIBUTION describes many different data sets, like: • • • • • • • Scores on a midterm exam Weights of people Calorie consumption per day Lengths of pregnancies Heights of people IQ The # of M&M’s in a 1lb bag The 68-95-99.7 Rule (Empirical Rule): In the normal distribution with mean and standard deviation : • 68% of the observations fall within of the mean , or • 95% of the observations fall within 2 of , or 2 • 99.7% of the observations fall within 3 of , or 3 Example #5 Find the area between each of the deviations. Why are these not percentiles? So, how do you know if it is normal? Assessing Normality: Looks can be deceiving! Method #1: Make a histogram 1. Find the mean and sd 2. Measure the intervals for 1, 2, and 3 sd 3. Count how many observations fall between these standard deviations 4. Compare to the 68-95-99.7 Rule Remember the Dalmatians? Example #9: Here is a sample of 25 dalmatians from a local vets office. Determine if their weights follow a normal distribution. 53 48 52 51 52 46 45 47 48 52 51 55 Mean = 49.92lbs S.D. = 3.34lbs 51 48 46 47 54 51 51 46 55 56 50 44 49 100% 60% 0 39.9 5 43.24 6 46.58 9 49.92 4 53.26 Gosh, tough call! 0 56.60 59.94 Method #2: Normal Probability Plot Zscore 1. Plug data into a list 2. Make a Normal Probability Plot 3. If the data is normal then it will make a straight line data Normal Left Skew Right Skew Bimodal Calculator Tip: Normal Probability Plot Statplot – Normal Probability Plot Back to the dogs… Example #10: Make a normal probability plot of the weights of dalmatians. Example #11 The plot shown at the right is a Normal probability plot for a set of data. The data value is plotted on the x axis, and the standardized value is plotted on the y axis. Which statement is true for these data? (a) The data are clearly Normally distributed. (b) The data are approximately Normally distributed. (c) The data are clearly skewed to the left. (d) The data are clearly skewed to the right. (e) There is insufficient information to determine the shape of the distribution. Day 2 of 2.2! The Standard Normal Curve: The requirement is that the data must me normally distributed. We will use TABLE A. (In the text, you have TABLE A in the front 2 pages of the book.) How can you find me? Example #12 Let’s us TABLE A to find the following: Draw the picture first, shade the region you want and look up the X in Table A to find the proportion, probability (percentage) to the left of that z-score. The proportion is also known as probability that the value of a particular member of a population will fall in the given interval. a. P (Z < -2.20) = =1 =0 Example #12 Let’s us TABLE A to find the following: Draw the picture first, shade the region you want and look up the X in Table A to find the proportion, probability (percentage) to the left of that z-score. The proportion is also known as probability that the value of a particular member of a population will fall in the given interval. a. P (Z < -2.20) = =1 Z -2.20 =0 Example #12 Let’s us TABLE A to find the following: Draw the picture first, shade the region you want and look up the X in Table A to find the proportion, probability (percentage) to the left of that z-score. The proportion is also known as probability that the value of a particular member of a population will fall in the given interval. a. P (Z < -2.20) = 0.0139 =1 Z -2.20 =0 Example #12 Let’s us TABLE A to find the following: Draw the picture first, shade the region you want and look up the X in Table A to find the proportion, probability (percentage) to the left of that z-score. The proportion is also known as probability that the value of a particular member of a population will fall in the given interval. b. P (Z > 2.20) = =1 =0 Z 2.20 Example #12 Let’s us TABLE A to find the following: Draw the picture first, shade the region you want and look up the X in Table A to find the proportion, probability (percentage) to the left of that z-score. The proportion is also known as probability that the value of a particular member of a population will fall in the given interval. b. P (Z > 2.20) = 1 – P(Z < 2.20) = =1 =0 Z 2.20 Example #12 Let’s us TABLE A to find the following: Draw the picture first, shade the region you want and look up the X in Table A to find the proportion, probability (percentage) to the left of that z-score. The proportion is also known as probability that the value of a particular member of a population will fall in the given interval. b. P (Z > 2.20) = 1 – P(Z < 2.20) = 1 – 0.9861 = 0.0139 =1 =0 Z 2.20 Example #12 Let’s us TABLE A to find the following: Draw the picture first, shade the region you want and look up the X in Table A to find the proportion, probability (percentage) to the left of that z-score. The proportion is also known as probability that the value of a particular member of a population will fall in the given interval. c. P(Z > -0.95) = 1 – P(Z <-0.95) = =1 =0 Z -0.95 Example #12 Let’s us TABLE A to find the following: Draw the picture first, shade the region you want and look up the X in Table A to find the proportion, probability (percentage) to the left of that z-score. The proportion is also known as probability that the value of a particular member of a population will fall in the given interval. c. P(Z > -0.95) = 1 – P(Z <-0.95) = 1 – 0.1711 = 0.8289 =1 =0 Z -0.95 Note: Example #12 Let’s us TABLE A to find the following: Draw the picture first, shade the region you want and look up the X in Table A to find the proportion, probability (percentage) to the left of that z-score. The proportion is also known as probability that the value of a particular member of a population will fall in the given interval. d. P (Z < 1.25) = =1 =0 Z 1.25 Example #12 Let’s us TABLE A to find the following: Draw the picture first, shade the region you want and look up the X in Table A to find the proportion, probability (percentage) to the left of that z-score. The proportion is also known as probability that the value of a particular member of a population will fall in the given interval. d. P (Z < 1.25) = 0.8944 =1 =0 Z 1.25 Example #12 Let’s us TABLE A to find the following: Draw the picture first, shade the region you want and look up the X in Table A to find the proportion, probability (percentage) to the left of that z-score. The proportion is also known as probability that the value of a particular member of a population will fall in the given interval. e. P (-1.04 < Z < 3.01) = P(Z < 3.01) – P(Z < –1.04) =1 Z =0 -1.04 Z 3.01 e. P (-1.04 < Z < 3.01) = P(Z < 3.01) – P(Z < –1.04) = 0.9987 – 0.1492 = 0.8495 =1 Z =0 -1.04 Z 3.01 Example #12 Let’s us TABLE A to find the following: Draw the picture first, shade the region you want and look up the X in Table A to find the proportion, probability (percentage) to the left of that z-score. The proportion is also known as probability that the value of a particular member of a population will fall in the given interval. f. P (0.15 < Z < 1.41) = P(Z < 1.41) – P(Z<0.15) =1 =0 Z Z 1.41 0.15 f. P (0.15 < Z < 1.41) = P(Z < 1.41) – P(Z<0.15) = 0.9207 – 0.5596 = 0.3611 =1 =0 Z Z 1.41 0.15 Calculator Tip: Catalog Help Apps – CtlgHelp – Enter – Enter Calculator Tip: Z-score Probabilities 2nd Dist – Normalcdf, (lowerbound, upperbound, mean, sd) Example #13 Suppose that the fuel efficiency (in miles per gallon) of a Beetle varies with each tank of gas according to a normal distribution with = 34 and standard deviation = 3.5 miles per gallon. a. What proportion of all tanks would get 29 miles per gallon or less? -5 x – 29 – 34 Z= = -1.4285 = = 3.5 3.5 P (Z < -1.4285) = =1 = 3.5 x 29 =34 Z -1.4285 =0 P (Z < -1.4285) = 0.0764 =1 = 3.5 x 29 =34 Z -1.4285 =0 Example #13 Suppose that the fuel efficiency (in miles per gallon) of a Beetle varies with each tank of gas according to a normal distribution with = 34 and standard deviation = 3.5 miles per gallon. b. What proportion of all tanks would get 40 miles per gallon or more? 6 x – 40 – 34 Z= = 1.7143 = = 3.5 3.5 . P (Z > 1.7143) = = 3.5 =34 x 40 =1 =0 Z 1.71 P (Z > 1.7143) = 0.0436 = 3.5 =34 x 40 =1 =0 Z 1.71 Example #13 Suppose that the fuel efficiency (in miles per gallon) of a Beetle varies with each tank of gas according to a normal distribution with = 34 and standard deviation = 3.5 miles per gallon. c. What proportion of all tanks would get between 27 and 42 miles per gallon? -7 x – 27 – 34 Z= = -2 = = 3.5 3.5 8 x – 42 – 34 Z= = 2.285 = = 3.5 3.5 . P (-2 < Z < 2.29) = P(Z < 2.29) – P(Z < -2) =1 = 3.5 Z 27 = 34 Z 42 Z -2 =0 Z 2.29 P (-2 < Z < 2.29) = P(Z < 2.29) – P(Z < -2) = 0.9890 – 0.0228 = 0.9662 =1 = 3.5 Z 27 = 34 Z 42 Z -2 =0 Z 2.29 Example #13 Suppose that the fuel efficiency (in miles per gallon) of a Beetle varies with each tank of gas according to a normal distribution with = 34 and standard deviation = 3.5 miles per gallon. d. What proportion of all tanks would get 47 miles per gallon or more? Less than 47 miles per gallon? 13 x – 47 – 34 Z= = 3.715 = = 3.5 3.5 P (Z < 3.715) = =1 =0 Z 3.72 P (Z > 3.715) = =1 =0 Z 3.72 P (Z < 3.715) = 1 =1 =0 Z 3.72 P (Z > 3.715) = 0 =1 =0 Z 3.72 Example #14: Golf courses have a wide range of difficulty. Similarly, players differ in ability. In order to adjust for variations between players, they are often assigned a handicap score. To adjust for variations between courses, a handicapper decides to compare the golfer’s score against the data from the course. Suppose course A plays at a mean score of 76 with a standard deviation of 8 strokes with a normal distribution of scores. The mean score for course B is 80 with a standard deviation of 6 strokes and the scores are normally distributed. If a golfer regularly shoots an 80 on course A, what should be the comparable score on course B? =8 = 76 x 80 =6 = 80 x=? 4 x – 80 – 76 Z= 0.5 = = 8 = 8 x – Z= x – 80 0.5 = 6 3 = x – 80 83 = x What if I know the percentile, but not X? Inverse Normal Probability Calculations: Un-standardizing Example #15: What value(s) of Z cut off the region described? a. The lowest 11% =1 0.11 Z=? =0 Example #15: What value(s) of Z cut off the region described? a. The lowest 11% Z = -1.23 =1 0.11 Z=? =0 Example #15: What value(s) of Z cut off the region described? b. The highest 30% =1 0.30 =0 Z=? Example #15: What value(s) of Z cut off the region described? b. The highest 30% Z = 0.52 =1 0.30 =0 Z=? Example #15: What value(s) of Z cut off the region described? c. The highest 7% =1 0.07 =0 Z=? Example #15: What value(s) of Z cut off the region described? c. The highest 7% Z = 1.48 =1 0.07 =0 Z=? Example #15: What value(s) of Z cut off the region described? d. The middle 50% 75% – 25% =1 Z=? = 0 Z=? Example #15: What value(s) of Z cut off the region described? d. The middle 50% 75% – 25% Z = 0.67 and Z = -0.67 =1 Z=? = 0 Z=? Want Z? Calculator Tip: Z-score given probabilities 2nd Dist – invNorm(%) Want X? Calculator Tip: Z-score given probabilities 2nd Dist – invNorm(%, , ) Steps for the Inverse Probability Calculation: • Draw a picture • Identify the z-value from the given value of the proportion – look up the proportion in the MIDDLE of Table A. • Solve for x: x – Z= x = Z() + Example # 16 A British company called Molebegon removes unwanted moles from gardens. In 1995, the European Union announced that the tiny moles are just too difficult to catch. They will not attempt to catch the smallest 10%. Molebegon’s past records indicate that weights of moles are normally distributed with a mean of 150 grams and a standard deviation of 32 grams. What is the cut off weight for the moles they will catch? x = Z() + = 32 0.10 Z=? =150 Example # 16 A British company called Molebegon removes unwanted moles from gardens. In 1995, the European Union announced that the tiny moles are just too difficult to catch. They will not attempt to catch the smallest 10%. Molebegon’s past records indicate that weights of moles are normally distributed with a mean of 150 grams and a standard deviation of 32 grams. What is the cut off weight for the moles they will catch? x = Z() + = 32 x = (-1.28)(32) + 150 x = -40.96 + 150 x = 109.04 grams 0.10 Z=? =150 Example #17 ACT versus SAT, I There are two major tests of readiness for college, the ACT and the SAT. ACT scores are reported on a scale from 1 to 36. The distribution of ACT scores in recent years has been roughly Normal with mean = 20.9 and standard deviation = 4.8. SAT scores (prior to 2005) were reported on a scale from 400 to 1600. SAT scores have been roughly Normal with mean = 1026 and standard deviation = 209. The following exercises are based on this information. a. Jose scores 1287 on the SAT. Assuming that both tests measure the same thing, what score on the ACT is equivalent to Jose's SAT score? Explain. SAT = 209 = 1026 x 1287 ACT = 4.8 = 20.9 x=? SAT 261 x – 1287 – 1026 Z= = 1.25 = = 209 209 x – ACT Z = x – 20.9 1.25 = 4.8 5.99 = x – 20.9 26.89 = x b. Reports on a student's ACT or SAT usually give the percentile as well as the actual score. Tonya scores 1318 on the SAT. What is her percentile? Show your method. x – 1318 – 1026 = 1.40 Z= = 209 P(Z < 1.40) = = 209 =1026 x = 1318 b. Reports on a student's ACT or SAT usually give the percentile as well as the actual score. Tonya scores 1318 on the SAT. What is her percentile? Show your method. x – 1318 – 1026 = 1.40 Z= = 209 P(Z < 1.40) = 0.9192 = 209 The 91st percentile =1026 x = 1318 c. The quartiles of any distribution are the values with cumulative proportions 0.25 and 0.75. What are the quartiles of the distribution of ACT scores? Show your method. x = Z() + = 4.8 25% Z? = 20.9 c. The quartiles of any distribution are the values with cumulative proportions 0.25 and 0.75. What are the quartiles of the distribution of ACT scores? Show your method. x = Z() + x = (-0.67)(4.8) + 20.9 x = -3.216 + 20.9 x = 17.684 = 4.8 25% Z? = 20.9 c. The quartiles of any distribution are the values with cumulative proportions 0.25 and 0.75. What are the quartiles of the distribution of ACT scores? Show your method. x = Z() + = 4.8 25% = 20.9 Z? c. The quartiles of any distribution are the values with cumulative proportions 0.25 and 0.75. What are the quartiles of the distribution of ACT scores? Show your method. x = Z() + x = (0.67)(4.8) + 20.9 x = 3.216 + 20.9 x = 24.116 = 4.8 25% = 20.9 x=?