The Standard Deviation as a Ruler A student got a 67/75 on the first exam and a 64/75 on the second exam. She was disappointed that she did not score as well on the second exam. To her surprise, the professor said she actually did better on the second exam, relative to the rest of the class. 1 The Standard Deviation as a Ruler How can this be? Both exams exhibit variation in the scores. However, that variation may be different from one exam to the next. The standard deviation provides a ruler for comparing the two exam scores. 2 Summarizing Exam Scores Exam 1 – Score: 67 – Mean: y 59.5 – Standard Deviation: s 8.61 Exam 2 – Score: 64 – Mean: y 50.1 – Standard Deviation: s 11.86 3 Standardizing Look at the number of standard deviations the score is from the mean. y y z s 4 Standardized Exam Scores Exam 1 Exam 2 – Score: 67 – Score: 64 67 59.5 z 8.61 z 0.87 64 50.1 z 11.86 z 1.17 5 Standardized Exam Scores On exam 1, the 67 was 0.87 standard deviations better than the mean. On exam 2, the 64 was 1.17 standard deviations better than the mean. 6 Standardizing Shifts the distribution by subtracting off the mean. Rescales the distribution by dividing by the standard deviation. 7 Distribution of Low Temps 20 10 Count 15 5 -10 0 10 20 30 40 50 Low Temperature (o F) 8 Shifting the Distribution 20 10 Count 15 5 -40 -30 -20 -10 0 10 Low Temperature – 32 (o F) 20 9 Shifting Temperature (o F) Temp – 32 (o F) – Median: 24.0o F – Median: –8o F – Mean: 24.4o F – Mean: –7.6o F – IQR: 16.0o F – IQR: 16.0o F – Std Dev: 11.22o F – Std Dev: 11.22o F 10 Shifting When adding (or subtracting) a constant: – Measures of position and center increase (or decrease) by that constant. – Measures of spread do not change. 11 Rescaling 10 Count 15 5 -20 -15 -10 -5 0 5 10 Low Temperature (o C) 12 Rescaling Temp – 32 (o F) Temperature (o C) – Median: –8o F – Median: –4.4o F – Mean: –7.6o F – Mean: –4.2o F – IQR: 16.0o F – IQR: 8.9o F – Std Dev: 11.22o F – Std Dev: 6.24o F 13 Rescaling When multiplying (or dividing) by a constant: – All measures of position, center and spread are multiplied (or divided) by that constant. 14 Standardizing Standardizing does not change the shape of the distribution. Standardizing changes the center by making the mean 0. Standardizing changes the spread by making the standard deviation 1. 15 Normal Models Our conceptualization of what the distribution of an entire population of values would look like. Characterized by population parameters: μ and σ. 16 30 Percent 20 10 0 40 45 50 55 60 65 70 75 80 Height 17 Describe the sample Shape is symmetric and mounded in the middle. Centered at 60 inches. Spread between 45 and 75 inches. 30% of the sample is between 60 and 65 inches. 18 Normal Models Our conceptualization of what the distribution of an entire population of values would look like. Characterized by a bell shaped curve with population parameters – Population mean = μ – Population standard deviation = σ. 19 Sample Data 0.08 0.07 Density 0.06 0.05 0.04 0.03 0.02 0.01 0.00 40 45 50 55 60 65 70 75 80 Height 20 Normal Model 0.08 0.07 Density 0.06 0.05 0.04 0.03 0.02 0.01 0.00 40 45 50 55 60 65 70 75 80 Height (inches) 21 Normal Model 0.08 0.07 0.05 0.04 0.03 0.02 0.01 0.00 40 45 50 55 60 65 70 75 80 Height (inches) 0.08 0.07 0.06 Density Sample – a few items from the population. Example: 550 children. 0.06 Density Population – all items of interest. Example: All children age 5 to 19. Variable: Height 0.05 0.04 0.03 0.02 0.01 0.00 40 45 50 55 60 65 70 75 80 Height 22 Normal Model Height Center: – Population mean, μ = 60 in. Spread: – Population standard deviation, σ = 6 in. 23 68-95-99.7 Rule For Normal Models – 68% of the values fall within 1 standard deviation of the mean. – 95% of the values fall within 2 standard deviations of the mean. – 99.7% of the values fall within 3 standard deviations of the mean. 24 Normal Model - Height 68% of the values fall between 60 – 6 = 54 and 60 + 6 = 66. 95% of the values fall between 60 – 12 = 48 and 60 + 12 = 72. 99.7% of the values fall between 60 – 18 = 42 and 60 + 18 = 78. 25 From Heights to Percentages What percentage of heights fall above 70 inches? Draw a picture. How far away from the mean is 70 in terms of number of standard deviations? 26 Normal Model 0.08 0.07 Density 0.06 0.05 0.04 Shaded area? 0.03 0.02 0.01 0.00 40 45 50 55 60 65 70 75 80 Height (inches) 27 Standardizing z y 70 60 z 1.67 6 28 Standard Normal Model Table Z: Areas under the standard Normal curve in the back of your text. On line: http://davidmlane.com/hyperstat/z_table.html 29 From Percentages to Heights What height corresponds to the 75th percentile? Draw a picture. The 75th percentile is how many standard deviations away from the mean? 30 Normal Model 0.08 25% 0.07 Density 0.06 0.05 50% 0.04 0.03 25% 0.02 0.01 0.00 40 45 50 55 60 65 70 75 80 Height (inches) 31 Standard Normal Model Table Z: Areas under the standard Normal curve in the back of your text. On line: http://davidmlane.com/hyperstat/z_table.html 32 Reverse Standardizing z y y 60 0.67 6 y 6 * 0.67 60 64.02 33 Do Data Come from a Normal Model? The histogram should be mounded in the middle and symmetric. The data plotted on a normal probability (quantile) plot should follow a diagonal line. – The normal quantile plot is an option in JMP: Analyze – Distribution. 34 Do Data Come from a Normal Model? Octane ratings – 40 gallons of gasoline taken from randomly selected gas stations. Amplifier gain – the amount (decibels) an amplifier increases the signal. Height – 550 children age 5 to 19. 35 .99 2 .95 .90 .75 .50 1 0 .25 .10 .05 .01 Normal Quantile Plot 3 -1 -2 -3 6 4 Count 8 2 85 90 Octane Rating 95 36 .99 2 .95 .90 .75 .50 1 0 .25 .10 .05 .01 Normal Quantile Plot 3 -1 -2 -3 25 15 Count 20 10 5 7.5 8 8.5 9 9.5 10 10.5 11 11.5 12 Amplifier Gain (dB) 37 .99 2 .95 .90 1 .75 0 .50 .25 Normal Quantile Plot 3 -1 .10 .05 -2 .01 -3 100 Count 150 50 45 50 55 60 65 70 75 38 Nearly normal? Is the histogram basically symmetric and mounded in the middle? Do the points on the Normal Quantile plot fall close to the red diagonal (Normal model) line? 39