2.5 Normal Distributions and z-scores Comparing marks • Stephanie and Tavia are both in the running for the Data Management award. Stephanie has 94% and Tavia has 92%, from different classes. • Who deserves the award? • What if I told you it was Tavia? • Why? • Stephanie’s class: mean 78, = 9.36 Scatter Plot Normal Distributions 0.07 0.07 0.06 0.06 0.05 0.05 0.04 0.04 0.03 0.03 0.02 0.02 0.01 0.01 0.00 0.00 50 60 70 80 90 100 Scatter Plot Normal Distributions data data • Tavia’s class: mean 73, = 8.19 50 60 70 x data = normalDensity x mean sd 80 x data = normalDensity x mean sd • Distributions are different • Fair comparison not possible …yet 90 100 Standard Normal Distribution – this process is called standardizing X N(0,12 ) Scatter Plot Standard Normal Distribution 0.6 0.5 data • Mean 0, standard deviation 1 • Can translate each element of a normal distribution to standard normal distribution by finding number of a given score is away from the mean 0.4 0.3 0.2 0.1 -8 -6 -4 -2 0 x data = normalDensity x 2 4 6 8 z-scores • z = The number of standard deviations a given score x is above or below the mean x x z z xx • z = z-score above the mean – Positive: value lies _________ below – Negative: value lies _________ the mean Example 1: Calculating z-scores • Consider the distribution X N(14, 4 ) • Find the number of standard deviations each piece of data lies above or below the mean. • A) x = 11 B) x = 21.5 2 z xx 11 14 4 0.75 z xx 21.5 14 4 1.88 Note: z-scores are always rounded to 2 decimal places Example 2: Comparing data using z-scores • Stephanie and Tavia are both in the running for the Data Management award. Stephanie has 94% and Tavia has 92%. If Stephanie’s class has a mean of 78 and = 9.36, and Tavia’s class has a mean of 73 and = 8.19. Who deserves the award? Example 2 • Use z-scores: • Stephanie: z xx 94 78 9.36 1.71 Tavia z xx 92 73 8.19 Tavia’s z-score is higher, therefore her result is better. z-Score Table • appendix B, pp. 398-399 of text • Determines percentage of data that has equal or lesser z-score than a given value Example: P(z < -2.34) = 0.0096 –2.9 –2.8 –2.7 –2.6 –2.5 –2.4 –2.3 –2.2 0.00 0.0019 0.0026 0.0035 0.0047 0.0062 0.0082 0.0107 0.0139 0.01 0.0018 0.0025 0.0034 0.0045 0.006 0.008 0.0104 0.0136 0.02 0.0018 0.0024 0.0033 0.0044 0.0059 0.0078 0.0102 0.0132 0.03 0.0017 0.0023 0.0032 0.0043 0.0057 0.0075 0.0099 0.0129 0.04 0.0016 0.0023 0.0031 0.0041 0.0055 0.0073 0.0096 0.0125 Only 0.96 % of the data has a lower z-score, and 1 – 0.0096 = 99.04% of the data has a higher z-score Note • Notice z-score table does not go above 2.99 or below –2.99 • Any value with z-score above 3 or less than –3 is considered an outlier – If z > 2.99, P(z < 2.99) = 100% – If z < -2.99, P(z < -2.99) = 0% • If z = 0, P(z < 0) = 50% – The data point is the mean Percentiles • The kth percentile is the data value that is greater than k % of the population • Example z = 0.40 P( z 0.40) 0.6554 65.54 % of the data are below this data point. It is in the 66th percentile. z = 1.67 P( z 1.67) 0.9525 95.25 % of the data are below this data point. It is in the 96th percentile. Example 3: Finding Ranges • Given X N(7, 2.22 ) , find the percent of data that lies in the following intervals: A) 3 < x < 6 B) x > 8.6 For x = 3, z xx For x = 6, z xx 37 2.2 67 2.2 1.82 0.45 P( z 1.82) 0.0344 P( z 0.45) 0.3264 Example 3: Finding Ranges • Given X N(7, 2.22 ) , find the percent of data that lies in the following intervals: • A) 3 < x < 6 B) x > 8.6 P( z 0.45) 0.3264 P( z 1.82) 0.0344 P(3 x 6) P(1.82 z 0.45) 0.3264 0.0344 0.2920 So 29.20% of the data fills this interval. Example 3: Finding Ranges • Given X N(7, 2.22 ) , find the percent of data that lies in the following intervals: • A) 3 < x < 6x x B) x > 8.6 For x = 8.6, z 8.6 7 2.2 0.73 So 23.27% of the data P( x 8.6) P( z 0.73) P ( x 8.6) 1 P( x 8.6) lies above 1 0.7673 8.6. 0.7673 0.2327