6.3 – Standard Deviation and Z

advertisement
6.3 – Standard Deviation and Z-Scores
While interquartile range is an effective measure of spread, it is awkward to calculate and has limited
usefulness. A more useful measure of spread, from a mathematical point of view, is the standard
deviation. It is a complex measure of spread with some interesting properties.
Recall that deviation is the distance a particular piece of data is from the mean.
Variance is a measure of dispersion and that is found by averaging the squares of the deviation of each
piece of data.
Standard deviation is a measure of dispersion that is the square root of the variance.
∑(𝑥 − 𝑥̅ )2
𝜎=√
𝑛
Where 𝜎 is the standard deviation, 𝑥̅ is the mean, and 𝑛 is the number in the sample.
What does this mean?
- The standard deviation averages the squares of the distances each piece of data is from the
mean. The smaller the standard deviation, the more compact the data set.
- So, if most of the data is clustered around the mean, then the standard deviation will be small.
- If the data is widely scattered, the standard deviation will be large.
Example 1: The following are test scores for two students. Who performed better?
Kyle: 73, 56, 92, 67, 88, 34, 77, 65
Janice: 69, 64, 73, 88, 67, 75, 61, 68
First, we calculated the mean of each student
Kyle
34 + 56 + 65 + 67 + 73 + 77 + 88 + 92
𝑥̅ =
8
552
𝑥̅ =
8
𝑥̅ = 69
Janice
61 + 64 + 67 + 68 + 69 + 73 + 75 + 88
𝑥̅ =
8
565
𝑥̅ =
8
𝑥̅ ≅ 70.6
Next, we will calculate the standard deviations of each student to see who was more consistent with
their marks.
Kyle:
∑(𝑥 − 𝑥̅ )2
𝜎=√
𝑛
(34 − 69)2 + (56 − 69)2 + (65 − 69)2 + (67 − 69)2 + (73 − 69)2 + (77 − 69)2 + (88 − 69)2 + (92 − 69)2
𝜎=√
8
2384
𝜎=√
8
𝜎 = 17.26
Mathematics of Data Management (MDM4UC)
Page 1
6.3 – Standard Deviation and Z-Scores
Janice:
∑(𝑥 − 𝑥̅ )2
𝜎=√
𝑛
𝜎
(61 − 70.6)2 + (64 − 70.6)2 + (67 − 70.6)2 + (68 − 70.6)2 + (69 − 70.6)2 + (73 − 70.6)2 + (75 − 70.6)2 + (88 − 70.6)2
=√
8
485.88
𝜎=√
8
𝜎 = 7.79
Kyle and Janice had averages that were about the same (69% vs 70%). Although Kyle had a larger range of
values, the Janice’s lower standard deviation proves that she was more consistent in her grades.
Sometimes, as with your major project, many data entries occur more than once. In this case, we use a
standard deviation that incorporates frequency into the formula:
∑ 𝑓(𝑥 − 𝑥̅ )2
𝜎=√
𝑛
Example 2: Candies are packed into bags and sold to people to pass out at Hallowe’en. The number of
candies in each bag is close, but not always the same. The following table summarizes the number of
candies in a sample of bags.
Number of Candies
45
46
47
48
49
Frequency
4
12
15
14
9
Calculate the mean and the standard deviation.
∑ 𝑥𝑤 4(45) + 12(46) + 15(47) + 14(48) + 9(49) 2550
𝑥̅ =
=
=
= 47.2
∑𝑤
4 + 12 + 15 + 14 + 9
54
𝜎=√
∑ 𝑓(𝑥 − 𝑥̅ )2
𝑛
𝜎=√
4(45 − 47.2)2 + 12(46 − 47.2)2 + 15(47 − 47.2)2 + 14(48 − 47.2)2 + 9(49 − 47.2)2
54
75.3
54
𝜎 = 1.18
𝜎=√
Mathematics of Data Management (MDM4UC)
Page 2
6.3 – Standard Deviation and Z-Scores
Z-Scores
A z-score indicates how many standard deviations a data value lies from the mean.
𝑧=
𝑥 − 𝑥̅
𝜎
Example 3: the final percentages in a class of grade 12 students are as follows:
54
88
64
63
47
64
43
83
69
31
71
77
52
52
15
85
62
72
78
68
73
53
65
The information was entered into a spreadsheet and the following statistical analyses were determined:
𝑛 = 23
𝑥̅ = 62.1
𝜎 = 17
There are two brothers in the class: Bryan and Brayden. If they achieved mark of 88% and 62%
respectively, how many standard deviations are each of their marks?
Bryan
Brayden
𝑥 − 𝑥̅
𝑥 − 𝑥̅
𝑧=
𝑧=
𝜎
𝜎
88 − 62.1
54 − 62.1
𝑧=
𝑧=
17
17
𝑧 = 1.52
𝑧 = −0.48
Bryan is 1.52 standard deviations above the
Brayden is 0.48 standard deviations below the
mean.
mean.
Practice: (Page 286) #1, 2, 3, 4
Mathematics of Data Management (MDM4UC)
Page 3
Download