Standard Deviation Notes

advertisement
An article on peanut butter reported the following scores (quality ratings on a
scale of 0 to 100) for various brands. Construct a comparative stem-and-leaf
plot and compare the graphs.
 Creamy:
56
65
40
44
45
50
62
40
56
36
56
30
39
68
22
53
41
50
30
 Crunchy:
62
62
80
53
52
47
75
50
56
42
34
62
47
42
40
36
34
75
 Creamy:
56
65
40
44
45
50
62
40
56
36
56
30
39
68
22
53
41
50
30
 Crunchy:
62
62
80
53
52
47
75
50
56
42
34
62
47
42
40
36
34
75
Center: The center of the
creamy is roughly 45 whereas
the center for crunchy is higher
at 51.
Shape: Both are unimodal but
crunchy is skewed to the right
while creamy is more
symmetric.
Spread: The range for creamy
and crunchy are equal at. There
doesn’t seem to be any gaps in
the distribution.
Variation
Which Brand of Paint is better? Why?
Brand A
Brand B
10
35
60
45
50
30
30
35
40
40
20
25
Standard Deviation
 It’s a measure of the typical or average deviation
(difference) from the mean.
Variance
 This is the average of the squared distance from
the mean.
Which Brand of Paint is better? Why?
Brand A
Brand B
10
35
60
45
50
30
30
35
40
40
20
25
Does the Average Help?
 Paint A: Avg = 210/6 = 35 months
 Paint B: Avg = 210/6 = 35 months
 They both last 35 months before fading.
No help in deciding which to buy.
Consider the Spread
 Paint A: Spread = 60 – 10 = 50 months
 Paint B: Spread = 45 – 25 = 20 months
 Paint B has a smaller variance which
means that it performs more consistently.
Choose paint B.
Formula for Population
Variance =
Standard Deviation =
Formula for Sample
Variance =
Standard Deviation =
Formulas for Variance and St. Deviation
Sample
Population
Variance
Variance
1
   x  x 2
N
1
2
s 
 x  x 
n 1
Standard Deviation
Standard Deviation
2
x
x 
1
2
(
x

x
)

N
2
x
1
2
sx 
(
x

x
)

n 1
Standard Deviation
 A more powerful approach to determining how much
individual data values vary.
 This is a measure of the average distance of the
observations from their mean.
 Like the mean, the standard deviation is appropriate only
for symmetric data!
 The use of squared deviations makes the standard
deviation even more sensitive than the mean to
outliers!
Standard Deviation
 One way to think about spread is to examine how far
each data value is from the mean.
 This difference is called a deviation.
 We could just average the deviations, but the positive
and negative differences always cancel each other
out! So, the average deviation is always 0  not very
helpful!
Finding Variance
 To keep them from canceling out, we square each
deviation.
 Squaring always gives a positive value, so the sum will not
be zero!
 Squaring also emphasizes larger differences – a feature that
turns out to be good and bad.
 When we add up these squared deviations and find their
average (almost), we call the result the variance.
Finding Standard Deviation
 This is the average of the squared distance from the mean.
 Variance will play an important role later – but it has a
problem as a measure of spread.
 Whatever the units of the original data are, the variance is
in squared units – we want measures of spread to have the
same units as the data, so to get back to the original units,
we take the square root of 𝒔𝟐 .
 The result is, s, is the standard deviation.
Let’s look at the data again on the number of pets
owned by a group of 9 children.
13 4 4 4 5 7 8 9
Recall that the
mean was 5 pets.
Let’s take a graphical look at the
“deviations” from the mean:
Let’s Find the Standard Deviation and Variance of
Mean  5 the Data Set of Pets 1 3 4 4 4 5 7 8 9
Pets
x
1
Deviations
𝒙 − 𝒙
1 – 5 = -4
3
3 – 5 = -2
4
4 – 5 = -1
4
4 – 5 = -1
4
4 – 5 = -1
5
5–5=0
7–5=2
7
8
8–5=3
9
9–5=4
Sum = 𝟎
Squared Deviations
𝒙−𝒙 𝟐
 42  16
 22  4
 12  1
 12  1
 12  1
0
22  4
32  9
4 2  16
Sum = 16
1
2
s 
 x  x 
n 1
Find Variance:
2
x
1
2
s 
 x  x 
n 1
1
52
2
s  52 
 6.5
8
8
2
This is the “average” squared deviation.
Find the Standard Deviation:
sx 
1
( x  x )2

n 1
s  s  6.5  2.55
2
This 2.55 is roughly the average distance of the
values in the data set from the mean.
Find the Standard Deviation
and Variance
Values
14 13 20 22 18 19 13
Deviations
Squared Deviations
14
13
20
22
18
19
13
s  13.133
s  3.65
2
Homework
 Worksheet
Download