Lesson 2 - Standard Deviation and Variance

advertisement
An article on peanut butter reported the following scores (quality ratings on a
scale of 0 to 100) for various brands. Construct a comparative stem-and-leaf
plot and compare the graphs.
 Creamy:
56
65
40
44
45
50
62
40
56
36
56
30
39
68
22
53
41
50
30
 Crunchy:
62
62
80
53
52
47
75
50
56
42
34
62
47
42
40
36
34
75
 Creamy:
56
65
40
44
45
50
62
40
56
36
56
30
39
68
22
53
41
50
30
 Crunchy:
62
62
80
53
52
47
75
50
56
42
34
62
47
42
40
36
34
75
Center: The center of the
creamy is roughly 45 whereas
the center for crunchy is higher
at 51.
Shape: Both are unimodal but
crunchy is skewed to the right
while creamy is more
symmetric.
Spread: The range for creamy
and crunchy are equal at 46.
There doesn’t seem to be any
gaps in the distribution.
Variation
Objective
Calculate standard
deviation and
variance.
Relevance
To be able to analyze a
set of data and see how
far each value is above or
below the mean of the
set.
Which Brand of Paint is better? Why?
Brand A
Brand B
10
35
60
45
50
30
30
35
40
40
20
25
Standard Deviation
 This is a measure of the average
distance of the observations from
their mean.
Variance
 This is the average of the squared
distance from the mean.
Formula for Population
Variance =
Standard Deviation =
Formula for Sample
Variance =
Standard Deviation =
Formulas for Variance and St. Deviation
Sample
Population
Variance
Variance
1
   x  x 2
N
1
2
s 
 x  x 
n 1
Standard Deviation
Standard Deviation
2
x
x 
1
2
(
x

x
)

N
2
x
1
2
sx 
(
x

x
)

n 1
Standard Deviation Facts
 This is a measure of the average distance of the
observations from their mean.
 Like the mean, the standard deviation is appropriate only
for symmetric data!
 The use of squared deviations makes the standard
deviation even more sensitive than the mean to
outliers! (Affected by extreme values)
Standard Deviation Facts
 One way to think about spread is to examine how far
each data value is from the mean.
 This difference is called a deviation.
 We could just average the deviations, but the positive
and negative differences always cancel each other
out! So, the average deviation is always 0  not very
helpful!
Variance Facts
 To keep them from canceling out, we square each
deviation.
 Squaring always gives a positive value, so the sum will not
be zero!
 Affected by extreme values.
 When we add up these squared deviations and find their
average (almost), we call the result the variance.
Variance
 This is the average of the squared distance from the mean.
 Variance will play an important role later – but it has a
problem as a measure of spread.
 Whatever the units of the original data are, the variance is
in squared units – we want measures of spread to have the
same units as the data, so to get back to the original units,
we take the square root of 𝒔𝟐 .
 The result is, s, is the standard deviation.
Let’s look at the data again on the number of pets
owned by a group of 9 children.
13 4 4 4 5 7 8 9
Recall that the
mean was 5 pets.
Let’s take a graphical look at the
“deviations” from the mean:
Let’s Find the Standard Deviation and Variance of
Mean  5 the Data Set of Pets 1 3 4 4 4 5 7 8 9
Pets
x
1
Deviations
𝒙 − 𝒙
1 – 5 = -4
3
3 – 5 = -2
4
4 – 5 = -1
4
4 – 5 = -1
4
4 – 5 = -1
5
5–5=0
7–5=2
7
Squared Deviations
𝒙−𝒙 𝟐
 42  16
 22  4
 12  1
 12  1
 12  1
0
22  4
8
8–5=3
32  9
9
9–5=4
Sum = 𝟎
4 2  16
Sum = 52
1
2
s 
 x  x 
n 1
Find Variance:
2
x
1
2
s 
 x  x 
n 1
1
52
2
s  52  
 6.5
8
8
2
This is the “average” squared deviation.
Find the Standard Deviation:
sx 
1
( x  x )2

n 1
s  s  6.5  2.55
2
This 2.55 is roughly the average distance of the
values in the data set from the mean.
Find the Standard Deviation
and Variance
Values
14
14 13 20 22 18 19 13
Mean = 17
Deviations
-3
Squared Deviations
9
13
-4
20
3
16
9
22
5
25
18
1
19
2
1
4
13
-4
16
Sum = 80
1
2
 x  x 
n 1
1
80
s 2  80  
 13.33
6
6
s2 
s  13.133
s  3.65
2
Homework
 Worksheet
Download