2.5 Measures of Variation

advertisement
Lecture 4: Measures of Variation
2.5 Measures of Variation
Slide 1
Slide 2
Review of Lecture 3: Measures of Center
• Given a stem –and-leaf plot
Be able to find
» Mean
• (40+42+3*50+51+2*52+64+67)/10=46.7
» Median
• (50+51)/2=50.5
» mode
Stem (tens)
Leaves (units)
4
02
5
6
000122
47
5th
• 50
• Given a regular frequency distribution
Be able to find
» Sample size
•2+4+5+16+13=40
» Mean
•(8+12+10+16+0)/40=1.15
» Median:
•average of the two middle values=1
Median
group
f
fx
Cum
Freq
4
2
8
2
3
4
12
6
2
5
10
11
1
16 16
27
0
13 0
40=n
Statistics handles variation. Thus this section one of
the most important sections in the entire book
Standard Deviation
Slide 3
The range of a set of data is the
difference between the highest
value and the lowest value
Range=(Highest value) – (Lowest value)
Example: Range of {1, 3, 14} is 14-1=13.
Sample Standard
Deviation Formula
Candidates: Range
Standard Deviation, Variance
Coefficient of Variation
6th
# of
phones (x)
Definition
Measure of Variation (Measure of Dispersion):
A measure helps us to know the spread
of a data set.
Slide 5
1
The standard deviation of a set of
values is a measure of variation of
values about the mean
We introduce two standard deviation:
• Sample standard deviation
• Population standard deviation
Sample Standard Deviation
(Shortcut Formula)
Data value
S=
Formula 2-4
Σ (x - x)2
n-1
Sample size
Slide 4
s=
Formula 2-5
n (Σ
Σx2) - (Σ
Σx)2
n (n - 1)
Slide 6
Standard Deviation Key Points
Example: Publix check-out waitingSlide 7
times in minutes
Data: 1, 4, 10. Find the sample mean and sample
standard deviation.
Using the shortcut
x−x
formula:
( x − x )2 x2
x
n=3
15
= 5.0 min
3
∑ (x − x) ∑ x
∑x
∑ (x − x )
2
s=
n −1
n ∑ x 2 − (∑ x )
! The value of the standard deviation s is usually
positive and always non-negative.
=
3(117) − (15)
3(3 − 1)
=
351 − 225
126
=
6
6
! The value of the standard deviation s can increase
dramatically with the inclusion of one or more
outliers (data values far away from all others)
s=
100
117
2
n(n − 1)
2
2
42
=
= 21 = 4.6 min
3 −1
! The units of the standard deviation s are the same as
the units of the original data values
= 21 = 4.6 min
Population Standard
Deviation
2
N
Slide 10
! The variance of a set of values is a measure of
variation equal to the square of the standard
deviation.
! Sample variance s2: Square of the sample standard
deviation s
! Population variance: Square of the population
standard deviation σ
This formula is similar to Formula 2-4, but
instead the population mean and population
size are used
Variance - Notation
Variance
Slide 9
Σ (x - µ)2
σ =
! The standard deviation is a measure of variation of
all values from the mean
2
1
16
16
1
25
42
-1
5
Slide 11
Round-off Rule
for Measures of Variation
Slide 12
standard deviation squared
Notation
}
x=
1−5= -4
1
4
10
15
Slide 8
s2
σ
2
Sample variance
Population variance
Carry one more decimal place than
is present in the original set of
data.
Round only the final answer, not values in
the middle of a calculation.
Definition
Example: How to compare the variability
in heights and weights of men?
Slide 13
Sample: 40 males were randomly selected. The
summarized statistics are given below.
The coefficient of variation (or CV) for a set of
sample or population data, expressed as a
percent, describes the standard deviation relative
to the mean
Sample
CV =
Sample mean
Population
s
•100%
x
CV =
σ
•100%
µ
68.34 in
Weight
172.55 lb
26.33 lb
s
3.02
• 100% = 4.42%
x
68.34
s
26.33
Weights: CV = • 100% =
• 100% = 15.26%
x
172.55
Heights: CV = • 100% =
n [Σ(f • x 2)] - [Σ(f • x)]2
n (n - 1)
Use the class midpoints as the x values
3
TV sets (x)
0
1
2
3
4
Total
(a) x =
# of Households (f)
4
33
28
10
5
80
Use
s≈
Range
4
Where range = (highest value) – (lowest value)
Slide 16
fx
0
33
56
30
20
139
fx2
0
33
112
90
80
315
Compute:
(a) the sample
mean
(b) the sample
standard
deviation
139
= 1.7sets
80
n∑ ( fx 2 ) − (∑ fx )
2
For estimating a value of the standard deviation s,
Heights (with
CV=4.42%) have
considerably less
variation than
weights (with
CV=15.26%)
• A random sample of 80 households was selected
• Number of TV owned is collected given below.
(b) s =
Estimation of Standard Deviation
Slide 17
Range Rule of Thumb
Conclusion:
Example: Number of TV sets
Owned by households
Slide 15
Formula 2-6
S=
Height
Sample standard
deviation
3.02 in
Solution: Use CV to compare the variability
• A measure good at comparing variation between populations
• No unit makes comparing apple and pear possible.
Standard Deviation from a
Frequency Distribution
Slide 14
n( n − 1)
=
80(315) − (139) 2
5879
=
= 1.0 sets
80(80 − 1)
6320
Estimation of Standard Deviation
Slide 18
Range Rule of Thumb
For interpreting a known value of the standard deviation s,
find rough estimates of the minimum and maximum
“usual” values by using:
Minimum “usual” value
≈ (mean) – 2 X (standard deviation)
Maximum “usual” value ≈ (mean) + 2 X (standard deviation)
Definition
The Empirical Rule
Slide 19
Slide 20
Empirical (68-95-99.7) Rule
For data sets having a distribution that is approximately
bell shaped, the following properties apply:
! About 68% of all values fall within 1 standard
deviation of the mean
! About 95% of all values fall within 2 standard
deviations of the mean
! About 99.7% of all values fall within 3 standard
deviations of the mean
FIGURE 2-13
The Empirical Rule
The Empirical Rule
Slide 21
Slide 22
4
FIGURE 2-13
Recap
FIGURE 2-13
Slide 23
Homework Assignment 4
Slide 24
In this section we have looked at:
! Range
• problems 2.5: 1, 3, 7, 9, 11, 17, 23, 25, 27, 31
! Standard deviation of a sample and population
! Variance of a sample and population
! Coefficient of Variation (CV)
! Standard deviation using a frequency distribution
! Range Rule of Thumb
! Empirical Distribution
• Read: section 2.6: Measures of relative standing.
Download