Module5 - CLSU Open University

advertisement
Module 5
Measures of Dispersion
Introduction
In the previous module, you were taught how to compute the mean,
median and the mode both for grouped and ungrouped quantitative
variables. These are summarizing figures used to describe a set of
observations. In some instances however, these measures may not be
sufficient for description especially if the intention is to look at how
variable or how different the observations are from each other. To do this,
you may use another group of indices, the measures of dispersion or
variation.
A measure of dispersion indicates the degree of spread or
variability of a given set of data. Since the data are not all alike, then an
assessment to which data differ from one another is one of the concern of
this module. Four types of measures of dispersion will be discussed,
namely: range, variance, standard deviation and coefficient of variation.
Objectives:
After going through this module, you should be able to:
1. Compute and interpret the various measures of dispersion such
as the range, variance, standard deviation and coefficient of
variation for a set of ungrouped data.
2. Compute and interpret the various measures of dispersion such
as the range, variance, standard deviation and coefficient of
variation for a set of grouped data.
35
The Range
Because the range is simply the difference between the highest and
the lowest measurements when the data are ungrouped, or the difference
between the true upper limit of the last class and the true lower limit of
the first class when the data are grouped, students like you usually
consider it the simplest measure of variability. If H represents the highest
value and L the lowest value, then the range R for ungrouped data is
R
=
H - L
If Hu represents the upper limit of the last class and Ll the lower limit
of the first class, then the range R for grouped data is
R
=
Hu - L l
The range, as a measure of dispersion, is not difficult to calculate
and understand and there is a natural curiosity about the minimum and
maximum values. Nonetheless, it is not generally a useful measure of
variation – the main shortcoming being that there is no indication
concerning the dispersion of the values which fall between two extremes.
Thus rendering the range to be a highly unstable measure of variability.
SAQ1
a. Compute and interpret the range of heights of the 12
basketball players cited in SAQ2, Module 4.
b. Compute an approximation of the range of mathematics
scores given in Activity 1, Module 4.
36
ASAQ1
a. The range of the heights of the 12 basketball players cited
in SAQ2, Module 4 is
24
. This is the difference
between 204 , the highest value, and 180 , the lowest
value.
b. Since these data are grouped, the range is 69 . This is
the difference between the upper limit of the last class
(which is equal to 89 cm) and the lower limit of the first
class (which is equal to 20 cm).
The Variance and Standard Deviation (Ungrouped Data)
Let us now tackle the variance. Unfortunately, this is one of the
measures which students find conceptually difficult to comprehend.
i) Population variance and standard deviation
2
=
Xi2 - (Xi)2/N
N
 = population standard deviation =  2
where :
2 (read “sigma square”) = population variance
N = population size
Xi = the ith observed value for the variable x
ii) Sample variance and standard deviation
s2
where :
=
Xi2 - (Xi)2/n
n-1
s = sample standard deviation =  s2
s2 = sample variance
37
n = sample size
Xi = the ith observed value for the variable x
s2
s
=
Xi2 - (Xi)2/n
n-1
=
114 – (22)2 / 5
4
=
114 - 484/5
4
=
114 - 96.8
4
=
4.3
=
 4.3
=
2.07
SAQ2
a. Once again, refer to item no. 1 of SAQ3, Module 4. Fill the second
column of the table below.
Number of
children (Xi)
3
4
7
6
2
Total:
Xi2
b. Compute Xi, (Xi)2, Xi2.
c. Compute and interpret the variance and standard deviation for the
number of children of 5 families.
38
ASAQ2
a. The notation Xi2 indicates the square of the individual
observations. Hence, with the help of a calculator, you should not
experience any problem obtaining the results tabulated below.
Xi
3
4
7
6
2
Xi2
9
16
49
36
4
Total:
b. The answers to this SAQ are (Xi) = 22, (Xi)2 = 484 and Xi2 =
114.
Did you notice that (Xi)2 is easily obtained by taking the total
of the Xi column, then squaring this total? Do this step by step. Fill
up the blanks below.
Total of the Xi column =Xi = __________
Square of the total = (Xi)2 = (
)2 = __________
On the other hand, Xi2 is obtained by squaring the individual
observations first, then taking the sum of the squared observations.
In fact, this is simply the sum of the Xi2 column in the table above.
Well, are you now confident that you can discern the difference
between (Xi) 2 and Xi2? Good.
c. The variance of the number of children is ______ and the standard
deviation is _______. How do your results compare with these? If
you got them right, you’ve done a good job. If you failed to get the
results, please follow through the proper steps in the computation.
So how did you interpret the variance of
4.3 ? It should have
been “The average of the squared deviations from the mean number of
children” is 4.3 . How about the standard deviation of
2.07
.
Theoretically, it can be interpreted as the square root of the average of the
squared deviations from the mean number of children. In layman’s
39
concept, the greater the value of the standard deviation, the more the
observations scatter from the mean.
The standard deviation is the most important measure of dispersion.
It is affected by the value of each observation. It is the most stable and is,
therefore, the most reliable measure of variability.
The Variance and Standard Deviation (Grouped Data)
Let us now go to the computation of the variance and standard
deviation for grouped data. Consider the following frequency distribution
table given in the next page:
By this time, I trust that you have developed enough facility with
statistical notations. I’d like you to study carefully the formula below for
the computation of the different measures of variation:
Range = Hu – Ll
Variance (s2 ) =
fiXi2 - [(fiXi)2/n]
n-1
Standard deviation (s) = √ s2
where :
H
Ll
=
Upper limit of the last class
=
Lower limit of the first class
s2
=
variance
s
=
standard deviation
Xi
=
ith observed value for the variable x
fi
=
midpoint or classmark of the ith class
n = fi = total frequency or total observations
40
Math Test
Scores
50 – 54
55 – 59
60 – 64
65 – 69
70 – 74
75 – 79
Freq
(fi)
6
18
23
11
7
5
Midpoint
(Xi)
52
57
62
67
72
77
fiXi
312
1026
1426
737
504
385
Xi2
2704
3249
3844
4489
5184
5929
fiXi=4390
n = 70
fiXi2
16224
58482
88412
49379
36288
29645
fiXi2 =
278430
Using the above formulas, we have
R = 79 – 50 = 29
S2 = 278430 – (4390)2/70
70 - 1
= (278430 – 275315.71)/69
= 3114.29/69
= 45.1346
s = √s2
= √45.1346
= 6.7182
The Coefficient of Variation
The coefficient of variation is a measure of relative dispersion which
may be used for comparing the variability of two sets of data. This
measure of variation is computed with the use of the formula (Walpole,
1982)
CV
=
s
x
(100%)
41
where : s = standard deviation
x = mean = ∑fiXi/n
From the frequency distribution table, we have
x = ∑fiXi/n = 4390/70 = 62.7
CV = 6.7182/62.7 (100%)
= 10.71%
The computed CV of 10.71% indicates that the variability or the
degree of differences of the mathematics test scores of the respondent
students is relatively low.
Another example of the computation of CV and its usefulness is as
follows:
The mean mathematics achievement test score of one section of
first year high school students is 55 with a standard deviation of 15. In
another section in the same school, the mean is 30 and the standard
deviation is 10. Do the scores of the first group fluctuate about its mean
more than those of the second group? (or is the first group more variable
than the second group?)
Computation:
First group:
CV1
=
s
x
(100%)
=
15/55 (100%)
=
27.3%
42
Second group:
CV2
=
s
x
(100%)
=
10/30 (100%)
=
33.3%
Since the coefficient of variation is higher for the second group
compared to the first group, this implies that the individual scores of the
second group is widely dispersed than the individual scores of the first
group.
43
Activity
1a. Complete the table below.
Scores
20-29
30-39
40-49
50-59
60-69
70-79
80-89
n = fi =
No. of
Students
(fi)
5
9
11
22
13
7
3
Midpoint
(Xi)
fiXi
Xi2
fiXi2
1b. Compute the range, variance, standard deviation and coefficient of
variation and interpret the results.
2. Compute the range, variance standard deviation and coefficient of
variation of the following teachers’ efficiency grades obtained by 25
faculty members. Do not group the data:
98
97
96
95
94
93
92
91
90
88
88
88
86
85
84
83
82
81
80
79
78
77
76
75
74
Download