Measures of Dispersion
Week 4
Dispersion
• Two groups of three students
Group 1
Group 2
• Mean mark
Group 1
4
7
7
7
10
7
4 + 7 + 10 = 21/3 = 7
Group 2 7 + 7 + 7 = 21/3 = 7
• Same mean mark, but Group 1’s marks are widely spread, Group 2’s are all the same
• The following diagram reinforces this point
2
3
Range
• The absolute difference between the highest and lowest value of the raw data
• Group of students 4 7 10
• Range = Maximum – Minimum
10 – 4 = 6
4
Interquartile Range
• This is the absolute difference between the upper and lower quartiles of the distribution.
• Interquartile Range =
Upper Quartile - Lower Quartile
• See next powerpoints for estimating quartiles
5
Quartiles (1)
• Upper quartile : that value for which 25% of the distribution is above it and 75% below
• Lower quartile : that value for which 75% of the distribution is above it and 25% below
6
Quartiles (2)
• If the data is ungrouped , then put the data in order in an array
• Find the quartile position , then estimate its value, as previously for the median
• Upper quartile (Q3): position = 3(n + 1)
4
• Lower quartile (Q1): position = (n + 1)
4
7
Quartiles (3)
Example: ungrouped data :
3, 5, 6, 9, 15, 27, 30, 35, 37
• Lower quartile: position = n + 1 = 9 + 1 = 2.5th
4 4
Lower quartile: value = 5.5
(mid-way between 2 nd and 3 rd number in array)
• Upper quartile: position = 3(n + 1) = 3(9 + 1)
4 4
= 7.5
th
Upper quartile: value = 32.5
(mid-way between 7 th and 8 th number in array)
8
Quartiles (4)
• Grouped data : use the same approach as for estimating the median for grouped data in week 4, except this time use the quartile positions
9
Semi-Interquartile Range
• This is half the interquartile range. It is sometimes called the Quartile Deviation
• Semi-Interquartile Range
= Upper Quartile - Lower Quartile
2
10
Example
Using previous ungrouped data
Interquartile range = UQ - LQ
= 32.5 – 5.5 = 27
Semi-interquartile range = UQ - LQ
2
= 32.5 – 5.5
2
= 27 = 13.5
2
11
Mean Deviation
• Average of the absolute deviations from the arithmetic mean (ignoring the sign)
• When two straight lines (rather than curved brackets) surround a number or variable it is referred to as the modulus and we ignore the sign
12
Mean Deviation of ungrouped data
• X
1
= 2, X
2
= 4, X
3
= 3
• MD =
1
X
2
X
3
X n
0
3
= ⅔
3
13
Variance
• If we square all the deviations from the arithmetic mean, then we no longer need to bother with dropping the signs since all the values will be positive.
• We can then replace the straight line brackets (modulus) for the Mean Deviation with the more usual round brackets.
• Variance is the average of the squared deviations from the arithmetic mean
14
Variance: ungrouped data (1)
• Variance = i n
1
X X
i
2 n
• To calculate the variance
1. Calculate the mean value
X
2. Subtract the mean from each value in turn, that is, find
X i
X
3. Square each answer to get
X i
X
2
15
Variance: ungrouped data (2)
4. Add up all these squared values to get i n
1
X i
X
2
5. Divide the result by n to get i n
1
X
1
X
2
n
6. You now have the average of the squared deviations from the mean (in square units)
16
Standard deviation (SD)
• This is simply the square root of the variance
• An advantage is that we avoid the square units of the variance
• Larger SD, larger the average dispersion of data from the mean
• Smaller SD, smaller the average dispersion of data from the mean
17
x i
Example 1: variance/standard deviation x
1
- x ( x
1
– x) 2
4
7
10
Total
4 – 7 = - 3
7 – 7 = 0
10 – 7 = 3
(-3 2 ) = 9
0 2 = 0
3 2 = 9
18
18
Solutions n
X
X
2
i n
18
3
6
Standard deviation is square root of 6
= 2.449 units
19
x i
Example 2: variance/standard deviation x i
- x ( x i
– x) 2
7
7
7
Total
7 – 7 = 0
7 – 7 = 0
7 – 7 = 0
0 2 = 0
0 2 = 0
0 2 = 0
0
20
Solution n
X
X
2
i n
0
3
0
Standard deviation is square root of 0 = 0 i.e. there is no spread of values
21
Variance of grouped data
S
2 i
j
1 i
j
F i
1
X i
2
i
j
1 i
j
1
F i
F i
X i
2
F i where Fi = Frequency of ith class interval
Xi = mid point of ith class interval j = number of class intervals
22
Price of item (£)
No of items sold
LCB UCB
1.5 2.5
Fi
15
Xi
2
FiXi FiXi^2
30 60
2.5
3.5
4.5
5.5
3.5
4.5
5.5
6.5
2
19
10
14
60
3
4
5
6
6
76
50
84
246
18
304
250
504
1136
23
246
60
2
S
2
1136
60
S 2 = 18.93 – 4.1
2
S 2 = 18.93 – 16.81
S 2 = £2.12
2
S = √ 2.12 = £1.45
24
Co-efficient of variation (C of V)
• A measure of relative dispersion
• Given by i.e. the standard
X deviation divided by the arithmetic mean of the data.
• Data sets with a higher co-efficient of variation have higher relative dispersion
25