7---measures-of-spread

advertisement
Measures of Spread
The measures of spread or dispersion of a data set are quantities that indicate how
closely a set of data clusters around its centre.
A) Deviation
A deviation is the difference between an individual value in a set of data and the
mean for the data.
For a population,
For a sample,
deviation = x - μ
deviation = x - 
Note:

larger the size of the deviation, the greater the spread in the data

values less than the mean have negative deviations
B) Standard Deviation
The standard deviation is the square root of the mean of the squares of the
deviations
σ
- symbol for standard deviation
s
- standard deviation of a sample
Population Standard Deviation
σ=
 x   
N
Sample Standard Deviation
 X  x 
2
2
s=
n 1
where N is the number of data in the population and n is the number in the
sample
C) Variance
The mean of the squares of the deviations is another useful measure.
This quantity is called the variance and is equal to the square of the standard
deviation.
Population Variance
Sample Variance
 x   
σ =
 x  x 
2
2
2
2
s =
N
n 1
Example 1:
Calculate the mean and standard deviation of Alice=s commuting time in
minutes from the following data:
55
68
83
59
68
75
62
78
97
83
Solution:
The mean is 72.8 minutes.
Standard Deviation:
Commuting Time (x)
(x - μ)
(x - μ)2
55
68
83
59
68
75
62
78
97
83
(x - μ)2=
σ=
 x   
2
N
Alice=s average commuting time is ____ minutes with a standard
deviation of _____ minutes.
D) Grouped Data
Formulas
f i mi   
N

f i mi  x
n 1
2
σ
s

2
where fi is the frequency for a given interval and m i is the midpoint of the interval
Example 2:
Determine the mean and standard deviation for the following.
Average daily interest ($) accumulated in savings accounts
Interest
0.50-10.50
10.50-20.50
20.50-30.50
30.50-40.50
Frequency
32
11
5
2
Solution:
Mean =
Standard deviation =
E) Quartiles and Interquartile Ranges
Quartiles divide a set of ordered data into four groups with equal numbers of values,
just as the median divides data into two equally sized groups. The three Adividing
points@ are the first quartile (Q1) the median (sometimes called the second quartile or
Q2) and the third quartile (Q3).
Q1 and Q3 are the medians of the lower and upper halves of the data.
The interquartile range is Q3 - Q1 which is the range of the middle half of the data.
The larger of the interquartile range, provides a measure of spread.
The semi-interquartile range is one half of the interquartile range.
Both of these ranges indicate how closely the data are clustered around the median.
Example 3:
The following data represent 20 people=s estimates of the size of a crowd
at a public gathering.
650 400
1000 550
500
625
600
575
700
750
500
900
600
650
700
700
450
575
750
800
Determine the median and the interquartile range.
Solution:
The number of data is even so the median will be the means of adjacent
data. In fact, because the number of data is a multiple of four, the
quartiles will be the means of adjacent data.
Order the data from smallest to largest.
Order
Data
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Median
400
450
500
500
550
575
575
600
600
625
650
650
700
700
700
750
750
800
900
1000
= (average of 10th and 11th data) =
First quartile, Q1
= (average of 5th and 6th data)
Third quartile, Q3
= (average of 15th and 16th data) =
Interquartile range = Q3 - Q1
Semi-interquartile range
=
=
=
The median of the estimates of the crowd is 637.5 with half of the
estimates within 81.25 of this.
F) Percentiles
Percentiles are similar to quartiles except that percentiles divide the data into 1intervals that have equal numbers of values. Thus, k percent of the data are less than
or equal to kth percentile, Pk, and (100-k) percent are greater than or equal to Pk.
Example 4:
On a recent aptitude test, Carrie was rated in the 93rd percentile. If 1068 people
wrote the test, how many people had a lower score on the test than Carrie did?
Solution:
(0.93)(1068) = 993.24
There were 993 people who had a lower score than Carrie.
Example 5:
In a popular mathematics competition, only the contestants in the top 5
percentiles win Diplomas of Distinction. If there were 478 contestants, how
many Diplomas of Distinction would be awarded?
Solution:
(0.05)(478)=23.9
There were 24 Diplomas of Distinction awarded.
G) Z-scores
A z-score is the number of standard deviations that a datum is from the mean.
Calculate by dividing the deviation for a datum by the standard deviation.
Population
z=
x

For a Sample
z=
xx
s
Note:
Variable values below the mean have negative z-scores, values above the mean have
positive z-scores, and values equal to the mean have a zero z-score.
Example 6:
Find the mean and standard deviation of the z-scores of the following set of data.
Solution:
Tabulate to compute the mean and standard deviation.
Mean is  = 15
Standard deviation is s = 3.74
The z-scores and their distribution can now be determined.
x
z
x  x
z2
s
10
-1.34
1.79
11
-1.07
1.14
14
-0.27
0.07
16
0.27
0.07
19
1.07
1.14
20
1.34
1.79
z = 0
The mean of the z-scores is
z2=6.0
z
n
=0
The standard deviation of the z-scores is 1.
Follow Up:
page 148-149 #1-7, 9, 10
Download