Measures of Dispersion: Mean Absolute Deviation and Standard Deviation
Pulse Rate
(x)
Deviation
(x - µ)
Absolute Deviation
|(x - µ)|
5
6
7
8
1
2
3
4
Sum (
)
60
63
69
54
57
78
81
72
534
-6.75
-3.75
2.25
-12.75
-9.75
11.25
14.25
5.25
0
6.75
3.75
2.25
12.75
9.75
11.25
14.25
5.25
66
Mean (µ) 66.75 0.00 8.25
Mean Absolute
Deviation
Symbol Key:
∑ (read as sigma)
represents sum (add everything up)
µ (read as mu)
represents the average x
represents any data value n
represents the total number of data within the set
σ (read as sigma)
represents variance
σ 2
(read as sigma squared)
represents standard deviation
Square of Deviation
(x - µ) 2
45.56
14.06
5.06
162.56
95.06
126.56
203.06
27.56
679.48
84.94
Variance
9.22
Standard Deviation
Measures of Dispersion: Mean Absolute Deviation and Standard Deviation
Use the data chart to help you answer the following questions.
1. The range of a data set represents the spread of all of the elements in the data set.
What is the range of the data set? 81 – 54 = 27 27 is the range; it’s the highest minus lowest value
What is a deviation?
Deviation is how far a value falls away from the groups average
How is deviation calculated?
Subtract the mean from the data value. ( x - µ) ; this example it’s each pulse rate minus the set’s average
What do you notice about the average deviation value?
The mean of the deviations is 0. This will always occur because the mean is a balance point for all of the data. The number of negative deviations below the mean will be the same as the amount of positive deviations above the mean.
2. The mean absolute deviation (MAD) of a data set represents the average distance of the data from the mean.
How do you calculate the mean absolute deviation (MAD)?
Subtract the mean from each data value
take the absolute value of each answer
find the average
If the mean absolute deviation (MAD) is high, what does this tell you about the data set?
If the MAD is high, then the data is more dispersed (meaning it’s more spread out and possibly not very consistent)
If the mean absolute deviation (MAD) is low, what does this tell you about the data set?
If the MAD is low, then the data is closely fit (meaning there is possibly more consistency in the data, so more reliable).
Measures of Dispersion: Mean Absolute Deviation and Standard Deviation
3. The variance (
2
) of a data set represents the average distance of the data from the mean.
How do you calculate the variance (
2
)?
Subtract the mean from each data value
square each answer
find the average
5. The standard deviation (
) of a data set represents the average distance of the data from the mean.
How do you calculate the standard deviation (
)?
Take the square root of the variance
If the standard deviation (
) is high, what does this tell you about the data?
If the standard deviation is high, then the data is more dispersed (meaning it’s more spread out and possibly not very consistent)
If the standard deviation (
) is low, what does this tell you about the data?
If the standard deviation is low, then the data is closely fit (meaning there is possibly more consistency in the data, so more reliable).
6. Outliers affect the value of the mean (µ). What affect do outliers have on:
The average deviation? Outliers don’t affect average deviation because it will always be zero
The mean absolute deviation? Outliers will increase the MAD because it will cause the data to be more spread out and MAD tells how spread out data is.
The standard deviation? Outliers will increase the standard deviation because it will cause the data to be more spread out and standard deviation tells how spread out data is.
Which is affected more by outliers – mean absolute deviation (MAD) or standard deviation (
)? Explain.
Standard deviation is more affected by outliers because the deviation values are squared in the calculation process which magnifies the value more greatly.
Measures of Dispersion: Mean Absolute Deviation and Standard Deviation
1. Which histogram below matches the descriptive statistics of data set A, and which matches the descriptive statistics of data set B?
How do you know? ( hint: look at the spread of the data )
Data Set A Data Set B
Mean = 45
Absolute Mean Deviation = 9.1
Variance = 106.1
Standard Deviation = 10.3
Mean = 45
Absolute Mean Deviation = 16
Variance = 420.3
Standard Deviation = 20.5
Data Set A Data Set B the deviation values are higher in B than in A which implies the data is more spread out. The 2 nd graph has a higher range so the data is more spread out
2.
At The Grill, steaks are cut into 12-ounce portions when they are ordered by a customer. For the past month, the manager has been trying to determine the cook who can most accurately portion out the meat. The manager collected the standard deviation of sample meat portions each week for each of his cooks.
Week Week Week
Compare and contrast the standard deviations for each cook. The manager 1 2 3
Week
4 needs to choose three of the cooks to cut portions of meat throughout the Joe
0.2 0.2 0.2 0.15 week. Which three should he choose? Why?
( hint: standard deviation is the average distance away from the mean )
Sarah
0.15 0.2 0.22 0.25
Holli, Jason, Ben have the lowest standard deviation values which means their Holli
0.2 0.18 0.15 0.1 cuts of meat are more consistent with the preferred 12-oz portion.
A high deviation number means the person varies away from the average.
Jason
0.15 0.1 0.13 0.11
Ben
0.3 0.1 0.2 0.1