Box and Whisker Plots

advertisement
Box and Whisker Plots
WHY?
When you want to compare 2 or more sets of data, Box and Whisker Plots can be used
to easily show the differences between them.
HOW?
To create a Box and Whisker Plot, you only need to know how to calculate the median
of a data set.
 To calculate the median, simply arrange your data from lowest to highest, and
select the middle value.
 If you have an even number of observations, you will have two values in the
middle. Can you see why? If this is the case, then the median will just be the
average of these 2 numbers.
o For example: if you have 50 observations, then the median will be the
average of the 25th and 26th observations in your list.
o
…..23rd
24th
25th
26th
27th…..
165cm
167cm
168cm
170cm
170cm
Median

168  170
2
= 169
=
You now need to calculate the median of the lower half of the data, and the
upper half of the data (called lower and upper quartiles respectively). The
Interquartile Range (IQR) is simply the difference between the lower and
upper quartiles.
The Box shows the middle half of the data between the upper and lower quartile with
the median marked as a solid line across the box.
The whiskers show the range of the data. (To help identify any data that falls outside
the overall pattern any observations more than 1.5 times the Interquartile Range from
the lower or upper quartile are plotted individually - These are called outliers.)
Below are 2 sets of data from the UK CensusAtSchool. Both are from a class of year
11 pupils asked to give their height to the nearest centimetre. Set F are the females
and set M are the males.
Set F 156
172
170
163
166
172
156
174
164
173
174
170
175
180
164
165
172
160
167
177
173
157
168
173
177
150
158
174
165
170
170
168
173
177
Set M 173
178
176
193
165
170
176
168
186
183
170
182
174
180
174
180
166
175
187
173
173
185
176
179
180
183
190
179
178
174
EXAMPLE
For Set F
1. Order the data from smallest to highest (34 obs)
150
167
173
156
168
173
156
168
174
157
170
174
158
170
174
160
170
175
163
170
177
164
172
177
164
172
177
165
172
180
165
173
166
173
2. Median = 170  170 2
= 170cm
Upper quartile = 173 cm
Range 180 – 150 = 30
Lower Quartile = 164 cm
IQR = 9 cm
Boxplot of Female
150
160
170
180
Female
Use the figures given to check on the outlier shown i.e. using the IQR rule we can see
that the lower quartile minus 1.5 times the IQR is equal to 150.5 cm.
Now construct a Box and Whisker Plot for the set M and compare with set F. What
are you conclusions? ****
Note that there are no outliers. Check yourself that the lower and upper limits for
outliers are 159.5cm and 195.5cm respectively.
You may wish to record your class heights and compare these with the year 11 UK
students.
Additional thoughts
What do you think would happen to the median of the UK year 11 boys if the tallest boy
was actually 213cm? Do you think it would change? Obviously this height would be
considered an outlier using the IQR rule. This should tell you something about the
median, and whether it is affected by outliers. What about the mean? Does this
change?
Which would be the ‘better’ measure of centre? See if you can find out when you
should use the mean instead of the median, and when the 2 values will be the same.
**** For Set M
Median = 177 cm
Upper quartile = 182 cm
Range 193 – 165 = 28
Lower Quartile = 173 cm
IQR = 9 cm
Download