Cumulative Frequency and Box Plots

advertisement
Mr Barton’s Maths Notes
Stats and Probability
4. Cumulative Frequency and
Box Plots
www.mrbartonmaths.com
4. Cumulative Frequency and Box Plots
• The answer to this question is similar to the one for: “why do we bother working out averages
Why do we bother with Statistical Diagrams?
and measures of spreads?”.
• We live in a world jam-packed full of statistics, and if we were forced to look at all the facts
and figures in their raw, untreated form, not only would we probably not be able to make any
sense out of them, but there is also a very good chance our heads would explode.
• Statistical Diagrams – if they are done properly - present those figures in a clear, concise,
visually pleasing way, allowing us to make some sense out of the figures, summarise them, and
compare them to other sets of data.
1. What is Cumulative Frequency?
Cumulative is just a posh way of saying “add up as you go along”
Frequency is just a posh word for “total”
So… if you put them together, you get a very posh way of saying “add the totals up as you go
along”
Big Example
To the right is a table showing the length of
time a group of 40 Year 10 students spent
playing on the Nintendo Wii on a gloomy week
in January. Draw a Cumulative Frequency
Curve, use it to estimate the Median and
Inter-Quartile Range, and construct a Box Plot
Hours spent playing
Frequency
0 < h ≤ 1
2
1 < h ≤ 2
5
2 < h ≤ 3
10
3 < h ≤ 4
15
4 < h ≤ 6
5
6 < h ≤ 10
3
2. Adding a Cumulative Frequency Column
Before you can even start thinking about drawing a Cumulative Frequency Curve, you need to be
able to add a Cumulative Frequency column to your Frequency table.
Remember, Cumulative Frequency just means that you add up the frequencies as you go along,
so that is exactly what you do!
Hours spent playing
Frequency
Cumulative Freq
0 < h ≤ 1
2
2
1 < h ≤ 2
5
7
2 < h ≤ 3
10
17
3 < h ≤ 4
15
32
4 < h ≤ 6
5
37
6 < h ≤ 10
3
40
Check: This final entry should always equal the total frequency!
This is the number of
people who play for 1
hour or less
This is the number of
people who play for 2
hours or less (5 + 2)
This is the number of
people who play for 3
hours or less (5 + 2 + 10)
3. Drawing the Cumulative Frequency Curve
Remember: we plot Cumulative
Frequency (y axis) against the
upper boundary of each group
(x axis)
So… for group one it’s 1 on the
x axis and 2 on the y
and for group two, it’s 2 on the
x axis and 7 on the y…
Hours spent playing
Frequency
Cumulative Freq
0 < h ≤ 1
2
2
1 < h ≤ 2
5
7
2 < h ≤ 3
10
17
3 < h ≤ 4
15
32
4 < h ≤ 6
5
37
6 < h ≤ 10
3
40
Things to notice about the Cumulative Frequency Curve:
1. When you have finished plotting the points, join them up with a smooth curve.
2. Native the curve starts at (0, 0). This is because there is nobody playing less than 0
hours a week!
3. You must label your axis correctly, or you lose very easy marks!
40
Cumulative Frequency
35
30
25
20
15
10
5
Time Spent Playing Wii (hours)
1
2
3
4
5
6
7
8
9
10
4. Estimating the Median and Inter-Quartile Range
We have spent a while drawing our cumulative frequency curve, so we may as well use it.
Very quickly we can come up with estimates for the Median and the Inter-Quartile Range
(a) Median
As you hopefully remember, the Median is the MIDDLE value.
To find it we:
1. Work out what is 50% of our total frequency (half way up
the y axis)
2. Draw a horizontal line across until it hits our curve
3. When it hits the curve, draw a vertical line down to the x
axis
4. The value on the x axis is our Median
(b) Inter-Quartile Range
For this we need to work out the upper quartile (UQ) and the lower quartile (LQ), and then
calculate: UQ - LQ
To find the Upper Quartile:
1. Work out what is 75% of our total frequency (three-quarters of the way up the y axis)
2. Draw a horizontal line across until it hits our curve
3. When it hits the curve, draw a vertical line down to the x axis
4. The value on the x axis is our Upper Quartile
The Lower Quartile is the same, but 25% (one-quarter) of the way up!
40
Cumulative Frequency
35
30
25
20
15
10
5
Time Spent Playing Wii (hours)
1
2
3
4
5
6
7
8
9
Median:
Upper Quartile
Lower Quartile
Inter-Quartile Range
50% of 40 = 20
75% of 40 = 30
25% of 40 = 10
= UQ – LQ
Median = 3.2 hours
UQ = 3.8 hours
LQ = 2.4 hours
= 3.8 – 2.4
Remember: The Median is a form of average, and just like the
Range, The Inter-Quartile Range is a measure of consistency
= 1.4 hours
10
5. Drawing Box Plots
Box Plots are another way of representing all the same information that can be found on a
Cumulative Frequency graph.
Top Tip: if you have the chance, draw your box plot directly below your cumulative
frequency graph, using the same scale on the x axis, and you can just extend the vertical
lines downwards and save yourself a lot of time!
Lowest value
Highest value
Median
Lower Quartile
Upper Quartile
Inter-Quartile Range
Range
Note: The minimum value is the lowest possible value of your first group, and the
maximum value is the highest possible value of your last group
40
Cumulative Frequency
35
30
Min Value = 0
LQ = 2.6
25
20
Median = 3.2
15
UQ = 3.8
Max Value = 10
10
5
Time Spent Playing Wii (hours)
1
2
3
4
5
6
7
8
9
10
Good luck with
your revision!
Download