Frequency Distribution - McGraw Hill Higher Education

2-1
f requency Distributions
Describing Data
Graphic Presentations
Copyright © 2004
2003 by The McGraw-Hill Companies, Inc. All rights reserved.
2-2
When you have completed this chapter, you will be able to:
Organize raw data into frequency distribution
Produce a histogram, a frequency polygon, and
a cumulative frequency polygon from
quantitative data
Develop and interpret a stem-and-leaf display
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
2-3
Present qualitative data using such graphical
techniques such as a clustered bar chart, a
stacked bar chart, and a pie chart
Detect graphic deceptions and use a graph
to present data with clarity, precision,
and efficiency
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
2-4
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
2-5
A Frequency Distribution is a
grouping of data into
non-overlapping classes
(mutually exclusive)…
showing the
number of observations
in each category
or class.
The range of categories includes all values in the
data set (collectively exhaustive classes).
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
2-6
Class Midpoint or Class Mark:
A point that divides a class
into two equal parts, i.e. the
average of the upper and lower
class limits.
Class frequency:
12.5
17.5
22.5
27.5
32.5
5
The number of observations in each class.
Class interval:
The class interval is obtained by subtracting the lower limit of
a class from the lower limit of the next class, e.g.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
2-7
Dr. Tillman is Dean of the School of Business.
He wishes to prepare a report showing the
number of hours per week students spend studying.
He selects a random sample of 30 students and
determines the number of hours
each student studied last week.
15.0, 23.7, 19.7, 15.4, 18.3, 23.0, 14.2, 20.8, 13.5, 20.7,
17.4, 18.6, 12.9, 20.3, 13.7, 21.4, 18.3, 29.8, 17.1, 18.9,
10.3, 26.1, 15.7, 14.0, 17.8, 33.8, 23.2, 12.9, 27.1, 16.6.
Organize the data into a frequency distribution.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
There are five steps
that can be used to
2-8
Construct a Frequency Distribution:
Decide how many classes you wish to use.
Frequency
Distributions
by hand
Determine the class width.
Set up the individual class limits.
Tally the items into the classes.
Count the number of items in each class.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Decide how many classes you wish to use
2-9
Rule of Thumb:
For most data sets, you would want
between 3 and 12 classes!
Use the 2 to the K rule.
Choose k so that 2 raised to the power of k is greater
than the number of data points (n) or 30.
In this
case…
2k = 30 students
25 = 32, so use k = about 5 classes
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Determine the class width
2 - 10
Generally, the class width should be
the same size for all classes.
>= Max - Min
K
Class width
15.0,
23.0,
17.4,
21.4,
10.3,
33.8,
23.7,
14.2,
18.6,
18.3,
26.1,
23.2,
19.7,
20.8,
12.9,
29.8,
15.7,
12.9,
15.4,
13.5,
20.3,
17.1,
14.0,
27.1,
18.3,
20.7,
13.7,
18.9,
17.8,
16.6.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
K=5
(33.8 – 10.3)/ 5 = 4.7
Therefore, use
class size of 5 hours
Set up the individual class limits
Minimum Value is 10.3,
2 - 11
Class Width 5 hours
therefore,
Lower class limits
classes should start
will be: 10, 15, 20, etc.
at 10 hours
15.0,
23.0,
17.4,
21.4,
10.3,
33.8,
23.7,
14.2,
18.6,
18.3,
26.1,
23.2,
19.7,
20.8,
12.9,
29.8,
15.7,
12.9,
15.4,
13.5,
20.3,
17.1,
14.0,
27.1,
18.3,
20.7,
13.7,
18.9,
17.8,
16.6.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Classes
10.0 – 14.9
15.0 – 19.9
20.0 – 24.9
25.0 – 29.9
30.0 – 34.9
or
Classes
10.0 to under 15
15.0 to under 20
20.0 to under 25
25.0 to under 30
30.0 to under 35
Tally the items into the classes
15.0, 23.7,
23.0, 14.2
14.2,
17.4, 18.6,
21.4, 18.3,
10.3, 26.1,
33.8, 23.2,
Find
Classes
10.0 to under 15
15.0 to under 20
20.0 to under 25
25.0 to under 30
30.0 to under 35
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
19.7,
20.8,
12.9
12.9,
29.8,
15.7,
12.9,
12.9
15.4,
13.5
13.5,
20.3,
17.1,
14.0,
14.0
27.1,
18.3,
20.7,
13.7,
13.7
18.9,
17.8,
16.6.
Tally
…and so on with
the remaining
hours
2 - 12
Count the number of items in each class
Hours Studying
x Frequency f
10.0 to under 15
15.0 to under 20
20.0 to under 25
25.0 to under 30
30.0 to under 35
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
2 - 13
7
12
7
3
1
30
Using different limits
2 - 14
…will give you a different distribution, e.g.
Hours Studying
7.5 to under 12.5
12.5 to under 17.5
17.5 to under 22.5
22.5 to under 27.5
27.5 to under 32.5
32.5 to under 37.5
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
x Frequency f
1
12
10
5
1
1
30
Construct a Frequency Distribution
Using Excel
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
2 - 15
Using
2 - 16
See
Click on
Frequency
Distributions
Click on
Quantitative
See…
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Using
2 - 17
See
$A:$A
5
10
INPUT NEEDS
See…
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Using
See
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
2 - 18
2 - 19
Relative Frequency
Distribution
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Relative Frequency Distribution
2 - 20
…shows the percent of observations in each class!
Hours Studying x
f
Relative f
10.0 to under 15
15.0 to under 20
20.0 to under 25
25.0 to under 30
30.0 to under 35
7
7/30 = 0.2333
12
7
3
12/30 = 0.40
7/30 = 0.2333
3/30 = 0.10
1
1/30 = 0.0333
30
30/30 =1
Total
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Using different limits
Hours Studying
x f
2 - 21
Relative f
7.5 to under 12.5
12.5 to under 17.5
17.5 to under 22.5
22.5 to under 27.5
27.5 to under 32.5
32.5 to under 37.5
1
12
10
5
1
1
1/30 = 0.0333
12/30 = 0.40
10/30 = 0.3333
5/30 = 0.1666
1/30 = 0.0333
1/30 = 0.0333
Total
30
30/30 =1
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Stem-and-leaf Displays
2 - 22
A statistical technique for displaying a set of data.
Each numerical value is divided into two parts:
1. the leading digits become the stem and
2. the trailing digits become the leaf.
…an advantage of the stem-and-leaf
display over a frequency distribution is
that we retain the value of each observation!
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Stem-and-leaf Displays
A student achieved the following
scores on the twelve accounting
quizzes this semester:
86, 79, 92, 84, 69, 88, 91,
83, 96, 78, 82, 85.
Construct a stem-and-leaf chart to
illustrate the results.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
2 - 23
Stem-and-leaf Displays
First, find the lowest score
86, 79, 92, 84, 69, 88,
91, 83, 96, 78, 82, 85.
Now list the next scores with the highest
leading digits.
You should now have the following STEMS:
669, 78,
7 82,
8 91
9
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
2 - 24
Stem-and-leaf Displays
2 - 25
86, 79, 92, 84, 69, 88,
91, 83, 96, 78, 82, 85.
Split
Stem
Leaf
669
6
9
778
7
8 9
882
8
2 3 4 5 6 8
991
9
1 2 6
Now, list the remaining
‘leaf’ scores!
All 12 Scores
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
2 - 26
The grades on a statistics exam for
a sample of 40 students are as follows:
Stem Leaf
3
68
4
1278
5
0125589
6
01112578889
7
0025667
Alpha-Numeric
Grading
A+ =
A =
B+ =
B =
90%-100%
80%-89%
75%-79%
70%-74%
C+ = 65%-69%
C = 60%-64%
8
46889
D = 55%-59%
9
0246
F = 0%-54%
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
How many
students
earned an
A
on this test?
5
What is the
most common
letter grade
earned?
F
2 - 27
Graphic
Presentation of a
Frequency
Distribution
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Graphic Presentation of a
Frequency Distribution
2 - 28
The three commonly used graphic forms are:
Histograms
Frequency Polygons or Line Charts
Cumulative Frequency Distributions
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
A Histogram
is a graph in which the
classes are marked on
the horizontal axis and
the class frequencies on
the vertical axis
The class frequencies
are represented by the
heights of the bars and
the bars are drawn
adjacent to each other.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Frequency
Graphic Presentation of a
Frequency Distribution
Class
2 - 29
Graphic Presentation of a
Frequency Distribution
2 - 30
Histogram
Hours Studying x f
10.0 to under 15 7
15.0 to under 20 12
20.0 to under 25 7
25.0 to under 30 3
30.0 to under 35 1
14
12
10
8
6
4
2
0
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
10
15 20 25 30 35
Hours spent studying
Graphic Presentation of a
Frequency Distribution
A frequency polygon
consists of line segments
connecting the points
formed by
the class midpoint and
the class frequency.
A cumulative frequency
distribution
is used to determine how
many or what proportion
of the data values are
below or above
a certain value.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
2 - 31
14
12
10
8
6
4
2
0
7.5
12.5
17.5
22.5
27.5
25
30
35
30
25
20
15
10
5
0
10
15
20
35
2 - 32
Making a
Histogram
in Excel
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Using
2 - 33
Click on DATA ANALYSIS
See
Click on HISTOGRAM
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Using
2 - 34
The upper limits of the classes you have
determined must now be entered from
Column B (Excel calls these “bins”)
Complete INPUTTING of DATA
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Using
To remove the Legend on the right side…
Right mouse click and Click on Clear
To remove the spaces between the bars…
Right mouse click on one of the bars and
Click on Format Data Series
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
2 - 35
Using
Now, Click on the Options tab;
To reduce/remove the spaces between the bars
Adjust the Gap width down to 0 and Click on OK.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
2 - 36
Using
Edit the size of the
histogram, titles, etc
as appropriate.
Note that the upper limit values
are included in each class
– this explains the
difference between this
Excel Frequency Distribution
and the one we did by hand.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
2 - 37
Frequency Polygon or Line Chart
for Hours Spent Studying
Hours Studying
xf
10.0 to under 15 7
15.0 to under 20 12
20.0 to under 25 7
25.0 to under 30 3
30.0 to under 35 1
14
12
10
8
6
4
2
0
10 15 20 25 30
Hours spent studying
Notice that the class midpoints
(the plotted points) aren’t as
“user friendly” in this distribution choice.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
2 - 38
35
Cumulative Frequency Distribution
For Hours Studying
Hours Studying
x
2 - 39
Cumulative
f
f Hours Studying
10.0 to under 15 7
15.0 to under 20 12
20.0 to under 25 7
25.0 to under 30 3
30.0 to under 35 1
under 15
under 20
under 25
under 30
under 35
7
19
26
29
30
Graph…..
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Cumulative Frequency Distribution
For Hours Studying
2 - 40
Hours Studying
Cumulative
f
under 15
under 20
under 25
under 30
under 35
7
19
26
29
30
35
30
25
20
15
10
5
0
10 15 20 25 30
Hours spent studying
Notice that the limits are the plotted points.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
35
2 - 41
Pie
Bar
Line
… used primarily for Qualitative Data
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Pie
…is useful for displaying a
Relative Frequency Distribution
A circle is divided
proportionally to the
relative frequency
and portions of the circle
are allocated for the
different groups.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
2 - 42
2 - 43
Pie
200 runners were asked to indicate their
favourite type of running shoe.
Type
Nike
Adidas
Reebok
Asics
Other
# of runners selecting:
92
49
37
13
9
Draw a pie chart based on this information.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
2 - 44
Pie
Relative Frequency Distribution
for the running shoes
Type
Nike
Adidas
Reebok
Asics
Other
#
%
92
49
37
13
9
46.0
24.5
18.5
6.5
4.5
200
100
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Reebok
Asics
Other
6.5%
18.5% 4.5%
Adidas 24.5%
46.0% Nike
Pie
Type
#
%
Nike
Adidas
Reebok
Asics
Other
92
49
37
13
9
46.0
24.5
18.5
6.5
4.5
200
100
Using Excel, follow the steps in the Chart Wizard
to construct a Pie Chart!
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
2 - 45
Bar
2 - 46
…can be used to depict any of the levels of measurement
(nominal, ordinal, interval, or ratio).
Examples of…
(also known as a
‘column chart’)
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
3-D
Bar
Use bar charts also
when the order
in which qualitative
data are presented
is meaningful.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
2 - 47
Bar
How could we chart this data?
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
2 - 48
Bar
2 - 49
Using Excel we can
produce this…
Other formats…
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Victoria
57.7
Vancouver
61.4
Edmonton
67.1
Winnipeg
66.7
Saskatoon
63.7
Regina
67.4
Thunder Bay
61.0
London
63.3
Kitchener
66.0
Hamilton
63.2
Toronto
65.1
Quebec
59.7
Sherbrooke
59.2
Montreal
60.4
Halifax
60.5
Bar
Employment Rate in Canadian Cities
70
68
% employment
Canadian City
Employment
Rate
2 - 50
66
64
62
60
58
56
54
52
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Victoria
57.7
Vancouver
61.4
Edmonton
67.1
Winnipeg
66.7
Saskatoon
63.7
Regina
67.4
Thunder Bay
61.0
London
63.3
Kitchener
66.0
Hamilton
63.2
Toronto
65.1
Quebec
59.7
Sherbrooke
59.2
Montreal
60.4
Halifax
60.5
Bar
Employment Rate in Canadian Cities
70
68
% employment
Canadian City
Employment
Rate
2 - 51
66
64
62
60
58
56
54
52
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
- by Province
Bar
Did any of the previous Bar Charts
adequately display
all the information that was provided?
The following has been modified from
that data found by Statistics Canada.
Does it do an effective job of displaying
the StatCan data?
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
2 - 52
% of enterprises
Clustered Bar
100
2 - 53
Comparison of Internet Use in 2000 and 2001
80
60
40
20
0
Manufacturing
Wholesale trade
Retail trade
% of enterprises that
use the Internet 2000
% of enterprises that
use the Internet 2001
% of enterprises with a
Web site 2000
% of enterprises with a
Web site 2001
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Data Source: Statistics Canada
Stacked Bar
2 - 54
Full-Time University Faculty By Gender,
Canada and Jurisdictions, 1987-88 and 1997-98
Total
Full Professor
1987-88 1997-98 1987-88 1997-98
34,651 33,925 12,829 13,910
% Female
% Male
17
83
25
75
7
93
13
87
Associate Professor
1987-88
12,650
1997-98
12,095
17
83
28
72
Other
1987-88 1997-98
9,172 7,817
32
68
44
56
% of Total
Canadian Full Time University Faculty
120
100
80
60
40
20
0
% males
% females
1987-88
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
1997-98
Data Source: Statistics Canada
2 - 55
Make sure that your
charts
are not
overly cluttered
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
2 - 56
Shapes
of
Modal
Histograms
Class
There are four typical shape characteristics
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
2 - 57
…a balanced effect!
Both ‘balanced’ or
‘have symmetry’
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
2 - 58
… occurs when the observations are graphed as
being skewed or tilted more to one side of the centre
of the observations than the other.
The skewness, if on the
right side is said to be
The skewness, if on the
left side is said to be
‘positive’.
‘negative’.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Modal
2 - 59
Class
A modal class is the one with the
largest number of observations
This is a uniModal Histogram
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
biModal
Modal
Class
biModal
This is a biModal Histogram
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
2 - 60
2 - 61
Population distributions are often bell shaped.
Drawing a histogram
helps verify the shape of the population in question.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Line
2 - 62
Line charts are particularly useful
when the trend over time
is to be emphasized
Examples …
3-D
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
In combination
Line
2 - 63
Time Plot
M oMn ot n
h th
l yly SSt te
e ee ll PPro
r oddu uc tio
c t ino n
8.5
7.5
6.5
5.5
Mo n th
J F MAM J J A S O N D J F MAM J J A S O N D J F MAMJ J A S O
2000
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
2001
2002
Line
2 - 64
Employment Rate in Canadian Cities
% employment
70
68
66
64
62
60
58
56
54
52
Preparing a Line Chart for this type of data is not overly useful!
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Line
2 - 65
Employment Rate in Canadian Cities
% employment
70
68
66
64
62
60
58
56
54
52
Is this combination any better for displaying the data?
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Line
2 - 66
frequency Polygon and Ogive
frequency Polygon
0.3
Ogive
1.0
0.2
0.5
0.1
0.0
0.0
0
10
20
30
40
50
Sales
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
0
10
20
30
Sales
40
50
Test your learning…
www.mcgrawhill.ca/college/lind
Online Learning Centre
for quizzes
extra content
data sets
searchable glossary
access to Statistics Canada’s E-Stat data
…and much more!
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
2 - 67
2 - 68
This completes Chapter 2
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.