Chapter 2
Organizing Data
Understanding Basic Statistics
Fifth Edition
By Brase and Brase
Prepared by Jon Booze
Frequency Tables
• A frequency table
– organizes quantitative data.
– partitions data into classes (intervals).
– shows how many data values are in each
class.
Test Score
Number of
Students
61-70
4
71-80
8
81-90
15
91-100
7
Copyright © Cengage Learning. All rights reserved.
2|2
Data Classes and Class Frequency
• Class: an interval of values.
– Example: 61  x  70
• Frequency: the number of data values that fall
within a class.
– “Five data fall within the class 61  x  70”.
• Relative Frequency: the proportion of data
values that fall within a class.
– “18% of the data fall within the class
61  x  70”.
Copyright © Cengage Learning. All rights reserved.
2|3
Structure of a Data Class
A “data class” is basically an interval on a number
line.
It has:
• A lower limit a and an
upper limit b.
• A width.
• A lower boundary and
an upper boundary
(integer data).
• A midpoint.
Copyright © Cengage Learning. All rights reserved.
2|4
Structure of a Data Class
A “data class” is basically an interval on a number
line.
If a = 60 and b = 69 for
integer data, what is
the value of the lower
boundary?
a). 60
b). 59.5
c). 9
d). 64.5
Copyright © Cengage Learning. All rights reserved.
2|5
Structure of a Data Class
A “data class” is basically an interval on a number
line.
If a = 60 and b = 69 for
integer data, what is
the value of the lower
boundary?
a). 60
b). 59.5
c). 9
d). 64.5
Copyright © Cengage Learning. All rights reserved.
2|6
Constructing Data Classes
• Find the class width.
Largest data value – smallest data value
–
Desired number of classes
– Increase the computed value to the next
higher whole number.
•
Find the class limits.
– The lower limit of the “leftmost” class is set
equal to the smallest value in the data set.
Copyright © Cengage Learning. All rights reserved.
2|7
Constructing Data Classes, cont’d
• Find the class boundaries (integer data).
– Subtract 0.5 from the lower class limit and
add 0.5 to the upper class limit.
For a certain data set, the minimum value is 25
and the maximum value is 58. If you wish to
partition the data into 5 classes, what would be
the class width?
a). 5
b). 6
Copyright © Cengage Learning. All rights reserved.
c). 7
d). 8
2|8
Constructing Data Classes, cont’d
• Find the class boundaries (integer data).
– Subtract 0.5 from the lower class limit and
add 0.5 to the upper class limit.
For a certain data set, the minimum value is 25
and the maximum value is 58. If you wish to
partition the data into 5 classes, what would be
the class width?
a). 5
b). 6
Copyright © Cengage Learning. All rights reserved.
c). 7
d). 8
2|9
Building a Frequency Table
• Find the class width, class limits, and class
boundaries of the data.
• Use Tally marks to count the data in each class.
• Record the frequencies (and relative
frequencies if desired) on the table.
Copyright © Cengage Learning. All rights reserved.
2 | 10
Histograms
• Histogram – graphical summary of a frequency
table.
• Uses bars to plot the data classes versus the
class frequencies.
Copyright © Cengage Learning. All rights reserved.
2 | 11
Making a Histogram
• Make a frequency table.
• Place class boundaries on horizontal axis.
Place frequencies on vertical axis.
• For each class, draw a bar with height equal to
the class frequency and width equal to the class
width plus 1.
Copyright © Cengage Learning. All rights reserved.
2 | 12
Making a Histogram
Copyright © Cengage Learning. All rights reserved.
2 | 13
Distribution Shapes
Symmetric
Uniform
Skewed Left
Skewed Right
Copyright © Cengage Learning. All rights reserved.
Bimodal
2 | 14
Critical Thinking
• A bimodal distribution shape might indicate that
the data are from two different populations.
• Outliers – data values that are very different
from other values in the data set.
• Outliers may indicate data recording errors.
Copyright © Cengage Learning. All rights reserved.
2 | 15
Exploratory Data Analysis
• EDA is the process of learning about a data set
by creating graphs.
• EDA specifically looks for patterns and trends in
the data.
• EDA also identifies extreme values.
Copyright © Cengage Learning. All rights reserved.
2 | 16
Graphical Displays…
• … represent the data.
• … induce the viewer to think about the
substance of the graphic.
• …should avoid distorting the message of the
data.
Copyright © Cengage Learning. All rights reserved.
2 | 17
Bar Graphs
• Used for qualitative or quantitative data.
• Can be vertical or horizontal.
• Bars are uniformly spaced and have equal
widths.
• Length/height of bars indicate counts or
percentages of the variable.
• “Good practice” requires including titles and
units and labeling axes.
Copyright © Cengage Learning. All rights reserved.
2 | 18
Bar Graphs
Example:
Copyright © Cengage Learning. All rights reserved.
2 | 19
Pareto Charts
• A bar chart with two specific features:
– Heights of bars represent frequencies.
– Bars are vertical and are ordered from tallest
to shortest.
Copyright © Cengage Learning. All rights reserved.
2 | 20
Circle Graphs/Pie Charts
• Used for qualitative data
• Wedges of the circle represent proportions of
the data that share a common characteristic.
• “Good practice” requires including a title and
either wedge labels or legend.
Copyright © Cengage Learning. All rights reserved.
2 | 21
Time-Series
• Shows data measurements in chronological
order.
• Data are plotted in order of occurrence at
regular intervals over a period of time.
Copyright © Cengage Learning. All rights reserved.
2 | 22
Critical Thinking –
which type of graph to use?
• Bar graphs are useful for quantitative or
qualitative data.
• Pareto charts identify the frequency in
decreasing order.
• Circle graphs display how a total is dispersed
into several categories.
• Time-series graphs display how data change
over time.
Copyright © Cengage Learning. All rights reserved.
2 | 23
Critical Thinking –
which type of graph to use?
What type of graph would be best for showing the
ice cream flavor preferences of a group of 100
children?
a). Histogram
b). Pareto graph
c). Time series graph d). Circle graph
Copyright © Cengage Learning. All rights reserved.
2 | 24
Critical Thinking –
which type of graph to use?
What type of graph would be best for showing the
ice cream flavor preferences of a group of 100
children?
a). Histogram
b). Pareto graph
c). Time series graph d). Circle graph
Copyright © Cengage Learning. All rights reserved.
2 | 25
Stem and Leaf Plots
•
•
Displays the distribution of the data while
maintaining the actual data values.
Each data value is split into a stem and a leaf.
Copyright © Cengage Learning. All rights reserved.
2 | 26
Stem and Leaf Plot Construction
Copyright © Cengage Learning. All rights reserved.
2 | 27
Critical Thinking
• By looking at the stem-andleaf display “sideways”, we
can see the distribution
shape of the data.
Copyright © Cengage Learning. All rights reserved.
2 | 28
Critical Thinking
• Large gaps between stems containing leaves,
especially at the top or bottom, suggest the
existence of outliers.
• Watch the outliers – are they data errors or
simply unusual data values?
Copyright © Cengage Learning. All rights reserved.
2 | 29