Stem and Leaf Plots

advertisement
Displaying Data







Data: Categorical and Numerical
Dot Plots
Stem and Leaf Plots
Back-to-Back Stem and Leaf Plots
Grouped Frequency Tables
Histograms and Bar Graphs
Circle Graphs (Pie Charts)
Graphs are used to try to tell a story.
Data
Data are observations (such as measurements,
genders, survey responses) that have been
collected. Statisticians often collect data from
small portions of a large group in order to
determine information about the group. This
information is then used to make conjectures
about the entire group.
Describing data frequently involves reading
information from graphical displays, tables, lists,
and so on.
Data: Categorical and Numerical
Categorical data are data that represent
characteristics of objects or individuals in groups
(or categories), such as black or white, inside or
outside, male or female.
Numerical data are data collected on numerical
variables. For example, in grade school, students
may ask whether there is a difference in the
distance that girls and boys can jump. The
distance jumped is a numerical variable and the
collected data is numerical.
Dot Plots
A dot plot, or line plot, provides a quick and simple
way of organizing numerical data. They are
typically used when there is only one group of data
with fewer than 50 values.
Dot Plots
Suppose the 30 students in Abel’s class received
the following test scores:
Dot Plots
A dot plot for the class scores consists of a
horizontal number line on which each score is
denoted by a dot, or an x, above the corresponding
number-line value. The number of x’s above each
score indicates how many times each score
occurred.
Dot plots
Two students
scored 72.
The score 52
is an outlier
Four students
scored 82.
A gap occurs
between scores 88
and 97.
Scores 97 and 98
form a cluster
Dot Plots
If a dot plot is constructed on grid paper, then
shading in the squares with x’s and adding a
vertical axis depicting the scale allows the
formation of a bar graph.
Stem and Leaf Plots
The stem and leaf plot is similar to the dot plot,
but the number line is usually vertical, and digits
are used rather than x’s.
9 | 7 represents 97.
Stem and Leaf Plots
In an ordered stem and leaf plot, the data are in
order from least to greatest on a given row.
Stem and Leaf Plots
Advantages of stem-and-leaf plots:
 They are easily created by hand.
 Do not become unmanageable when volume of
data is large.
 No data values are lost.
Disadvantage of stem-and-leaf plots:
 We lose information – we may know a data value
exists, but we cannot tell which one it is.
Stem and Leaf Plots
How to construct a stem-and-leaf plot:
1. Find the high and low values of the data.
2. Decide on the stems.
3. List the stems in a column from least to
greatest.
4. Use each piece of data to create leaves to the
right of the stems on the appropriate rows.
Stem and Leaf Plots
5. If the plot is to be ordered, list the leaves in
order from least to greatest.
6. Add a legend identifying the values
represented by the stems and leaves.
7. Add a title explaining what the graph is about.
Back-to-Back Stem-and-Leaf Plots
Back-to-back stem-and-leaf plots can be used to
compare two sets of related data. In this plot,
there is one stem and two sets of leaves, one to
the left and one to the right of the stem.
Example
Group the presidents into two groups, George
Washington to Rutherford B. Hayes and James Garfield
to Ronald Reagan.
Example (continued)
a. Create a back-to-back stem and leaf plot of the
two groups and see if there appears to be a
difference in ages at death between the two
groups.
b. Which group of presidents seems to have lived
longer?
Example (continued)
Because the ages at death vary from 46 to 93, the
stems vary from 4 to 9. The first 19 presidents are
listed on the left and the remaining 19 on the right.
Example (continued)
The early presidents seem, on average, to have
lived longer because the ages at the high end,
especially in the 70s through 90s, come more
often from the early presidents.
The ages at the lower end come more often from
the later presidents.
For the stems in the 50s and 60s, the numbers of
leaves are about equal.
Stem and Leaf Plots
A stem and leaf plot shows how wide a range of
values the data cover, where the values are
concentrated, whether the data have any
symmetry, where gaps in the data are, and
whether any data points are decidedly different
from the rest of the data.
Frequency Tables
A frequency distribution table shows how many
times data occurs in a range.
The data for the
ages of the
presidents at death
are summarized in
the table.
Frequency Tables
Characteristics of Frequency Tables
 Each class interval has the same size.
 The size of each interval can be computed by
subtracting the lower endpoint from the higher
and adding 1, e.g., 49 – 40 +1 = 10.
 We know how many data values occur within a
particular interval but we do not know the
particular data values themselves.
Frequency Tables
Characteristics of Frequency Tables
 As the interval size increases, information is lost.
 Classes (intervals) should not overlap.
Histograms and Bar Graphs
A histogram is made up of adjoining rectangles, or
bars.
The bars are all the same width. The scale on the
vertical axis must be uniform.
Histograms and Bar Graphs
Uniform scale
The death ages are shown on the horizontal axis
and the numbers along the vertical axis give the
scale for the frequency.
Frequencies
are shown by
the heights of
vertical bars
each having
same width.
Bar Graphs
A bar graph typically has spaces between the
bars and is used to depict categorical data.
The bars representing
Tom, Dick, Mary, Joy, and
Jane could be
placed in any order.
Histograms and Bar Graphs
A distinguishing feature between histograms and
bar graphs is that there is no ordering that has to
be done among the bars of the bar graph, whereas
there is an order for a histogram.
Double-Bar Graphs
A double bar graph can be used to make
comparisons in data.
Circle Graphs (Pie Charts)
A circle graph, or pie chart, is used to represent
categorical data. It consists of a circular region
partitioned into disjoint sections, with each section
representing a part or percentage of the whole.
A circle graph shows how parts are related to the
whole.
Example
Construct a circle graph for the information in the
table, which is based on information taken from a
U.S. Bureau of the Census Report (2006).
Example (continued)
The entire circle represents the total 299 million
people.
The measure of the central angle (an angle whose
vertex is at the center of the circle) of each sector
of the graph is proportional to the fraction or
percentage of the population the section
represents.
Example (continued)
For example, the measure of the angle for the
sector for the under-5 group is
or approximately
7% of the circle.
Because the entire circle is 360°,
of 360°, or
about 24°, should represent the under-5 group.
Example (continued)
The table shows the number of degrees for each
age group.
Example (continued)
Download