Constructing Stem and Leaf Plots

advertisement
TMTH 3360
NOTES ON COMMON GRAPHS AND CHARTS
To Describe Data, consider:
 Symmetry
 Skewness
 Unimodal or bimodal or uniform
 Extreme values
 Range of Values and mid-Range
 Most frequently occurring values
In interpreting graphs, consider:
 Horizontal and vertical scales
 The center point - of particular importance in comparing two histograms
 The starting point of the vertical scale - does it start at 0? How could this affect the
interpretation of the data?
Pareto Diagram
 Pareto diagrams are special bar chart that are usually used for qualitative data
 Vertical axis - frequency
 Horizontal axis - particular type, problem, classification
 Bars - placed left to right in decreasing order of importance
Color Preference of Customers
12
N 10
8
6
4
2
0
Red
Blue
Yellow
Green
Color
Goodson/ 3360gr
1
Data
The charts that follow use the following data which is time in minutes.
Time
110
130
115
130
115
130
120
135
120
135
120
135
120
140
125
140
125
140
125
140
130
145
130
145
150
Dot plots
Dot plots are used for quantitative data. Each observation is represented as a dot and
placed over its number value on a number line.
Time to Complete Task
Dotplot
Dotplot
Time to Complete Task
Time to Complete Task
110
110
110
120
130
140
Time
120
130
120
Time
150
140
130
150
140
150
Time
Goodson/ 3360gr
2
Constructing Frequency Distributions and Histograms





Determine the number of classes - usually you will have from 5 to 20; it depends on
how many data values you have and the spread of the data.
Determine the class width - Generally, divide the difference between the largest and
smallest values by the number of classes desired; round up.
All the classes should be of equal width to make uniform comparisons of the class
frequencies.
Write the class boundaries. The lowest class end point must be less than or equal to
the smallest data value (note that it does not have to equal the lowest value). The
uppermost class endpoint must be greater than the largest data value.
Construct a table that includes each class and the corresponding frequencies or
relative frequencies.
Table 1
Frequency Distribution
of Time
Time
Count
110
115
120
125
130
135
140
145
150
1
2
4
3
5
3
4
2
1
Note Table1
There are 8 classes. The class width is 5. The
frequency of the first class is 1; i.e. there is 1
value within the class which has a midpoint at
110. This distribution was constructed using
Minitab. If you are using XL, the format is
different.
Examine the histogram for Table 1. It is formulated by plotting the class boundaries on
the horizontal axis and bars with heights that correspond to the frequency (or relative
frequency) for each class.
5
Frequency
4
3
2
1
0
110
115
120
125
130
135
140
145
150
Time
Goodson/ 3360gr
3
Constructing Stem and Leaf Plots
Create the stem
 Divide the range of the data into equal units to be used as the stem
 The first few digits in each number will be the stem.
 Your data should result in five to fifteen stems, depending on the value of the data.
 List the stem values in order in a vertical column
 Draw a vertical line to the right of the stem values; the leaves will be placed to the right
of this line.
Attach the leaves.
 Digits to the right of the stem form the leaves.
 Specifically use the digit to the right of the stem and drop the rest of the digits.
 The leaves are ordered numerically on each branch.
 If the number of leaves in each stem row is too large, divide the stems into two groups,
the first corresponding to leaves beginning with 0 through 4 and the second with 5
through 9
Advantages
 Easy to construct
 Can find the median and quartiles
 Can read the numerical values from the graph
Note: it can be difficult to construct stem and leaf plots if the are many values and/or
many digits.
Stem-and-leaf
of Time N = 25
Leaf
1
Unit
11
11
12
12
13
13
14
14
15
0
55
0
555
0
555
0
55
0
Goodson/ 3360gr
4
Constructing a Box Plot [Note more details are on the box plot handout.]
1. Draw a number line showing the range of values of your data
2. Above the number line, locate the median, and the lower and upper quartiles -- [The
difference between the upper and lower quartiles is called the inter quartile range
(IQR).]
3. The box extends over the number line from the lower to upper quartile, i.e. the sides of
the box are on lines through each of the quartile points.
4. A line is drawn through the median within the box.
5. Draw lines extending to the left and to the right of the box, ending at:
the smallest data point > Q(.25) - I.5IQR.
the largest data point < Q(.25) + I.5IQR.
6.
Plot extreme points as individual points.
Advantages of the Box plot
 The graph provides a summary display.
 There is no clutter.
 It highlights the important features: median, quartiles and extreme values
 Additional data does not complicate the graph.
Interpreting Box plots
 The box encloses the middle 50% of the data.
 If the data is symmetrical, the median will lay half way between the extreme values.
 If the median is close to the left quartile and far from the right extreme, the data is
skewed right.
 If the median is close to the right quartile and far from the left extreme, the data is
skewed right.
 Two or more Box plots drawn on the same scale and side by side provide an effective
way of comparing samples.
Boxplot of Time
150
Time
140
130
120
110
Goodson/ 3360gr
5
Download