MATH 1342 BPS 5th ed. Notes on Graphing Stemplots Send corrections / comments to Mary Parker at mparker@austincc.edu last updated 08/10/09 page 1 of 2 Constructing a stemplot by hand, with rounding and splitting stems. Also, discussion of how to handle outliers. (Last paragraph) Example from Moore’s Basic Practice of Statistics, 4th edition, page 21. People with diabetes must monitor and control their blood glucose level. The goal is to maintain “fasting plasma glucose” between about 90 and 130 milligrams per deciliter (mg/dl). The following are the fasting plasma glucose levels for 18 diabetics enrolled in a diabetes control class, five months after the end of the class: (These data are from a 1993 master’s thesis at Purdue University by Debora Arsenau, “Comparison of diet management instruction for patients with non-insulin dependent diabetes mellitus: learning activity package vs. group instruction.” ) 141 158 112 153 134 95 96 78 148 172 200 271 103 172 359 145 147 255 Make a stemplot of these data and describe the main features of the distribution. (You will want to round and also split stems.) Are there outliers? How well is the group as a whole achieving the goal for controlling glucose levels? Solution: In this problem, they tell you to round the data and to split stems. Before we actually do that, let’s discuss why they tell you that. Suppose they hadn’t said that and you started to do a stemplot on the data as it is. Since the last digit is the leaf and the digits before that are stems, we’d need to have stems all the way from 7 to 35 (for the 70’s to the 350’s.) That’s a lot of stems – more classes than we would want for a graph of a dataset unless it had MANY observations. Generally speaking, it’s a good idea to have between 6 and 20 classes for a graph. So first we round the data, so that the last digits are all zeros, and then we can use the tens digit as the leaf of the stemplot. Here are the rounded data: 140 160 110 150 130 100 100 80 150 170 200 170 100 170 360 150 150 260 Now since the leaves are the tens digits then the stems are the hundreds digits. So there are only three stems. 0 1 2 3 Now, four stems isn’t enough different classes for a good graph. We want between 6 and 20 classes. So if we split each stem in two, we’ll have a better stemplot. Here are the stems 0 0 1 1 2 2 3 3 Now, I’ll put the first four rounded observations onto this to indicate how to get started: 0 | MATH 1342 BPS 5th ed. Notes on Graphing Stemplots Send corrections / comments to Mary Parker at mparker@austincc.edu 0 1 1 2 2 3 3 last updated 08/10/09 page 2 of 2 | etc. | 4 1 etc. | 6 5 etc. | etc. | | | Shown are two versions of this stemplot. For the first we have (as the text suggests) rounded to the nearest 10. In the second version, we have trimmed numbers (dropped the last digit). That is sometimes used when dealing with data electronically as an alternative to rounding. Notice that both stemplots show the same basic shape. 359 mg/dl appears to be an outlier. The stemplot seems to be slightly right-skewed (even if we ignore the outlier). Overall, glucose levels are not under control: Only 4 of the 18 had levels in the desired range. Sometimes students think that, if there is an outlier, they should discard it. That is NOT necessarily true. If you can determine that it is clearly an error (in measurement, in recording the data, or in classifying the individuals into this group) then you should say that and delete the outlier from the data. If you can’t determine that it is an error, but believe there is some reason to be concerned about it, it is acceptable to show the results of the analysis both with and without the outlier and discuss those.