(2.3, 2.4) Stem and Leaf Plots Stem and Leaf Plots: Can be used to show both the rank order and shape of a data set simultaneously To make a stem and leaf plot, the stem is all digits except for the last one and the leaves are the last digit, so for the number 112, the “2” would be a leaf and the “11” would be the stem. The stem goes on the left of the vertical line and the leaves are written after the corresponding stem. Example: The following data represents the number of minutes students spend surfing the internet: 7 42 72 20 43 75 24 44 77 25 45 78 25 46 79 28 47 83 28 48 87 30 48 88 32 50 90 35 51 91 Use the tens digit as the stem (think of 7, 07, so all entries have a tens digit) 0|7 1| 2|0 3|0 4|2 5|0 6| 7|2 8|3 9|0 45588 25 3456788 1 5789 78 1 A split stemplot has more stems if you wish to stretch the data out. A partial split stemplot (by 5’s) could look like this: 0|7 1| 1| 2|0 2|5 3|0 3|5 4|2 4|5 4 588 2 34 6788 If we had numbers like 28.3 we could also trim the data of unnecessary digits like the .3 and keep this number as 28. If we wanted to use the decimal as the leaf, most of the leaves would be “0” (because they don’t have decimal places) and the stems would have to be 07, 08, 09, 10, 11, … 88 Practicing trimming data 0.2, 3, 7, 14, 22, 47.6, 47.8, 48.2, 60, 89.5, 108 0|0 3 7 1|4 2|2 3| 4|7 7 8 5| 6|0 7| 8|9 9| 10|8 When looked at horizontally (turned 90 degrees counter-clockwise) the stem and leaf plot looks like a histogram however: Benefits of Stem and Leaf: The stem and leaf plot is easier than a histogram to construct by hand It provides more information than a histogram because the stem and leaf plot shows the actual data Histograms Quantitative variables Good for big data sets, especially if technology is available. Uses a box to represent each data point. 35 30 25 20 15 10 5 0 20's 30's 40's 50's 60's 70's 80's Stemplots Quantitative variables Good for small data sets, convenient for back-of-the-envelope calculations. Rarely found in scientific or laymen publications. Uses a digit to represent each data point. 0|7 1| 2|0 4 5 5 8 8 3|0 2 5 4|2 3 4 5 6 7 8 8 5|0 1 6| 7|2 5 7 8 9 8|3 7 8 9|0 1 Graphs: • Bar graphs and pie charts (categorical variables) • Histograms and stemplots (quantitative variables—good for checking for symmetry and skewness) • Boxplots (quantitative variables—graphical display of the 5 # summary, modified boxplots show outliers) Describing distributions • Shape (symmetric/skewed, unimodal/bimodal/multimodal) • Center (mean or median) • Spread (usually standard deviation/variance or IQR from the 5 # summary) • Outliers • If you have a symmetric distribution with no outliers, use the mean and standard deviation. • If you have a skewed distribution and/or you have outliers, use the 5 # summary instead.