Histograms & Stemplots for Quantitative Data Describing Data using Summary Features of Quantitative Variables Center—Location in middle of all data Unusual features - Outliers, gaps, clusters Spread—Measure of variability, range Shape—Distribution pattern: symmetric, skewed, uniform, bimodal, etc. CUSS in context! Dotplot for Univariate Quantitative Data Center: about -50 Unusual features: gap at -45 Spread: 48 degrees (-69 to -21) Shape: trimodal, representing 3 seasons Stemplot for Quantitative Data Ages of Death of U.S. First Ladies 3 4 5 6 7 8 9 | | | | | | | 4, 3 2, 0, 0, 1, 7 6 4, 0, 1, 1, Key 5, 1, 3, 2, 7, 2, 4, 3, 3 | 4 indicates 34 years old 8 4, 4, 4, 5, 6, 9 6, 7, 8, 8 3, 6, 7, 8, 9, 9 Stem Leaf—a single digit Center: 65 years, Spread: 63 years; Shape: skewed left (towards lower numbers) How to make a Stemplot (Stem and Leaf Plot) Separate each observation into a stem (all but the last digit) and a leaf (the last digit) Sometimes rounding to the nearest hundred, thousand, etc. is a good idea when there are a lot of digits to consider Write the stems in a vertical column in order from smallest to largest and draw a vertical line at the right of the column Write each leaf in the row to the right of its stem in increasing order Make a stemplot with the following data Joey’s first 14 quiz grades in a marking period were: 86 87 7 8 9 84 76 4568 02467 01368 91 96 75 82 78 90 80 98 74 93 Key: 7 | 4 is score of 74 Center: 86; Spread: 24; Shape: Uniform Split Stemplot Stem is split for every 2 leaves— (0, 1), (2, 3), (4, 5), (6, 7), and (8, 9) 1 1 2 2 2 2 2 3 | | | | | | | | 7 8, 0, 2, 4, 9, 9, 9, 9, 9 0, 0, 0, 1, 1, 1, 1, 1, 1 2, 2, 3, 3 5 8 0, 1 Age of 27 students randomly selected from Stat 303 at A&M Split Stemplot 1 1 2 2 3 3 | | | | | | 7, 8, 9, 9, 9, 9, 9 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 3, 3, 4 5, 8 0, 1 Stem is split for every 5 leaves—(0 thru 4) and (5 thru 9) Age of 27 students randomly selected from Stat 303 at A&M Back-to-back Stemplot Babe Ruth When comparing |0| data, use comparative |1| language! (higher, more than, etc.) 5, 2 | 2 | 5, 4 | 3 | 9, 7, 6, 6, 6, 1, 1 | 4 9, 4, 4 | 5 | 0|6| Roger Maris 8 3, 4, 6 3, 6, 8 3, 9 1 Number of home runs in a season Compare Ruth & Maris Who’s Better? Any unusual features? Maris has a possible outlier at 61 Spread? Babe Ruth – centered higher at about 47 compared to Maris at 23 Maris has larger spread of 53 compared to Babe’s of 38 Shape? Babe’s is mound shaped and symmetrical, while Maris’s is skewed right with the outlier Histogram Histogram TX_betw eenHoustonDallas 120 Frequency Count This bin represents the # of people whose age is at least 20 but less than 25 100 Count 80 Centered at about 35 Skewed right Spread of 90 years 60 40 20 0 20 40 Variable being counted 60 age 80 100 Uniform Distribution from rolling a fair six-sided die 300 times 70 59 Frequency 60 50 54 42 54 46 45 3 4 40 30 20 10 0 1 2 Face of Fair Six-sided Die 5 6 How to make Histograms Divide the list of data into groups or classes of equal width (0-5, 5-10, 10-15, etc) Scale the horizontal axis using these classes Count the number of individuals in each class Scale the vertical axis using the counts Draw bars representing the count for each class, so each bar has equal width Histograms on the calculator Enter data into List Choose histogram option in StatPlot Choose the list you used for Xlist Choose 1 for Freq or a 2nd list if data is stored in two lists (values in one, frequency in another) Zoom 9:statplot will scale it for you but check the Window to make sure you have reasonable values of min & max for both x (values) and y (frequency count). The Xscl will set the width of the bars. Ch. 3 Test Results Centered at about 80 No unusual features Skewed left Spread of about 60 Ch. 3 Test Results Decimal point is 1 digit(s) to the right of the colon. 4:4 4:9 5 : 13 5 : 589 6 : 01111444 6 : 5666678 7 : 00012222233344 7 : 5666678888889999999 8 : 00112223333334444 8 : 56666668889999999 9 : 012222222334444 9 : 555667789 10 : 0 Make a histogram using Babe Ruth’s data from the earlier slide