Stat 101L: Lecture 4 Quantitative Data For a Statistics’ project, students weighed the contents of cans of cola. In 2000, 24 cans of cola were weighed (full and empty). The difference (full – empty) is the weight of the contents. The units are grams. 1 Quantitative Data Who? Cans of cola. What? Weight (g) of contents. 368, 351, 355, 367, 352, 369, 370, 369 370, 355, 354, 357, 366, 353, 373, 365 355, 356, 362, 354, 353, 378, 368, 349 2 Weight of Contents What can we say about the weight of contents of a can of cola? – Variation! – Smallest value? – Largest value? – Middle value? 3 1 Stat 101L: Lecture 4 Display of Data Stem-and-Leaf Display or Stem Plot – Orders the data and creates a display of the distribution of values. 4 Display of Data Histogram – A picture of the distribution of the data. – Collects values into bins. – Bins should be of equal width. – Different bin choices can yield different pictures. 5 Frequency Histogram Measurement 6 2 Stat 101L: Lecture 4 Constructing a Histogram Order data from smallest to largest using a stem and leaf display. Determine bins. – equal width – more data more bins 7 Weight of Contents W eight of Contents of Cans of Cola Frequency 15 10 5 0 3 30 340 350 360 37 0 38 0 390 W eight (gram s ) 8 Shape Symmetry – Mounded, flat Skew – Right, left Other – Multiple peaks, outliers 9 3 Stat 101L: Lecture 4 Symmetric, mounded in middle Histogram of Octane Rating 10 9 8 Frequency 7 6 5 4 3 2 1 0 86 87 88 89 90 91 92 93 94 95 96 Octane 10 Skew - Right pH of Pork Loins 80 70 Frequency 60 50 40 30 20 10 0 5.0 5.5 6.0 6.5 7.0 pH 11 Skew - Left Flexibility Index of Young Adult Men 20 Frequency 15 10 5 0 1 2 3 4 5 6 7 8 9 10 Flexibility Index 12 4 Stat 101L: Lecture 4 Multiple Peaks Size of Diamonds (carats) 15 Frequency 10 5 0 0.1 0.2 0.3 0.4 Size (carats) 13 Center A typical value. Summary of the whole batch of numbers. For symmetric distributions – easy. 14 Histogram of Octane Histogram of Octane Rating 10 9 8 Frequency 7 6 5 4 3 2 1 0 86 87 88 89 90 91 92 Octane 93 94 95 96 Center 15 5 Stat 101L: Lecture 4 Spread Variation matters. – Tightly clustered? – Spread out? – Low and high values? 16 Numerical Summaries Weights of contents of cans of cola. 34 9 35 12334455567 36 25678899 37 0038 17 Numerical Summaries What is a “typical” value? Look for the center of the distribution. What do we mean by “center”? 18 6 Stat 101L: Lecture 4 Measures of Center Sample Midrange –Average of the minimum and the maximum. (349+378)/2=363.5 grams –Greatly affected outliers. 19 Measures of Center Sample Median – A value that divides the data into a lower half and an upper half. – About half the data values are greater than the median about half are less than the median. 20 Sample Median 34 9 35 12334455567 36 25678899 37 0038 Median = (357+362)/2 = 359.5 grams 21 7