AP STATISTICS CHAPTER 4 REVIEW created by Daniel Ho and Sachin Mehta Betters 2 nd period DISPLAYING QUANTITATIVE DATA Large groups of data are difficult to comprehend without summarization… visual aids such as pie and bar charts provide a solution, but are for categorical data... THUS, we explore various graphs that can effectively display quantitative data in this chapter HISTOGRAM data should be divided into equal width groups, or bins bins and count of data presents distribution a histogram plots the counts in each bin as height on a graph a relative frequency histogram displays the percentage of each bin instead of the exact amount percentiles are the percent of data that is at or below a certain value. quartiles are located every 25% of the data. interquartile range is the range of the middle half of the data. bars on a histogram touch discrete – bars are centered over distinct values continuous – bars cover an interval of values STEM AND LEAF DISPLAYS stem and leaf displays, like a histogram, show distribution, but present individual data as well each value is cut into leading digits (stems) and following digits (leaves) the bins are labeled by the stems each leaf can only be one digit DOT PLOTS a dot plot is a display where a dot represents a case along an axis (can be both vertical or horizontal) shows both data distribution and individual cases of data in each bin -- TIME PLOTS - although not a dot plot, time plots show data and trends over time (see above) SHAPE, CENTER, AND SPREAD distribution of data is described by CUSS – center, unusual points, shape, and spread humps in the graphs are called modes, and are specified by classic prefixes such as unimodal, bimodal, etc. no humps is called uniform symmetry is important to describing data; graph is symmetric if it can be mirrored across a vertical line If the data tapers of f to a side, it is referred to as skewed in the direction of the taper (the narrow side is called the tail) unusual points to look out for are outliers, points that stand far away from distribution, and gaps 11. PROBLEM ELEVEN Gasoline. In June 2004, 16 gas stations in Ithaca, NY, posted these prices for a gallon of regular gasoline. 2.029 2.119 2.259 2.049 2.079 2.089 2.079 2.039 2.069 2.269 2.099 2.129 2.169 2.189 2.039 2.079 a) Make a stem-and-leaf display of these gas prices. Use split stems; for example, use two 2.1 stems. One for prices between $2.10 and $2.149, the other for prices $2.15 to $2.199. b) Describe the shape, center, and spread of this distribution. c) What unusual feature do you see? PROBLEM ELEVEN ANSWER a) Make a stem-and-leaf display of these gas prices. Use split stems; for example, use two 2.1 stems. One for prices between $2.10 and $2.149, the other for prices $2.15 to $2.199. Stem Leaves 2.2 6 5 2.2 2.1 6 8 2.1 1 2 2.0 7 6 8 7 9 7 2.0 2 3 4 3 2.1|7 = $2.179 b) Describe the shape, center, and spread of this distribution. The distribution of gas prices is skewed to the right, centered around $2.10 per gallon, with most stations charging between $2.05 and $2.13. The lowest and highest prices were $2.03 and $2.27. c) What unusual feature do you see? There is a gap; no stations charge between $2.19 and $2.25 . PROBLEM FIFTEEN 15. Home runs, again. Students were asked to make a histogram of the number of home runs hit by Mark McGwire from 1986 to 2001 (see Exercise 13). One student submitted the following display: a) Comment on this graph. b) Create your own histogram of the data PROBLEM FIFTEEN ANSWER a) Comment on this graph. This is not a histogram. The horizontal axis should split the number of home runs hit in each year into bins. The vertical axis should show the number of years in each bin. b) Create your own histogram of the data. 6 Frequency 5 4 3 2 1 0 0 10 20 30 40 Home Runs 50 60 More