Lesson 2.3 A box plot gives you an idea of the overall distribution of a data set, but in some cases you might want to see other information and details that a box plot doesn’t show. A histogram is a graphical representation of a data set, with columns to show how the data are distributed across different intervals of values. Histograms give vivid pictures of distribution features, such as clusters of values, or gaps in data. The columns of a histogram are called bins and should not be confused with the bars of a bar graph. The bars of a bar graph indicate categories—how many data items either have the same value or share a characteristic Shatevia took a random sample of 50 students who own MP3 players at her high school and asked how many songs they have stored. The two graphs were constructed from the data in the table. a. What is the range of the data? The range is 248 songs. (1013-765) Shatevia took a random sample of 50 students who own MP3 players at her high school and asked how many songs they have stored. The two graphs were constructed from the data in the table. b. What is the bin width of each graph? Graph A bin width = 50 songs Graph B bin width = 10 songs. Shatevia took a random sample of 50 students who own MP3 players at her high school and asked how many songs they have stored. The two graphs were constructed from the data in the table. c. How can you know if the graph accounts for all 50 values? The sum of all the bin frequencies = 50 Shatevia took a random sample of 50 students who own MP3 players at her high school and asked how many songs they have stored. The two graphs were constructed from the data in the table. d. Why are the columns shorter in Graph B? With smaller bin widths you will usually have shorter bins. Shatevia took a random sample of 50 students who own MP3 players at her high school and asked how many songs they have stored. The two graphs were constructed from the data in the table. e. Which graph is better at showing the overall shape of the distribution? What is that shape? Shatevia took a random sample of 50 students who own MP3 players at her high school and asked how many songs they have stored. The two graphs were constructed from the data in the table. Graph A shows that the distribution is skewed left. This fact is harder to see with all the ups and downs in Graph B. Shatevia took a random sample of 50 students who own MP3 players at her high school and asked how many songs they have stored. The two graphs were constructed from the data in the table. f. Which graph is better at showing the gaps and cluster in the data? Shatevia took a random sample of 50 students who own MP3 players at her high school and asked how many songs they have stored. The two graphs were constructed from the data in the table. With more bins you can see gaps and clusters in the data. A dot plot is like a histogram with a very small bin width. Graph B is the better graph for seeing gaps and clusters. Shatevia took a random sample of 50 students who own MP3 players at her high school and asked how many songs they have stored. The two graphs were constructed from the data in the table. g. What percentage of the players have fewer than 850 songs stored? Shatevia took a random sample of 50 students who own MP3 players at her high school and asked how many songs they have stored. The two graphs were constructed from the data in the table. Add the bin frequencies for the bins below (to the left of) 850 songs. There are 10 data values, so 10 out of 50, or 20% of the sample, had fewer than 850 songs. The percentile rank of a value is the percentage of data values that are below the given value. In the example, 850 songs has a percentile rank of 20 because this value is greater than 20% of the values in the sample. The data used in this histogram have a mean of 34.05 and a standard deviation of 14.68. Add the bin frequencies to find that there are 40 data values in all. Approximate the percentile rank of a value two standard deviations above the mean. The value of two standard deviations above the mean is 34.05 + 2 x 14.68 or 63.41. All of the data values in the ten bins up to the value of 60 are less than 63.41. Adding the bin frequencies up to 60 gives 37. This is 37/40 or 92.5%, of the data lie below 63.41. So 63.41 is approximately the 93rd percentile. Approximately what percentage of the data values are within one standard deviation of the mean? One standard deviation above the mean is 48.73, and one standard deviation below the mean is 19.37. This interval includes at least those values in the bins from 20 to 45. So 25/40 , or approximately 62.5%, of the data lie within one standard deviation of the mean. Teenagers require anywhere from 1800 to 3200 calories per day, depending on their growth rate and level of activity. The food you consume as part of your diet should include sufficient fiber, moderate levels of carbohydrates and fat, and as little sodium, saturated fat, and cholesterol as possible. The table shows the recommended amounts of carbohydrates and fiber and the maximum amounts of other nutrients in a healthy 2500-calorie diet. So, how does fast food fit into a healthy diet? Examine the information about the nutritional content of fastfood sandwiches. With your group, study one of the nutritional components (total calories, total fat, saturated fat, cholesterol, sodium, or total carbohydrate). Use box plots, histograms, and the measures of central tendency and spread to compare the amount of that component in the sandwiches. You may want to divide your data so that you can make comparisons between different types of sandwiches or between restaurants. As you do your statistical analysis, discuss how these fast-food items would affect a healthy diet.