Chapter 2 Organizing Data Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze Frequency Tables • A frequency table – organizes quantitative data. – partitions data into classes (intervals). – shows how many data values are in each class. Test Score Number of Students 61-70 4 71-80 8 81-90 15 91-100 7 Copyright © Cengage Learning. All rights reserved. 2|2 Data Classes and Class Frequency • Class: an interval of values. – Example: 61 x 70 • Frequency: the number of data values that fall within a class. – “Five data fall within the class 61 x 70”. • Relative Frequency: the proportion of data values that fall within a class. – “18% of the data fall within the class 61 x 70”. Copyright © Cengage Learning. All rights reserved. 2|3 Structure of a Data Class A “data class” is basically an interval on a number line. It has: • A lower limit a and an upper limit b. • A width. • A lower boundary and an upper boundary (integer data). • A midpoint. Copyright © Cengage Learning. All rights reserved. 2|4 Structure of a Data Class A “data class” is basically an interval on a number line. If a = 60 and b = 69 for integer data, what is the value of the lower boundary? a). 60 b). 59.5 c). 9 d). 64.5 Copyright © Cengage Learning. All rights reserved. 2|5 Structure of a Data Class A “data class” is basically an interval on a number line. If a = 60 and b = 69 for integer data, what is the value of the lower boundary? a). 60 b). 59.5 c). 9 d). 64.5 Copyright © Cengage Learning. All rights reserved. 2|6 Constructing Data Classes • Find the class width. Largest data value – smallest data value – Desired number of classes – Increase the computed value to the next higher whole number. • Find the class limits. – The lower limit of the “leftmost” class is set equal to the smallest value in the data set. Copyright © Cengage Learning. All rights reserved. 2|7 Constructing Data Classes, cont’d • Find the class boundaries (integer data). – Subtract 0.5 from the lower class limit and add 0.5 to the upper class limit. For a certain data set, the minimum value is 25 and the maximum value is 58. If you wish to partition the data into 5 classes, what would be the class width? a). 5 b). 6 Copyright © Cengage Learning. All rights reserved. c). 7 d). 8 2|8 Constructing Data Classes, cont’d • Find the class boundaries (integer data). – Subtract 0.5 from the lower class limit and add 0.5 to the upper class limit. For a certain data set, the minimum value is 25 and the maximum value is 58. If you wish to partition the data into 5 classes, what would be the class width? a). 5 b). 6 Copyright © Cengage Learning. All rights reserved. c). 7 d). 8 2|9 Building a Frequency Table • Find the class width, class limits, and class boundaries of the data. • Use Tally marks to count the data in each class. • Record the frequencies (and relative frequencies if desired) on the table. Copyright © Cengage Learning. All rights reserved. 2 | 10 Histograms • Histogram – graphical summary of a frequency table. • Uses bars to plot the data classes versus the class frequencies. Copyright © Cengage Learning. All rights reserved. 2 | 11 Making a Histogram • Make a frequency table. • Place class boundaries on horizontal axis. Place frequencies on vertical axis. • For each class, draw a bar with height equal to the class frequency and width equal to the class width plus 1. Copyright © Cengage Learning. All rights reserved. 2 | 12 Making a Histogram Copyright © Cengage Learning. All rights reserved. 2 | 13 Distribution Shapes Symmetric Uniform Skewed Left Skewed Right Copyright © Cengage Learning. All rights reserved. Bimodal 2 | 14 Critical Thinking • A bimodal distribution shape might indicate that the data are from two different populations. • Outliers – data values that are very different from other values in the data set. • Outliers may indicate data recording errors. Copyright © Cengage Learning. All rights reserved. 2 | 15 Exploratory Data Analysis • EDA is the process of learning about a data set by creating graphs. • EDA specifically looks for patterns and trends in the data. • EDA also identifies extreme values. Copyright © Cengage Learning. All rights reserved. 2 | 16 Graphical Displays… • … represent the data. • … induce the viewer to think about the substance of the graphic. • …should avoid distorting the message of the data. Copyright © Cengage Learning. All rights reserved. 2 | 17 Bar Graphs • Used for qualitative or quantitative data. • Can be vertical or horizontal. • Bars are uniformly spaced and have equal widths. • Length/height of bars indicate counts or percentages of the variable. • “Good practice” requires including titles and units and labeling axes. Copyright © Cengage Learning. All rights reserved. 2 | 18 Bar Graphs Example: Copyright © Cengage Learning. All rights reserved. 2 | 19 Pareto Charts • A bar chart with two specific features: – Heights of bars represent frequencies. – Bars are vertical and are ordered from tallest to shortest. Copyright © Cengage Learning. All rights reserved. 2 | 20 Circle Graphs/Pie Charts • Used for qualitative data • Wedges of the circle represent proportions of the data that share a common characteristic. • “Good practice” requires including a title and either wedge labels or legend. Copyright © Cengage Learning. All rights reserved. 2 | 21 Time-Series • Shows data measurements in chronological order. • Data are plotted in order of occurrence at regular intervals over a period of time. Copyright © Cengage Learning. All rights reserved. 2 | 22 Critical Thinking – which type of graph to use? • Bar graphs are useful for quantitative or qualitative data. • Pareto charts identify the frequency in decreasing order. • Circle graphs display how a total is dispersed into several categories. • Time-series graphs display how data change over time. Copyright © Cengage Learning. All rights reserved. 2 | 23 Critical Thinking – which type of graph to use? What type of graph would be best for showing the ice cream flavor preferences of a group of 100 children? a). Histogram b). Pareto graph c). Time series graph d). Circle graph Copyright © Cengage Learning. All rights reserved. 2 | 24 Critical Thinking – which type of graph to use? What type of graph would be best for showing the ice cream flavor preferences of a group of 100 children? a). Histogram b). Pareto graph c). Time series graph d). Circle graph Copyright © Cengage Learning. All rights reserved. 2 | 25 Stem and Leaf Plots • • Displays the distribution of the data while maintaining the actual data values. Each data value is split into a stem and a leaf. Copyright © Cengage Learning. All rights reserved. 2 | 26 Stem and Leaf Plot Construction Copyright © Cengage Learning. All rights reserved. 2 | 27 Critical Thinking • By looking at the stem-andleaf display “sideways”, we can see the distribution shape of the data. Copyright © Cengage Learning. All rights reserved. 2 | 28 Critical Thinking • Large gaps between stems containing leaves, especially at the top or bottom, suggest the existence of outliers. • Watch the outliers – are they data errors or simply unusual data values? Copyright © Cengage Learning. All rights reserved. 2 | 29