Chapter 2: Organizing and Summarizing Data Section 2.1: Organizing Qualitative Data Objectives: Students will be able to: Organize qualitative data in tables Construct bar graphs Construct pie charts Vocabulary: Frequency Distribution – lists each category of data and the number of occurrences for each category of data Relative Frequency Distribution – is the proportion (or percentage) of observations within a category Bar Graph – a graph showing frequency or relative frequency on the y-axis and categories on the x-axis Pareto chart – bar graph whose bars are drawn in decreasing order of frequency or relative frequency Side-by-Side bar graphs – use relative frequency since sample or population sizes may be different Pie Charts – circle divided into sectors; each sector represents a category of data; area is proportional to the frequency of the category; all data must be represented Key Concepts: Charts help to display data more visually with fewer words Different Views 14 Rehab Groin Neck Hand 3% 3% 7% 12 10 Back 40% Knee 17% 8 6 Rehab 4 Shoulder 13% Hip 7% Wrist 7% Neck Groin Elbow Hip Hand Shoulder Rehab Back 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 Knee Neck Groin Hand Knee Hip Shoulder Elbow Wrist Percent Rehab Elbow 3% Wrist Neck Hand Groin Knee Shoulder Hip Wrist Elbow 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 Back Percent 0 Back 2 Plot in the lower right corner is a Pareto chart. It is the same as the relative frequency chart; except the categories are in relative frequency order (from largest to smallest) from left to right. This graph came from the Total Quality Management (TQM) era in the middle to late 1980’s. Chapter 2: Organizing and Summarizing Data Example: Construct a frequency distribution and a histogram. Below are the net weights (in ounces) of 30 cans of peas from the Grumpy Blue Midget Company. 16.2 15.7 16.4 15.4 16.4 15.8 16.0 15.2 15.7 16.6 15.8 16.2 15.9 15.9 15.6 15.8 16.1 15.9 16.0 15.6 16.3 16.8 15.9 16.3 16.9 15.6 16.0 16.8 16.0 16.3 Homework: pg : 67 – 73; 6-8, 11, 13, 15, 22, 30 Chapter 2: Organizing and Summarizing Data Section 2.2: Organizing Quantitative Data: The popular displays Objectives: Students will be able to: Organize discrete data in tables Construct histograms of discrete data Organize continuous data in tables Construct histograms of continuous data Draw stem-and-leaf plots Draw dot plots Identify the shape of the distribution Vocabulary: Histogram – bar graphs of the frequency or relative frequency of the class Classes – categories of data Lower class limit – smallest value in the class Upper class limit – largest value in the class Class width – largest value minus smallest value of the class Open Ended – one of the limits is missing (or infinite) Stem-and-Leaf Plot – numerical graph of the data organized by place of the digits Split stems – divides the digit (range) in half Dot plots – like a histogram, but with dots representing the bars Data distribution – determining from the histogram the shape of the data Key Concepts: Stem and Leaf plots maintain the raw data, while histograms do not maintain the raw data Best used when the data sets are small The number of classes, k, to be constructed can be roughly approximated by k = number of observations maximum - minimum k and always round up to the same decimal units as the original data. To determine the width of a class use w = Frequency Distributions Uniform Skewed (Right -- tail) Normal (Bell-Shaped) Skewed (Left -- tail) Chapter 2: Organizing and Summarizing Data With the following data a) Construct a histogram b) Construct a stem graph Ex. 1 The ages (measured by last birthday) of the employees of Dewey, Cheatum and Howe are listed below. 22 20 Ex. 2 31 37 21 32 49 36 26 35 42 33 42 45 30 47 28 49 31 38 39 28 39 48 Below are times obtained from a mail-order company's shipping records concerning time from receipt of order to delivery (in days) for items from their catalogue? 3 5 27 7 7 31 10 12 13 5 10 21 14 22 6 Homework: pg 87 - 96: 3, 6, 8, 9, 12, 14, 19, 28, 43 12 23 8 6 14 3 2 8 10 9 5 19 22 4 12 25 7 11 11 13 8 Chapter 2: Organizing and Summarizing Data Section 2.3: Additional Displays of Quantitative Data Objectives: Students will be able to: Construct frequency polygons Create cumulative frequency and relative frequency tables Construct frequency and relative frequency ogives Draw time-series graphs Vocabulary: Class midpoint – adding consecutive lower class limits and dividing by 2 Frequency polygon – Plot a point above each class midpoint on a horizontal axis at a height equal to the frequency of the class. After points are plotted, straight lines (line-plot) are drawn between consecutive points Cumulative Frequency Distribution – displays the aggregate frequency of the category Cumulative Relative Frequency Distribution – displays the proportion (or percentage) of observations less than or equal to the category for discrete data and less then or equal to the upper class limit for continuous data Ogive – a line-plot graph that represents the cumulative frequency or cumulative relative frequency for the class Time-Series Plot – plots (line-plots) time along the horizontal access and the corresponding variable value along the vertical axis Key Concepts: Very similar to histograms Comparative Stem and Leaf plots (back to back) 1. Here are the numbers of home runs that Babe Ruth hit in his 15 years with the New York Yankees, 1920 to 1934. 54 49 59 46 35 41 41 34 46 22 25 47 60 54 46 Babe Ruth's home run record for a single year was broken by another Yankee, Roger Maris, who hit 61 home runs in 1961. Here are Maris's home run totals for his 10 years in the American league. 13 23 26 16 33 61 28 39 14 8 Compare these two distributions by constructing a back-to-back stemplot. Comment on the similarities and/or differences. Homework: pg 101 - 104: 8, 11, 19 Chapter 2: Organizing and Summarizing Data Section 2.4: Graphical Misrepresentations of Data Objectives: Students will be able to: Describe what can make a graph misleading or deceptive Vocabulary: None new Key Concepts: Common problems: 1. Vertical Axis a. Inconsistent vertical scaling b. Incorrect vertical scaling c. Vertical axis that doesn’t start at zero 2. Scaling in pictures different than that reflected by the data 3. Class widths a. Values overlap b. Class widths different Characteristics of Good Graphics: 1. Label the graphic clearly and provide explanations if needed 2. Avoid distortion. Don’t lie about the data 3. Avoid three-dimensions. Look nice, but often distract reader and result in misinterpretation 4. Don’t use more than one design in the same graphic. a. Let the numbers speak for themselves b. Sometimes graphs use a different design in a portion of the graphic to call attention to it Example Data 2.23 2.22 2.21 2.2 2.18 Data Comparing Two Graphs Value 2.19 2.17 2.16 2.15 2.14 2.13 2.12 1 2 3 4 5 6 7 8 Category Propaganda Chart 2.3 2.2 2.1 2 1.9 1.8 1.7 1.6 Value 1.5 1.4 1.3 1.2 1.1 Data 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1 2 3 4 5 Category Homework: pg 107-109: 1, 2, 7, 8, 13, 14 6 7 8 Chapter 2: Organizing and Summarizing Data Chapter 2: Review Objectives: Students will be able to: Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review exercises Use the technology to display graphs and plots of data Vocabulary: None new Homework: pg 111-117: 3, 5, 9, 11, 15, 19-21