2-1 f requency Distributions Describing Data Graphic Presentations Copyright © 2004 2003 by The McGraw-Hill Companies, Inc. All rights reserved. 2-2 When you have completed this chapter, you will be able to: Organize raw data into frequency distribution Produce a histogram, a frequency polygon, and a cumulative frequency polygon from quantitative data Develop and interpret a stem-and-leaf display Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 2-3 Present qualitative data using such graphical techniques such as a clustered bar chart, a stacked bar chart, and a pie chart Detect graphic deceptions and use a graph to present data with clarity, precision, and efficiency Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 2-4 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 2-5 A Frequency Distribution is a grouping of data into non-overlapping classes (mutually exclusive)… showing the number of observations in each category or class. The range of categories includes all values in the data set (collectively exhaustive classes). Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 2-6 Class Midpoint or Class Mark: A point that divides a class into two equal parts, i.e. the average of the upper and lower class limits. Class frequency: 12.5 17.5 22.5 27.5 32.5 5 The number of observations in each class. Class interval: The class interval is obtained by subtracting the lower limit of a class from the lower limit of the next class, e.g. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 2-7 Dr. Tillman is Dean of the School of Business. He wishes to prepare a report showing the number of hours per week students spend studying. He selects a random sample of 30 students and determines the number of hours each student studied last week. 15.0, 23.7, 19.7, 15.4, 18.3, 23.0, 14.2, 20.8, 13.5, 20.7, 17.4, 18.6, 12.9, 20.3, 13.7, 21.4, 18.3, 29.8, 17.1, 18.9, 10.3, 26.1, 15.7, 14.0, 17.8, 33.8, 23.2, 12.9, 27.1, 16.6. Organize the data into a frequency distribution. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. There are five steps that can be used to 2-8 Construct a Frequency Distribution: Decide how many classes you wish to use. Frequency Distributions by hand Determine the class width. Set up the individual class limits. Tally the items into the classes. Count the number of items in each class. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Decide how many classes you wish to use 2-9 Rule of Thumb: For most data sets, you would want between 3 and 12 classes! Use the 2 to the K rule. Choose k so that 2 raised to the power of k is greater than the number of data points (n) or 30. In this case… 2k = 30 students 25 = 32, so use k = about 5 classes Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Determine the class width 2 - 10 Generally, the class width should be the same size for all classes. >= Max - Min K Class width 15.0, 23.0, 17.4, 21.4, 10.3, 33.8, 23.7, 14.2, 18.6, 18.3, 26.1, 23.2, 19.7, 20.8, 12.9, 29.8, 15.7, 12.9, 15.4, 13.5, 20.3, 17.1, 14.0, 27.1, 18.3, 20.7, 13.7, 18.9, 17.8, 16.6. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. K=5 (33.8 – 10.3)/ 5 = 4.7 Therefore, use class size of 5 hours Set up the individual class limits Minimum Value is 10.3, 2 - 11 Class Width 5 hours therefore, Lower class limits classes should start will be: 10, 15, 20, etc. at 10 hours 15.0, 23.0, 17.4, 21.4, 10.3, 33.8, 23.7, 14.2, 18.6, 18.3, 26.1, 23.2, 19.7, 20.8, 12.9, 29.8, 15.7, 12.9, 15.4, 13.5, 20.3, 17.1, 14.0, 27.1, 18.3, 20.7, 13.7, 18.9, 17.8, 16.6. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Classes 10.0 – 14.9 15.0 – 19.9 20.0 – 24.9 25.0 – 29.9 30.0 – 34.9 or Classes 10.0 to under 15 15.0 to under 20 20.0 to under 25 25.0 to under 30 30.0 to under 35 Tally the items into the classes 15.0, 23.7, 23.0, 14.2 14.2, 17.4, 18.6, 21.4, 18.3, 10.3, 26.1, 33.8, 23.2, Find Classes 10.0 to under 15 15.0 to under 20 20.0 to under 25 25.0 to under 30 30.0 to under 35 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 19.7, 20.8, 12.9 12.9, 29.8, 15.7, 12.9, 12.9 15.4, 13.5 13.5, 20.3, 17.1, 14.0, 14.0 27.1, 18.3, 20.7, 13.7, 13.7 18.9, 17.8, 16.6. Tally …and so on with the remaining hours 2 - 12 Count the number of items in each class Hours Studying x Frequency f 10.0 to under 15 15.0 to under 20 20.0 to under 25 25.0 to under 30 30.0 to under 35 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 2 - 13 7 12 7 3 1 30 Using different limits 2 - 14 …will give you a different distribution, e.g. Hours Studying 7.5 to under 12.5 12.5 to under 17.5 17.5 to under 22.5 22.5 to under 27.5 27.5 to under 32.5 32.5 to under 37.5 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. x Frequency f 1 12 10 5 1 1 30 Construct a Frequency Distribution Using Excel Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 2 - 15 Using 2 - 16 See Click on Frequency Distributions Click on Quantitative See… Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Using 2 - 17 See $A:$A 5 10 INPUT NEEDS See… Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Using See Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 2 - 18 2 - 19 Relative Frequency Distribution Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Relative Frequency Distribution 2 - 20 …shows the percent of observations in each class! Hours Studying x f Relative f 10.0 to under 15 15.0 to under 20 20.0 to under 25 25.0 to under 30 30.0 to under 35 7 7/30 = 0.2333 12 7 3 12/30 = 0.40 7/30 = 0.2333 3/30 = 0.10 1 1/30 = 0.0333 30 30/30 =1 Total Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Using different limits Hours Studying x f 2 - 21 Relative f 7.5 to under 12.5 12.5 to under 17.5 17.5 to under 22.5 22.5 to under 27.5 27.5 to under 32.5 32.5 to under 37.5 1 12 10 5 1 1 1/30 = 0.0333 12/30 = 0.40 10/30 = 0.3333 5/30 = 0.1666 1/30 = 0.0333 1/30 = 0.0333 Total 30 30/30 =1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Stem-and-leaf Displays 2 - 22 A statistical technique for displaying a set of data. Each numerical value is divided into two parts: 1. the leading digits become the stem and 2. the trailing digits become the leaf. …an advantage of the stem-and-leaf display over a frequency distribution is that we retain the value of each observation! Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Stem-and-leaf Displays A student achieved the following scores on the twelve accounting quizzes this semester: 86, 79, 92, 84, 69, 88, 91, 83, 96, 78, 82, 85. Construct a stem-and-leaf chart to illustrate the results. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 2 - 23 Stem-and-leaf Displays First, find the lowest score 86, 79, 92, 84, 69, 88, 91, 83, 96, 78, 82, 85. Now list the next scores with the highest leading digits. You should now have the following STEMS: 669, 78, 7 82, 8 91 9 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 2 - 24 Stem-and-leaf Displays 2 - 25 86, 79, 92, 84, 69, 88, 91, 83, 96, 78, 82, 85. Split Stem Leaf 669 6 9 778 7 8 9 882 8 2 3 4 5 6 8 991 9 1 2 6 Now, list the remaining ‘leaf’ scores! All 12 Scores Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 2 - 26 The grades on a statistics exam for a sample of 40 students are as follows: Stem Leaf 3 68 4 1278 5 0125589 6 01112578889 7 0025667 Alpha-Numeric Grading A+ = A = B+ = B = 90%-100% 80%-89% 75%-79% 70%-74% C+ = 65%-69% C = 60%-64% 8 46889 D = 55%-59% 9 0246 F = 0%-54% Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. How many students earned an A on this test? 5 What is the most common letter grade earned? F 2 - 27 Graphic Presentation of a Frequency Distribution Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Graphic Presentation of a Frequency Distribution 2 - 28 The three commonly used graphic forms are: Histograms Frequency Polygons or Line Charts Cumulative Frequency Distributions Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. A Histogram is a graph in which the classes are marked on the horizontal axis and the class frequencies on the vertical axis The class frequencies are represented by the heights of the bars and the bars are drawn adjacent to each other. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Frequency Graphic Presentation of a Frequency Distribution Class 2 - 29 Graphic Presentation of a Frequency Distribution 2 - 30 Histogram Hours Studying x f 10.0 to under 15 7 15.0 to under 20 12 20.0 to under 25 7 25.0 to under 30 3 30.0 to under 35 1 14 12 10 8 6 4 2 0 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 10 15 20 25 30 35 Hours spent studying Graphic Presentation of a Frequency Distribution A frequency polygon consists of line segments connecting the points formed by the class midpoint and the class frequency. A cumulative frequency distribution is used to determine how many or what proportion of the data values are below or above a certain value. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 2 - 31 14 12 10 8 6 4 2 0 7.5 12.5 17.5 22.5 27.5 25 30 35 30 25 20 15 10 5 0 10 15 20 35 2 - 32 Making a Histogram in Excel Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Using 2 - 33 Click on DATA ANALYSIS See Click on HISTOGRAM Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Using 2 - 34 The upper limits of the classes you have determined must now be entered from Column B (Excel calls these “bins”) Complete INPUTTING of DATA Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Using To remove the Legend on the right side… Right mouse click and Click on Clear To remove the spaces between the bars… Right mouse click on one of the bars and Click on Format Data Series Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 2 - 35 Using Now, Click on the Options tab; To reduce/remove the spaces between the bars Adjust the Gap width down to 0 and Click on OK. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 2 - 36 Using Edit the size of the histogram, titles, etc as appropriate. Note that the upper limit values are included in each class – this explains the difference between this Excel Frequency Distribution and the one we did by hand. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 2 - 37 Frequency Polygon or Line Chart for Hours Spent Studying Hours Studying xf 10.0 to under 15 7 15.0 to under 20 12 20.0 to under 25 7 25.0 to under 30 3 30.0 to under 35 1 14 12 10 8 6 4 2 0 10 15 20 25 30 Hours spent studying Notice that the class midpoints (the plotted points) aren’t as “user friendly” in this distribution choice. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 2 - 38 35 Cumulative Frequency Distribution For Hours Studying Hours Studying x 2 - 39 Cumulative f f Hours Studying 10.0 to under 15 7 15.0 to under 20 12 20.0 to under 25 7 25.0 to under 30 3 30.0 to under 35 1 under 15 under 20 under 25 under 30 under 35 7 19 26 29 30 Graph….. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Cumulative Frequency Distribution For Hours Studying 2 - 40 Hours Studying Cumulative f under 15 under 20 under 25 under 30 under 35 7 19 26 29 30 35 30 25 20 15 10 5 0 10 15 20 25 30 Hours spent studying Notice that the limits are the plotted points. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 35 2 - 41 Pie Bar Line … used primarily for Qualitative Data Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Pie …is useful for displaying a Relative Frequency Distribution A circle is divided proportionally to the relative frequency and portions of the circle are allocated for the different groups. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 2 - 42 2 - 43 Pie 200 runners were asked to indicate their favourite type of running shoe. Type Nike Adidas Reebok Asics Other # of runners selecting: 92 49 37 13 9 Draw a pie chart based on this information. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 2 - 44 Pie Relative Frequency Distribution for the running shoes Type Nike Adidas Reebok Asics Other # % 92 49 37 13 9 46.0 24.5 18.5 6.5 4.5 200 100 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Reebok Asics Other 6.5% 18.5% 4.5% Adidas 24.5% 46.0% Nike Pie Type # % Nike Adidas Reebok Asics Other 92 49 37 13 9 46.0 24.5 18.5 6.5 4.5 200 100 Using Excel, follow the steps in the Chart Wizard to construct a Pie Chart! Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 2 - 45 Bar 2 - 46 …can be used to depict any of the levels of measurement (nominal, ordinal, interval, or ratio). Examples of… (also known as a ‘column chart’) Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 3-D Bar Use bar charts also when the order in which qualitative data are presented is meaningful. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 2 - 47 Bar How could we chart this data? Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 2 - 48 Bar 2 - 49 Using Excel we can produce this… Other formats… Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Victoria 57.7 Vancouver 61.4 Edmonton 67.1 Winnipeg 66.7 Saskatoon 63.7 Regina 67.4 Thunder Bay 61.0 London 63.3 Kitchener 66.0 Hamilton 63.2 Toronto 65.1 Quebec 59.7 Sherbrooke 59.2 Montreal 60.4 Halifax 60.5 Bar Employment Rate in Canadian Cities 70 68 % employment Canadian City Employment Rate 2 - 50 66 64 62 60 58 56 54 52 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Victoria 57.7 Vancouver 61.4 Edmonton 67.1 Winnipeg 66.7 Saskatoon 63.7 Regina 67.4 Thunder Bay 61.0 London 63.3 Kitchener 66.0 Hamilton 63.2 Toronto 65.1 Quebec 59.7 Sherbrooke 59.2 Montreal 60.4 Halifax 60.5 Bar Employment Rate in Canadian Cities 70 68 % employment Canadian City Employment Rate 2 - 51 66 64 62 60 58 56 54 52 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. - by Province Bar Did any of the previous Bar Charts adequately display all the information that was provided? The following has been modified from that data found by Statistics Canada. Does it do an effective job of displaying the StatCan data? Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 2 - 52 % of enterprises Clustered Bar 100 2 - 53 Comparison of Internet Use in 2000 and 2001 80 60 40 20 0 Manufacturing Wholesale trade Retail trade % of enterprises that use the Internet 2000 % of enterprises that use the Internet 2001 % of enterprises with a Web site 2000 % of enterprises with a Web site 2001 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Data Source: Statistics Canada Stacked Bar 2 - 54 Full-Time University Faculty By Gender, Canada and Jurisdictions, 1987-88 and 1997-98 Total Full Professor 1987-88 1997-98 1987-88 1997-98 34,651 33,925 12,829 13,910 % Female % Male 17 83 25 75 7 93 13 87 Associate Professor 1987-88 12,650 1997-98 12,095 17 83 28 72 Other 1987-88 1997-98 9,172 7,817 32 68 44 56 % of Total Canadian Full Time University Faculty 120 100 80 60 40 20 0 % males % females 1987-88 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 1997-98 Data Source: Statistics Canada 2 - 55 Make sure that your charts are not overly cluttered Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 2 - 56 Shapes of Modal Histograms Class There are four typical shape characteristics Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 2 - 57 …a balanced effect! Both ‘balanced’ or ‘have symmetry’ Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 2 - 58 … occurs when the observations are graphed as being skewed or tilted more to one side of the centre of the observations than the other. The skewness, if on the right side is said to be The skewness, if on the left side is said to be ‘positive’. ‘negative’. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Modal 2 - 59 Class A modal class is the one with the largest number of observations This is a uniModal Histogram Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. biModal Modal Class biModal This is a biModal Histogram Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 2 - 60 2 - 61 Population distributions are often bell shaped. Drawing a histogram helps verify the shape of the population in question. Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Line 2 - 62 Line charts are particularly useful when the trend over time is to be emphasized Examples … 3-D Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. In combination Line 2 - 63 Time Plot M oMn ot n h th l yly SSt te e ee ll PPro r oddu uc tio c t ino n 8.5 7.5 6.5 5.5 Mo n th J F MAM J J A S O N D J F MAM J J A S O N D J F MAMJ J A S O 2000 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 2001 2002 Line 2 - 64 Employment Rate in Canadian Cities % employment 70 68 66 64 62 60 58 56 54 52 Preparing a Line Chart for this type of data is not overly useful! Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Line 2 - 65 Employment Rate in Canadian Cities % employment 70 68 66 64 62 60 58 56 54 52 Is this combination any better for displaying the data? Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Line 2 - 66 frequency Polygon and Ogive frequency Polygon 0.3 Ogive 1.0 0.2 0.5 0.1 0.0 0.0 0 10 20 30 40 50 Sales Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 0 10 20 30 Sales 40 50 Test your learning… www.mcgrawhill.ca/college/lind Online Learning Centre for quizzes extra content data sets searchable glossary access to Statistics Canada’s E-Stat data …and much more! Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 2 - 67 2 - 68 This completes Chapter 2 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.