Business Statistics: A Decision-Making Approach 7th Edition Chapter 2 Graphs, Charts, and Tables – Describing Your Data Chap 2-1 Chapter Goals After completing this chapter, you should be able to: Construct a frequency distribution both manually and with a computer Construct and interpret a histogram Create and interpret bar charts, pie charts, and stem-andleaf diagrams Present and interpret data in line charts and scatter diagrams Chap 2-2 Chapter Focus First time trying practices using Excel Describe data using frequency distribution and relative frequency distribution. Practices are simple….not strongly fun Try to be familiar with Excel Discrete Continuous Present data using a chart Universal and popular way: Histogram Chap 2-3 Variable (Data) Types Variable (Data) Qualitative (Categorical) Quantitative (Numerical) 1) Discrete 2) Continuous Chap 2-4 Detail View Data Qualitative Data Tabular Methods 1) Frequency Distr. 2) Relative/Percent Frequency Distr. 3) Crosstabulation Graphic Methods Charts: 1) Column 2) Pie Quantitative Data Tabular Methods 1) Frequency Distr. 2) Relative/Percent Frequency Distr. 3) Cumulative Frequency Distr. 4) Cumulative Relative/Percent Frequency Distr. 5) Crosstabulation Graphic Methods 1) Dot Plot 2) Histogram 3) Ogive 4) Stem & Leaf Display 5) Scatter Diagram Chap 2-5 Frequency Distribution (FD) It is a tabulation of the values.. Each entry in the table contains the frequency or count of the occurrences of values within a particular group or interval, and in this way the table summarizes the distribution of values in the sample. Number of days read Frequency 0 44 1 24 2 18 3 16 4 20 5 22 6 26 7 30 Total 200 Chap 2-6 Why Use FD? A frequency distribution is a way to summarize data The distribution condenses the raw data into a more useful form... and allows for a quick visual interpretation of the data Chap 2-7 Frequency Distribution: Discrete Data Discrete data: possible values are countable Example: An advertiser asks 200 customers how many days per week they read the daily newspaper. Number of days read Frequency 0 44 1 24 2 18 3 16 4 20 5 22 6 26 7 30 Total 200 Chap 2-8 Relative Frequency Relative Frequency: What proportion (%) is in each category? Number of days read Frequency Relative Frequency 0 44 .22 1 24 .12 2 18 .09 3 16 .08 4 20 .10 5 22 .11 6 26 .13 7 30 .15 Total 200 1.00 44 .22 200 22% of the people in the sample report that they read the newspaper 0 days per week Chap 2-9 Practice Develop FD using Discreet Data Download the “SportShoes” Excel data file from the class website Make sure to download and SAVE the data file. See the note (ppt). Chap 2-10 Frequency Distribution: Continuous Data Continuous Data: uncountable…..may take on any value in some interval Example: A manufacturer of insulation randomly selects 20 winter days and records the daily high temperature (Temperature is a continuous variable because it could be measured to any degree of precision desired – 98.58697 F) 24, 35, 17, 21, 24, 37, 26, 46, 58, 30, 32, 13, 12, 38, 41, 43, 44, 27, 53, 27 Chap 2-11 Grouping Data by Classes Sort raw data from low to high (easy using Excel): 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Find range: 58 (Max) – 12 (Min) = 46 (use for class width) Determine number of classes: Rule of thumb: between 5 and 20 Calculation of class: follow 2^k>= n (n=20) Two to the power of four and five (in Excel: 2^4=16 and 2^5=32). Then, take 5. Thus, there should be 5 classes. Chap 2-12 Grouping Data by Classes Compute class width: 10 (46/5 = 9.2 then round off 10) W = Number of Classes Determine intervals:10, 20, 30, 40, 50 Largest Value - Smallest Value (Sometimes class midpoints are reported: 15, 25, 35, 45, 55 – if calculation result is 13.5) Construct frequency distribution count number of values in each class Chap 2-13 Frequency Distribution Data from low to high: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Frequency Distribution Class Frequency Relative Frequency 3 6 5 4 2 20 .15 .30 .25 .20 .10 1.00 0 or 12 is also OK 10 but under 19.99 20 but under 29.99 30 but under 39.99 40 but under 49.99 50 but under 59.99 Total Chap 2-14 Histogram based on FD Data in ordered array: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Histogram Frequency 7 6 6 5 5 4 4 3 3 2 1 0 2 0 0 No gaps between bars, since continuous data 0 5 10 15 20 25 30 3640 45 50 55 60More Class Midpoints Class Endpoints Chap 2-15 Histogram The classes or intervals are shown on the horizontal axis frequency is measured on the vertical axis Bars of the appropriate heights can be used to represent the number of observations within each class Such a graph is called a histogram Chap 2-16 How Many Class Intervals? Many (Narrow class intervals) 3 2.5 2 1.5 1 0.5 may compress variation too much and yield a blocky distribution can obscure important patterns of variation. 60 56 52 48 44 40 36 32 28 24 20 16 Temperature 12 10 8 6 4 2 0 0 30 60 More Temperature (X axis labels are upper class endpoints) Chap 2-17 More Few (Wide class intervals) Frequency 8 0 4 may yield a very jagged distribution with gaps from empty classes Can give a poor indication of how frequency varies across classes 12 3.5 Frequency General Guidelines Number of Data Points under 50 50 – 100 100 – 250 over 250 Number of Classes 5- 7 6 - 10 7 - 12 10 - 20 Chap 2-18 Practice Develop FD using continuous data Download the “Capital Credit Union” Excel file from the class website See the note (ppt). Chap 2-19 Joint Frequency Distribution What does the credit card balance distribution look like from male versus female cardholder? Conventional way: Develop F.D. and Hist. for each gender separately Better way: joint the two variables (M/F) using joint frequency distribution…much easier to compare two different variables See the next slide Chap 2-20 Joint Frequency Distribution Chap 2-21 Chap 2-22 Practice Develop JFD and relative JFD using “Capital Credit Union” Excel file and then develop other types (i.e., charts, diagram) using “Bach, Lombard, & Wilson” Excel files See the note (ppt). Chap 2-23 Ogives An Ogive is a graph of the cumulative relative frequencies from a relative frequency distribution Ogives are sometime shown in the same graph as a relative frequency histogram Chap 2-24 Ogives (continued) 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Add a cumulative relative frequency column: Frequency Distribution Class 10 but under 20 20 but under 30 30 but under 40 40 but under 50 50 but under 60 Total Frequency Relative Frequency 3 6 5 4 2 20 .15 .30 .25 .20 .10 1.00 Cumulative Relative Frequency .15 .45 .70 .90 1.00 Chap 2-25 Ogive Example / Ogive Frequency 7 6 100 80 5 4 60 3 2 40 20 1 0 0 Cumulative Frequency (%) Histogram 0 5 10 15 20 25 30 3640 45 50 55 60More Class Midpoints Class Endpoints Chap 2-26 Ogives in Excel Excel will show the Ogive graphically if the “Cumulative Percentage” option is selected in the Histogram dialog box Chap 2-27 Other Graphical Presentation Tools ** Try the rest of them by yourself ** Qualitative (Categorical) Data Bar Chart Pie Charts Quantitative (Numerical) Data Stem and Leaf Diagram Chap 2-28 Bar and Pie Charts Bar charts and Pie charts are often used for qualitative (category) data Height of bar or size of pie slice shows the frequency or percentage for each category Chap 2-29 Bar Chart Example 1 Investor's Portfolio Savings CD Bonds Stocks 0 10 20 30 40 50 Amount in $1000's (Note that bar charts can also be displayed with vertical bars) Chap 2-30 Bar Chart Example 2 Number of days read Frequency 0 44 1 24 50 2 18 40 3 16 4 20 5 22 6 26 10 7 30 0 Total 200 Freuency Newspaper readership per week 30 20 0 1 2 3 4 5 6 7 Number of days newspaper is read per week Chap 2-31 Pie Chart Example Current Investment Portfolio Investment Type Amount Percentage Savings 15% (in thousands $) Stocks Bonds CD Savings 46.5 32.0 15.5 16.0 42.27 29.09 14.09 14.55 Total 110 100 (Variables are Qualitative) Stocks 42% CD 14% Bonds 29% Percentages are rounded to the nearest percent Chap 2-32 Tabulating and Graphing Multivariate Categorical Data Investment in thousands of dollars Investment Category Investor A Investor B Investor C Total Stocks 46.5 55 27.5 129 Bonds CD Savings 32.0 15.5 16.0 44 20 28 19.0 13.5 7.0 95 49 51 Total 110.0 147 67.0 324 Chap 2-33 Tabulating and Graphing Multivariate Categorical Data (continued) Side by side charts C o m p arin g In vesto rs S a vin g s CD B onds S toc k s 0 10 In ve s t o r A 20 30 In ve s t o r B 40 50 60 In ve s t o r C Chap 2-34 Side-by-Side Chart Example Sales by quarter for three sales territories: East West North 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr 20.4 27.4 59 20.4 30.6 38.6 34.6 31.6 45.9 46.9 45 43.9 60 50 40 East West North 30 20 10 0 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr Chap 2-35 Stem and Leaf Diagram A simple way to see distribution details from qualitative data METHOD 1. Separate the sorted data series into leading digits (the stem) and the trailing digits (the leaves) 2. List all stems in a column from low to high 3. For each stem, list all associated leaves Chap 2-36 Example: Data sorted from low to high: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Here, use the 10’s digit for the stem unit: Stem Leaf 12 is shown as 1 2 35 is shown as 3 5 Chap 2-37 Example: Data in ordered array: 12, 13, 17, 21, 24, 24, 26, 27, 28, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Completed Stem-and-leaf diagram: Stem Leaves 1 2 3 7 2 1 4 4 6 7 8 3 0 2 5 7 8 4 1 3 4 6 5 3 8 Chap 2-38 Using other stem units Using the 100’s digit as the stem: Round off the 10’s digit to form the leaves Stem Leaf 613 would become 6 1 776 would become ... 1224 becomes 7 8 12 2 Chap 2-39 Line Charts and Scatter Diagrams Line charts show values of one variable vs. time Time is traditionally shown on the horizontal axis Scatter Diagrams show points for bivariate data one variable is measured on the vertical axis and the other variable is measured on the horizontal axis Chap 2-40 Line Chart Example 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 Inflation Rate 3.56 1.86 3.65 4.14 4.82 5.40 4.21 3.01 2.99 2.56 2.83 2.95 2.29 1.56 2.21 3.36 2.85 1.59 2.27 2.68 3.39 3.24 U.S. Inflation Rate 6 Inflation Rate (%) Year 5 4 3 2 1 0 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 Year Chap 2-41 Scatter Diagram Example Production Volume vs. Cost per Day Cost per day 23 125 250 26 140 200 29 146 33 160 38 167 42 170 50 188 55 195 60 200 Cost per Day Volume per day 150 100 50 0 0 10 20 30 40 50 60 70 Volume per Day Chap 2-42 Types of Relationships Linear Relationships Y Y X X Chap 2-43 Types of Relationships (continued) Curvilinear Relationships Y Y X X Chap 2-44 Types of Relationships (continued) No Relationship Y Y X X Chap 2-45 Chapter Summary Data in raw form are usually not easy to use for decision making -- Some type of organization is needed: Table Graph Techniques reviewed in this chapter: Frequency Distributions, Histograms, and Ogives Bar Charts and Pie Charts Stem and Leaf Diagrams Line Charts and Scatter Diagrams Chap 2-46