Business Statistics: A Decision

advertisement
Business Statistics:
A Decision-Making Approach
7th Edition
Chapter 2
Graphs, Charts, and Tables –
Describing Your Data
Chap 2-1
Chapter Goals
After completing this chapter, you should be
able to:

Construct a frequency distribution both manually and with
a computer

Construct and interpret a histogram

Create and interpret bar charts, pie charts, and stem-andleaf diagrams

Present and interpret data in line charts and scatter
diagrams
Chap 2-2
Chapter Focus

First time trying practices using Excel



Describe data using frequency distribution and
relative frequency distribution.



Practices are simple….not strongly fun
Try to be familiar with Excel
Discrete
Continuous
Present data using a chart

Universal and popular way: Histogram
Chap 2-3
Variable (Data) Types
Variable
(Data)
Qualitative
(Categorical)
Quantitative
(Numerical)
1) Discrete
2) Continuous
Chap 2-4
Detail View
Data
Qualitative
Data
Tabular
Methods
1) Frequency Distr.
2) Relative/Percent
Frequency Distr.
3) Crosstabulation
Graphic
Methods
Charts:
1) Column
2) Pie
Quantitative
Data
Tabular
Methods
1) Frequency Distr.
2) Relative/Percent
Frequency Distr.
3) Cumulative
Frequency Distr.
4) Cumulative
Relative/Percent
Frequency Distr.
5) Crosstabulation
Graphic
Methods
1) Dot Plot
2) Histogram
3) Ogive
4) Stem & Leaf Display
5) Scatter Diagram
Chap 2-5
Frequency Distribution (FD)

It is a tabulation of the values..

Each entry in the table contains
the frequency or count of the
occurrences of values within a
particular group or interval,

and in this way the table
summarizes the distribution of
values in the sample.
Number of
days read
Frequency
0
44
1
24
2
18
3
16
4
20
5
22
6
26
7
30
Total
200
Chap 2-6
Why Use FD?

A frequency distribution is a way to
summarize data

The distribution condenses the raw data
into a more useful form...

and allows for a quick visual interpretation
of the data
Chap 2-7
Frequency Distribution:
Discrete Data

Discrete data: possible values are countable
Example: An
advertiser asks
200 customers
how many days
per week they
read the daily
newspaper.
Number of days
read
Frequency
0
44
1
24
2
18
3
16
4
20
5
22
6
26
7
30
Total
200
Chap 2-8
Relative Frequency
Relative Frequency: What proportion (%) is in each category?
Number of days
read
Frequency
Relative
Frequency
0
44
.22
1
24
.12
2
18
.09
3
16
.08
4
20
.10
5
22
.11
6
26
.13
7
30
.15
Total
200
1.00
44
 .22
200
22% of the
people in the
sample report
that they read
the newspaper
0 days per week
Chap 2-9
Practice

Develop FD using Discreet Data



Download the “SportShoes” Excel data file
from the class website
Make sure to download and SAVE the data
file.
See the note (ppt).
Chap 2-10
Frequency Distribution:
Continuous Data

Continuous Data: uncountable…..may take on any
value in some interval
Example: A manufacturer of insulation randomly selects
20 winter days and records the daily high temperature

(Temperature is a continuous variable because it could
be measured to any degree of precision desired – 98.58697 F)
24, 35, 17, 21, 24, 37, 26, 46, 58, 30,
32, 13, 12, 38, 41, 43, 44, 27, 53, 27
Chap 2-11
Grouping Data by Classes
Sort raw data from low to high (easy using Excel):
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

Find range: 58 (Max) – 12 (Min) = 46 (use for class width)

Determine number of classes:

Rule of thumb: between 5 and 20

Calculation of class: follow 2^k>= n (n=20)

Two to the power of four and five (in Excel: 2^4=16 and 2^5=32).
Then, take 5.

Thus, there should be 5 classes.
Chap 2-12
Grouping Data by Classes

Compute class width: 10 (46/5 = 9.2 then round off 10)
W =

Number of Classes
Determine intervals:10, 20, 30, 40, 50


Largest Value - Smallest Value
(Sometimes class midpoints are reported: 15, 25, 35, 45, 55 – if
calculation result is 13.5)
Construct frequency distribution

count number of values in each class
Chap 2-13
Frequency Distribution
Data from low to high:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
Frequency Distribution
Class
Frequency
Relative
Frequency
3
6
5
4
2
20
.15
.30
.25
.20
.10
1.00
0 or 12 is also OK
10 but under 19.99
20 but under 29.99
30 but under 39.99
40 but under 49.99
50 but under 59.99
Total
Chap 2-14
Histogram based on FD
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
Histogram
Frequency
7
6
6
5
5
4
4
3
3
2
1
0
2
0
0
No gaps
between
bars, since
continuous
data
0 5 10 15 20 25 30 3640 45 50 55 60More
Class
Midpoints
Class
Endpoints
Chap 2-15
Histogram




The classes or intervals are shown on the
horizontal axis
frequency is measured on the vertical axis
Bars of the appropriate heights can be used
to represent the number of observations
within each class
Such a graph is called a histogram
Chap 2-16
How Many Class Intervals?
Many (Narrow class intervals)
3
2.5
2
1.5
1
0.5


may compress variation too much and
yield a blocky distribution
can obscure important patterns of
variation.
60
56
52
48
44
40
36
32
28
24
20
16
Temperature
12
10
8
6
4
2
0
0
30
60
More
Temperature
(X axis labels are upper class endpoints)
Chap 2-17
More
Few (Wide class intervals)
Frequency

8
0
4

may yield a very jagged distribution
with gaps from empty classes
Can give a poor indication of how
frequency varies across classes
12

3.5
Frequency

General Guidelines

Number of Data Points
under 50
50 – 100
100 – 250
over 250
Number of Classes
5- 7
6 - 10
7 - 12
10 - 20
Chap 2-18
Practice

Develop FD using continuous data


Download the “Capital Credit Union” Excel
file from the class website
See the note (ppt).
Chap 2-19
Joint Frequency Distribution

What does the credit card balance distribution
look like from male versus female cardholder?



Conventional way: Develop F.D. and Hist. for each
gender separately
Better way: joint the two variables (M/F) using joint
frequency distribution…much easier to compare two
different variables
See the next slide
Chap 2-20
Joint Frequency Distribution
Chap 2-21
Chap 2-22
Practice


Develop JFD and relative JFD using
“Capital Credit Union” Excel file and then
develop other types (i.e., charts, diagram)
using “Bach, Lombard, & Wilson” Excel files
See the note (ppt).
Chap 2-23
Ogives


An Ogive is a graph of the cumulative
relative frequencies from a relative
frequency distribution
Ogives are sometime shown in the same
graph as a relative frequency histogram
Chap 2-24
Ogives
(continued)
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
Add a cumulative relative frequency column:
Frequency Distribution
Class
10 but under 20
20 but under 30
30 but under 40
40 but under 50
50 but under 60
Total
Frequency
Relative
Frequency
3
6
5
4
2
20
.15
.30
.25
.20
.10
1.00
Cumulative
Relative
Frequency
.15
.45
.70
.90
1.00
Chap 2-25
Ogive Example
/ Ogive
Frequency
7
6
100
80
5
4
60
3
2
40
20
1
0
0
Cumulative Frequency (%)
Histogram
0 5 10 15 20 25 30 3640 45 50 55 60More
Class
Midpoints
Class
Endpoints
Chap 2-26
Ogives in Excel
Excel will show the Ogive
graphically if the
“Cumulative Percentage”
option is selected in the
Histogram dialog box
Chap 2-27
Other Graphical
Presentation Tools
** Try the rest of them by yourself **
Qualitative
(Categorical)
Data
Bar
Chart
Pie
Charts
Quantitative
(Numerical)
Data
Stem and Leaf
Diagram
Chap 2-28
Bar and Pie Charts

Bar charts and Pie charts are often used
for qualitative (category) data

Height of bar or size of pie slice shows the
frequency or percentage for each
category
Chap 2-29
Bar Chart Example 1
Investor's Portfolio
Savings
CD
Bonds
Stocks
0
10
20
30
40
50
Amount in $1000's
(Note that bar charts can also be displayed with vertical bars)
Chap 2-30
Bar Chart Example 2
Number of
days read
Frequency
0
44
1
24
50
2
18
40
3
16
4
20
5
22
6
26
10
7
30
0
Total
200
Freuency
Newspaper readership per week
30
20
0
1
2
3
4
5
6
7
Number of days newspaper is read per week
Chap 2-31
Pie Chart Example
Current Investment Portfolio
Investment
Type
Amount
Percentage
Savings
15%
(in thousands $)
Stocks
Bonds
CD
Savings
46.5
32.0
15.5
16.0
42.27
29.09
14.09
14.55
Total
110
100
(Variables are Qualitative)
Stocks
42%
CD
14%
Bonds
29%
Percentages
are rounded to
the nearest
percent
Chap 2-32
Tabulating and Graphing
Multivariate Categorical Data

Investment in thousands of dollars
Investment
Category
Investor A
Investor B
Investor C
Total
Stocks
46.5
55
27.5
129
Bonds
CD
Savings
32.0
15.5
16.0
44
20
28
19.0
13.5
7.0
95
49
51
Total
110.0
147
67.0
324
Chap 2-33
Tabulating and Graphing
Multivariate Categorical Data
(continued)

Side by side charts
C o m p arin g In vesto rs
S a vin g s
CD
B onds
S toc k s
0
10
In ve s t o r A
20
30
In ve s t o r B
40
50
60
In ve s t o r C
Chap 2-34
Side-by-Side Chart Example

Sales by quarter for three sales territories:
East
West
North
1st Qtr
2nd Qtr
3rd Qtr
4th Qtr
20.4
27.4
59
20.4
30.6
38.6
34.6
31.6
45.9
46.9
45
43.9
60
50
40
East
West
North
30
20
10
0
1st Qtr
2nd Qtr
3rd Qtr
4th Qtr
Chap 2-35
Stem and Leaf Diagram

A simple way to see distribution details from
qualitative data
METHOD
1.
Separate the sorted data series into leading digits
(the stem) and the trailing digits (the leaves)
2.
List all stems in a column from low to high
3.
For each stem, list all associated leaves
Chap 2-36
Example:
Data sorted from low to high:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

Here, use the 10’s digit for the stem unit:
Stem Leaf

12 is shown as
1
2

35 is shown as
3
5
Chap 2-37
Example:
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 28, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

Completed Stem-and-leaf diagram:
Stem
Leaves
1
2 3 7
2
1 4 4 6 7 8
3
0 2 5 7 8
4
1 3 4 6
5
3 8
Chap 2-38
Using other stem units

Using the 100’s digit as the stem:

Round off the 10’s digit to form the leaves
Stem




Leaf
613 would become
6
1
776 would become
...
1224 becomes
7
8
12
2
Chap 2-39
Line Charts and
Scatter Diagrams

Line charts show values of one variable
vs. time


Time is traditionally shown on the horizontal axis
Scatter Diagrams show points for bivariate
data

one variable is measured on the vertical axis and
the other variable is measured on the horizontal
axis
Chap 2-40
Line Chart Example
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
Inflation
Rate
3.56
1.86
3.65
4.14
4.82
5.40
4.21
3.01
2.99
2.56
2.83
2.95
2.29
1.56
2.21
3.36
2.85
1.59
2.27
2.68
3.39
3.24
U.S. Inflation Rate
6
Inflation Rate (%)
Year
5
4
3
2
1
0
1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006
Year
Chap 2-41
Scatter Diagram Example
Production Volume vs. Cost per Day
Cost per
day
23
125
250
26
140
200
29
146
33
160
38
167
42
170
50
188
55
195
60
200
Cost per Day
Volume
per day
150
100
50
0
0
10
20
30
40
50
60
70
Volume per Day
Chap 2-42
Types of Relationships

Linear Relationships
Y
Y
X
X
Chap 2-43
Types of Relationships
(continued)

Curvilinear Relationships
Y
Y
X
X
Chap 2-44
Types of Relationships
(continued)

No Relationship
Y
Y
X
X
Chap 2-45
Chapter Summary

Data in raw form are usually not easy to use for
decision making -- Some type of organization is
needed:


Table

Graph
Techniques reviewed in this chapter:
 Frequency Distributions, Histograms, and Ogives
 Bar Charts and Pie Charts
 Stem and Leaf Diagrams
 Line Charts and Scatter Diagrams
Chap 2-46
Download