Busn210ch02

advertisement
Slides by
JOHN
LOUCKS
St. Edward’s
University
Slide 1
Chapter 2, Part A
Descriptive Statistics:
Tabular and Graphical Presentations


Summarizing Categorical Data
Summarizing Quantitative Data
Slide 2
Summarizing Categorical Data






Frequency Distribution
Relative Frequency Distribution
Percent Frequency Distribution
Bar Chart
Pie Chart
Crosstabulation
Slide 3
Frequency Distribution
A frequency distribution is a tabular summary of
data showing the frequency (or number) of items
in each of several non-overlapping classes.
The objective is to provide insights about the data
that cannot be quickly obtained by looking only at
the original data.
Slide 4
Frequency Distribution
Example: Marada Inn
Guests staying at Marada Inn were asked to rate the
quality of their accommodations as being excellent,
above average, average, below average, or poor. The
ratings provided by a sample of 20 guests are:

Below Average
Above Average
Above Average
Average
Above Average
Average
Above Average
Average
Above Average
Below Average
Poor
Excellent
Above Average
Average
Above Average
Above Average
Below Average
Poor
Above Average
Average
Slide 5
Frequency Distribution

Example: Marada Inn
Rating
Frequency
2
Poor
3
Below Average
5
Average
9
Above Average
1
Excellent
Total
20
Slide 6
Using Excel’s COUNTIF Function
to Construct a Frequency Distribution

1
2
3
4
5
6
7
8
Excel Formula Worksheet
A
Quality Rating
Above Average
Below Average
Above Average
Average
Average
Above Average
Above Average
B
C
Quality Rating
Poor
Below Average
Average
Above Average
Excellent
Total
D
Frequency
=COUNTIF($A$2:$A$21,C2)
=COUNTIF($A$2:$A$21,C3)
=COUNTIF($A$2:$A$21,C4)
=COUNTIF($A$2:$A$21,C5)
=COUNTIF($A$2:$A$21,C6)
=SUM(D2:D6)
Note: Rows 9-21 are not shown.
Slide 7
Using Excel’s COUNTIF Function
to Construct a Frequency Distribution

1
2
3
4
5
6
7
8
Excel Value Worksheet
A
Quality Rating
Above Average
Below Average
Above Average
Average
Average
Above Average
Above Average
B
C
Quality Rating
Poor
Below Average
Average
Above Average
Excellent
Total
D
Frequency
2
3
5
9
1
20
Note: Rows 9-21 are not shown.
Slide 8
Relative Frequency Distribution
The relative frequency of a class is the fraction or
proportion of the total number of data items
belonging to the class.
A relative frequency distribution is a tabular
summary of a set of data showing the relative
frequency for each class.
Slide 9
Percent Frequency Distribution
The percent frequency of a class is the relative
frequency multiplied by 100.
A percent frequency distribution is a tabular
summary of a set of data showing the percent
frequency for each class.
Slide 10
Relative Frequency and
Percent Frequency Distributions

Example: Marada Inn
Relative
Frequency
Rating
.10
Poor
.15
Below Average
.25
Average
.45
Above Average
.05
Excellent
Total
1.00
Percent
Frequency
10
15
25 .10(100) = 10
45
5
100
1/20 = .05
Slide 11
Using Excel to Construct Relative Frequency
and Percent Frequency Distributions

1
2
3
4
5
6
7
8
Excel Formula Worksheet
C
D
Quality Rating
Poor
Below Average
Average
Above Average
Excellent
Total
Frequency
=COUNTIF($A$2:$A$21,C2)
=COUNTIF($A$2:$A$21,C3)
=COUNTIF($A$2:$A$21,C4)
=COUNTIF($A$2:$A$21,C5)
=COUNTIF($A$2:$A$21,C6)
=SUM(D2:D6)
E
Relative
Frequency
=D2/$D$7
=D3/$D$7
=D4/$D$7
=D5/$D$7
=D6/$D$7
=SUM(E2:E6)
F
Percent
Frequency
=E2*100
=E3*100
=E4*100
=E5*100
=E6*100
=SUM(F2:F6)
Note: Columns A-B and rows 9-21 and are not shown.
Slide 12
Using Excel to Construct Relative Frequency
and Percent Frequency Distributions

1
2
3
4
5
6
7
8
Excel Value Worksheet
C
D
Quality Rating
Poor
Below Average
Average
Above Average
Excellent
Total
Frequency
2
3
5
9
1
20
E
Relative
Frequency
0.10
0.15
0.25
0.45
0.05
1.00
F
Percent
Frequency
10
15
25
45
5
100
Note: Columns A-B and rows 9-21 and are not shown.
Slide 13
Bar Chart (In Excel this is called a Column Chart)
 A bar chart is a graphical device for depicting
qualitative data.
 On one axis (usually the horizontal axis), we specify
the labels that are used for each of the classes.
 A frequency, relative frequency, or percent frequency
scale can be used for the other axis (usually the
vertical axis).
 Using a bar of fixed width drawn above each class
label, we extend the height appropriately.
 The bars are separated to emphasize the fact that each
class is a separate category.
Slide 14
Bar Chart (In Excel this is called a Column Chart)
Marada Inn Quality Ratings
10
9
Frequency
8
7
6
5
4
3
2
1
Poor
Below Average Above Excellent
Average
Average
Rating
Slide 15
Using Excel’s Chart Tools
to Construct a Bar Chart
Step 1. Select cells C1:D6
Step 2. Click the Insert tab on the Ribbon
Column
Step 3. In the Charts group, click
Step 4. When the list of column chart subtypes appears:
Go to the 2-D Column section
Click Clustered Column (the leftmost chart)
Step 5. In the Chart Layouts group, click the More button
(the downward pointing arrow with a line over it)
to display all the options
… continued
Slide 16
Using Excel’s Chart Tools
to Construct a Bar Chart
Step 6. Choose Layout 9
Step 7. Click the Chart Title and replace it with
Marada Inn Quality Ratings
Step 8. Click the Horizontal Axis (Category) Title and
replace it with Quality Rating
Step 9. Click the Vertical Axis (Value) Title and
replace it with Frequency
Step 10. Right click the Series 1 Legend Entry and choose
Delete from the list of options that appear
… continued
Slide 17
Using Excel’s Chart Tools
to Construct a Bar Chart
Step 11. Right click the vertical axis and choose
Format Axis from the options that appear
Step 12. When the Format Axis dialog box appears:
Go to the Axis Options section
Select Fixed for Major Unit and enter 2.0 in
the corresponding box
Click Close
Slide 18
Using Excel’s Chart Tools
to Construct a Bar Chart
C
E
Marada Inn Quality Ratings
10
Frequency
9
10
11
12
13
14
15
16
17
18
19
20
21
D
8
6
4
2
0
Poor
Below
Average
Average
Above
Average
Excellent
Quality Rating
Slide 19
Pie Chart
 The pie chart is a commonly used graphical device
for presenting relative frequency distributions for
qualitative data.

First draw a circle; then use the relative frequencies
to subdivide the circle into sectors that correspond to
the relative frequency for each class.

Since there are 360 degrees in a circle, a class with a
relative frequency of .25 would consume .25(360) = 90
degrees of the circle.
Slide 20
Pie Chart
Marada Inn Quality Ratings
Excellent
5%
Above
Average
45%
Poor
10%
Below
Average
15%
Average
25%
Slide 21
Example: Marada Inn

Insights Gained from the Preceding Pie Chart
•
One-half of the customers surveyed gave Marada
a quality rating of “above average” or “excellent”
(looking at the left side of the pie). This might
please the manager.
•
For each customer who gave an “excellent” rating,
there were two customers who gave a “poor”
rating (looking at the top of the pie). This should
displease the manager.
Slide 22
Using Excel’s Chart Tools
to Construct a Pie Chart
Excel’s chart tools can be used to develop a pie chart for
the Marada quality rating data in much the same way we
developed the bar chart.
The major difference is that in step 3 we would choose
Pie in the Charts group.
Slide 23
Using Excel’s Chart Tools
to Construct a Pie Chart
C
9
10
11
12
13
14
15
16
17
18
19
20
E
D
Marada Inn Quality Ratings
Excellent
5%
Above
Average
45%
Poor
10%
Below
Average
15%
Average
25%
Slide 24
Excel’s PivotTable Report
and PivotChart Report
You have now seen how Excel’s COUNTIF function can
be used to develop a frequency distribution and Excel’s
Chart Tools can be used to create bar and pie charts.
But there is a more powerful set of Excel tools that can
be used for categorical data:
• PivotTable report
• PivotChart report
Slide 25
Summarizing Quantitative Data

Frequency Distribution
Relative Frequency and
Percent Frequency Distributions
Dot Plot
Histogram
Cumulative Distributions
Ogive

Stem-Leaf Display

Crosstabulation

Scatter Diagram





Slide 26
Frequency Distribution

Example: Hudson Auto Repair
The manager of Hudson Auto would like to gain a
better understanding of the cost of parts used in the
engine tune-ups performed in the shop. She examines
50 customer invoices for tune-ups. The costs of parts,
rounded to the nearest dollar, are listed on the next
slide.
Slide 27
Frequency Distribution

Example: Hudson Auto Repair
Sample of Parts Cost($) for 50 Tune-ups
91
71
104
85
62
78
69
74
97
82
93
72
62
88
98
57
89
68
68
101
75
66
97
83
79
52
75
105
68
105
99
79
77
71
79
80
75
65
69
69
97
72
80
67
62
62
76
109
74
73
Slide 28
Frequency Distribution

Guidelines for Selecting Number of Classes
• Use between 5 and 20 classes.
•
Data sets with a larger number of elements
usually require a larger number of classes.
•
Smaller data sets usually require fewer classes.
Slide 29
Frequency Distribution

Guidelines for Selecting Width of Classes
•Use classes of equal width.
•Approximate Class Width =
Largest Data Value  Smallest Data Value
Number of Classes
Slide 30
Frequency Distribution

Example: Hudson Auto Repair
If we choose six classes:
Approximate Class Width = (109 - 52)/6 = 9.5  10
Parts Cost ($) Frequency
50-59
2
60-69
13
70-79
16
80-89
7
90-99
7
100-109
5
Total
50
Slide 31
Using Excel’s PivotTable Report
to Construct a Frequency Distribution
Step 1 Click the Insert tab on the Ribbon
Step 2 In the Tables group, click the icon above the
word PivotTable
Step 3 When the Create PivotTable dialog box appears:
Choose Select a table or range
Enter A1:A51 in the Table/Range box
Choose Existing Worksheet as the location
for the PivotTable
Enter C1 in the Location box
Click OK
… continued
Slide 32
Using Excel’s PivotTable Report
to Construct a Frequency Distribution
Step 4 In the PivotTable Field List, go to Choose Fields
to add to report:
Drag the Parts Cost field to the Row Labels area
Drag the Parts Cost field to the Values area
Step 5 Click on Sum of Parts Cost in the Values area
Step 6 Click Value Field Settings from the list of options
that appear
Step 7 When the Value Field Settings dialog box appears:
Under Summarize value field by, choose Count
Click OK
Slide 33
Using Excel’s PivotTable Report
to Construct a Frequency Distribution
To construct the frequency distribution, we must group
the rows containing parts costs.
Step 1 Right click any cell in the PivotTable report
containing a parts cost.
Step 2 Choose Group from the list of options that appears
Step 3 When the Grouping dialog box appears:
Enter 50 in the Starting at box
Enter 109 in the Ending at box
Enter 10 in the By box
Click OK
Slide 34
Using Excel’s PivotTable Report
to Construct a Frequency Distribution

Excel Value Worksheet
A
1 Parts Cost
2
91
3
71
4
104
5
85
6
62
7
78
8
69
B
C
D
Parts Cost
50-59
60-69
70-79
80-89
90-99
100-109
Grand Total
Count of Parts Cost
2
13
16
7
7
5
50
Note: Rows 9-51 are not shown.
Slide 35
Relative Frequency and
Percent Frequency Distributions

Example: Hudson Auto Repair
Parts
Relative
Percent
Cost ($) Frequency
Frequency
50-59
.04
4
60-69
.26
2/50 26 .04(100)
70-79
.32
32
80-89
.14
14
90-99
.14
14
100-109
.10
10
Total 1.00
100
Slide 36
Relative Frequency and
Percent Frequency Distributions

Example: Hudson Auto Repair
Insights Gained from the % Frequency Distribution:
•
•
•
•
Only 4% of the parts costs are in the $50-59 class.
30% of the parts costs are under $70.
The greatest percentage (32% or almost one-third)
of the parts costs are in the $70-79 class.
10% of the parts costs are $100 or more.
Slide 37
Dot Plot



One of the simplest graphical summaries of data is a
dot plot.
A horizontal axis shows the range of data values.
Then each data value is represented by a dot placed
above the axis.
Slide 38
Dot Plot

Example: Hudson Auto Repair
Tune-up Parts Cost
50
60
70
80
90
100
110
Cost ($)
Slide 39
Histogram
 Another common graphical presentation of
quantitative data is a histogram.
 The variable of interest is placed on the horizontal
axis.
 A rectangle is drawn above each class interval with
its height corresponding to the interval’s frequency,
relative frequency, or percent frequency.
 Unlike a bar graph, a histogram has no natural
separation between rectangles of adjacent classes.
Slide 40
Histogram
Example: Hudson Auto Repair
18
Tune-up Parts Cost
16
14
Frequency

12
10
8
6
4
2
Parts
5059 6069 7079 8089 9099 100-110 Cost ($)
Slide 41
Using Excel’s Chart Tools
to Construct a Histogram
Step 1.
Step 2.
Step 3.
Step 4.
Select cells C2:D7
Click the Insert tab on the Ribbon
In the Charts group, click Column
When the list of column chart subtypes appears:
Go to the 2-D Column section
Click Clustered Column (the leftmost chart)
Step 5. In the Chart Layouts group, click the More
button (the downward pointing arrow with
a line over it) to display all the options
… continued
Slide 42
Using Excel’s Chart Tools
to Construct a Histogram
Step 6. Choose Layout 8
Step 7. Select the Chart Title and replace it with
Tune-up Parts Cost
Step 8. Select the Horizontal (Category) Axis Title and
replace it with Parts Cost ($)
Step 9. Select the Vertical (Value) Axis Title and replace
it with Frequency
Slide 43
Using Excel’s Chart Tools
to Construct a Histogram
C
E
Tune-up Parts Cost
20
Frequency
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
D
15
10
5
0
50-59
60-69
70-79
80-89
90-99 100-109
Parts Cost ($)
Slide 44
Histogram
Symmetric
• Left tail is the mirror image of the right tail
• Examples: heights and weights of people
.35
Relative Frequency

.30
.25
.20
.15
.10
.05
0
Slide 45
Histogram
Moderately Skewed Left
• A longer tail to the left
• Example: exam scores
.35
Relative Frequency

.30
.25
.20
.15
.10
.05
0
Slide 46
Histogram
Moderately Right Skewed
• A Longer tail to the right
• Example: housing values
.35
Relative Frequency

.30
.25
.20
.15
.10
.05
0
Slide 47
Histogram
Highly Skewed Right
• A very long tail to the right
• Example: executive salaries
.35
Relative Frequency

.30
.25
.20
.15
.10
.05
0
Slide 48
Cumulative Distributions
Cumulative frequency distribution  shows the
number of items with values less than or equal to the
upper limit of each class..
Cumulative relative frequency distribution – shows
the proportion of items with values less than or
equal to the upper limit of each class.
Cumulative percent frequency distribution – shows
the percentage of items with values less than or
equal to the upper limit of each class.
Slide 49
Cumulative Distributions

Hudson Auto Repair
Cost ($)
< 59
< 69
< 79
< 89
< 99
< 109
Cumulative Cumulative
Cumulative
Relative
Percent
Frequency
Frequency
Frequency
2
.04
4
15
.30
30
31 2 + 13 .62 15/50 62 .30(100)
38
.76
76
45
.90
90
50
1.00
100
Slide 50
Ogive

An ogive is a graph of a cumulative distribution.

The data values are shown on the horizontal axis.

Shown on the vertical axis are the:
• cumulative frequencies, or
• cumulative relative frequencies, or
• cumulative percent frequencies

The frequency (one of the above) of each class is
plotted as a point.

The plotted points are connected by straight lines.
Slide 51
Ogive

Hudson Auto Repair
• Because the class limits for the parts-cost data are
50-59, 60-69, and so on, there appear to be one-unit
gaps from 59 to 60, 69 to 70, and so on.
• These gaps are eliminated by plotting points
halfway between the class limits.
• Thus, 59.5 is used for the 50-59 class, 69.5 is used
for the 60-69 class, and so on.
Slide 52
Ogive with
Cumulative Percent Frequencies
Example: Hudson Auto Repair
Tune-up Parts Cost
Cumulative Percent Frequency

100
80
60
(89.5, 76)
40
20
50
60
70
80
90
100
110
Parts
Cost ($)
Slide 53
Using Excel’s PivotChart Report
You have now seen how Excel’s PivotTable report can be
used to construct a frequency distribution for quantitative
data and how Excel’s Chart tools can be used to construct
the corresponding histogram.
However, Excel’s PivotChart report can be used to
develop a frequency distribution and a graphical display
at the same time.
Slide 54
Using Excel’s PivotChart Report
Step 1. Click the Insert tab on the Ribbon
Step 2. In the Tables group, click the word PivotTable
Step 3. Choose PivotChart from the options that appear
Step 4. When the Create PivotTable with PivotChart
dialog box appears:
Choose Select a table or range
Enter A1:A51 in the Table/Range box
Choose Existing Worksheet as the location for
the PivotTable and PivotChart
Enter C1 in the Location box
Click OK
… continued
Slide 55
Using Excel’s PivotChart Report
Step 5. In the PivotTable Field List, go to Choose Fields
to add to report
Drag the Parts Cost field to the Axis Fields
(Categories) area
Drag the Parts Cost field to the Values area
Step 6. Click Sum of Parts Cost in the Values area
Step 7. Click Value Field Settings from the list of options
that appear
Step 8. When the Value Field Settings dialog appears:
Under Summarize value field by, choose Count
Click OK
… continued
Slide 56
Using Excel’s PivotChart Report
Step 9. Right click cell C2 n the PivotTable report or any
other cell containing a parts cost
Step 10. Choose Group from the list of options
Step 11. When the Grouping dialog box appears:
Enter ___ in the Starting at box
Enter ___ in the Ending at box
Click OK
Step 12. Click inside the resulting PivotChart
Step 13. Click the Design tab on the Ribbon
… continued
Slide 57
Using Excel’s PivotChart Report
Step 14. In the Chart Layouts group, click the More
button (the downward pointing arrow with a
line over it) to display all the options
Step 15. Choose Layout 8
Step 16. Select the Chart Title and replace it with
Tune-up Parts Costs
Step 17. Select the Horizontal Axis (Category) Title and
replace it with Parts Cost ($)
Step 18. Select the Vertical (Value) Axis Title and replace
it with Frequency
Slide 58
End of Chapter 2, Part A
Slide 59
Chapter 2, Part B
Descriptive Statistics:
Tabular and Graphical Presentations

Exploratory Data Analysis: Stem-and-Leaf Display

Crosstabulations and Scatter Diagrams
Slide 60
Exploratory Data Analysis
 The techniques of exploratory data analysis consist of
simple arithmetic and easy-to-draw pictures that can
be used to summarize data quickly.
 One such technique is the stem-and-leaf display.
Slide 61
Stem-and-Leaf Display
 A stem-and-leaf display shows both the rank order
and shape of the distribution of the data.
 It is similar to a histogram on its side, but it has the
advantage of showing the actual data values.
 The first digits of each data item are arranged to the
left of a vertical line.
 To the right of the vertical line we record the last
digit for each item in rank order.
 Each line in the display is referred to as a stem.
 Each digit on a stem is a leaf.
Slide 62
Example: Hudson Auto Repair
The manager of Hudson Auto would like to gain a
better understanding of the cost of parts used in the
engine tune-ups performed in the shop. She examines
50 customer invoices for tune-ups. The costs of parts,
rounded to the nearest dollar, are listed on the next
slide.
Slide 63
Stem-and-Leaf Display
 Example: Hudson Auto Repair
Sample of Parts Cost ($) for 50 Tune-ups
91
71
104
85
62
78
69
74
97
82
93
72
62
88
98
57
89
68
68
101
75
66
97
83
79
52
75
105
68
105
99
79
77
71
79
80
75
65
69
69
97
72
80
67
62
62
76
109
74
73
Slide 64
Stem-and-Leaf Display
 Example: Hudson Auto Repair
5
6
7
8
9
10
a stem
2
2
1
0
1
1
7
2
1
0
3
4
2
2
2
7
5
2
2
3
7
5
5
3
5
7
9
6
4
8
8
7 8 8 8 9 9 9
4 5 5 5 6 7 8 9 9 9
9
9
a leaf
Slide 65
Stretched Stem-and-Leaf Display
 If we believe the original stem-and-leaf display has
condensed the data too much, we can stretch the
display by using two stems for each leading digit(s).
 Whenever a stem value is stated twice, the first value
corresponds to leaf values of 0  4, and the second
value corresponds to leaf values of 5  9.
Slide 66
Stretched Stem-and-Leaf Display
 Example: Hudson Auto Repair
5
5
6
6
7
7
8
8
9
9
10
10
2
7
2
5
1
5
0
5
1
7
1
5
2
6
1
5
0
8
3
7
4
5
2
7
2
5
2
9
2
8 8 8 9 9 9
2 3 4 4
6 7 8 9 9 9
3
7 8 9
9
Slide 67
Stem-and-Leaf Display
 Leaf Units
• A single digit is used to define each leaf.
• In the preceding example, the leaf unit was 1.
• Leaf units may be 100, 10, 1, 0.1, and so on.
• Where the leaf unit is not shown, it is assumed
to equal 1.
Slide 68
Example: Leaf Unit = 0.1
If we have data with values such as
8.6
11.7
9.4
9.1
10.2
11.0
8.8
a stem-and-leaf display of these data will be
Leaf Unit = 0.1
8 6 8
9 1 4
10 2
11 0 7
Slide 69
Example: Leaf Unit = 10
If we have data with values such as
1806
1717
1974
1791
1682
1910 1838
a stem-and-leaf display of these data will be
Leaf Unit = 10
16 8
17 1 9
18 0 3
19 1 7
The 82 in 1682
is rounded down
to 80 and is
represented as an 8.
Slide 70
Crosstabulations and Scatter Diagrams
 Thus far we have focused on methods that are used
to summarize the data for one variable at a time.
 Often a manager is interested in tabular and
graphical methods that will help understand the
relationship between two variables.
 Crosstabulation and a scatter diagram are two
methods for summarizing the data for two variables
simultaneously.
Slide 71
Crosstabulation
 A crosstabulation is a tabular summary of data for
two variables and helps to reveal the relationship
between the two variables.

Crosstabulation can be used when:
• One variable is Qualitative and the other is
Categorical,
• Both variables are Qualitative, or
• Both variables are Categorical.
 The left and top margin labels define the classes for
the two variables.
Slide 72
Crosstabulation
 Example: Finger Lakes Homes
The number of Finger Lakes homes sold for each
style and price for the past two years is shown below.
quantitative
categorical
variable
variable
Home Style
Price
Colonial Log Split A-Frame Total
Range
< $99,000
> $99,000
Total
18
12
6
14
19
16
12
3
55
30
20
35
15
100
45
Slide 73
Crosstabulation
 Example: Finger Lakes Homes
Insights Gained from Preceding Crosstabulation
•
The greatest number of homes (19) in the sample
are a split-level style and priced at less than or
equal to $99,000.
• Only three homes in the sample are an A-Frame
style and priced at more than $99,000.
Slide 74
Crosstabulation
 Example: Finger Lakes Homes
Frequency
distribution
for the
price range
variable
Home Style
Log Split A-Frame
Price
Range
Colonial
< $99,000
> $99,000
18
12
6
14
19
16
12
3
55
30
20
35
15
100
Total
Total
45
Frequency distribution for
the home style variable
Slide 75
Using Excel’s PivotTable Report
to Create a Crosstabulation
 Excel Worksheet (showing partial data)
1
2
3
4
5
6
7
8
9
A
B
Home Price ($)
1
>99K
2
<=99K
3
>99K
4
<=99K
5
<=99K
6
<=99K
7
>99K
8
>99K
C
Style
Colonial
Log
Log
A-Frame
Colonial
Split-Level
A-Frame
Colonial
D
E
Note: Rows 10-101 are not shown.
Slide 76
Using Excel’s PivotTable Report
to Create a Crosstabulation
 Displaying the Initial PivotTable Field List and
PivotTable Report
Step 1 Click the Insert tab on the Ribbon
Step 2 In the Tables group, click the icon above the
word PivotTable
Step 3 When the Create PivotTable dialog box appears:
Choose Select a Table or Range
Enter A1:C101 in the Table/Range box
Choose New Worksheet as the location for
the PivotTable Report
Click OK
Slide 77
Using Excel’s PivotTable Report
to Create a Crosstabulation
 Setting Up the PivotTable Field List
Step 1 In the PivotTable Field List, go to Choose Fields
to add to report
Drag the Price ($) field to Row Labels area
Drag the Style field to Column Labels area
Drag the Home field to the Values area
Step 2 Click on Sum of Home in the Values area
Step 3 Click Value Field Settings from the list of options
Step 4 When the Value Field Settings dialog box appears:
Under Summarize value field by, choose Count
Choose New Worksheet as the location for
Click OK
Slide 78
Using Excel’s PivotTable Report
to Create a Crosstabulation
 Value Worksheet
A
1
2
3
4
5
6
B
C
Count of Home Style
Price ($)
Colonial
<=99K
18
>99K
12
Grand Total
30
D
E
F
G
Log Split-Level A-Frame Grand Total
6
19
12
55
14
16
3
45
20
35
15
100
Slide 79
Crosstabulation: Row or Column Percentages
 Converting the entries in the table into row
percentages or column percentages can provide
additional insight about the relationship between
the two variables.
Slide 80
Crosstabulation: Row Percentages
 Example: Finger Lakes Homes
Price
Range
Colonial
< $99,000
> $99,000
32.73
26.67
Home Style
Log Split A-Frame
10.91
31.11
34.55
35.56
21.82
6.67
Total
100
100
Note: row totals are actually 100.01 due to rounding.
(Colonial and > $99K)/(All >$99K) x 100 = (12/45) x 100
Slide 81
Crosstabulation: Column Percentages
 Example: Finger Lakes Homes
Price
Range
Colonial
< $99,000
> $99,000
60.00
40.00
30.00
70.00
54.29
45.71
80.00
20.00
Total
100
100
100
100
Home Style
Log Split A-Frame
(Colonial and > $99K)/(All Colonial) x 100 = (12/30) x 100
Slide 82
Crosstabulation: Simpson’s Paradox
 Data in two or more crosstabulations are often
aggregated to produce a summary crosstabulation.
 We must be careful in drawing conclusions about the
relationship between the two variables in the
aggregated crosstabulation.
 Simpson’ Paradox: In some cases the conclusions
based upon an aggregated crosstabulation can be
completely reversed if we look at the unaggregated
data. Before drawing conclusions about
relationships between two variables (for aggregated
data), you must investigate whether any hidden
variables could affect the results.
Slide 83
Scatter Diagram and Trendline
 A scatter diagram is a graphical presentation of the
relationship between two quantitative variables.
 One variable is shown on the horizontal axis and the
other variable is shown on the vertical axis.
 The general pattern of the plotted points suggests the
overall relationship between the variables.
 A trendline is an approximation of the relationship.
Slide 84
Scatter Diagram
 A Positive Relationship
y
x
Slide 85
Scatter Diagram
 A Negative Relationship
y
x
Slide 86
Scatter Diagram
 No Apparent Relationship
y
x
Slide 87
Scatter Diagram
 Example: Panthers Football Team
The Panthers football team is interested in
investigating the relationship, if any, between
interceptions made and points scored.
x = Number of
Interceptions
1
3
2
1
3
y = Number of
Points Scored
14
24
18
17
30
Slide 88
Scatter Diagram
Number of Points Scored
y
35
30
25
20
15
10
5
0
0
1
x
2
3
4
Number of Interceptions
Slide 89
Example: Panthers Football Team
 Insights Gained from the Preceding Scatter Diagram
•
The scatter diagram indicates a positive relationship
between the number of interceptions and the
number of points scored.
•
Higher points scored are associated with a higher
number of interceptions.
• The relationship is not perfect; all plotted points in
the scatter diagram are not on a straight line.
Slide 90
Using Excel’s Chart Wizard to Construct
a Scatter Diagram and Trendline
 Excel Worksheet (showing data)
1
2
3
4
5
6
7
A
Number of
Interceptions
1
3
2
1
3
B
Number of
Points Scored
14
24
18
17
30
C
Slide 91
Using Excel’s Chart Tools to
Construct a Scatter Diagram and Trendline
Step 1 Select cells A2:B6
Step 2 Click the Insert tab on the Excel Ribbon
Step 3 In the Charts group, click Scatter
Step 4 When the list of scatter diagram subtypes appears:
Click Scatter with only Markers
Step 5 In the Chart Layout group, click Layout 1
Step 6 Select the Chart Title and replace it with Scatter
Diagram for the Panthers
Step 7 Select the Horizontal Axis (Value) Title and
replace it with Number of Interceptions
. . . continue
Slide 92
Using Excel’s Chart Tools to
Construct a Scatter Diagram and Trendline
Step 8 Select the Vertical (Value) Axis Title and replace
it with Number of Points Scored
Step 9 Right click Series 1 Legend Entry and click Delete
- - - - - - - - - - - - - - - - To Add a Trendline - - - - - - - - - - - - - - -
Step 10 Position the pointer over any data point in the
scatter diagram and right-click to display options
Step 11 Choose Add Trendline
Step 12 When the Format Trendline dialog box appears:
Select Trendline Options
Choose Linear from Trend/Regression Type list
Click Close
Slide 93
Using Excel’s Chart Tools to
Construct a Scatter Diagram and Trendline
A
C
Scatter Diagram for the Panthers
35
30
Number of
Points Scored.
8
9
10
11
12
13
14
15
16
17
18
19
20
B
25
20
15
10
5
0
0
1
2
3
Num ber of Interceptions
4
Slide 94
Tabular and Graphical Methods
Data
Categorical Data
Tabular
Methods
• Frequency
Distribution
• Rel. Freq. Dist.
• Percent Freq.
Distribution
• Crosstabulation
Quantitative Data
Graphical
Methods
Tabular
Methods
• Bar Graph
• Pie Chart
• Frequency
Distribution
• Rel. Freq. Dist.
• % Freq. Dist.
• Cum. Freq. Dist.
• Cum. Rel. Freq.
Distribution
• Cum. % Freq.
Distribution
• Crosstabulation
Graphical
Methods
• Dot Plot
• Histogram
• Ogive
• Stem-andLeaf Display
• Scatter
Diagram
Slide 95
End of Chapter 2, Part B
Slide 96
Download