Now

advertisement
CHAPTER 2
Descriptive Statistics:
Tabular and
Graphical Methods
19
20
CHAPTER 2
KEY CONCEPTSs
CONCEPT
EXAMPLES
EXERCISES
Summarizing Qualitative Data
Frequency Distribution
1
1
Relative Frequency Distribution
2
1
Percent Frequency Distribution
2
1
Bar Graph
3
2
Pie Chart
4
2
Frequency Distribution
5
3
Relative Frequency Distribution
5
3
Percent Frequency Distribution
5
3
Histogram
6
4
Cumulative Distributions
7
5
Ogive
8
5
9
6
Crosstabulation
10
7
Scatter Diagram
11
8
Summarizing Quantitative Data
Exploratory Data Analysis
Stem-and-Leaf Display
Relationship Between Two Variables
Excel Used
REVIEWs
Summarizing Qualitative Data
Frequency Distribution


A frequency distribution is a tabular summary of data showing the frequency (or
number) of items in each of several nonoverlapping classes.
The objective is to provide insights about the data that cannot be quickly obtained by
looking only at the original data.
Excel Function for Qualitative Frequency Distributions
TABULAR AND GRAPHICAL METHODS




21
Excel’s COUNTIF function can be used to construct a frequency distribution for
qualitative data.
The function has two arguments:
 First – the range of cell addresses containing the observations (qualitative data)
to be counted by class.
 Second – the label, or address of the cell containing the label, of the class to be
counted.
Note that each time the function is employed, it counts the frequency of only one
class – the class entered as the second argument in the function.
If, for example, you have four classes of data, you would employ the function four
times.
Relative Frequency Distribution


The relative frequency of a class is the fraction or proportion of the total number of data
items belonging to the class.
A relative frequency distribution is a tabular summary of a set of data showing the
relative frequency for each class.
Percent Frequency Distribution
 The percent frequency of a class is the relative frequency multiplied by 100.
 A percent frequency distribution is a tabular summary of a set of data showing the
percent frequency for each class.
Bar Graph





A bar graph is a graphical device for depicting qualitative data that have been
summarized in a frequency, relative frequency, or percent frequency distribution.
On the horizontal axis we specify the labels that are used for each of the classes.
A frequency, relative frequency, or percent frequency scale can be used for the vertical
axis.
Using a bar of fixed width drawn above each class label, we extend the height
appropriately.
The bars are separated to emphasize the fact that each class is a separate category.
Pie Chart



The pie chart is a commonly used graphical device for presenting relative frequency
distributions for qualitative data.
First draw a circle; then use the relative frequencies to subdivide the circle into sectors
that correspond to the relative frequency for each class.
Since there are 360 degrees in a circle, a class with a relative frequency of .25 would
consume .25(360) = 90 degrees of the circle.
Excel’s Tool for Bar Graphs and Pie Charts


Excel’s Chart Wizard can be used to construct bar graphs and pie charts.
After frequencies, relative frequencies, or percent frequencies have been computed,
these values can be graphically displayed using Chart Wizard.
Summarizing Quantitative Data
22
CHAPTER 2
Frequency Distribution




With quantitative data we have to be careful in defining the nonoverlapping classes to
be used in the frequency distribution.
The three steps necessary to define the classes are:
 Determine the number of nonoverlapping classes.
 Determine the width of each class.
 Determine the class limits.
The guidelines for selecting number of classes are:
 Use between 5 and 20 classes.
 Data sets with a larger number of elements usually require a larger number of
classes.
 Smaller data sets usually require fewer classes.
The guidelines for selecting the width of classes are:
 Use classes of equal width.
 Approximate Class Width =
Largest Data Value - Smallest Data Value
Number of Classes
Excel Function for Quantitative Frequency Distributions






Excel’s FREQUENCY function can be used to construct a frequency distribution for
quantitative data.
Unlike simple Excel functions (COUNTIF for example), the FREQUENCY function
can provide multiple values such as class frequencies.
A formula containing a function that can return multiple values is called an array
formula and must be entered in a special way.
The function has two arguments:
 First – the range of cell addresses containing the observations (quantitative data)
to be counted by class.
 Second – a list of the upper class limits (this tells Excel which frequency to put in
each cell within the range of the array formula)
After selecting the cells in which you want the frequencies to appear and typing in
the array formula containing the FREQUENCY function, press
CTRL+SHIFT+ENTER. The formula will be entered into each of the cells selected.
Regardless of the number of classes, you enter the array formula only once and no
copying/pasting is necessary.
Histogram




A common graphical presentation of quantitative data is a histogram.
The variable of interest is placed on the horizontal axis and the frequency, relative
frequency, or percent frequency is placed on the vertical axis.
A rectangle is drawn above each class interval with its height corresponding to the
interval’s frequency, relative frequency, or percent frequency.
Unlike a bar graph, a histogram has no natural separation between rectangles of
adjacent classes.
Excel’s Tool for Histograms

Excel’s Chart Wizard can be used to construct histograms.
TABULAR AND GRAPHICAL METHODS

23
After frequencies, relative frequencies, or percent frequencies have been computed,
these values can be displayed graphically using Chart Wizard.
Cumulative Distributions



The cumulative frequency distribution shows the number of items with values less than
or equal to the upper limit of each class.
The cumulative relative frequency distribution shows the proportion of items with values
less than or equal to the upper limit of each class.
The cumulative percent frequency distribution shows the percentage of items with
values less than or equal to the upper limit of each class.
Ogive





An ogive is a graph of a cumulative distribution.
The data values are shown on the horizontal axis.
Shown on the vertical axis are one of the following: cumulative frequencies, cumulative
relative frequencies, or cumulative percent frequencies.
The frequency (one of the above) of each class is plotted as a point.
Straight lines connect the plotted points.
Excel’s Tool for Ogives


Excel’s Chart Wizard can be used to construct ogives.
After cumulative frequencies have been computed, these values can be displayed
graphically as a form of scatter diagram using Chart Wizard.
Exploratory Data Analysis


The techniques of exploratory data analysis consist of simple arithmetic and easy-todraw pictures that can be used to summarize data quickly.
One such technique, for quantitative data, is the stem-and-leaf display.
Stem-and-Leaf Display






A stem-and-leaf display shows both the rank order and shape of the distribution of
the data.
It is similar to a histogram on its side, but it has the advantage of showing the actual
data values.
The first digit(s) of each data item are arranged to the left of a vertical line.
To the right of the vertical line we record the last digit for each item in rank order.
Each line in the display is referred to as a stem.
Each digit on a stem is a leaf.
Relationship Between Two Variables


Often a manager is interested in tabular and graphical methods that will help understand the
relationship between two variables.
Crosstabulation and a scatter diagram are two methods for summarizing the data for two
variables simultaneously.
Crosstabulation


A crosstabulation is a tabular summary of data for two variables.
The two variables might both be qualitative, both be quantitative, or be one of each.
24
CHAPTER 2


The classes for one variable are represented by the rows; the columns represent the
classes for the other variable.
Converting the entries in the table into row percentages or column percentages can
provide additional insight about the relationship between the variables.
Excel’s Tool for Crosstabulation


Excel’s PivotTable Report is a general tool for summarizing the data for two or more
variables simultaneously.
This tool can be used to construct a crosstabulation (an Excel PivotTable).
Scatter Diagram



A scatter diagram is a graphical presentation of the relationship between two
quantitative variables.
One variable is shown on the horizontal axis and the other variable is shown on the
vertical axis.
A positive relationship, negative relationship, or no relationship might be apparent to a
manager with the aid of a scatter diagram.
Excel’s Tool for Scatter Diagrams


Excel’s Chart Wizard can be used to construct scatter diagrams.
Once a scatter diagram has been developed, other Excel tools (covered in later
chapters) can be used, for example to fit a trend line to the plotted data.
EXAMPLESs
EXAMPLE 1
FREQUENCY DISTRIBUTION – QUALITATIVE DATA
Guests staying at Marada Inn were asked to rate the quality of their accommodations as
being excellent, above average, average, below average, or poor. The ratings provided by
a sample of 20 quests are shown below.
Below Average
Above Average
Below Average
Above Average
Average
Above Average
Average
Average
Above Average
Above Average
Poor
Above Average
Above Average
Below Average
Poor
Average
Provide a frequency distribution, using Excel, showing the number of occurrences of each
rating level in the sample.
SOLUTION 1
USING EXCEL’S COUNTIF FUNCTION FOR FREQUENCY DISTRIBUTIONS
TABULAR AND GRAPHICAL METHODS
25
Excel can be used to count the frequencies and construct a frequency distribution for
the quality ratings data.
Enter Data: The label “Rating Given” and the data for the 20-guest sample are entered into
cells A1:A21.
(Note: Misspelled data will not be counted as it should. Misspelling
includes typing any spaces before or after the phrases!)
Enter Functions and Formulas: Excel’s COUNTIF function can be used to count the
number of times each rating level appears in cells A2:A21. We first enter a label and the
rating levels in cells C1:C6, the label “Total” in cell C7, and the label “Frequency” in cell D1.
Then, we enter the following function into cell D2:
=COUNTIF($A$2:$A$21,C2)
To count the number of times the other rating levels appear in our data, we copy the above
formula into cells D3:D6. Finally, we total the frequency counts using the SUM function in
cell D7.
(Note: It is a good idea to total your frequency count as a partial check of the
accuracy of your data and function entries. We know from our sample size that
the total frequency should be 20.)
Formula Worksheet
1
2
3
4
5
6
7
8
20
21
A
Rating Given
Above Average
Below Average
Above Average
Average
Average
Above Average
Above Average
Excellent
Poor
B
C
Quality Rating
Poor
Below Average
Average
Above Average
Excellent
Total
D
Frequency
=COUNTIF($A$2:$A$21,C2)
=COUNTIF($A$2:$A$21,C3)
=COUNTIF($A$2:$A$21,C4)
=COUNTIF($A$2:$A$21,C5)
=COUNTIF($A$2:$A$21,C6)
=SUM(D2:D6)
Note: Rows 9-19 are hidden.
Value Worksheet
1
2
3
4
5
6
7
8
20
21
A
Rating Given
Above Average
Below Average
Above Average
Average
Average
Above Average
Above Average
Excellent
Poor
B
C
Quality Rating
Poor
Below Average
Average
Above Average
Excellent
Total
D
Frequency
2
3
5
9
1
20
26
CHAPTER 2
Note: Rows 9-19 are hidden.
EXAMPLE 2
RELATIVE AND PERCENT FREQUENCY DISTRIBUTIONS – QUALITATIVE DATA
Refer to the quality ratings data in Example 1. Construct, using Excel, a relative
frequency distribution and percent frequency distribution for the data.
SOLUTION 2
USING EXCEL
For a data set with n observations, the relative frequency of each class is computed as
follows:
Relative Frequency of a Class = (Frequency of the Class)/n
Extending the worksheet shown in the solution to Example 1, we can construct the relative
frequency and percent frequency distributions for the quality ratings data.
Enter Data: The data set is already entered. We simply enter the label “Relative
Frequency” in cell E1and “Percent Frequency” in cell F1.
Enter Functions and Formulas: Using the relative frequency formula above, we compute
the relative frequency for the Poor rating by entering the formula =D2/$D$7 in cell E2. We
then copy cell E2 to cells E3:E6.
To compute the percent frequency for the Poor rating we enter the formula =E2*100
into cell F2. We then copy cell F2 to cells F3:F6. Finally, we copy cell D7 to cells E7:F7 to
compute the total of the relative frequencies (1.00) and the total of the percent frequencies
(100).
Formula Worksheet
C
D
1 Quality Rating Frequency
2 Poor
=COUNTIF($A$2:$A$21,C2)
3 Below Average =COUNTIF($A$2:$A$21,C3)
4 Average
=COUNTIF($A$2:$A$21,C4)
5 Above Average =COUNTIF($A$2:$A$21,C5)
6 Excellent
=COUNTIF($A$2:$A$21,C6)
7
Total =SUM(D2:D6)
8
Note: Columns A and B are not shown.
E
Relative
Frequency
=D2/$D$7
=D3/$D$7
=D4/$D$7
=D5/$D$7
=D6/$D$7
=SUM(E2:E6)
F
Percent
Frequency
=E2*100
=E3*100
=E4*100
=E5*100
=E6*100
=SUM(F2:F6)
TABULAR AND GRAPHICAL METHODS
27
Value Worksheet
C
1
2
3
4
5
6
7
8
Quality Rating
Poor
Below Average
Average
Above Average
Excellent
Total
D
Frequency
2
3
5
9
1
20
E
Relative
Frequency
0.10
0.15
0.25
0.45
0.05
1.00
F
Percent
Frequency
10
15
25
45
5
100
Note: Columns A and B are not shown.
EXAMPLE 3
BAR GRAPH
Refer to the quality ratings data in Example 1. Display the frequencies (computed in
Example 1) graphically with a bar graph using Excel.
SOLUTION 3
USING EXCEL’S CHART WIZARD FOR BAR GRAPHS
Extending the worksheet shown in the solution to Example 1, we can construct the bar
graph using Excel’s Chart Wizard. A third task (in addition to Enter Data and Enter
Formulas/Functions) is now necessary: Apply Tools.
Enter Data: The data set was entered in Example 1.
Enter Functions and Formulas: The functions and formulas for the relative frequencies
we want to graph were entered in Example 1.
Apply Tools: The following steps describe how to use Excel’s Chart Wizard to construct a
bar graph using the frequency distribution appearing in cells C1:D6.
Step 1 Select cells C1:D6
Step 2 Select the Chart Wizard button
Step 3 When the Chart Wizard-Step 1 of 4-Chart Type dialog box appears:
Choose Column in the Chart type list
Choose Clustered Column from the Chart sub-type display
Select Next >
Step 4 When the Chart Wizard-Step 2 of 4-Chart Source Data dialog box appears
Select Next >
Step 5 When the Chart Wizard-Step 3 of 4-Chart Options dialog box appears:
28
CHAPTER 2
Select the Titles tab and then
Type Bar Graph of Quality Ratings in the Chart title box
Enter Quality Rating in the Value (X) axis box
Enter Frequency in the Value (Y) axis box
Select the Legend tab and then
Remove the check in the Show Legend box
Select Next >
Step 6 When the Chart Wizard-Step 4 of 4-Chart Location dialog box appears:
Specify the location for the new chart
Select Finish to display the bar graph
With practice and patience, you can alter the graph initially produced by Excel to look
like the one below or to suit your personal preferences. A right-click on almost any item in
the chart will bring up a menu of alteration options.
Bar Graph
C
D
E
F
Bar Graph of Quality Ratings
10
Frequency
9
10
11
12
13
14
15
16
17
18
19
20
21
22
8
6
4
2
0
Poor
Below
Average
Average
Above
Average
Excellent
Quality Rating
EXAMPLE 4
PIE CHART
Refer to the quality ratings data in Example 1. Display the percent frequencies
(computed in Example 2) graphically with a pie chart using Excel.
SOLUTION 4
USING EXCEL’S CHART WIZARD FOR PIE CHARTS
Excel’s Chart Wizard provides a general tool for constructing graphical displays such as
pie charts. Extending the worksheet shown in the solution to Example 2, we can construct
the pie chart. A third task (in addition to Enter Data and Enter Formulas/Functions) is
necessary: Apply Tools.
TABULAR AND GRAPHICAL METHODS
29
Enter Data: The data set was entered in Example 1.
Enter Functions and Formulas: The functions and formulas for the percent frequencies
we want to graph were entered in Example 2.
Apply Tools: The following steps describe how to use Excel’s Chart Wizard to construct a
pie chart using the percent frequency distribution appearing in cells C1:C6 and F1:F6.
Step 1 Select cells C1:C6 and F1:F6 (To select nonadjacent cells, select cells C1:C6
and then press the Control key while selecting cells F1:F6.)
Step 2 Select the Chart Wizard button
Step 3 When the Chart Wizard-Step 1 of 4-Chart Type dialog box appears:
Choose Pie in the Chart type list
Choose Pie from the Chart sub-type display
Select Next >
Step 4 When the Chart Wizard-Step 2 of 4-Chart Source Data dialog box appears
Select Next >
Step 5 When the Chart Wizard-Step 3 of 4-Chart Options dialog box appears:
Select the Titles tab and then
Type Customers’ Quality Ratings at Marada in the Chart title box
Select the Legend tab and then
Remove the check in the Show Legend box
Select the Data Labels tab and then
Select Show Label and percent
Select Show leader lines
Select Next >
Step 6 When the Chart Wizard-Step 4 of 4-Chart Location dialog box appears:
Specify the location for the new chart
Select Finish to display the pie chart
With practice and patience, you can alter the chart initially produced by Excel to look like the
one below or to suit your personal preferences. A right-click on almost any item in the chart
will bring up a menu of alteration options.
Pie Chart
C
9
10
11
12
13
14
15
16
17
18
19
20
21
22
D
E
F
Customers' Quality Ratings
at Marada Inn
Excellent
5%
Above
Average
45%
Poor
10%
Below
Average
15%
Average
25%
G
30
CHAPTER 2
EXAMPLE 5
FREQUENCY DISTRIBUTION – QUANTITATIVE DATA
The manager of Hudson Auto Repair would like to get a better picture of the distribution
of costs for new parts used in the engine tune-up jobs done in the garage. A sample of 50
customer invoices for tune-ups has been taken and the costs of parts, rounded to the
nearest dollar, are listed below.
91
71
104
85
62
78
69
74
97
82
93
72
62
88
98
57
89
68
68
101
75
66
97
83
79
52
75
105
68
105
99
79
77
71
79
80
75
65
69
69
73
72
80
67
62
62
76
109
74
73
Develop, using Excel, a frequency distribution for these cost data. Use your own judgment
to determine the number of classes and class width that provide a distribution that will be
meaningful and helpful to the manager.
SOLUTION 5
USING EXCEL’S FREQUENCY FUNCTION
Before we employ Excel we must define the nonoverlapping classes to be used in the
frequency distribution. In this example the data is in dollars. Most people will think in
increments of $5, $10, $20, and so on because these increments coincide with the
denominations of U.S. currency. We should consider using one of these increments as the
class width.
If we round up the largest data value (109) to 110 and we round down the smallest data
value (52) to 50, we have a range of 110 – 50 = 60 for the frequency distribution to span. If
we choose 10 as the class width, the result will be 60/10 = 6 classes, which looks
reasonable.
The data is in integer dollar (no cents) amounts. If we set the lower limit of the first
class at 50, the upper limit of the first class will be 59 (not 60). There are 10 dollar amounts
between 50 and 59 inclusively. The second class will have limits of 60 and 69, and so on.
Now we are ready to use Excel. Using Excel’s COUNTIF function to construct a
frequency distribution for quantitative data is cumbersome. Excel’s FREQUENCY function
is more appropriate here.
Enter Data: The label “Parts Cost” and the cost data from the 50 customer invoices are
entered into cells A1:A51.
Enter Functions and Formulas: Descriptive labels are entered into cells C1 and D1, and
the class limits 50-59, 60-69, and so on, are entered in a text format into cells C2:C7.
TABULAR AND GRAPHICAL METHODS
A
B
C
Parts
Parts
Cost
Cost
1
2
91
50-59
3
71
60-69
4
104
70-79
5
85
80-89
6
62
90-99
7
78
100-109
8
69
50
74
51
73
31
D
Frequency
Note: Rows 9-49 are hidden.
The FREQUENCY function is not a “simple” Excel function. FREQUENCY is capable of
providing multiple values and for this reason it is called an array formula. An array formula
must be entered in a special way.
Entering the Necessary Array Formula
Step 1 Select cells D2:D7 (where the frequencies will appear)
Step 2 Type the following formula:
=FREQUENCY(A2:A51,{59,69,79,89,99,109})
Step 3 Hold down the CTRL and SHIFT keys while pressing the ENTER key
(Array formula will be entered in cells D2:D7)
A
B
C
Parts
Parts
Cost
1 Cost
2
91
50-59
3
71
60-69
4
104
70-79
5
85
80-89
6
62
90-99
7
78
100-109
8
69
50
74
51
73
D
Frequency
2
13
17
7
6
5
Note: Rows 9-49 are hidden.
EXAMPLE 6
32
CHAPTER 2
HISTOGRAM
Refer to the auto parts cost data in Example 5. Display the frequency distribution
(constructed in Example 5) graphically with a histogram using Excel.
SOLUTION 6
USING EXCEL’S CHART WIZARD FOR HISTOGRAMS
Enter Data: The data set was entered in Example 5.
Enter Functions and Formulas: The functions and formulas for the frequencies we want
to graph were entered in Example 5.
Apply Tools: The following steps describe how to use Excel’s Chart Wizard to construct a
Histogram.
Step 1 Select cells C1:D7
Step 2 Select the Chart Wizard button
Step 3 When the Chart Wizard-Step 1 of 4-Chart Type dialog box appears:
Choose Column in the Chart type list
Choose Clustered Column from the Chart sub-type display
Select Next >
Step 4 When the Chart Wizard-Step 2 of 4-Chart Source Data dialog box appears
Select Next >
Step 5 When the Chart Wizard-Step 3 of 4-Chart Options dialog box appears:
Select the Titles tab and then
Type Histogram for Parts Cost Data in the Chart title box
Enter Parts Cost ($) in the Value (X) axis box
Enter Frequency in the Value (Y) axis box
Select the Legend tab and then
Remove the check in the Show Legend box
Select Next >
Step 6 When the Chart Wizard-Step 4 of 4-Chart Location dialog box appears:
Specify the location for the new chart
Select Finish to display the histogram
Initial Histogram
TABULAR AND GRAPHICAL METHODS
C
E
Histogram for Parts Cost Data
20
Frequency
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
D
15
10
5
0
50-59
60-69
70-79
80-89
90-99
Parts Cost ($)
Eliminating Gaps Between Rectangles
Step 1 Right click on any rectangle in the column chart
Step 2 Select the Format Data Series option
Step 3 When the Format Data Series Option dialog box appears:
Select the Options tab and then
Enter 0 in the Gap width box
Select OK
Finished Histogram
100-109
33
34
CHAPTER 2
C
D
E
Histogram for Parts Cost Data
20
Frequency
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
15
10
5
0
50-59
60-69
70-79
80-89
90-99
100-109
Parts Cost ($)
EXAMPLE 7
CUMULATIVE DISTRIBUTIONS
Refer to the auto parts cost data in Example 5. Develop a cumulative frequency
distribution and a cumulative percent frequency distribution for this data using Excel.
SOLUTION 7
USING EXCEL FOR CUMULATIVE DISTRIBUTIONS
Extending the worksheet shown in the solution to Example 5, we can construct the
cumulative frequency and cumulative percent frequency distributions for the cost data.
Enter Data: The data set was entered in Example 5.
Enter Functions and Formulas: The functions and formulas for the frequencies we need
were entered in Example 5. The first cumulative frequency (2) is simply the frequency for
the first class, so we enter the formula =D2 in cell E2. The second cumulative frequency
(15) is equal to the frequency of the second class (13) plus the prior cumulative frequency
(2), so we enter the formula =D3+E2 in cell E3. Now, we copy cell E3 to cells E4:E7 to
complete the cumulative distribution.
To compute the cumulative percent frequency distribution, we must convert each
cumulative frequency value to a cumulative relative frequency value and then multiply it by
100. We enter the formula =E2/$D$8*100 into cell F2 to compute the first cumulative
percent frequency. We then copy cell F2 to cells F3:F7 to complete our distribution.
Formula Worksheet
35
TABULAR AND GRAPHICAL METHODS
1
2
3
4
5
6
7
8
A
B
C
D
Parts
Parts
Cost
Cost
Frequency
91
$50-59
2
71
$60-69
13
104
$70-79
16
85
$80-89
7
62
$90-99
7
78
$100-109
5
69
Sum =SUM(D2:D7)
E
Cumulative
Frequency
=D2
=D3+E2
=D4+E3
=D5+E4
=D6+E5
=D7+E6
F
Cumulative
% Frequency
=E2/$D$8*100
=E3/$D$8*100
=E4/$D$8*100
=E5/$D$8*100
=E6/$D$8*100
=E7/$D$8*100
G
F
Cumulative
Frequency
2
15
31
38
45
50
G
Cumulative
% Frequency
4
30
62
76
90
100
G
Note: Rows 9-51 are not shown.
Value Worksheet
1
2
3
4
5
6
7
8
A
B
C
Parts
Parts
Cost
Cost
91
$50-59
71
$60-69
104
$70-79
85
$80-89
62
$90-99
78
$100-109
69
Sum
D
Frequency
2
13
16
7
7
5
50
Note: Rows 9-51 are not shown.
EXAMPLE 8
OGIVE
Refer to the auto parts cost data in Example 5. Construct an ogive, using Excel,
showing the cumulative percent frequency distribution for the data.
SOLUTION 8
USING EXCEL’S CHART WIZARD FOR OGIVES
We will extend the worksheet shown in the solution to Example 7 to develop the ogive.
We need to modify the worksheet in two ways before we can apply the Chart Wizard tool.
We will “insert” (figuratively) a new row between existing rows 1 and 2, and we will insert
(literally) a new column between existing columns C and D.
The reason for a “new” row is that we need a starting point for the ogive showing that no
36
CHAPTER 2
data values fall below the 50-59 class. In other words, we need to create a 40-49 class with
a cumulative percent frequency equal to 0. The reason for a new column is this: Because
the class limits are 50-59, 60-69, and so on, there appear to be one-unit gaps from 59 to 60,
69 to 70, and so on. These gaps are eliminated by plotting points, on our ogive, at the
midpoints of the gaps. Thus, 59.5 is used as the upper limit for the 50-59 class, 69.5 is
used for the 60-69 class, and so on. We will enter these new upper limits in the new
column.
Enter Data: The data set was entered in Example 5.
Enter Functions and Formulas: The formulas for the cumulative percent frequencies we
need were entered in Example 7. Now, in order to enter the new 40-49 class in the
worksheet we must free up a row. To do this, we drag cells C2:G8 down one row to cells
C3:G9. This frees up cells C2:G2 for the new 40-49 class information. Next, we will insert
a new column between existing columns C and D. To do this, select any cell in column D
and then go to the main menu, select Insert, and select Columns in the drop-down menu.
The worksheet will now look like this:
1
2
3
4
5
6
7
8
9
A
B
C
Parts
Parts
Cost
Cost
91
71
50-59
104
60-69
85
70-79
62
80-89
78
90-99
69
100-109
74
Sum
D
E
Frequency
2
13
16
7
7
5
50
F
Cumulative
Frequency
G
Cumulative
% Frequency
2
15
31
38
45
50
4
30
62
76
90
100
In the new column D, enter the label “Parts Cost Upper Limit” in cell D1 and enter the
new upper limits 49.5, 59.5, and so on, in cells D2:D8. In cell C2 enter, in text format, 4049. In cells E2:G2 enter zeros. Finally, drag the Sum label in cell C9 to cell D9.
The worksheet will now look like this:
A
B
C
D
E
F
G
Parts
Parts
Parts Cost
Cumulative
Cumulative
1 Cost
Cost
Upper Limit Frequency Frequency
% Frequency
2
91
40-49
49.5
0
0
0
3
71
50-59
59.5
2
2
4
4
104
60-69
69.5
13
15
30
5
85
70-79
79.5
16
31
62
6
62
80-89
89.5
7
38
76
7
78
90-99
99.5
7
45
90
8
69
100-109
109.5
5
50
100
9
74
Sum
50
Apply Tools: The following steps describe how to use Excel’s Chart Wizard to construct an
37
TABULAR AND GRAPHICAL METHODS
ogive.
Step 1 Select cells D2:D8 and G2:G8 (To select nonadjacent cells, select cells D2:D8
and then press the Control key while selecting cells G2:G8.)
Step 2 Select the Chart Wizard button
Step 3 When the Chart Wizard-Step 1 of 4-Chart Type dialog box appears:
Choose XY (Scatter) in the Chart type list
Choose Scatter with data points connected by lines from the
Chart sub-type display
Select Next >
Step 4 When the Chart Wizard-Step 2 of 4-Chart Source Data dialog box appears
Select Next >
Step 5 When the Chart Wizard-Step 3 of 4-Chart Options dialog box appears:
Select the Titles tab and then
Type Ogive for the Parts-Cost Data in the Chart title box
Enter Parts Cost ($) in the Value (X) axis box
Enter Cumulative Percent Frequency in the Value (Y) axis box
Select the Legend tab and then
Remove the check in the Show Legend box
Select Next >
Step 6 When the Chart Wizard-Step 4 of 4-Chart Location dialog box appears:
Specify the location for the new chart
Select Finish to display the ogive
Ogive
C
E
F
G
H
Ogive for the Parts-Cost Data
100
90
80
Cumulative Percent
Frequency
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
D
EXAMPLE 9
70
60
50
40
30
20
10
0
0
10
20
30
40
50
60
70
Parts Cost ($)
80
90
100
110
120
38
CHAPTER 2
STEM-AND-LEAF DISPLAY
Refer to the auto parts cost data in Example 5.
a) Develop a stem-and-leaf display showing both the rank order and shape of the data set
simultaneously.
b) Also, develop a stretched stem-and-leaf display using two stems for each leading
digit(s).
c) Which of the two displays is better in terms of revealing the natural grouping and
variation in the data?
SOLUTION 9
a) To develop a stem-and-leaf display, we first arrange the leading digits (all but the last
digit) of each data value to the left of a vertical line. To the right of the vertical line, we
record the last digit for each data value as we pass through the observations in the
order they are recorded. The last digit for each data value is placed on the line
corresponding to its first digit.
At this point, the display will look like this:
5
6
7
8
9
10
7
2
8
0
1
4
2
9
5
9
3
5
6
3
0
9
9
2
1
5
7
1
8
2
8
7
5
5 8 8 9 7 2 9 2
5 9 5 2 6 4 7 1 4 9 9 3
3 2
8
Next, we sort (in ascending order) the digits on each line. The result is the finished
stem-and-leaf display.
5
6
7
8
9
10
2
2
1
0
1
1
7
2
1
0
3
4
2
2
2
7
5
2
2
3
7
5
5
3
5
8
9
6 7 8 8 8 9 9 9
3 4 4 5 5 5 6 7 8 9 9 9
8 9
9
b) If we believe that our stem-and-leaf display has condensed the data too much, we can
stretch the display by using two stems for each leading digit(s). All data values ending
in 0, 1, 2, 3, and 4 are placed on one line, and all values ending in 5, 6, 7, 8, and 9 are
placed on a second line.
The resulting stretched stem-and-leaf display will look like this:
TABULAR AND GRAPHICAL METHODS
5
5
6
6
7
7
8
8
9
9
10
10
2
7
2
5
1
5
0
5
1
7
1
5
2
6
1
5
0
8
3
7
4
5
2
7
2
5
2
9
39
2
8 8 8 9 9 9
2 3 3 4 4
6 7 8 9 9 9
3
8 9
9
c) The stretched stem-and-leaf display in (b) does a better job of revealing the dispersion of
the data.
EXAMPLE 10
CROSSTABULATION
Ithaca Log Homes manufactures four styles of log houses that are sold in kits. The
number of log houses the company has sold, by price and style, for the past year is shown
below.
<99 Colonial
>100 A-Frame
>100 Colonial
<99 Ranch
>100 Split-Level
<99 Colonial
>100 Split-Level
<99 Colonial
<99 A-Frame
>100 Split-Level
>100 Ranch
>100 Split-Level
<99 Colonial
>100 Colonial
>100 Ranch
<99 A-Frame
<99 A-Frame
<99 Split-Level
<99 Split-Level
<99 Split-Level
>100 Split-Level
<99 A-Frame
<99 Split-Level
>100 Colonial
>100 Ranch
<99 Colonial
>100 Ranch
>100 Split-Level
<99 Ranch
>100 Split-Level
<99 A-Frame
>100 Split-Level
<99 Colonial
<99 Colonial
>100 Colonial
>100 Colonial
>100 Ranch
<99 Split-Level
<99 Split-Level
<99 Colonial
Prepare a crosstabulation for the variables price and style.
SOLUTION 10
40
CHAPTER 2
USING EXCEL’S PIVOTTABLE REPORT FOR CROSSTABULATION
On a new worksheet, enter the data in columns A, B, and C.
A
B
1 Home Price ($)
>100K
2
1
<99K
3
2
>100K
4
3
<99K
5
4
<99K
6
5
<99K
7
6
>100K
8
7
>100K
9
8
C
Style
Colonial
Ranch
Ranch
A-Frame
Colonial
Split-Level
A-Frame
Colonial
D
E
Note: Rows 10-41 are not shown.
Changing the Default Order for the PivotTable Report
Step 1 Select the Tools pull-down menu
Step 2 Choose Options
Step 3 When the Options dialog box appears:
Select the Custom lists tab
In the List entries: box, type <99K and press Enter, and type >100K
Select Add
Select OK
We are now ready to use the PivotTable Report to construct a crosstabulation.
Using the PivotTable Report
Step 1 Select the Data pull-down menu
Step 2 Choose the PivotTable and PivotChart Report
Step 3 When the PivotTable and PivotChart Wizard-Step 1 of 3 dialog box appears:
Choose Microsoft Excel list or database
Choose PivotTable
Select Next >
Step 4 When the PivotTable and PivotChart Wizard Step 2 of 3 dialog box appears:
Enter A1:C41 in the Range box
Select Next >
Step 5 When the PivotTable and PivotChart Wizard Step 3 of 3 dialog box appears:
Select New Worksheet
Click on the Layout button
TABULAR AND GRAPHICAL METHODS
41
When the PivotTable and PivotChart Wizard – Layout diagram appears:
Drag the Price ($) field button to the ROW section of the diagram
Drag the Style field button to the COLUMN section of the diagram
Drag the Home field button to the DATA section of the diagram
Double click the Sum of Home field button in the data section
When the PivotTable Field dialog box appears:
Choose Count under Summarized by:
Select OK
Select OK
When the PivotTable and PivotChart Wizard-Step 3 of 3 dialog box
reappears:
Select Finish >
1
2
3
4
5
6
E
F
G
H
Count of Home Style
Price ($)
Colonial Ranch Split-Level
<99K
8
2
8
>100K
5
5
6
Grand Total
13
7
14
I
J
A-Frame Grand Total
5
23
1
17
6
40
EXAMPLE 11
SCATTER DIAGRAM
The Panthers football team is interested in investigating the relationship, if any, between
interceptions made and points scored. The following data was collected for five recent
games.
x = Number of
y = Number of
Interceptions
Points Scored
1
14
3
24
2
18
1
17
3
27
Develop a scatter diagram to show the relationship between the two variables, number
of interceptions and number of points scored.
SOLUTION 11
USING EXCEL’S CHART WIZARD FOR SCATTER DIAGRAMS
42
CHAPTER 2
Enter Data: The appropriate labels and the data for the five football games are entered
into cells A1:B6.
1
2
3
4
5
6
7
A
Number of
Interceptions
1
3
2
1
3
B
Number of
Points Scored
14
24
18
17
27
C
Enter Functions and Formulas: No functions or formulas are needed.
Apply Tools: The following steps describe how to use Excel’s Chart Wizard to produce a
scatter diagram from the data in the worksheet.
Step 1 Select cells A1:B6
Step 2 Select the Chart Wizard
Step 3 When the Chart Wizard-Step 1 of 4-Chart Type dialog box appears:
Choose XY (Scatter) in the Chart type list
Choose Scatter from the Chart sub-type display
Select Next >
Step 4 When the Chart Wizard-Step 2 of 4-Chart Source Data dialog box appears
Select Next >
Step 5 When the Chart Wizard-Step 3 of 4-Chart Options dialog box appears:
Select the Titles tab and then
Delete Number of Points Scored in the Chart title box
Enter Number of Interceptions in the Value (X) axis box
Enter Number of Points Scored in the Value (Y) axis box
Select the Legend tab and then
Remove the check in the Show Legend box
Select Next >
Step 6 When the Chart Wizard-Step 4 of 4-Chart Location dialog box appears:
Specify the location for the new chart
Select Finish to display the scatter diagram
TABULAR AND GRAPHICAL METHODS
43
Finished Scatter Diagram
Scatter Diagram for the Panthers
30
25
Number of
Points Scored
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
D
C
B
A
20
15
10
5
0
0
3
2
1
4
Number of Interceptions
EXERCISESs
EXERCISE 1
FREQUENCY DISTRIBUTIONS – QUALITATIVE DATA
It is time for Roger Hall, manager of new car sales at the Maxwell Ford dealership, to
submit his order for new Mustang coupes. These cars will be parked in the lot, available for
immediate sale to buyers who are not special-ordering a car. One of the decisions he must
make is how many Mustangs of each color option he should order. The new color options
are very similar to the past year’s options.
Roger believes that the colors chosen by customers who special-order their cars best
reflect most customers’ true color preferences. For that reason, he has taken a random
sample of 40 special orders for Mustang coupes placed in the past year. The color
preferences found in the sample are listed below.
Blue
Black
Red
Green
Blue
Black
Red
White
Black
Red
Green
White
Blue
Red
Black
White
Blue
White
Black
White
Black
Blue
Red
Blue
Black
Red
Green
Red
Black
Red
Red
Red
Black
White
Black
White
Black
Black
Green
Blue
Prepare, using Excel, a frequency distribution, relative frequency distribution, and
percent frequency distribution for the data set.
44
CHAPTER 2
EXERCISE 2
BAR GRAPH AND PIE CHART
Refer to the data set in Exercise 1. Construct, using Excel, a bar graph showing the
frequency distribution of the car colors. Also construct, using Excel, a pie chart showing the
percent frequency distribution of the car colors.
EXERCISE 3
FREQUENCY DISTRIBUTIONS – QUANTITATIVE DATA
Missy Walters owns a mail-order business specializing in clothing, linens, and furniture
for children. She is considering offering her customers a discount on shipping charges for
furniture based on the dollar-amount of the furniture order. Before Missy decides the
discount policy, she needs a better understanding of the dollar-amount distribution of the
furniture orders she receives.
Missy had an assistant randomly select 50 recent orders that included furniture and
record the value, to the nearest dollar, of the furniture portion of each order. The data
collected is listed below.
136
211
194
277
231
281
162
242
348
154
226
212
368
173
166
123
241
258
409
214
178
182
323
264
311
445
290
196
237
141
231
434
183
490
159
389
167
209
222
362
196
246
198
472
189
175
338
212
248
260
Prepare, using Excel, a frequency distribution, relative frequency distribution, and
percent frequency distribution for the data set. Use your own judgment to determine the
number of classes and class width that provide a distribution that will be meaningful and
helpful to Missy in deciding the shipping discount policy.
EXERCISE 4
HISTOGRAM
Refer to the data set in Exercise 3. Construct, using Excel, a histogram showing the
percent frequency distribution of the furniture-order values in the sample.
EXERCISE 5
TABULAR AND GRAPHICAL METHODS
45
CUMULATIVE DISTRIBUTIONS AND OGIVE
Refer to the data set in Exercise 3. Develop a cumulative frequency distribution and a
cumulative percent frequency distribution for this data using Excel. Then construct, using
Excel, an ogive showing the cumulative percent frequency distribution.
EXERCISE 6
STEM-AND-LEAF DISPLAY
Refer to the furniture-order values data in Example 3. Develop a stem-and-leaf display
for the data set. (Employ the following tips to avoid having 38 stems and other
shortcomings.)
1) Set the leaf unit equal to 10. In other words, convert the data to 10’s of dollars. For
example, the data value 226 would be treated as 22. The last digit, 6, is ignored.
2) Try using a stretched stem-and-leaf display. That is, have two stems labeled 1, two
stems labeled 2, and so on. Values 10-14 are entered on the first stem labeled 1,
values 15-19 are entered on the second stem labeled 1, and so on.
EXERCISE 7
CROSSTABULATION
Tony Zamora, a real estate investor, has just moved to Clarksville and wants to learn
about the local real estate market. He wants to understand, for example, the relationship
between geographical segment of the city and selling price of a house, the relationship
between selling price and number of bedrooms, and so on.
Tony has randomly selected 25 house-for-sale listings from the Sunday newspaper and
collected the data listed below.
a) Use Excel to construct a crosstabulation for the variables city segment and number of
bedrooms.
b) Compute the row percentages for your crosstabulation in part (a).
c) Comment on any apparent relationship between the variables.
46
CHAPTER 2
Segment
of City
Northwest
South
Northeast
Northwest
West
South
West
West
Northwest
Northeast
Northwest
South
Northwest
West
South
South
Northeast
Northwest
West
Northeast
West
Northwest
South
Northeast
West
Selling
Price
($000)
290
95
170
375
350
125
310
275
340
215
295
190
385
430
185
175
190
330
405
170
365
280
135
205
260
House Size Number of
(00 sq. ft.) Bedrooms
21
4
11
2
19
3
38
5
24
4
10
2
31
4
25
3
27
5
22
4
20
4
24
4
36
5
32
5
14
3
18
4
19
4
29
4
33
5
23
4
34
5
25
4
17
3
21
4
26
4
Number of
Bathrooms
2
1
2
4
3
2
4
2
3
3
3
3
4
4
2
2
2
4
4
2
4
2
1
3
3
Garage
Size
(cars)
2
0
2
3
2
0
2
2
3
2
2
2
3
2
1
2
2
3
3
2
3
2
1
2
2
EXERCISE 8
SCATTER DIAGRAM
Refer to the real estate data in Exercise 7. Develop a scatter diagram to show the
relationship between the two variables size of house and number of bathrooms. Place the
variable number of bathrooms on the horizontal axis.
TABULAR AND GRAPHICAL METHODS
47
SELF-TESTs
TRUE/FALSE
1. The lines that connect the points plotted in an ogive cannot have a negative slope.
2. In a stem-and-leaf display, a single digit is used to define each leaf, while more than one
digit can be used to define each stem.
3. At least one of the variables in a crosstabulation must be a quantitative variable.
4. There should be no gaps between adjacent bars in a histogram.
5. For a bar graph to be worthwhile, the data being displayed must involve at least two
variables.
FILL-IN-THE-BLANK
1. The techniques of _____________________ consist of simple arithmetic and easy-to-draw
graphs that can be used to summarize data quickly.
2. A ______________________ is a tabular summary of data showing the number of items in
each of several nonoverlapping classes.
3. The last entry in a ____________________________ distribution is always 1.00.
4. In general practice, cumulative frequency distributions are appropriate for summarizing
_____________ data and not _______________ data.
5. Adjacent bars are touching in a bar graph and are not touching in a _____________.
MULTIPLE CHOICE
1. Which one of the following graphical methods is most appropriate for qualitative data?
a) ogive
b) scatter diagram
c) histogram
d) pie chart
2. A graphical method that will assist in the understanding of the relationship between two
variables is a
a) crosstabulation
b) scatter diagram
c) stem-and-leaf display
d) bar graph
3. A graphical method that can be used to show both the rank order and shape of a data set
48
CHAPTER 2
simultaneously is a
a) relative frequency distribution
b) pie chart
c) stem-and-leaf display
d) pivot table
4. The proper way to construct a stem-and-leaf display for the data set
{62,67,68,73,73,79,91,94,95,97} is to
a) exclude a stem labeled ‘8’
b) include a stem labeled ‘8’ and enter no leaves on the stem
c) include a stem labeled ‘(8)’ and enter no leaves on the stem
d) include a stem labeled ‘8’ and enter one leaf value of ‘0’ on the stem
5. The graphical method presented in the chapter for displaying cumulative frequencies is
a) an ogive
b) a stem-and-leaf display
c) a histogram
d) a bar graph
ANSWERSs
EXERCISES
1)
1
2
3
4
5
6
7
C
D
Color of Car
Black
Blue
Green
Red
White
Total
Frequency
12
7
4
10
7
40
2) Bar Graph - Frequencies
E
Relative
Frequency
0.300
0.175
0.100
0.250
0.175
1.000
F
Percent
Frequency
30.0
17.5
10.0
25.0
17.5
100.0
TABULAR AND GRAPHICAL METHODS
Car Color Frequencies
14
Frequency
12
10
8
6
4
2
0
Black
Blue
Green
Red
White
Car Color
Pie Chart – Percent Frequencies
Car Colors' Percent Frequencies
Black
30.0%
White
17.5%
Red
25.0%
Blue
17.5%
Green
10.0%
3)
1
2
3
4
5
6
7
8
9
4)
D
Parts
Cost
100-149
150-199
200-249
250-299
300-349
350-399
400-449
450-499
E
Frequency
3
15
14
6
4
3
3
2
F
Relative
Frequency
0.06
0.30
0.28
0.12
0.08
0.06
0.06
0.04
G
Cum. Rel.
Frequency
0.06
0.36
0.64
0.76
0.84
0.90
0.96
1.00
49
50
CHAPTER 2
Percent Frequencies of
Furniture-Order Values
35
Percent
Frequency
30
25
20
15
10
5
0
100-149 150-199 200-249 250-299 300-349 350-399
Value of Furniture Order ($)
5)
D
Parts
Cost
100-149
150-199
200-249
250-299
300-349
350-399
400-449
450-499
1
2
3
4
5
6
7
8
9
E
Frequency
3
15
14
6
4
3
3
2
F
Cumulative
Frequency
3
18
32
38
42
45
48
50
6)
1
1
2
2
3
3
4
4
2
5
0
5
1
6
0
7
3
5
1
6
2
6
3
9
4
6
1
6
3
8
4
6 6 7 7 7 8 8 8 9 9 9 9
1 1 2 2 3 3 3 4 4 4 4
7 8 9
4
G
Cumulative
% Frequency
6
36
64
76
84
90
96
100
TABULAR AND GRAPHICAL METHODS
51
7) a) Crosstabulation:
1
2
3
4
5
6
7
D
Count of Home
Segment of City
Northeast
Northwest
South
West
Grand Total
E
F
G
Number of Bedrooms
2
3
4
0
1
4
0
0
4
2
2
2
0
1
3
2
4
13
H
I
5
0
3
0
3
6
Grand Total
5
7
6
7
25
E
F
G
Number of Bedrooms
2
3
4
0.0
20.0
80.0
0.0
0.0
57.1
33.3
33.3
33.3
0.0
14.3
42.9
H
I
5
0.0
42.9
0.0
42.9
Grand Total
100.0
100.0
100.0
100.1
b) Row Percentages:
1
2
3
4
5
6
D
Percent of Home
Segment of City
Northeast
Northwest
South
West
c) We see that fewest bedrooms are associated with the South, and the most bedrooms are
associated with the West and particularly the Northwest.
8)
Scatter Diagram for the Real Estate Data
40
35
House Size
(00 sq.ft.)
30
25
20
15
10
5
0
0
1
2
3
Number of Bathrooms
TRUE/FALSE
4
5
52
1.
2.
3.
4.
5.
CHAPTER 2
True
True
False
True
False
FILL-IN-THE-BLANK
1.
2.
3.
4.
5.
exploratory data analysis
frequency distribution
cumulative relative frequency
quantitative, qualitative
histogram
MULTIPLE CHOICE
1.
2.
3.
4.
5.
d
b
c
b
a
Download