Chapter 12

advertisement
12-1
Chapter 12
Chi-Square and
Analysis of Variance
(ANOVA)
© The McGraw-Hill Companies, Inc., 2000
12-2
Outline

12-1 Introduction

12-2 Test for Goodness of Fit


12-3 Tests Using Contingency
Tables
12-4 Analysis of Variance (ANOVA)
© The McGraw-Hill Companies, Inc., 2000
12-3
Objectives




Test a distribution for goodness of fit
using chi-square.
Test two variables for independence
using chi-square.
Test proportions for homogeneity using
chi-square.
Use ANOVA technique to determine a
difference among three or more means.
© The McGraw-Hill Companies, Inc., 2000
12-4
12-2 Test for Goodness of Fit

When one is testing to see
whether a frequency
distribution fits a specific
pattern, the chi-square
goodness-of-fit test is used.
© The McGraw-Hill Companies, Inc., 2000
12-5
12-2 Test for Goodness of Fit Example

Suppose a market analyst wished to see
whether consumers have any preference
among five flavors of a new fruit soda. A
sample of 100 people provided the
following data:
Cherry Straw- Orange
berry
32
28
16
Lime
Grape
14
10
© The McGraw-Hill Companies, Inc., 2000
12-6
12-2 Test for Goodness of Fit Example



If there were no preference, one
would expect that each flavor would
be selected with equal frequency.
In this case, the equal frequency is
100/5 = 20.
That is, approximately 20 people
would select each flavor.
© The McGraw-Hill Companies, Inc., 2000
12-7
12-2 Test for Goodness of Fit Example



The frequencies obtained from the
sample are called observed
frequencies.
The frequencies obtained from
calculations are called expected
frequencies.
Table for the test is shown next.
© The McGraw-Hill Companies, Inc., 2000
12-8
12-2 Test for Goodness of Fit Example
Freq.
Cherry Straw- Orange Lime Grape
berry
Observed
32
28
16
14
10
Expected
20
20
20
20
20
© The McGraw-Hill Companies, Inc., 2000
12-9
12-2 Test for Goodness of Fit Example



The observed frequencies will almost
always differ from the expected
frequencies due to sampling error.
Question: Are these differences
significant, or are they due to chance?
The chi-square goodness-of-fit test will
enable one to answer this question.
© The McGraw-Hill Companies, Inc., 2000
12-10
12-2 Test for Goodness of Fit Example




The appropriate hypotheses for this
example are:
H0: Consumers show no preference for
flavors of the fruit soda.
H1: Consumers show a preference.
The d. f. for this test is equal to the
number of categories minus 1.
© The McGraw-Hill Companies, Inc., 2000
12-11
12-2 Test for Goodness of Fit Formula
O  E 
2
  
2
E
d . f .  number of categories  1
O  observed frequency
E  expected frequency
© The McGraw-Hill Companies, Inc., 2000
12-12
12-2 Test for Goodness of Fit Example


Is there enough evidence to reject the
claim that there is no preference in the
selection of fruit soda flavors? Let
 = 0.05.
Step 1: State the hypotheses and
identify the claim.
© The McGraw-Hill Companies, Inc., 2000
12-13
12-2 Test for Goodness of Fit Example



H0: Consumers show no preference for
flavors (claim).
H1: Consumers show a preference.
Step 2: Find the critical value. The d. f.
are 5 – 1 = 4 and  = 0.05. Hence, the
critical value = 9.488.
© The McGraw-Hill Companies, Inc., 2000
12-14
12-2 Test for Goodness of Fit Example


Step 3: Compute the test value.
  = (32 – 20)2/20 + (28 is – 20)2/20 + …
+ (10 – 20)2/20 = 18.0.
Step 4: Make the decision. The
decision is to reject the null hypothesis,
since 18.0 > 9.488.
© The McGraw-Hill Companies, Inc., 2000
12-15
12-2 Test for Goodness of Fit Example

Step 5: Summarize the results. There is
enough evidence to reject the claim that
consumers show no preference for the
flavors.
© The McGraw-Hill Companies, Inc., 2000
12-16
12-2 Test for Goodness of Fit Example

9.488
© The McGraw-Hill Companies, Inc., 2000
12-17
12-2 Test for Goodness of Fit Example

The advisor of an ecology club at a large
college believes that the group consists
of 10% freshmen, 20% sophomores, 40%
juniors, and 30% seniors. The
membership for the club this year
consisted of 14 freshmen, 19
sophomores, 51 juniors, and 16 seniors.
At  = 0.10, test the advisor’s conjecture.
© The McGraw-Hill Companies, Inc., 2000
12-18
12-2 Test for Goodness of Fit Example



Step 1: State the hypotheses and
identify the claim.
H0: The club consists of 10% freshmen,
20% sophomores, 40% juniors, and
30% seniors (claim)
H1: The distribution is not the same as
stated in the null hypothesis.
© The McGraw-Hill Companies, Inc., 2000
12-19
12-2 Test for Goodness of Fit Example


Step 2: Find the critical value. The d. f.
are 4 – 1 = 3 and  = 0.10. Hence, the
critical value = 6.251.
Step 3: Compute the test value.
  = (14 – 10)2/10 + (19 – 20)2/20 + … +
(16 – 30)2/30 = 11.208.
© The McGraw-Hill Companies, Inc., 2000
12-20
12-2 Test for Goodness of Fit Example


Step 4: Make the decision. The
decision is to reject the null hypothesis,
since 11.208 > 6.251.
Step 5: Summarize the results. There is
enough evidence to reject the advisor’s
claim.
© The McGraw-Hill Companies, Inc., 2000
12-21
12-3 Tests Using Contingency Tables


When data can be tabulated in table
form in terms of frequencies, several
types of hypotheses can be tested
using the chi-square test.
Two such tests are the independence of
variables test and the homogeneity of
proportions test.
© The McGraw-Hill Companies, Inc., 2000
12-22
12-3 Tests Using Contingency Tables


The test of independence of variables is
used to determine whether two variables
are independent when a single sample is
selected.
The test of homogeneity of proportions is
used to determine whether the proportions
for a variable are equal when several
samples are selected from different
populations.
© The McGraw-Hill Companies, Inc., 2000
12-23
12-3 Test for Independence Example



Suppose a new postoperative procedure
is administered to a number of patients
in a large hospital.
Question: Do the doctors feel
differently about this procedure from
the nurses, or do they feel basically the
same way?
Data is on the next slide.
© The McGraw-Hill Companies, Inc., 2000
12-24
12-3 Test for Independence Example
Group
Group
Prefer
Prefer
old
old
procedure
procedure
80
80
No
No
procedure
preference
Nurses
Nurses
Prefer
Prefer
new
new
procedure
procedure
100
100
Doctors
Doctors
50
50
120
120
30
30
20
20
© The McGraw-Hill Companies, Inc., 2000
12-25
12-3 Test for Independence Example



The null and the alternative hypotheses
are as follows:
H0: The opinion about the procedure is
independent of the profession.
H1: The opinion about the procedure is
dependent on the profession.
© The McGraw-Hill Companies, Inc., 2000
12-26
12-3 Test for Independence Example


If the null hypothesis is not rejected, the
test means that both professions feel
basically the same way about the
procedure, and the differences are due to
chance.
If the null hypothesis is rejected, the test
means that one group feels differently
about the procedure from the other.
© The McGraw-Hill Companies, Inc., 2000
12-27
12-3 Test for Independence Example



Note: The rejection of the null hypothesis
does not mean that one group favors the
procedure and the other does not.
The test value is the  2 value (same as the
goodness-of-fit test value).
The expected values are computed from:
(row sum)(column sum)/(grand total).
© The McGraw-Hill Companies, Inc., 2000
12-28
12-3 Test for Independence Example
© The McGraw-Hill Companies, Inc., 2000
12-29
12-3 Test for Independence Example



From the MINITAB output, the
Pvalue = 0. Hence, the null hypothesis
will be rejected.
If the critical value approach is used, the
degrees of freedom for the chi-square
critical value will be (number of
columns –1)(number of rows – 1).
d.f. = (3 –1)(2 – 1) = 2.
© The McGraw-Hill Companies, Inc., 2000
12-30
12-3 Test for Homogeneity of
Proportions

Here, samples are selected from several
different populations and one is
interested in determining whether the
proportions of elements that have a
common characteristic are the same for
each population.
© The McGraw-Hill Companies, Inc., 2000
12-31
12-3 Test for Homogeneity of
Proportions


The sample sizes are specified in
advance, making either the row totals or
column totals in the contingency table
known before the samples are selected.
The hypotheses will be:
H0: p1 = p2 = … = pk
H1: At least one proportion is different
from the others.
© The McGraw-Hill Companies, Inc., 2000
12-32
12-3 Test for Homogeneity of
Proportions

The computations for this test
are the same as that for the test
of independence.
© The McGraw-Hill Companies, Inc., 2000
12-33
12-4 Analysis of Variance (ANOVA)

When an F test is used to test a
hypothesis concerning the
means of three or more
populations, the technique is
called analysis of variance
(ANOVA).
© The McGraw-Hill Companies, Inc., 2000
12-34
12-4 Assumptions for the F Test for
Comparing Three or More Means



The populations from which the
samples were obtained must be
normally or approximately normally
distributed.
The samples must be independent of
each other.
The variances of the populations must
be equal.
© The McGraw-Hill Companies, Inc., 2000
12-35
12-4 Analysis of Variance


Although means are being compared
in this F test, variances are used in the
test instead of the means.
Two different estimates of the
population variance are made.
© The McGraw-Hill Companies, Inc., 2000
12-36
12-4 Analysis of Variance


Between-group variance - this involves
computing the variance by using the
means of the groups or between the
groups.
Within-group variance - this involves
computing the variance by using all
the data and is not affected by
differences in the means.
© The McGraw-Hill Companies, Inc., 2000
12-37
12-4 Analysis of Variance



The following hypotheses should be
used when testing for the difference
between three or more means.
H0:   =  = … = k
H1: At least one mean is different from
the others.
© The McGraw-Hill Companies, Inc., 2000
12-38
12-4 Analysis of Variance



d.f.N. = k – 1, where k is the number of
groups.
d.f.D. = N – k, where N is the sum of
the sample sizes of the groups.
Note: The formulas for this test are
tedious to work through, so examples
will be done in MINITAB. See text for
formulas.
© The McGraw-Hill Companies, Inc., 2000
12-39
12-4 Analysis of Variance -Example


A marketing specialist wishes to see
whether there is a difference in the
average time a customer has to wait in a
checkout line in three large self-service
department stores. The times (in
minutes) are shown on the next slide.
Is there a significant difference in the
mean waiting times of customers for each
store using  = 0.05?
© The McGraw-Hill Companies, Inc., 2000
12-40
12-4 Analysis of Variance -Example
Store
StoreAA
33
Store
StoreBB
55
Store
StoreCC
11
22
55
88
99
33
44
66
33
66
22
22
77
11
55
33
© The McGraw-Hill Companies, Inc., 2000
12-41
12-4 Analysis of Variance -Example


Step 1: State the hypotheses and
identify the claim.
H0:   = 
H1: At least one mean is different
from the others (claim).
© The McGraw-Hill Companies, Inc., 2000
12-42
12-4 Analysis of Variance -Example


Step 2: Find the critical value. Since
k = 3, N = 18, and  = 0.05, d.f.N. = k – 1
= 3 – 1= 2, d.f.D. = N – k = 18 – 3 = 15.
The critical value is 3.68.
Step 3: Compute the test value. From
the MINITAB output, F = 2.70. (See
your text for computations).
© The McGraw-Hill Companies, Inc., 2000
12-43
12-4 Analysis of Variance -Example


Step 4: Make a decision. Since
2.70 < 3.68, the decision is not to reject
the null hypothesis.
Step 5: Summarize the results. There
is not enough evidence to support the
claim that there is a difference among
the means. The ANOVA summary table
is given on the next slide.
© The McGraw-Hill Companies, Inc., 2000
12-44
12-4 Analysis of Variance -Example
© The McGraw-Hill Companies, Inc., 2000
Download