Chapter 14 - Inference for Distributions of Categorical Variables

advertisement
Chapter 14
Inference for Distributions of Categorical
Variables: Chi-Square Procedures
AP Statistics
Hamilton/Mann
CHAPTER 14 SECTION 1
Test for Goodness of Fit
HW: 14.1, 14.3, 14.6, 14.8
Test for Goodness of Fit
• Suppose you open up a 1.69-ounce bag of M&M’s
Milk Chocolate Candies and discover that out of 56
total M&M’s in the bag, there are only 2 red
M&M’s.
• Knowing that 13% of all plain M&M’s made by the
M&M/Mars Company are red, and that in our
sample, the proportion of reds is
you
feel cheated out of some reds.
• You could use the z test described in Chapter 12 to
test the hypotheses
where p is the
proportion of reds.
Test for Goodness of Fit
• You could then perform additional tests of
significance on each of the other colors.
• This, however, would be inefficient. More
important, it wouldn’t tell us how likely it is that the
six sample proportions differ from the values stated
by M&M/Mars Company as much as our sample
does.
• There is a single test that can be applied to see if
the observed sample distribution is significantly
different in some way from the hypothesized
population distribution.
• This test is called the chi-square (χ2) test for
goodness of fit.
Auto Accidents and Cell Phones
• Are you more likely to have a motor vehicle collision
when using a cell phone? A study of 699 drivers
who were using a cell phone when they were
involved in a collision examined this question.
These drivers made 26,798 cell phone calls during a
14-month period. Each of the 699 collisions was
classified in various ways. Here are the counts for
each day of the week:
Day
Sun
Mon
Tues
Wed
Thurs
Fri
Sat
Total
Number
20
133
126
159
136
113
12
699
Auto Accidents and Cell Phones
• We have a total of 699 accidents involving drivers
who were using a cell phone at the time of their
accident. Let’s explore the relationship between
these accidents and the day of the week. Are the
accidents equally likely to occur on any day of the
week?
• We can think of this table of counts as a one-way
table with seven cells, each with a count of the
number of accidents that occurred on the particular
day of the week.
Auto Accidents and Cell Phones
• Our question is translated into the following
hypotheses:
• We can also write it in terms of population
proportions.
Goodness of Fit Test
• The idea of a goodness of fit test is this: we compare
the observed counts for our sample with the counts
that would be expected if the null hypothesis were
true.
• The more the observed counts differ from the
expected counts, the more evidence we have to
reject H0 and conclude some of the proportions must
be different than those we had in the null hypothesis.
• In general, the expected count for any categorical
variable is obtained by multiplying proportion of the
distribution for each category by the sample size.
Auto Accidents and Cell Phones
• Before proceeding with a significance test, it’s
always a good idea to plot the data. In this case, a
bar graph allows you to compare the observed
number of accidents by day with the expected
numbers of accidents by day. The counts as well as
the percents are given in the table below.
Day
Count
Percent
Cumulative Percent
Sunday
20
2.86
2.86
Monday
133
19.0
21.86
Tuesday
126
18.0
39.86
Wednesday
159
22.7
62.56
Thursday
136
19.5
82.06
Friday
113
16.2
98.26
Saturday
12
1.72
99.98
Total
699
99.98
Auto Accidents and Cell Phones
• Notice the percents did not add to 100% because of
rounding error.
• Now, we need to calculate the expected counts for
each day. Since there are 699 accidents and we are
assuming that the probability of an accident is the
same for each day, the expected number of
accidents is given by
• The Bar Graph comparing the expected and
observed counts is on the next page.
Auto Accidents and Cell Phones
• Anything jump out at you from the bar graph?
Auto Accidents and Cell Phones
• To determine whether the distribution of accidents is
uniform, we need a way to measure how well the observed
counts (O) fit the expected counts (E) under H0. The
procedure is to calculate the quantity
for each day and then add up these
terms. The sum is denoted X2 and is
called the chi-square statistic.
• So let’s figure out what the chi-square
statistic is for our example.
Auto Accidents and Cell Phones
Day
Sun
Mon
Tues
Wed
Thurs
Fri
Sat
Total
Number
20
133
126
159
136
113
12
699
Sunday
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
O  E 2  20  99.8572
E
O  E 2
E
O  E 2
E
O  E 2
E
O  E 2
E
O  E 2
E
99.857
2

133 99.857

99.857
2

126 99.857

99.857
2

159 99.857

99.857
2

136 99.857

99.857
2

113 99.857

99.857
E
99.857
O  E 2  12  99.8572
 63.863
 11.000
 6.844
 35.029
 13.082
 1.730
 77.299
Χ2=208.847
Chi-Square Distribution
• The larger the difference between the observed and
expected values, the larger X2 will be, and the more
evidence there will be against H0.
• The chi-square family of distribution curves is used
to assess the evidence against H0 represented in the
value of X2. The specific member of the family is
determined by the degrees of freedom.
• A χ2 distribution with k degrees of freedom is
written χ2(k).
• Since we are working not only with counts, but
percents, six of the seven percents are free to vary,
but the seventh is not since it would de determined
by the first six. So there are 6 degrees of freedom.
Chi-Square Distribution
• Degrees of freedom is one less than the number of
cells in the one-way table (not including the total
column).
• Table D shows a typical chi-square curve with the
right-tail area shaded.
• The chi-square test statistic is a point on the
horizontal axis, and the area to the right under the
curve is the P-value of the test.
• This P-value is the probability of observing a value
of X2 at least as extreme as the one observed.
• The larger the value of the chi-square statistic, the
smaller the P-value and the more evidence you
have against the null hypothesis H0.
Auto Accidents and Cell Phones
• Looking at the Chi-Square Table, a P-value of 0.0005
with 6 degrees of freedom has a value of 24.10.
Since our X2 = 208.847 is more extreme than the
24.10 in the table, the probability of observing a
result as the extreme as the one we observed, by
chance alone, is less than 0.05%.
• So, there is sufficient evidence to reject H0 and
conclude that these types of accidents are not
equally likely to occur on each of the seven days of
the week.
Test for Goodness of Fit
• The chi-square test applied to the hypothesis that a
categorical variable has a specified distribution is
called the test for goodness of fit.
• The idea is that the test assesses whether the
observed counts “fit” the hypothesized distribution.
• The details are on the next slide!
Conditions for Goodness of Fit Test
• Notice that we don’t want to divide by zero, and
since we are working with counts, we require that
all expected counts be greater than 1.
• In checking conditions, remember that it is the
expected counts, not the observed counts, that are
important.
• Also notice that no more than 20% (1 out of 5) can
have expected counts less than 5.
Properties of the Chi-Square Distribution
• As the degrees of freedom increase, the density
curve becomes less skewed and larger values for X2
become more likely.
• Table D gives critical values for chi-square
distributions. To get P-values for a chi-square test,
you can use Table D, computer software, or your
calculator.
Properties of the Chi-Square Distribution
• The chi-square density curves have the following
properties:
1. The total area under a chi-square curve is equal to 1.
2. Each chi-square curve (except when degrees of
freedom = 1) begins at 0 on the horizontal axis,
increases to a peak, and them approaches the
horizontal axis asymptotically from above.
3. Each chi-square curve is skewed to the right. As the
number of degrees of freedom increase, the curve
becomes more and more symmetrical and looks more
like a Normal curve.
Properties of the Chi-Square Distribution
One Application of the Chi-Square
Goodness of Fit Test
• This is often used in the field of genetics.
• Scientists want to investigate the genetic
characteristics of offspring that result from mating
parents with known genetic makeups.
• Scientists use rules about dominant and recessive
genes to predict the ratio of offspring that will fall in
each possible genetic category.
• Then the researchers mate the parents and classify
the resulting offspring.
• The chi-square goodness of fit test helps the
scientists assess the validity of their hypothesized
ratios.
Red-eyed Fruit Flies
• Scientists wish to mate two fruit flies having genetic
makeup RrCc, indicating that each parent has one
dominant gene (R) and one gene (r) for eye color,
along with one dominant (C) and one recessive (c)
gene for wing type.
• R is red-eyed and C is straight-winged.
• Each offspring receives one gene for each of the
two traits from each parent.
• Let’s create the Punnett Square to show the
possible genotypes the offspring could end up with.
Red-eyed Fruit Flies
Parent 2 Passes On
Parent 1
Passes On
RC
Rc
rC
rc
RC
RRCC
RRCc
RrCC
RrCc
Rc
RRCc
RRcc
RrCc
Rrcc
rC
RrCC
RrCc
rrCC
rrCc
rc
RrCc
Rrcc
rrCc
rrcc
• Based on what we see in the Punnett Square:
− Probability of red-eyed and straight-winged is 9 out of
16.
− Probability of red-eyed and curly-winged is 3 out of 16.
− Probability of white-eyed and straight-winged is 3 out of
16.
− Probability of white-eyed and curly-winged is 1 out of
16.
Red-eyed Fruit Flies
• To test their hypothesis about the distribution of
offspring, the biologists mate the fruit flies.
• Of 200 offspring, 99 had red eyes and straight
wings, 42 had red eyes and curly wings, 49 had
white eyes and straight wings, and 10 had white
eyes and curly wings.
• Do these results differ significantly from what the
biologists would have predicted?
• We are going to use the inference toolbox to carry
out the significance test.
Red-eyed Fruit Flies
• Step 1 – Hypotheses – The biologists are interested
in the proportion of offspring that fall into each
genetic category for the population of all fruit flies
that would result from crossing two parents with
genetic makeup RrCc. The hypotheses would be:
• Step 2 – Conditions – We must run a chi-square
goodness of fit test. We must check expected
counts. Since all are greater than 5, we can
proceed.
Red-eyed Fruit Flies
• Step 3 – Calculations
Expected
Type
Observed
Red-eyed,
straight-winged
99
112.5
1.62
Red-eyed,
curly-winged
42
37.5
0.54
White-eyed,
straight-winged
49
37.5
3.5267
White-eyed,
curly-winged
10
12.5
0.5
X2 = 6.1867
– We have 3 degrees of freedom since there are 4
categories. Our X2 value is between 5.32 and 6.25 which
correspond to P-values of 0.15 and 0.10. (On the
calculator we get a P-value of 0.1029.)
Red-eyed Fruit Flies
• Step 4 – Interpretation
• Since the p-value (0.1029) is fairly large, that means that
observed differences as large as we saw would happen
by chance fairly regularly. Therefore, there is not
sufficient evidence to reject the null hypothesis and
reject the scientists’ predicted distribution.
• We can also do this on the calculator.
1.
2.
3.
4.
Enter observed values in list 1.
Calculate expected values and enter in list 2.
Click Stat and go to Tests and choose D: χ2GOF-Test.
Tell the calculator where the observed and expected
counts are, input your degrees of freedom and
calculate. Read the results!
Follow-Up Analysis
• In the chi-square test for goodness of fit, we test
the null hypothesis that a categorical variable has a
specified distribution.
• If we find significance, we can conclude that our
variable has a distribution different from the
specified one. In this case, it’s a good idea to
determine which categories of the variable provide
the greatest differences between observed and
expected counts.
• To do this, look at the individual terms
that are added together to produce the test statistic
X2.
Follow-Up Analysis
• For the fruit flies example, the third category
(white-eyed, straight-winged) contributed the most
(3.5267) to the X2 statistic.
• For the days of the week that accidents occurred
due to cell phone use, Saturday (77.299) and
Sunday (63.863) contributed the most to the X2
statistic.
• This is known as the largest component of the chisquare statistic. This is the component whose
percentage is very different from the hypothesized
value.
CHAPTER 14 SECTION 2
Inference for Two-Way Tables
HW: 14.28, 29, 31, 32
Inference for Two-Way Tables
• Two-sample z procedures of Chapter 13 allow us to
compare the proportion of successes in two groups
(either two populations or two treatments in an
experiment).
• What if we want to compare more than two
groups?
• Then we need a new statistical test.
• This new test begins by presenting the data as a
two-way table.
• Two-way tables have more general uses than
comparing the proportions of successes in several
groups.
Inference for Two-Way Tables
• As we saw in Section 2 of Chapter 4, two-way tables
can be used to describe relationships between any
two categorical variables.
• The same test that compares several proportions
also tests whether the row and column variables
are related in any two-way table.
• We will start with the problem of comparing several
proportions.
Background Music and Purchasing Wine
• Market researchers know that background music
can influence the mood and purchasing behavior of
customers. One study in a supermarket in Northern
Ireland compared three treatments: no music,
French accordion music, and Italian string music.
Under each condition, the researchers recorded the
number of bottles of French, Italian, and other wine
purchased. Here is a table with the data
Music
Wine
None
French
Italian
Total
French
30
39
30
99
Italian
11
1
19
31
Other
43
35
35
113
Total
84
75
84
243
Background Music and Purchasing Wine
• The conditional distributions of types of wine sold
for each kind of music (this means that the music
must total to 100%) are given in the table below.
Music
Wine
None
French
Italian
Total
French
35.7%
52%
35.7%
40.7%
Italian
13.1%
1.3%
22.6%
12.8%
Other
51.2%
46.7%
41.7%
46.5%
Total
100%
100%
100%
100%
• This compares the distributions (percents) of the
types of wines sold based on the music condition.
• Just like earlier, now we want to look at bar graphs
comparing the different distributions.
Background Music and Purchasing Wine
• Do you notice anything?
Background Music and Purchasing Wine
• There appears to be an association between music
and the type of wine purchased.
• When no music is played, other wine is purchased
most often and very little Italian wine is purchased.
• When French music is played, more French wine is
sold and very little Italian.
• When Italian music is played, the percent of Italian
wine purchased increases dramatically.
• What if we instead looked at the conditional
distributions of types of music for each kind of wine
(this means that the wine type must total to 100%).
Background Music and Purchasing Wine
• The conditional distributions of types of music for
each kind wine are given below.
Music
Wine
None
French
Italian
Total
French
30.3%
39.4%
30.3%
100%
Italian
35.5%
3.2%
61.3%
100%
Other
38.1%
31%
31%
100%
Total
34.6%
30.9%
34.6%
100%
• This shows that Italian wine is the wine most
affected by music type. The other two types of
wine have percentages that stay fairly consistent.
• Let’s look at these bar graphs.
Background Music and Purchasing Wine
• So, what do we notice?
Background Music and Purchasing Wine
• It appears that more French wine is sold when
French music is playing.
• Similarly for Italian wine and Italian music.
• It also appears that French music has a dramatic
negative impact on sales of Italian wine.
Problem of Multiple Comparisons
• The researchers expected the type of music being
played would influence sales, so type of music is the
explanatory variable and type of wine purchased is
the response variable.
• In general, the clearest way to describe this type of
relationship is to compare the conditional
distributions of the response variable for each value
of the explanatory variable. This means that the
explanatory variables rows or columns must add up
to 100%.
• In our case, that means that the music columns
must add up to 100%.
Problem of Multiple Comparisons
• To compare the three population distributions, we
might use chi-square goodness of fit procedures
several times and test the following hypotheses:
– H0: the distribution of wine types for no music is the
same as the distribution of wine types for French music
– H0: the distribution of wine types for no music is the
same as the distribution of wine types for Italian music
– H0: the distribution of wine types for French music is the
same as the distribution of wine types for Italian music
Problem of Multiple Comparisons
• The weakness of doing three tests is that we get
three results, one for each test alone. That doesn’t
tell us how likely it is that the three sample
distributions are as different as these if the
corresponding population distributions are the
same.
• It may be that the sample distributions are
significantly different if we look at just two groups,
but not significantly different if we look at two
other groups.
• We can’t safely compare many parameters by doing
tests or confidence intervals for two parameters at
a time.
Problem of Multiple Comparisons
• The problem of how to do many comparisons at
once with some overall measure of confidence in all
our conclusions is common in statistics.
• This is the problem of multiple comparisons.
• Statistical methods for dealing with multiple
comparison usually have two parts:
1. An overall test to see if there is good evidence of any
differences among the parameters that we want to
compare.
2. A detailed follow-up analysis to decide which of the
parameters differ and to estimate how large the
differences are.
Problem of Multiple Comparisons
• The test we use is one we are now familiar with –
the chi-square test – but in this new setting it will
be used to compare several population proportions.
• The follow-up analysis can be quite elaborate.
Two-Way Tables
• The first step in the overall test for comparing
several population proportions is to arrange the
data in a two-way table that gives the counts for
both successes and failures.
• A table with r rows and c columns is called an r x c
table.
• The table shows the relationship between two
categorical variables.
• For our example, the type of music is the
explanatory variable and the type of wine
purchased is the response variable.
Stating Hypotheses
• We observe a clear relationship between music
type and wine sales for the 243 bottles sold during
the study.
• The chi-square test assesses whether this observed
association is statistically significant, that is, too
strong to occur often just by chance.
• The market researchers changed the background
music and took three samples of wine sales in three
distinct environments.
• Each column in the table represents one of these
samples, and each row a wine type.
Stating Hypotheses
• This is an example of separate and independent
random samples from each of c populations.
• The c columns of the two-way table represent the
populations.
• There is a single categorical response variable, wine
type.
• The r rows of the table correspond to the values of
the response variable.
• The r x c table allows us to compare more than two
populations, more than two categories of response,
or both.
Stating Hypotheses
• In this setting, the null hypothesis becomes:
H0: The distribution of the response variable is the
same in all c populations.
• Because the response variable is categorical, its
distribution just consists of the proportions of its r
possible values or categories.
• The null hypothesis says that these population
proportions are the same in all c populations.
Background Music and Purchasing Wine
• In our study, the three populations are:
– Population 1 – bottles of wine sold when no music is
playing
– Population 2 – bottles of wine sold when French music is
playing
– Population 3 – bottles of wine sold when Italian music is
playing
• We have three independent samples of sizes 84, 75,
and 84, with a separate sample being taken from
each population.
Background Music and Purchasing Wine
• The null hypothesis for the chi-square test is
H0: The proportions of each wine type sold is the
same in all three populations.
• The parameters of the model are the proportions of
the three types of wine that would be sold in each
of the three environments. There are three
proportions (for French wine, Italian wine and other
wine) for each environment.
Computing Expected Cell Counts
• Here is the formula for expected cell counts under
the null hypothesis
H0: The proportions of each wine type sold is the
same in all three populations.
• Let’s look at our example to see if we can make
some sense out of this!
Background Music and Purchasing Wine
Music
Wine
None
French
Italian
Total
French
30
39
30
99
Italian
11
1
19
31
Other
43
35
35
113
Total
84
75
84
243
Back to end!
• Let’s find the expected count for the cell in row 1
(French wine) and column 1 (no music). The
proportion of no music among all subjects in the
study is
• Think of this as p, the overall proportion of no
music. If H0 is true, we expect (except for random
variation) this same proportion of no music in all
three groups.
Background Music and Purchasing Wine
• So the expected count of no music among the 99
subjects who purchased French wine is
• Notice this would be
• We can now find the remaining expected counts in
the same way. Results are summarized in the table
below.
Expected Counts for Music and Wine
Music
Wine
None
French
Italian
Total
French
34.222
30.556
34.222
99.000
Italian
10.716
9.568
10.716
31.000
Other
39.062
34.877
39.062
113.001
Total
84.000
75.001
84.000
243.001
The Chi-Square Test for Homogeneity of Populations
• This is an extension of the X2 statistic for one-way
tables from Section 14.1.
• In that special case, r = 1 and c = 1.
The X2 Statistic and Its P-value
• All of the expected counts were large, so we can
proceed with the chi-square test.
• Just as we did with the goodness of fit test, we must
compare the table of observed counts with the
expected counts using the X2 statistic.
• We must calculate the sum for each cell and then
add up the contributions from each of the nine
cells.
• This is much too tedious to show the entire process,
so we will use the calculator to make it much
simpler.
The X2 Statistic and Its P-value
• As in the test for goodness of fit, think of the chisquare statistic as a measure of the distance of the
observed counts from the expected counts.
• Like any distance, it is always zero or positive, and it
is only zero when the observed counts are exactly
equal to the expected counts.
• Large values of X2 are evidence against H0 because
they say that the observed counts are far from what
we would expect if H0 were true.
The X2 Statistic and Its P-value
• Although the alternative hypothesis Ha is manysided, the chi-square test is one-sided because any
violation of H0 tends to produce a large value of X2.
• Small values of X2 are not evidence against H0.
• The same chi-square procedure we used to test
goodness of fit allows us to compare the
distribution of proportions in several populations,
provided that we take separate and independent
samples from each population.
Chi-Square Test
• The chi-square test, like the z procedure for
comparing two proportions, is an approximate
method that becomes more accurate as the counts
in the cells of the table get larger.
• Fortunately, the approximation is accurate for quite
modest counts. Here is a practical guideline.
• Now lets run the inference procedure.
Background Music and Purchasing Wine
• Step 1: Populations and Parameters
– We want to use X2 to compare the distributions of types
of wine selected for each type of music. Our hypotheses
are
• Step 2: Conditions
– To use the chi-square test for homogeneity of
populations:
• The data must come from independent SRSs from the
populations of interest. We are willing to treat the subjects in
the three groups as SRSs from their respective populations.
• All expected cell counts must be greater than 1, and no more
than 20% are less than 5. We meet this condition as well.
Background Music and Purchasing Wine
• Step 3 – Calculations
– The test statistic is
– Because there are r = 3 types of wine and c = 3 types of music, the
degrees of freedom are given by
– Now we can look up the P-value in the table or on the calculator.
– The X2 value of 18.28 with df = 4 is located between 16.42 and
18.47 which correspond to P-values of 0.001 and 0.0025. So our
P-value is between 0.001 and 0.0025.
Background Music and Purchasing Wine
• Step 4 – Interpretation
– Because the expected cell counts are all large, the Pvalue associated with the test will be quite accurate.
There is strong evidence to reject H0 (X2 = 18.28, df = 4,
p < 0.0025) and conclude that the type of music being
played has a significant effect on sales.
Performing the Chi-Square Test with Technology
• Calculating expected counts and the chi-square
statistic by hand is a bit time-consuming.
• As usual, computer software saves time and always
gets the arithmetic right.
• Our calculator can also run the test for us.
• The book describes how to use the calculator to run
the test on p. 863 - 864.
• Let’s look at running the test now! We’ll do the
example we just finished!
Follow-up Analysis
• The chi-square test is the overall test for comparing
any number of population proportions.
• If the test allows us to reject the null hypothesis
that all proportions are equal, we then do a followup analysis that examines the differences in detail.
• We don’t get into doing a follow-up analysis, but
you should look at the data to see what specific
effects they suggest.
Background Music and Purchasing Wine
• Follow-up Analysis
1. We can look at the row and column percents. We created
earlier. They show that there are major differences in wine
type purchased for each type of music played. Italian wine
sells very poorly when French music is played. French wine
sells well across the board, but particularly well when
French music is played.
2. You can also compare the observed and expected counts.
3. Look at the contributions from each component of X2. The
largest components show which cells contribute the most
to X2. For our example, sales of Italian wine are much
below what is expected when French music is playing and
well above expectation when Italian music is playing. Let’s
look on our calculator.
Background Music and Purchasing Wine
• The test we ran confirms only that there is some
relationship . The percents we have compared
describe the nature of the relationship.
• The chi-square test itself does not tell us what
population our conclusion describes.
• If the study was done in one market on a Saturday,
the results may apply only to Saturday shoppers at
this market.
• So we have a limited Scope of Inference!
The Chi-Square Test and the Z Test
• We can use the chi-square test to compare any
number of proportions.
• If we are comparing r proportions and make the
columns of the table “success” and “failure,” the
counts form an r x 2 table. P-values come from the
chi-square distribution with r – 1 degrees of
freedom.
• If r = 2, we are comparing just two proportions. We
have two ways to do this: the z test from Section
13.2 and the chi-square test with 1 degree of
freedom for a 2 x 2 table.
The Chi-Square Test and the Z Test
• These two tests always agree.
• In fact, the chi-square statistic is just the square of
the z statistic, and the P-value for X2 is exactly the
same as the two-sided P-value for z.
• The recommendation would be to use a z test to
compare two proportions because it gives you the
option of a one-sided test and is related to a
confidence interval for p1 – p2.
The Chi-Square Test of Association/Independence
• Two-way tables can arise in several ways.
• The music and wine study is an experiment that
compared three music treatments using separate
and independent samples. Each group is a sample
from a separate population corresponding to a
separate treatment. The study fixes the size of each
sample in advance, and the data record which of
the three outcomes (type of wine purchased)
occurred for each category of the explanatory
variable (type of music). The null hypothesis takes
on the form of “equal proportions for each type of
wine among the three populations.”
• There is also another type of setting.
Franchises that Succeed
• Franchises, such as McDonald’s, establish contracts
with the entrepreneur’s who run them. One clause
that these contracts could contain is a right to an
exclusive territory, hence no other franchise will
open within a certain number of miles.
• The question we have is how does the presence of
an exclusive-territory clause in the contract relate to
the survival of the business?
• A study designed to answer this questions collected
data from a sample of 170 new franchise firms.
Franchises that Succeed
• Two categorical variables were measured for each
firm.
1. The firm was classified as successful or not based on
whether it was still franchising as of a certain date.
2. The contract each firm offered to franchisees was
classified according to whether or not there was an
exclusive-territory clause .
• The count data are arranged in the two-way table
below.
Observed Number of Firms
Exclusive Territory
Success
Yes
No
Total
Yes
108
15
123
No
34
13
47
Total
142
28
170
Franchises that Succeed
• The two categorical variables in this example are
“success,” with values of “Yes” and No,” and
“exclusive territory,” with values of “Yes” and “No.”
• The objective is to compare franchises that have
exclusive territories with those that do not.
• We view “exclusive territory” as the explanatory
variable and make it the column variable.
• So “success” is the response variable and is the row
variable.
Franchises that Succeed
• This two way table does not compare several
populations.
• Instead, it arises by classifying observations from a
single population in two ways: by exclusive territory
and success.
• Both of these variables have two levels, so a statement
of the null hypothesis is
• H0: There is no association between exclusive territory
and success.
Observed Number of Firms
Exclusive Territory
Success
Yes
No
Total
Yes
108
15
123
No
34
13
47
Total
142
28
170
Franchises that Succeed
• This setting is very different from a comparison of
several population proportions.
• Even so, we can apply a chi-square test.
• In general, a chi-square test tests the hypothesis,
“the row and column variables are not related to
each other” whenever this hypothesis makes sense
for a two-way table.
• Notice that here we have a single SRS from a single
population.
• When we tested for homogeneity, we had multiple
SRSs from several different populations.
• This is how you distinguish between a test for
homogeneity and association/independence.
The Chi-Square Test of Association/Independence
• We start our analysis by computing descriptive
statistics that summarize the observed relation
between the two categorical variables.
• We compare a relationship between cateogorical
variables by comparing percents.
• Once again, we always want to look at the effect
that the explanatory variable has on the response
variable. So the explanatory variable should always
add up to 100% when we compute the conditional
distributions.
Franchises that Succeed
Observed Number of Firms
Exclusive Territory
Success
Yes
No
Total
Yes
108
15
123
No
34
13
47
Total
142
28
170
Conditional Distribution by Exclusive Territory
Exclusive Territory
Success
Yes
No
Yes
76%
54%
No
24%
46%
Total
100%
100%
• The total row reminds us that every firm (whether
exclusive territory or not) was classified as
successful or not successful.
Franchises that Succeed
• The bar graph to the right compares
the percent of each type of firm that is
successful.
• What do you notice? Can we conclude
anything?
• Is there any need to compare the bar
graphs for the percents that are not
successful?
Franchises that Succeed
• To think about the problem another way
– Among franchises who had exclusive territories, the
number of successful firms was about 3 times the
number of unsuccessful firms.
– Among franchises without exclusive territories, only
about half were successful.
• Therefore, it appears there is reason to believe that
having an exclusive territory makes a firm more
likely to be successful.
• Is this difference statistically significant or did it just
happen by chance and there really is no association
between exclusive territory and success.
This is output from
CrunchIt! For our
exclusive territory
problem. What would
we conclude?
Chi-Square Test of Association/Independence
• The chi-square test of association/independence
assesses whether this observed association is
statistically significant.
• That is, is the exclusive territory-success
relationship in the sample sufficiently strong for us
to conclude that it is due to a relationship between
those two variables in the underlying population
and not merely to chance?
• Note that the test asks only whether there is
evidence of some relationship.
Chi-Square Test of Association/Independence
• To explore the direction or nature of the
relationship we must examine the column or row
percents.
• Note also that in using the chi-square test we are
acting as if the subjects were an SRS from a single
population of interest.
• If the franchises are a biased sample – for example,
if unsuccessful franchises are reluctant to
participate in the study – then conclusions about
the entire population of franchises are not justified.
Computing Expected Cell Counts
• The null hypothesis is that there is no relationship
between exclusive territory and success in the
population.
• The alternative is that these two variables are
related.
• If we assume that the null hypothesis is true, then
success and exclusive territory are independent.
• Therefore, we can find the expected cell counts
using the multiplication rule for independent
events.
Franchises that Succeed
• What is the expected cell count for the cell
corresponding to the event “successful” and
“exclusive territory”?
• The expected count of “success and exclusive”
would then be the probability of “success and
exclusive” times the total number of franchises in
the sample.
• This is the same formula we had before.
Franchises that Succeed
• So let’s complete the table of expected counts.
Observed Number of Firms
Exclusive Territory
Success
Yes
No
Total
Yes
108
15
123
No
34
13
47
Total
142
28
170
Expected Number of Firms
Exclusive Territory
Success
Yes
No
Total
Yes
102.74
20.26
123
No
39.26
7.74
47
Total
142
28
170
Performing the χ2 Test
• We are now ready to carry out the chi-square
procedure for the success and exclusive-territory
data, following the steps in the inference toolbox.
• Step 1: Hypotheses
– H0: Success and exclusive territory are independent.
– Ha: Success and exclusive territory are dependent.
• Or
– H0: There is no association between success and
exclusive territory.
– Ha: There is an association between success and
exclusive territory.
Performing the χ2 Test
• Step 2: Conditions
– To use the chi-square test of association/independence,
we must check that all expected cell counts are at least
one and that no more than 20% of cell counts are less
than 5. From the expected counts table we just created,
we know this is true.
Performing the χ2 Test
• Step 3: Calculations
– The test statistic is
– Since there are r = 2 categories of the row variable and
c = 2 categories of the column variable, we have
df = (r – 1)(c – 1) = (2 – 1)(2 – 1) = 1.
Performing the χ2 Test
• Step 4: Interpretation
– Since our p-value (0.0150) is less than any standard
significance level (0.05 and 0.10), we reject H0.
Therefore, there is sufficient evidence to believe that
there is an association between success and exclusive
territory. Of course, this association does not show that
exclusive territory causes success.
Conclusion
• To distinguish between the two types of chi-square
tests for two-way tables, you must examine the
design of the study.
• In the test for association/independence, there is a
single sample from a single population. The
individuals in the sample are classified according to
two categorical variables.
• In the test for homogeneity of populations, there is
a sample from each of two or more populations.
Each individual is classified based on a single
categorical variable.
• The hypothesis differ based on the test.
Music
Wine
None
French
Italian
Total
French
35.7%
52%
35.7%
40.7%
Italian
13.1%
1.3%
22.6%
12.8%
Other
51.2%
46.7%
41.7%
46.5%
Total
100%
100%
100%
100%
Go back!
Music
Wine
None
French
Italian
Total
French
30.3%
39.4%
30.3%
100%
Italian
35.5%
3.2%
61.3%
100%
Other
38.1%
31%
31%
100%
Total
34.6%
30.9%
34.6%
100%
Observed Counts for Music and Wine
Music
Wine
None
French
Italian
Total
French
30
39
30
99
Italian
11
1
19
31
Other
43
35
35
113
Total
84
75
84
243
Expected Counts for Music and Wine
Music
Wine
None
French
Italian
Total
French
34.222
30.556
34.222
99.000
Italian
10.716
9.568
10.716
31.000
Other
39.062
34.877
39.062
113.001
Total
84.000
75.001
84.000
243.001
Go back!
Download