Elementary Statistics 12e

advertisement
11-3 Contingency Tables
In this section we consider contingency tables (or two-way
frequency tables), which include frequency counts for
categorical data arranged in a table with a least two rows and
at least two columns.
We present a method for testing the claim that the row and
column variables are independent of each other.
Basic Concepts of Testing
for Independence
A contingency table (or two-way frequency table) is a table in
which frequencies correspond to two variables.
(One variable is used to categorize rows, and a second
variable is used to categorize columns.)
Contingency tables have at least two rows and at least two
columns.
Example
Below is a contingency table summarizing the results of foot
procedures as a success or failure based different treatments.
Definition
Test of Independence
A test of independence tests the null hypothesis that in a
contingency table, the row and column variables are
independent.
Notation
O
represents the observed frequency in a cell of a
contingency table.
E
represents the expected frequency in a cell, found by
assuming that the row and column variables are
independent
r
represents the number of rows in a contingency table (not
including labels).
c
represents the number of columns in a contingency table
(not including labels).
Hypotheses and Test Statistic
H 0 : The row and column variables are independent.
H1 : The row and column variables are dependent.
(O  E )
 
E
(row total)(column total)
E
(grand total)
2
2
O is the observed frequency in a cell and E is the expected frequency in a cell.
Critical Values
Critical Values
1. Found in Table A-4 using
degrees of freedom = (r – 1)(c – 1)
r is the number of rows and c is the number of
columns
2. Tests of Independence are always right-tailed.
Expected Frequencies
Referring back to slide 3, the observed frequency is 54
successful surgeries.
The expected frequency is calculated using the first row
total of 66, the first column total of 182, and the grand
total of 253.
(row total)(column total)
E
(grand total)
66 182 


 47.478
 253
Example
Does it appear that the choice of treatment affects the
success of the treatment for the foot procedures?
Use a 0.05 level of significance to test the claim that
success is independent of treatment group.
Example - Continued
Requirement Check:
1. Based on the study, we will treat the subjects as
being randomly selected and randomly assigned to
the different treatment groups.
2. The results are expressed in frequency counts.
3. The expected frequencies are all over 5.
The requirements are all satisfied.
Example - Continued
The hypotheses are:
H 0 : Success is independent of the treatment.
H1 : Success and the treatment are dependent.
The significance level is α = 0.05.
Example - Continued
We use a χ2 distribution with this test statistic:
2  
O  E 
E
 58.393
2

 54  47.478
47.478
2


 5  6.174 
6.174
2
Example - Continued
Critical Value: The critical value of χ2 = 7.815 is found in
Table A-4 with α = 0.05 and degrees of freedom of
Because the test statistic does fall in the critical region,
we reject the null hypothesis.
 r 1 c 1   4 1 2 1  3
A graphic of the chi-square distribution is on the next
slide.
Example - Continued
Example - Continued
Interpretation:
It appears that success is dependent on the treatment.
Although this test does not tell us which treatment is
best, we can see that the success rates of 81.8%, 44.6%,
95.9%, and 77.3% suggest that the best treatment is to
use a non-weight-bearing cast for 6 weeks.
Relationships Among Key Components
in Test of Independence
Download