Hypothesis Testing IV
(Chi Square)
Basic Logic
Chi Square is a test of significance based on bivariate tables.
We are looking for significant differences between the actual cell frequencies in a table (f o
) and those that would be expected by random chance (f e
).
The relationship of homicide rate and gun sales
Totals Low homicide
8
High homicide
5 13 Low gun sales
High gun sales
Totals
4
12
8
13
12
25
Tables
Notice the following about these tables
1. Table must have a title
2. Independent vrble must go into columns and if percentaged, must percentage within columns
3. Subtotals are called marginals.
4. N is reported at the intersection of row and column marginals.
Tables
Rows
Row 1
Row 2
Title
Column 1 Column 2 cell a cell b cell c cell d
Column
Marginal 1
Column
Marginal 2
Row
Marginal 1
Row
Marginal 2
N
Example of Computation
Problem 11.2
Are the homicide rate and volume of gun sales related for a sample of 25 cities?
Example of Computation
The bivariate table showing the relationship between homicide rate (columns) and gun sales
(rows). This 2x2 table has 4 cells.
High
Low
Low
8
4
12
High
5
8
13
13
12
25
Example of Computation
Use Formula 11.2 to find f e
.
Multiply column and row marginals for each cell and divide by N.
For Problem 11.2
(13*12)/25 = 156/25 = 6.24
(13*13)/25 = 169/25 = 6.76
(12*12)/25 = 144/25 = 5.76
(12*13)/25 = 156/25 = 6.24
Example of Computation
Expected frequencies:
High
Low
Low
6.24
5.76
12
High
6.76
6.24
13
13
12
25
5
4 f o
8
8
25
Example of Computation
A computational table helps organize the computations.
f o
- f e
(f o
- f e
) 2 (f o
- f e
) 2 /f e f e
6.24
6.76
5.76
6.24
25
5
4 f o
8
8
25
Example of Computation
Subtract each f e from each f o
. The total of this column must be zero.
(f o
- f e
) 2 (f o
- f e
) 2 /f e f e
6.24
6.76
5.76
6.24
25 f o
- f e
1.76
-1.76
-1.76
1.76
0
Example of Computation
Square each of these values
5
4 f o
8
8
25 f e
6.24
6.76
5.76
6.24
25 f o
- f e
1.76
-1.76
-1.76
1.76
0
(f o
- f e
) 2
3.10
3.10
3.10
3.10
(f o
- f e
) 2 /f e
5
4 f o
8
8
25
Example of Computation
Divide each of the squared values by the f cell. The sum of this column is chi square e for that f e
6.24
6.76
5.76
6.24
25 f o
- f e
1.76
-1.76
-1.76
1.76
0
(f o
- f e
) 2
3.10
3.10
3.10
3.10
(f o
- f e
) 2 /f e
.50
.46
.54
.50
χ 2 = 2.00
Step 1 Make Assumptions and
Meet Test Requirements
Independent random samples
LOM is nominal
Note the minimal assumptions. In particular, note that no assumption is made about the shape of the distribution of the parameters. The chi square test is non-parametric.
Step 2 State the Null
Hypothesis
H
0
: The variables are independent
Another way to state the H
0
, more consistent with previous tests:
H
0
: f o
= f e
Step 2 State the Null
Hypothesis
H
1
: The variables are dependent
Another way to state the H
1
H
1
: f o
≠ f e
:
Step 3 Select the S. D. and
Establish the C. R.
Sampling Distribution = χ 2
Alpha = .05
df = (r-1)(c-1) = 1
χ 2 (critical) = 3.841
Calculate the Test Statistic
χ 2 (obtained) = 2.00
Step 5 Make a Decision and
Interpret the Results of the Test
χ 2 (critical) = 3.841
χ 2 (obtained) = 2.00
The test statistic is not in the Critical
Region. Fail to reject the H
0
.
There is no significant relationship between homicide rate and gun sales.
Interpreting Chi Square
The chi square test tells us only if the variables are independent or not.
It does not tell us the pattern or nature of the relationship.
To investigate the pattern, compute
%s within each column and compare across the columns.
Interpreting Chi Square
Cities low on homicide rate were low in gun sales and cities high in homicide rate were high in gun sales.
As homicide rates increase, gun sales increase. This relationship is not significant . The apparent pattern may be sampling error.
Low
High
Low
8 (66.7%)
4 (33.3%)
12 (100%)
High
5 (38.5%)
8 (61.5%)
13 (100%)
13
12
25
The Limits of Chi Square
Like all tests of hypothesis, chi square is sensitive to sample size.
As N increases, obtained chi square increases.
With large samples, trivial relationships may be significant.
Remember: significance is not the same thing as importance.
Additional limits
If there are more than four categories in either variable, the use of chi square is questionable.
If one of the cells has a frequency less than 5 (as in our example), the use of chi square is questionable