CHI-SQUARE TEST

advertisement
CHI-SQUARE
TEST
Prepared by:
Francis Joseph Campeña
 Goodnes-of-Fit
Test
 Test for independence
 Test for Homogeneity/Test for Several
Proportions
Goodnes-of-Fit Test
A
test to determine if a population has a
specified theoretical distribution.
 Test Statistic
𝑘
𝜒2 =
𝑖=1
𝑜𝑖 − 𝑒
𝑒𝑖
2
,
𝑣 =𝑘−1
Region: 𝜒 2 > 𝜒 2 𝛼 where the
expected frequencies is at least 5.
 Critical
Example
Suppose that a die is tossed 120 times and
each outcome is recorded. We want to test
if the die is balanced.
FACE
1
2
3
4
5
6
OBSERVED
20
22
17
18
19
24
EXPECTED
20
20
20
20
20
20
EXAMPLE
A
machine is supposed to mix peanuts,
hazelnuts, cashews and pecans in the
ratio 5:2:2:1. A can containing 500 of
these mixed nuts was found to have 269
peanuts, 112 hazelnuts, 74 cashews, and
45 pecans. At a 0.05 level of significance,
test the hypothesis that the machine is
mixing the nuts in the ratio 5:2:2:1.
Test for Independence
 The
Chi-Square test can be used to test
the hypothesis of independence of two
variables of classification.
 The observed frequencies of this
classification is called a contingency
table.
 The row and column totals are called
marginal frequencies.
Test for Independence
 𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑
𝑘
𝜒2 =
𝑖=1
 Critical
𝑐𝑜𝑙𝑢𝑚𝑛 𝑡𝑜𝑡𝑎𝑙 × 𝑟𝑜𝑤 𝑡𝑜𝑡𝑎𝑙
𝑔𝑟𝑎𝑛𝑑 𝑡𝑜𝑡𝑎𝑙
𝑓𝑟𝑒𝑞. =
𝑜𝑖 − 𝑒
𝑒𝑖
2
,
𝑣 = 𝑟−1 𝑐−1
Region: 𝜒 2 > 𝜒 2 𝛼
Example
Suppose that we wish to determine whether
the opinions of the voting residents of the
state of Illinois concerning a new tax reform
are independent of their levels of income.
INCOME LEVELS
TAX REFORM
Low
Medium
High
TOTAL
For
182
213
203
598
Against
154
138
110
402
TOTAL
336
351
313
1000
Ho: The opinion of the voting residents
regarding the new tax reform is independent
of their level of income.
Ha: The opinion of the voting residents
regarding the new tax reform is dependent of
their level of income.
2. 𝛼 = 0.05
3. 𝜒 2 > 𝜒 2 0.05 ⇒ 𝜒 2 > 5.991
1.
4.
2
Test Statistic: 𝜒 =
110−125.8 2
125.8
5.
182−200.9 2
200.9
+
213−209.9 2
209.9
+⋯+
= 7.85
Reject Ho: The opinion of the voting residents
regarding the new tax reform is independent
of their level of income.
Test for Homogeneity
 We
can also use the chi-square test to
determine the categories of the column
variable are homogeneous with respect
to each row.
Example
Suppose we select voters from100 LP, 150 UNA
and 150 Nationalist Party and record whether
they are for a proposed abortion law, against it,
or undecided.
Political Affiliation
Abortion
Law
LP
UNA
NP
Total
For
82
70
62
214
Against
93
62
67
222
Undecided
25
18
21
64
TOTAL
200
150
150
500
Political Affiliation
Abortion
Law
LP
UNA
NP
Total
For
82 (85.6)
70 (64.2)
62 (64.2)
214
Against
93 (88.8)
62 (66.6)
67 (66.6)
222
Undecided
25 (25.6)
18 (19.2)
21(19.2)
64
TOTAL
200
150
150
500
Test for Several Proportions.
1.
𝐻0 : 𝜋1 = 𝜋2 = ⋯ = 𝜋𝑘
Ha: At least one pair of population proportion are
not equal.
3.
𝐿𝑒𝑣𝑒𝑙 𝑜𝑓 𝑠𝑖𝑔𝑛𝑖𝑓𝑖𝑐𝑎𝑛𝑐𝑒: 𝛼
Critical Region: 𝜒 2 > 𝜒 2 𝛼
4.
𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑓𝑟𝑒𝑞. =
2.
𝑘
𝜒2 =
𝑖=1
𝑐𝑜𝑙𝑢𝑚𝑛 𝑡𝑜𝑡𝑎𝑙 × 𝑟𝑜𝑤 𝑡𝑜𝑡𝑎𝑙
𝑔𝑟𝑎𝑛𝑑 𝑡𝑜𝑡𝑎𝑙
𝑜𝑖 − 𝑒
𝑒𝑖
2
,
𝑣 = 𝑘−1
EXAMPLE
In a shop study, a set of data was collected
to determine whether or not the proportion
of defectives produced was the same for
workers on the day, evening and night
shifts. Use 0.025 level of significance.
Shift:
DAY
EVENING
NIGHT
Defectives
45
55
70
Non-defectives
905
890
870
Ho: 𝜋1 = 𝜋2 = 𝜋3 Ha: At least one pair of
work shifts have different amount
defectives produced.
2. 𝛼 = 0.025
3. C.R./D.R. : 𝜒 2 > 𝜒 2 0.025 ⇒ 𝜒 2 >7.378
1.
4.
Test Statistic: :
⋯+
5.
870−883.7 2
883.7
𝜒2
=
45−57 2
57
+
55−56.7 2
56.7
= 6.29
Decision: We do not reject the null
hypothesis.
+
EXERCISES
Download