In Chapter 3 we introduced the idea of categorical data.
In Chapter 15 we explored probability rules and when events are independent.
In Chapter 26 we put these two ideas together to compare counts.
1
National Opinion Research
Center’s General Social Survey
In 2006 a sample of 1928 adults in the U.S. were asked the question
“When is premarital sex wrong?”
The participants were also asked with what religion they were affiliated.
2
Who?
–A sample of 1928 adults.
What?
–Attitude towards premarital sex.
–Religious affiliation.
3
When is premarital sex wrong?
–Categorical: Always Wrong,
Almost Always Wrong,
Sometimes Wrong, Not Wrong at All
4
What is your religious affiliation?
–Categorical: Catholic, Jewish,
Protestant, None, Other
5
When is Premarital Sex Wrong?
Religion
Catholic
Jewish
Protestant
None
Other
Total
Always
Wrong
83
Almost
Always
Wrong
47
Sometimes
Wrong
105
Not
Wrong at All
249
4 2 9 20
364
27
28
506
97
12
8
166
190
52
20
376
Total
341
219
51 107
880 1928
484
35
992
310
6
When is Premarital Sex Wrong?
Religion
Catholic
Jewish
Protestant
None
Other
Always Almost
Always
Sometimes Never Total
17.2%
11.4%
9.7%
5.7%
21.7%
25.7%
51.4%
57.2%
100%
100%
36.7%
8.7%
26.2%
9.8%
3.9%
7.5%
19.1% 34.4% 100%
16.8% 70.6% 100%
18.7% 47.6% 100%
7
8
People who have no religion or are Jewish are more likely to say premarital sex is Not Wrong at All.
Protestants are much more likely to say premarital sex is
Always Wrong.
9
Are these differences statistically significant?
Or, are these differences due to chance variation so that religion and attitude towards premarital sex are independent?
10
If religion and attitude towards premarital sex are independent then
Pr(A and B) = Pr(A)*Pr(B) where A is a religion category and B is an attitude category.
11
If religion and attitude toward premarital sex are independent we would expect to see n*Pr(A)*Pr(B) people in the religion category
A and the attitude category B.
12
Religion
Catholic
Jewish
Protestant
None
Other
Total
Always Almost
Always
Sometimes Never Total
506 166 376
484
35
992
310
107
880 1928
13
Catholic and Always Wrong
E
1928
484
*
1928
506
*
1928
E
484 * 506
127 .
0
1928
14
Religion
Catholic
Always Almost
Always
Sometimes Never Total
127.0
41.7
94.4
220.9
484
Jewish
Protestant
None
Other
Total
9.2
260.3
81.4
28.1
506
3.0
85.4
26.7
9.2
166
6.8
16.0
193.5
452.8
60.4
141.5
20.9
376
48.8
107
880 1928
35
992
310
15
Take the difference between the observed and expected counts in a cell.
Square the difference.
Divide by the expected count.
Sum up over all the cells.
16
2 df
O
E
2
E
r
1
c
1
17
2
Catholic and Always
83
127 .
0
44 .
0
2
15 .
24
127 .
0 127 .
0
18
H
0
: Religion and attitude towards premarital sex are independent.
H
A
: Religion and attitude towards premarital sex are not independent.
2
= 184.51, df=(5-1)*(4-1)=12
P-value < 0.0001
19
Because the P-value is so small we reject the null hypothesis.
Religion and Attitude towards premarital sex are not independent.
20
Look at the cells with the largest contributions to the test statistic.
None and Always Wrong has much fewer people than expected and None and Not
Wrong at All has much more people than expected.
21
Look at the cells with the largest contributions to the test statistic.
Protestant and Always Wrong has much more people than expected and Protestant and
Not Wrong at All has much fewer people than expected.
22
Protestants are much more likely to say Always Wrong and much less likely to say Not Wrong at All.
People with no religion are much more likely to say Not Wrong at All and much less likely to say Always
Wrong .
23
Religion Attitude
1 Catholic 1 Always Wrong
2 Jewish 1 Always Wrong
3 Protestant 1 Always Wrong
4 None
5 Other
1 Always Wrong
1 Always Wrong
5 Other 4 Not Wrong at All
Count
83
4
364
27
28
51
24
Fit Y by X
Y, Response: Attitude
X, Factor: Religion
Freq: Count
25
Test
Likelihood Ratio
Pearson
ChiSquare Prob>ChiSq
193.959
<.0001*
184.510
<.0001*
26
Remember that JMP only does the calculations for you (Step
3). You have to provide all the other steps in the test of hypothesis.
27