Chapter 11.pptx

advertisement
Chapter 11
Chi-Square and no ANOVA Tests
Slide set to accompany "Statistics Using Technology" by Kathryn Kozak (Slides by David H Straayer) is
licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Based on a work at
http://www.tacomacc.edu/home/dstraayer/published/Statistics/Book/StatisticsUsingTechnology112314b.pdf.
Section 11.1: Chi-Square Test for Independence
• Two categorical variables
• Are they independent, or are they related?
• If they are independent, there can not be any
casual relationship.
Chi-Square test for independence
1. Hypotheses and 
H0: the two variables are independent
H1: the two variables are dependent
2. Assumptions: random sample, expected
frequencies all 5
3. Test statistic and p-value
The test statistic, 2 can be used to calculate
a p-value
4. Conclusion. As always, if p-value < , reject
the null hypothesis in favor of the alternate.
There is sufficient evidence to support the
alternate hypothesis that the categorical
variables are dependent.
Otherwise, there is not enough evidence to
support the alternate hypothesis at the
stated level .
5. Interpretation: what does this conclusion
imply in the context of the problem?
The test statistic 2
• This is a single number that captures the
central question: how far is the observed data
away from what we’d expect if the null
hypothesis is true?
• It is just the sum of the squares of the
differences (but normalized to become a unitless number.
OK, but what is that expectation?
• Let’s work this discussion around an example.
Autism
Yes
No
Column Total
Breast Feeding Timelines
None
Less
2 to 6
More
than 2 months than 6
months
months
241
20
261
198
25
223
164
27
191
215
44
259
Row
Total
818
116
934
• Focus on one cell, and ask “what should we expect if
Autism is independent of breastfeeding?
Focus on “None” “Yes” cell
• In this sample 818 out of 934 babies had
Autism (Gee, that seems like a pretty high
percentage to me!) . This works out to 87.5%
• If there were no relationship between
breastfeeding and autism, that percentage of
the 261 babies with “none” breastfeeding
would be expected to be autistic. 87.5% of
(times) 261 is 228.6 (don’t get freaked out by
the fractional baby)
On the other hand…
• What percentage of babies had “none”
breastfeeding? It’s 261/934  27.9%
• Out of those 818 babies with autism, we’d
expect that percentage of them to have
Autism: 27.9%*818, That same 228.6.
• For each cell, the expected count is:
𝑟𝑜𝑤 𝑡𝑜𝑡𝑎𝑙 ∗ (𝑐𝑜𝑙𝑢𝑚𝑛 𝑡𝑜𝑡𝑎𝑙)
(𝑔𝑟𝑎𝑛𝑑 𝑡𝑜𝑡𝑎𝑙)
Now we have a matrix of expectations
Expected Counts
Breast Feeding Timelines
Autism
Yes
No
Column
Total
Less
More
2 to 6
None
than 2
than 6
months
months
months
228.6
195.3
167.3
226.8
32.4
27.7
23.7
32.2
261
223
191
259
Row
Total
818
116
934
How “far” are these two matrices apart?
• In each cell, subtract to calculate a distance.
• Since we’re going to want to add them up to
get a grand total distance, we need to square
each difference first, to make all the numbers
positive, just as we have done before in things
like the standard deviation.
• But we divide each square by the expected
number first, before we add them up. This
makes the resulting sum unit-less.
2 formula
•
𝝌2
=
𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑𝑖𝑗 −𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑𝑖𝑗
2
𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑𝑖𝑗
• For this data, 2  11.21
• That might mean something to Dilbert, but what
does it tell our pointy-haired boss (and us) about
how unusual it would be to get these numbers, if
our null hypothesis is true?
• That’s where a distribution function comes in
Getting to the p-value
• The area of a right tail of a distribution
function is the p-value.
• What is the distribution function? The 2
distribution function.
• Actually, like the t-distribution, there is a
whole family of 2 functions, one for each
degree of freedom. d.f. is defines as
(row count-1)*(column count -1)
Shape of 2 distribution
The right-tail on the TI
• 2cdf(11.2166,1E99,3)  0.01061
• That’s more than 1%, which is a reasonable 
for this problem. So we’d have to fail to reject
the null hypothesis.
• “We don’t have enough evidence to meet our
stated level of evidence to conclude that
breastfeeding frequency is linked to autism.”
Find an easier way using technology
Put the observed
data in here
Give permission to put
expected values here
Section 11.2: Chi-Square Goodness of Fit
• Sometimes, we want to test a single
categorical value to see if it observed values
match theoretical values.
• For example, a casino wants to make sure
roulette wheels and dice are “fair”. As long as
any departures from “fairness” are very much
smaller than the “house edge”, the gambling
equipment is safe to use.
FITTEST on the TI
• Some TI84 firmware has a 2GOF-Test
• The TI83s don’t have it.
• But it’s not all that hard to calculate a 2
statistic, and it’s even easier with a little
program.
• Put your observed data in L2. If you have
expected probabilities, put them in L1. If they
are all equal, the program will do that for you.
FITTEST, continued
• The program will put expected counts in L3,
and the terms that you add up to calculate 2
in L4.
• The program reports the 2 statistics, and the
corresponding p-value.
• Most of the program is user-interface code,
but the “guts” of the program is shown on the
next slide.
Babies Birthdays
• http://www.dartmouth.edu/~chance/teaching_ai
ds/data/birthday.txt
Day
Sunday
Monday
Tuesday
Wednesday Thursday
Friday
7
11
8
9
4
50 random
33
39
54
43
45
300 random
149
206
207
183
178
1300 random
Saturday
5
43
203
6
43
174
• Do those weekend numbers seem a little small?
• Do babies consult calendars before coming out?
• Let’s assume babies don’t, and the chances of
being born on all days are the same: 1/7
50 random babies selected
50 babies data interpreted
• 2 = 4.8, p = 0.559
• This means there is a 56% chance of getting
data like this if babies are equally likely to be
born on any weekday.
• “There is insufficient evidence that baby’s
birth days are not as likely on any weekday.
300 or 1300 random babies
• 300 baby data: 2 = 5.62, p = 0.467
Still not much evidence. This could happen by
chance, easily.
• 1300 baby data: 2 = 14.6, p = 0.0234
“With 1300 randomly-selected babies, we
have evidence (p=0.0234) that babies birth
days were not uniformly distributed among
the weekdays.”
Download