Chi-square Goodness of fit wksht Name Pi – we know the

advertisement
Chi-square Goodness of fit wksht
Name _________________________
1.) Pi – we know the approximation of 3.14 but did you know that ∏ is an irrational
number that goes on forever and ever without any repeating pattern? Did you
know at the 523,551,502nd digit of pi, the sequence 123456789 appears for the first time? However there are
no 0’s and only one 7 in the first 20 decimal places of pi. Does that pattern persist, or do the digits show up
with equal frequency? The table shows the number of times each digit appears in the first million digits. Test
the hypothesis that the digits 0 through 9 are uniformly distributed in the decimal representation of ∏.
Digit
Count
0
99959
1
2
3
4
5
6
7
8
9
99758
100026
100229
100230
100359
99548
99800
99985
100106
Expected
Counts
100,000
100,000
100,000
100,000
100,000
100,000
100,000
100,000
100,000
100,000
Assumptions:

Representative sample of digits of pi

All expected counts are greater than 5
 
2
Chi-square goodness-of-fit test
obs  exp 2
exp
Ho: The digits 0-9 are uniformly distributed in the
first million digits of pi.
Ha: At least one proportion is different.
(obs  exp)2
 
 5.509 with 9 df
exp
2
p-value = x2cdf(5.509, ∞, 9) = .7879
α = .05
Since p-value > α, we fail to reject the null hypothesis. There is not sufficient
evidence to suggest that the digits 0 – 9 are not uniformly distributed.
2) Find the p-value for the given chi-square test statistic and degrees of freedom. Give the decision that you
would make at a significance level of α = .01.
a) X2 = 7.5, df = 2
p-value = .024 fail to reject b) X2 = 13.0, df = 6
c) X2 = 18.0, df = 9
p-value=.035 fail to reject d) X2 = 21.3, df = 4
p-value=.043 fail to reject
p-value=.000276 reject
3) An article about the California lottery gave the following information on the age distribution of adults in
California: 35% are between 18 and 34 years old, 51% are between 35 and 64 years old, and 14% are 65 years
old or older. The article also gave information on the age distribution of those who purchase lottery tickets
as recorded below.
Age of
purchaser
18-34
Frequency
(70)
36
35-64
(102)
130
65 and older
(28)
34
Suppose that the data resulted from a random sample of 200 lottery ticket
purchasers. Based on these sample data, is it reasonable to conclude that one or
more of these three age groups buy a disproportionate share of lottery tickets?
Assumptions:
Random sample of lottery tickets
All expected counts are greater than 5 (in parentheses)
Ho: Lottery tickets sold are proportionate to age groups.
Ha: One or more of the age groups buy a disproportionate share of lottery tickets.
Chi-square goodness-of-fit test
(obs  exp)2
 
 25.486 with 2 df
exp
2
p-value = x2cdf(25.486, ∞, 2) = .0000029
α = .05
Since p-value < α, we reject the null hypothesis. There is sufficient evidence to
suggest that one or more age group buys a disproportionate share of lottery tickets.
4) According to Census Bureau data, in 1998 the California population consisted of 50.7% whites, 6.6% blacks, 30.6%
Hispanics, 10.8% Asians, and 1.3% other ethnic groups. Suppose that a random sample of 1000 students graduating
from California colleges and universities in 1998 resulted in the following data on ethnic group.
Ethnic
Group
White
Number in
Sample
(507)
679
Do these data provide evidence that the proportion of students graduating
Black
(66)
51
the appropriate hypotheses using α = .01.
Hispanic
(306)
77
Asian
(108)
190
Other
(13)
3
from colleges and universities in California for these ethnic group categories
differs from the respective proportions in the population for California? Test
Assumptions:
Random sample of students
All expected counts are greater than 5 (in parentheses)
Ho: The proportion of students of different ethnic groups graduating from college in
California is proportionate to the population ethnic groups.
Ha: One or more of the proportions is different.
Chi-square goodness-of-fit test
2 
(obs  exp)2
 303.09 with 4 df
exp
p-value = x2cdf(303.09, ∞, 4) = 0
α = .01
Since p-value < α, we reject the null hypothesis. There is sufficient evidence to
conclude that one or more of the proportions of college graduates in California
differs from the ethnic group population proportions.
5) Criminologists have long debated whether there is a relationship between weather and violent crime. The author of
the article “Is There a Season for Homicide?” classified 1361 homicides according to season resulting in the following
data:
Winter
Spring
Summer
Fall
328
(340.25)
334
(340.25)
372
(340.25)
327
(340.25)
Do these data support the theory that the homicide
rate is not the same over the four seasons? Test the
relevant hypotheses using a significance level of .05.
Assumptions:
Representative sample of homicides
All expected counts are greater than 5 (in parentheses)
Ho: Homicides are distributed evenly throughout the seasons.
Ha: Homicides are not distributed evenly throughout the seasons.
Chi-square goodness-of-fit test
2 
(obs  exp)2
 4.03 with 3 df
exp
p-value = x2cdf(4.03, ∞, 3) = .2577
α = .05
Since p-value > α, we fail to reject the null hypothesis. There is not sufficient
evidence to conclude that homicides are not distributed evenly throughout the
season.
Download