Applied Statistics: Hypothesis Testing & ANOVA Tutorial

advertisement
MATH2411
Null
Hypothesis
H0 :
µX − µY = d0
H0 :
µX − µY = d0
Applied Statistics
Condition
σX , σY
known
σX , σY
unknown
Test
Statistics
x − y − d0
z0 = q 2
2
σY
σX
n + m
t0 =
x − y − d0
q
1
sp n1 + m
σX = σY
H0 :
µX − µY = d0
σX , σY
unknown
x − y − d0
t0 = q 2
s2Y
sX
n + m
µX , µY
unknown
Alternative
Hypothesis
H1 :
µX − µY 6= d0
H1 :
µX − µY > d0
s2
f0 = X2
r0 sY
Define the number α := P (Type I error) = P (reject H0 |H0 is true ), α is called
the Significance Level of the test. Define the number β := P (Type II error) =
P (not reject H0 |H0 is false ). 1 − β is called the Power of the test.
Rejection Criteria
|z0 | > z α2
z0 > zα
H1 :
µX − µY < d0
H1 :
µX − µY 6= d0
H1 :
µX − µY > d0
H1 :
µX − µY < d0
H1 :
µX − µY 6= d0
H1 :
µX − µY > d0
H1 :
µX − µY < d0
H1 :
2
σX
> r0
σY2
σX 6= σY
σ2
H0 : X
= r0
σY2
Tutorial Notes 5
Distribution
α
z0 < −zα
a
Z ∼ N (0, 1)
0.025
z0.025
=
|t0 | > tn+m−2, α2
b
T ∼ t7
0.025
t7,0.025
=
t0 > tn+m−2,α
c
X 2 ∼ χ29
0.025
χ29,0.025
=
d
F ∼ F5,7
0.05
f0.05 (5, 7)
=
e
Z ∼ N (0, 1)
0.005
z0.005
=
f
T ∼ t22
0.005
t22,0.005
=
g
X 2 ∼ χ223
0.005
χ223,0.005
=
h
F ∼ F7,5
0.05
f0.05 (7, 5)
=
t0 < −tn+m−2,α
|t0 | > tk, α2
t0 > tk,α
t0 < −tk,α
f0 > fα (n − 1, m − 1)
H1 :
2
σX
6= r0
σY2
f0 > f α2 (n − 1, m − 1)
or
1
f0 <
f α2 (m − 1, n − 1)
H1 :
2
σX
< r0
σY2
f0 <
where:
2
(n − 1)SX
+ (m − 1)SY2
Sp2 =
is called the pooled sample variance,
2

 n+m
−
2
2
2


SX
SY


+
n
m


and k = 
2 2
2 2 .
SX
SY
1
1
+ m−1
n−1
n
m
Warm-up (Distribution of Sample Mean)
Check the distribution table and fill in the blanks:
1
fα (m − 1, n − 1)
Example 1 (Test for equality of population variances and means)
The following are the burning times(in minutes) of candles of two different brands.
Assuming all samples are randomly drawn and assume burning time of the two
brands are both normal distribution.
Sample burning time
Brand X
Brand Y
63 82 81 68 57 64 56 72 63 83
59 66 75 82 73 74 59 82 65 82
(a) For α = 0.1, test if two populations have got the same variance.
(b) Based on your result in (a), for α = 0.1, test if the mean burning time of the two
brands are equal.
Exercise 1 (Test for difference between population means)
To test whether or not HKUST professors’ average monthly salary is $4000 higher
than that of the professors from other institutions, a random sample of 50 professors
from HKUST are drawn and it shows that their average monthly salary is $81,750.
Also a random sample of 200 professors from other institutions are drawn and it
shows that their average monthly salary was $77,500. Test the hypothesis with a
0.05 level of significance, assuming that both HKUST and non-HKUST professors’
monthly salaries follow normal distributions with the same population standard
deviation being $5000.
Example 2 (2012 Spring Final Exam)
A recent article in the British Journal Lancet reports that babies who were fed by
mother’s milk tended to have a higher IQ than formula-fed babies, Suppose that two
groups of babied are compared, one group fed by mother’s milk and the other group
fed by formula milk powder. The IQ scores are listed below:
IQ Score
Mother Fed
Formula Fed
121 105 111 119 108 101 110 107 98
101 90 131 106 112 103 86 117 113
(b) Hence or otherwise, at a 0.05 level of significance, test H0 : µX = µY + 25 against
H1 : µX 6= µY + 25.
89
87
Assume IQ scores are normally distributed with population mean µX and population
2
for mother-fed babies, population mean µY and population variance σY2
variance σX
2
6= σY2 .
for formula-fed babies. Assume that σX
(a) Construct a 95% confidence interval for the difference between the IQ mean scores
µX − µY .
A brief summary of course materials:
1.
Error Sum of Squares, SSE =
ni
k X
X
(xij − xi )2
i=1 j=1
Treatment Sum of Squares, SStreat =
k
X
ni (xi − xall )2
i=1
2.
M SE =
3.
F =
SSE
n−k
and
M Streat =
SStreat
k−1
M Streat
and if F > fα (k − 1, n − k),
M SE
then hypothesis µ1 = µ2 = · · · = µn is rejected at significant level α
Example 3
To study if exam performance is affected by the background sound, 12 student volunteers from MATH 2411 class are randomly assigned to 3 exam rooms to complete the
same standardized test in statistics, each exam room has 4 students.
Rock music is played in Room X, light music is played in Room Y, while there is so
special background sound in Room Z.
The test scores of the 12 students are shown in the following table.
Student
Student
Student
Student
1
2
3
4
Group 1, Room X
50
55
45
40
Group 2, Room Y
75
65
60
60
Group 3, Room Z
65
50
65
70
(f) Calculate SSE, the Error Sum of Squares, where SSE = SSX + SSY + SSZ .
(g) Calculate the mean score of all the 12 students, call it M .
(a) Write down the sample size n =
(b) Write down the number of groups k =
(c) Write down the number of sampling point in each group.
n1 =
, n2 =
, · · · , nk =
(h) Calculate the Treatment Sum of Square (SStreat ),
k
X
where SStreat =
ni (Mi − M )2 .
i=1
(d) Calculate the sample mean score of each group, namely M1 = X, M2 = Y and
M3 = Z.
(i) For α = 0.05, fill in the following Table:
Source
Degree of
freedom
Sum of
Squares
Error
n−k =
SSE =
Mean Sum
of Squares
SSE
M SE =
n−k
=
(e) Calculate the Sum of Squares (SS) of each group, namely SSX , SSY and SSZ .
Treatment
k−1 =
SStreat
=
M Streat =
.
SStreat
k−1
F -Value
F =
M Streat
=
M SE
fα (k − 1, n − k) =
=
Hence test the Null Hypothesis H0 : µX = µY = µZ at significant level α = 0.05.
Exercise 2
A research on young children’s mental arithmetic ability is being conducted on some
native English speaking, Chinese Speaking, and Italian speaking pupils, all of 8 years
old. The list below is a summary on total number of single digit multiplication questions each of them can answer in a unit period of time.
Group 1, English Speaking
3
6
7
4
Group 2, Chinese Speaking
10
12
11
14
8
6
Group 3, Italian Speaking
8
3
2
5
M SE =
SSE
=
n−k
M Streat =
SStreat
=
k−1
Conduct ANOVA Test at significant level α = 0.05 to see on average whether the
pupil from different language groups have the same level of mental arithmetic ability.
n=
xeng =
, k=
;
neng =
xchi =
xita =
, nchi =
, nita =
xall =
SSE =
F =
M Streat
=
M SE
fα (k − 1, n − k) =
SStreat =
∴ the conclusion is :
that we reject H0 at significant level α = 0.05.
Example 4
The following shows the number of subjects with credits for MATH major students in
2012-2013 Fall Semester.
SStreat =
(a) Fill in the given table.
No. of subjects
with credits
0
1
2
3
4
5
6
7
8
Total
Credits no.
× boy freq.
FreqBoy
17
22
12
6
4
16
11
7
5
-uency
Girl
4
8
12
15
22
17
10
5
2
Credits no.
× girl freq.
∴ M Streat =
∴F =
while fα (k − 1, n − k) =
(b) Conduct ANOVA Test at sinificant level α = 0.01 to see if boys and girls have the
same number of subjects with credits.
so
SSE = SSboy + SSgirl
∴ M SE =
(Answers will be available at http://ihome.ust.hk/~makittylee)
Download