Lecture 20

advertisement
Stat 200
9
Wednesday March 7th
Hypothesis Testing
Statistical Hypotheses testing is a formal means of choosing between two distributions on the basis
of a particular statistic or random variable generated from one of them.
9.1
Neyman-Pearson Paradigm
Null hypothesis H
0
Alternate hypothesis H
A
or H 1
There are two types of hypotheses, simple ones where the hypothesis completely species the
distribution. Here is an example when they are both compositie:
Xi ∼ Poisson with unknown parameter
Xi is not Poisson
Simple hypotheses test one value of the parameter against another, the form of the distribution
remaining xed.
9.2
Two Types of Error-the burnt toast and the smoke detector.
Reality
Test says accept H 0
reject H 0
true
H 0 false
Good
Type II error
Type I error Good
H
0
Setting up tests
1. Dene the null hypothesis H 0 (devil's advocate).
2. Dene the alternative H A (one sided /two sided).
3. Find the test statistic.
4. Decide on the type I error: α that you are willing to take.
5. Compute the Probability of observing the data given the null hypothesis: P-value.
6. Compare the P-value to α, if its smaller, reject H 0 .
9.2.1
Example
ESP experiment : guess the color of 52 cards with replacement.
H
H
0
1
T ∼ B (1/2, n = 10)
T ∼ B (p, n = 10),
1
p > 1/2
p
.7
.6
.5
0
1
2
3
4
.0000 .0001 .0014 .0090 .0368
.0001 .0016 .0106 .0425 .1115
.0010 .0098 .0439 .1172 .2051
Rejection region R = {8, 9, 10}.
5
6
7
8
9
10
.1029 .2001 .2668 .2335 .1211 .0282
.2007 .2508 .2150 .1209 .0403 .0060
.2461 .2051 .1172 .0439 .0098 .0010
α = P(X > 7) = P(R) = 0.0439 + 0.0098 + 0.0010 = 0.0547
Rejection region= R = {7, 8, 9, 10}.
α = P(X > 6) = P(R) = 0.1172 + 0.0439 + 0.0098 + 0.0010 = 0.1719
In the setup, we have to choose α rst. Then we can compute what the power will be under
various values of p:
p = 0.6
p = 0.7
P(X > 7|p = 0.6) = 0.1673
P(X > 7|p = 0.7) = 0.3828
As p approaches 1, the power approaches one. As p approaches 0.5 the power becomes equal to α.
9.3
Bayesian Version
We can give a probability of H 0 being true given the data, this can actually be more meaningful
than the P-value.
Consider testing:
H 0 : θ ≤ θ 0 against H 1 : θ > θ 0 ,
we can calculate under the Bayesian model P(θ ≤ θ 0 ), the integral of the pdf over (−∞, θ).
Of course this poses a problem if the pdf is continuous and one of the hypotheses is simple.
Example: Ten sets of twins, randomized, one is given A, one is given B.
p = P(trt A better than trt B )
H
H
: p = 21
1
A : p 6= 2
0
We observe Y , number of cases where A is better, Y ∼ B (10, p).
Assign a positive prior point probability to the null hypothesis: P(p = 21 ) = γ. Suppose we
take a symmetric prior pdf for p 6= 12 ,
g(p|p 6=
1
2
) = 6p(1 − p),
0<p<1
If the experiment results in 9 wins for A and one for B, the likelihood, given this data is: L(p) =
P(Y = 9|p) = 10p9 (1 − p) We need the unconditional probability of Y = 9.
Z1
P(Y = 9) = γ × (Y = 9|p = .5) + (1 − γ) × P(Y = 9|p 6= .5)g(p)dp
0
Z1
= γ × 10(.5)10 + (1 − γ) × 10 6p10 (1 − p)2 dp
0
= 10{
γ
1024
+
1−γ
143
}
2
143γ
P(H 0 |data) =
143γ + 1024(1 − γ)
When γ = 0.5, P(no trt eect)=0.123 after nine success; if the prior is γ = .1, the posterior
probability of H 0 beccomes .015.
9.4
Optimal Tests for simple hypotheses
Null hypothesis H
0
: f = f0
Alternate hypothesis H
A
: f = f1
We want to nd a rejection region R such that the error of both types are as small as possible.
Z
Z
f0 (x)dx = α
and
1−β=
f1 dx
R
R
Neyman-Pearson Lemma
Theorem 9.1 For testing f0 (x) against f1 (x) a critical region of the form
f0 (x)
Λ(x) =
<K
f1 (x)
where K is constant has the greatest power (smallest β) in the class of tests with the same
α
Z
−
R
Z !
Z
S
βR − βS =
−
R
Z
Z
f=
Z !
f−
R∩Sc
f
S∩Rc
Z
f1 =
f1 −
R∩Sc
S
Z
f1
S∩Rc
Since in R , f1 ≥ f0 /K and in Rc , −f1 ≥ −f0 /K we have:
βS − βR
Z
Z
≥
(
f0 −
f0 )
K R∩Sc
S∩Rc
Z
Z
1
1
=
( f0 − f0 ) = (αR − αS )
K R
K
S
1
Example: Two simple hypotheses about Bernouilli parameter:
H
y f0 (y)
0 .0312
1 .1562
2 .3125
3 .3125
4 .1562
5 .0312
0
H
: p = 0.5
f1 (y) Λ(y)
.2373
.3955
.2637
.0879
.0146
.0010
.13
.40
1.2
3.6
10.6
31.2
R
fg
f0 g
f0,1 g
f0,1,2 g
3
1
: p = 0.25
α
β
0
1
0.031 .763
0.187 .367
0.500 .103
Power
0
0.237
0.633
0.897
9.4.1
Neyman Pearson Test for N (0, θ))
Given a random sample from X the likelihood is
L(θ) = θ−n/2 exp −
1 X
2θ
X2i
For the NP likelihood ration test the rejection region for testing θ 1 against θ 2 is of the form:
Λ=(
θ
θ
1 −n/2
)
exp −(
2
1
2θ1
−
1
2θ2
)
X
X2i < K 0
P
If θ 1 < θ 2 this inequality
for a given K 0 if and only if, for some K, X2i > K
P holds
The distribution of X2i /θ is χ2n and we can thus nd the error sizes in this case:
X
K
α = P(
X2i > K|θ 1 ) = 1 − Fχ2n (
)
θ1
X
K
β = P(
X2i < K|θ 2 ) = Fχ2n (
)
θ2
9.5
Composite Hypotheses
Typically one or both hypotheses are composite, suppose H A is composite, a test that is most
powerful for all simple hypotheses conatined in H A is called uniformly most powerful.
Example:
Suppose we test H 0 : p = 0.5 based on Y , the number of 1's in a random sample of size n = 5.
We saw that rejecting H 0 for Y < K. We saw that rejecting H 0 for Y < K is most powerful agains
p = 0.25 in the class of tests with α = 0.187. But the same is true if we were to use an arbitrary
p < 0.5. The likelihood ratio is:
Λ=
.5Y .55−Y
pY (1
−
p)5−Y
∝(
1
p
− 1)Y
An increasing function of Y when p < 12 . So, Λ is less than a constant when Y is less than a
constant. Thus Y < K is most powerful against every p < .5; it is UMP against H A : p < .5.
Note:
The rejection region {0, 1} for Y is UMP for p = 0.5 against p < 0.5, in the class of tests with
α = 0.187. The power function is
π(p) = P(Y = 0 or 1|p) = (1 − p)5 + 5p(1 − p)4 = (1 − p)4 (1 + 4p)
This is a decreasing function of p on (0,1), so its largest value on p ≥ .5 is at p = 0.5. So
α∗ = .187 = π(.5) = maxα 0 s.
Since {Y ≤ 1} has the greatest power at any alternative p < .5 among tests with α ≤ .187 it
has greatest power at any alternative among tests with α∗ = .187, it is UMP for p ≥ .5 against
p < .5.
4
Download