ELE530: Neyman-Pearson and Composite Hypothesis Testing

advertisement
ELE530: Neyman-Pearson and
Composite Hypothesis Testing
Fernando Pérez-Cruz
Princeton University and University Carlos III in Madrid
February 16th , 2009
Theory of Detection and Estimation
Neyman-Pearson
Composite Hypothesis Testing
I Simple hypothesis testing:
• Binary: decide whether y = 0 or y = 1, given an observation.
• Multiple: decide whether y = 1, y = 2, . . . , y = M .
Pérez-Cruz
1
Theory of Detection and Estimation
Neyman-Pearson
Composite Hypothesis Testing
I Simple hypothesis testing:
• Binary: decide whether y = 0 or y = 1, given an observation.
• Multiple: decide whether y = 1, y = 2, . . . , y = M .
I Composite binary hypothesis testing:
• Decide whether y = c or y 6= c.
• Decide whether y ≥ c or y < c.
• Generally, decide whether y ∈ Y0 or y ∈
/ Y0 .
Pérez-Cruz
2
Theory of Detection and Estimation
Neyman-Pearson
Composite Hypothesis Testing
I Simple hypothesis testing:
• Binary: decide whether y = 0 or y = 1, given an observation.
• Multiple: decide whether y = 1, y = 2, . . . , y = M .
I Composite binary hypothesis testing:
• Decide whether y = c or y 6= c.
• Decide whether y ≥ c or y < c.
• Generally, decide whether y ∈ Y0 or y ∈
/ Y0 .
I Composite multiple hypothesis testing:
• Decide whether y ∈ Y0, y ∈ Y1, y ∈ Y2, . . . and
Pérez-Cruz
SM
i=1 Yi = Y.
3
Theory of Detection and Estimation
Neyman-Pearson
Binary Hypothesis Testing
I Null hypothesis:
Alternative hypothesis:
H0 : y ∈ Y 0
H1 : y ∈
/ Y0
I Define the rejection region: X1 = {x|T (x) > c}
I We reject the null hypothesis if x ∈ X1.
I We do not reject the null hypothesis if x ∈
/ X1 .
H0 True
Retain Null
Reject Null
Pérez-Cruz
√
Type I Error
H1 True
Type II Error
√
4
Theory of Detection and Estimation
Neyman-Pearson
Binary Hypothesis Testing
I Null hypothesis:
Alternative hypothesis:
H0 : y ∈ Y 0
H1 : y ∈
/ Y0
I Define the rejection region: X1 = {x|T (x) > c}
I We reject the null hypothesis if x ∈ X1.
I We do not reject the null hypothesis if x ∈
/ X1 .
H0 True
Retain Null
Reject Null
Pérez-Cruz
√
False Alarm
H1 True
Misdetection
√
5
Theory of Detection and Estimation
Neyman-Pearson
Definitions
I Power of a Test:
β(y) = P (x ∈ X1|y)
It measures the probability of rejecting the null hypothesis for a
given y and rejection region.
I The size of a test:
α = sup β(y)
y∈Y0
It measures the maximum probability of rejecting the null hypothesis (type I error).
I A test has level α if its size is less than α.
Pérez-Cruz
6
Theory of Detection and Estimation
Neyman-Pearson
Example
I We are given n observation from N (y, σ 2) with known σ.
I We want to know whether y ≤ 0 (H0) or y > 0 (H1).
I We are given the test T (x) =
1 Pn
n i=1 xi .
I We want to select c to get 0.05-level test. How do we do it?
• Rejection region:
Pérez-Cruz
7
Theory of Detection and Estimation
Neyman-Pearson
Example
I We are given n observation from N (y, σ 2) with known σ.
I We want to know whether y ≤ 0 (H0) or y > 0 (H1).
I We are given the test T (x) =
1 Pn
n i=1 xi .
I We want to select c to get 0.05-level test. How do we do it?
• Rejection region: X1 = {x|T (x) > c}
• The Power function:
β(y) = P (T (x) > c) = P
=P
√
√
!
n(T (x) − y)
n(c − y)
>
σ
σ
!
!
√
√
n(c − y)
n(c − y)
Z>
=1−Φ
σ
σ
where Z ∼ N (0, 1) and Φ(·) is the cdf of a Gaussian.
Pérez-Cruz
8
Theory of Detection and Estimation
Neyman-Pearson
Example
I We are given n observation from N (y, σ 2) with known σ.
I We want to know whether y ≤ 0 (H0) or y > 0 (H1).
I We are given the test T (x) =
1 Pn
n i=1 xi .
I We want to select c to get 0.05-level test. How do we do it?
• Rejection region: X1 = {x|T (x) > c}
• The Power function:
√
√
!
n(T (x) − y)
n(c − y)
>
σ
σ
!
!
√
√
n(c − y)
n(c − y)
Z>
=1−P Z ≤
σ
σ
β(y) = P (T (x) > c) = P
=P
where Z ∼ N (0, 1).
Pérez-Cruz
9
Theory of Detection and Estimation
Neyman-Pearson
Example
I
n P
o
1
Power for our test for different levels. X1 = x n i xi > c
1
0.9
0.8
0.8
0.7
0.7
0.6
0.6
β(y)
β(y)
0.9
1
c=0
c=0.52
c=0.815
α=0.5
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0
−1
α=0.05
−0.5
Pérez-Cruz
n=5
n=10
n=25
0.1
α=0.005
0
0.5
y
1
1.5
2
0
−1
−0.5
0
0.5
y
1
1.5
2
10
Theory of Detection and Estimation
Neyman-Pearson
Most Powerful Test
I For T (x), we find the c that achieves the desired level α.
I α is the maximum probability of false alarm for any y ∈ Y0.
I The minimum value of β(y) for y ∈
/ Y0, is the minimum probability of detection, i.e. the maximum probability of misdetection.
I Most Powerful Test: test with highest power for a given level:
• Maximize the probability of detection for a given false alarm.
max Pd(δ)
δ
Pérez-Cruz
subject to
Pf a(δ) ≤ α
11
Theory of Detection and Estimation
Neyman-Pearson
Most Powerful Test
I For T (x), we find the c that achieves the desired level α.
I α is the maximum probability of false alarm for any y ∈ Y0.
I The minimum value of β(y) for y ∈
/ Y0, is the minimum probability of detection, i.e. the maximum probability of misdetection.
I Most Powerful Test: test with highest power for a given level:
• Maximize the probability of detection for a given false alarm.
max Pd(δ)
subject to
δ
max sup 1 − R(δ|y)
δ
Pérez-Cruz
y ∈Y
/ 0
subject to
Pf a(δ) ≤ α
sup R(δ|y) ≤ α
y∈Y0
12
Theory of Detection and Estimation
Neyman-Pearson
Most Powerful Test
I For T (x), we find the c that achieves the desired level α.
I α is the maximum probability of false alarm for any y ∈ Y0.
I The minimum value of β(y) for y ∈
/ Y0, is the minimum probability of detection, i.e. the maximum probability of misdetection.
I Most Powerful Test: test with highest power for a given level:
• Maximize the probability of detection for a given false alarm.
max Pd(δ)
subject to
min sup R(δ|y)
subject to
δ
δ
Pérez-Cruz
y ∈Y
/ 0
Pf a(δ) ≤ α
sup R(δ|y) ≤ α
y∈Y0
13
Theory of Detection and Estimation
Neyman-Pearson
Some tests
I There are many different test.
I It is difficult to prove if a test is the most powerful test.
I A test is proposed and its power is compared with other test.
I Some test of interest:
Pérez-Cruz
14
Theory of Detection and Estimation
Neyman-Pearson
Some tests
I There are many different test.
I It is difficult to prove if a test is the most powerful test.
I A test is proposed and its power is compared with other test.
I Some test of interest:
• Wald test: verifies if y = y0.
Pérez-Cruz
15
Theory of Detection and Estimation
Neyman-Pearson
Some tests
I There are many different test.
I It is difficult to prove if a test is the most powerful test.
I A test is proposed and its power is compared with other test.
I Some test of interest:
• Wald test: verifies if y = y0.
• Permutation test: verifies if p(x) and q(x) are the same.
Pérez-Cruz
16
Theory of Detection and Estimation
Neyman-Pearson
Some tests
I There are many different test.
I It is difficult to prove if a test is the most powerful test.
I A test is proposed and its power is compared with other test.
I Some test of interest:
• Wald test: verifies if y = y0.
• Permutation test: verifies if p(x) and q(x) are the same.
• Likelihood test: verifies if y ∈ Y0 versus y ∈
/ Y0 .
Pérez-Cruz
17
Theory of Detection and Estimation
Neyman-Pearson
Some tests
I There are many different test.
I It is difficult to prove if a test is the most powerful test.
I A test is proposed and its power is compared with other test.
I Some test of interest:
• Wald test: verifies if y = y0.
• Permutation test: verifies if p(x) and q(x) are the same.
• Likelihood test: verifies if y ∈ Y0 versus y ∈
/ Y0 .
• Neyman-Pearson test: verifies if y = y0 versus y = y1.
Pérez-Cruz
18
Theory of Detection and Estimation
Neyman-Pearson
Neyman-Pearson test
I Decide between hypothesis y = y0 or y = y1 given x = {x1, . . . , xn}:
I Neyman-Pearson test:
n
p(xi|y = y1)
L(y = y1)
i=1
T (x) =
= Qn
L(y = y0)
i=1 p(xi |y = y0 )
Q
I Neyman-Pearson rejection region:
X1 = {x|T (x) > k}
where k is such that β(y0) = α.
I Neyman-Pearson lemma:
• Most powerful test.
Pérez-Cruz
19
Theory of Detection and Estimation
Neyman-Pearson
Example I
I Null Hypothesis:
Alternative Hypothesis:
exp(−(x + 1)2/2)
√
2π
I
exp(−x2/8)
√
8π
2
2
x + (x+1)
exp
−
>k
Rejection region: X1 = x 1
2
8
2
5
0.5
log L(x)
logτ
4
p(x|y=0)
p(x|y=1)
0.45
0.4
0.35
0.3
2
p(x|y)
log L(x)
3
1
0.2
0.15
0
0.1
−1
−2
−6
Pérez-Cruz
0.25
0.05
−4
−2
0
x
2
4
6
0
−6
−4
−2
0
x
2
4
6
20
Theory of Detection and Estimation
Neyman-Pearson
k=0
5
0.5
log L(x)
logτ
4
p(x|y=0)
p(x|y=1)
0.45
0.4
0.35
0.3
2
p(x|y)
log L(x)
3
1
0.2
0.15
0
0.1
−1
0.05
−2
−6
−4
−2
0
2
x
4
p(x|y=0)
p(x|y=1)
0.45
0.4
0.8
0.35
0.7
0.3
0.6
0.25
0.2
−2
0
2
x
4
6
0.5
0.4
0.3
0.15
0.2
0.1
P
0.05
0
−6
−4
0.9
R(δ|y=1)
p(x|y)
0
−6
6
1
0.5
Pérez-Cruz
0.25
−4
Pfa
Pfa
miss
−2
0
x
0.1
2
4
6
0
0
0.2
0.4
0.6
R(δ|y=0)
0.8
1
21
Theory of Detection and Estimation
Neyman-Pearson
k=1
5
0.5
log L(x)
logτ
4
p(x|y=0)
p(x|y=1)
0.45
0.4
0.35
0.3
2
p(x|y)
log L(x)
3
1
0.2
0.15
0
0.1
−1
0.05
−2
−6
−4
−2
0
2
x
4
p(x|y=0)
p(x|y=1)
0.45
0.4
0.8
0.35
0.7
0.3
0.6
0.25
0.2
−2
0
2
x
4
6
0.5
0.4
0.3
0.15
0.2
0.1
P
0.05
0
−6
−4
0.9
R(δ|y=1)
p(x|y)
0
−6
6
1
0.5
Pérez-Cruz
0.25
0.1
miss
Pfa
−4
−2
0
x
Pfa
2
4
6
0
0
0.2
0.4
0.6
R(δ|y=0)
0.8
1
22
Theory of Detection and Estimation
Neyman-Pearson
k = −0.5
5
0.5
log L(x)
logτ
4
p(x|y=0)
p(x|y=1)
0.45
0.4
0.35
0.3
2
p(x|y)
log L(x)
3
1
0.2
0.15
0
0.1
−1
0.05
−2
−6
−4
−2
0
2
x
4
p(x|y=0)
p(x|y=1)
0.45
0.4
0.8
0.35
0.7
0.3
0.6
0.25
0.2
−2
0
2
x
4
6
0.5
0.4
0.3
0.15
0.2
0.1
0.05
0
−6
−4
0.9
R(δ|y=1)
p(x|y)
0
−6
6
1
0.5
Pérez-Cruz
0.25
P
−4
P
miss
fa
−2
P
0.1
fa
0
x
2
4
6
0
0
0.2
0.4
0.6
R(δ|y=0)
0.8
1
23
Theory of Detection and Estimation
Neyman-Pearson
Receiver Operation Characteristic
1
1
0.9
0.9
0.8
0.8
0.7
0.7
Detection rate
R(δ|y=1)
I Repeat for every value of k:
0.6
0.5
0.4
0.3
0.6
0.5
0.4
0.3
0.2
0.2
0.1
0.1
0
0
0.2
0.4
0.6
R(δ|y=0)
0.8
1
0
0
0.2
0.4
0.6
False alarm
0.8
1
I What does the red circle represent?
I And the green?
Pérez-Cruz
24
Theory of Detection and Estimation
Neyman-Pearson
Example II: Biased Coin
I Null Hypothesis: fair coin.
I Alternative Hypothesis: biased coin p(H) = 0.8.
I We toss the coin three times.
I We want to find the most powerful test with level 0.2.
I What do we do?
Pérez-Cruz
25
Theory of Detection and Estimation
Neyman-Pearson
Example II: Biased Coin
I Null Hypothesis: fair coin.
I Alternative Hypothesis: biased coin p(H) = 0.8.
I We toss the coin three times.
I We want to find the most powerful test with level 0.2.
I What do we do?
• Define a rejection set.
Pérez-Cruz
26
Theory of Detection and Estimation
Neyman-Pearson
Example II: Biased Coin
I Null Hypothesis: fair coin.
I Alternative Hypothesis: biased coin p(H) = 0.8.
I We toss the coin three times.
I We want to find the most powerful test with level 0.2.
I What do we do?
• Define a rejection set.
• Compute the power function β(y).
Pérez-Cruz
27
Theory of Detection and Estimation
Neyman-Pearson
Example II: Biased Coin
I Null Hypothesis: fair coin.
I Alternative Hypothesis: biased coin p(H) = 0.8.
I We toss the coin three times.
I We want to find the most powerful test with level 0.2.
I What do we do?
• Define a rejection set.
• Compute the power function β(y).
• Obtain the test level.
Pérez-Cruz
28
Theory of Detection and Estimation
Neyman-Pearson
Example II: Biased Coin
I Null Hypothesis: fair coin.
I Alternative Hypothesis: biased coin p(H) = 0.8.
I We toss the coin three times.
I We want to find the most powerful test with level 0.2.
I What do we do?
• Define a rejection set.
• Compute the power function β(y).
• Obtain the test level.
• Obtain the test power.
Pérez-Cruz
29
Theory of Detection and Estimation
Neyman-Pearson
Example II
I Observation set x ∈ {HHH,. . . ,TTT}
I Rejection set X1 = {
}
1
0.9
0.8
R(δ|y=1)
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
Pérez-Cruz
0.2
0.4
0.6
R(δ|y=0)
0.8
1
30
Theory of Detection and Estimation
Neyman-Pearson
Example II
I Observation set x ∈ {HHH,. . . ,TTT}
I Rejection set X1 = {
}
1
0.9
0.8
R(δ|y=1)
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
Pérez-Cruz
0.2
0.4
0.6
R(δ|y=0)
0.8
1
31
Theory of Detection and Estimation
Neyman-Pearson
Example II
I Hard decision rules.
1
0.9
0.8
R(δ|y=1)
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
Pérez-Cruz
0.2
0.4
0.6
R(δ|y=0)
0.8
1
32
Theory of Detection and Estimation
Neyman-Pearson
Example II
I Ramdomize decision rules.
1
0.9
0.8
R(δ|y=1)
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
Pérez-Cruz
0.2
0.4
0.6
R(δ|y=0)
0.8
1
33
Theory of Detection and Estimation
Neyman-Pearson
Example II
I ROC: Receiver Operation Characteristic
1
0.9
Detection rate
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
Pérez-Cruz
0.2
0.4
0.6
False alarm
0.8
1
34
Theory of Detection and Estimation
Neyman-Pearson
Risk Composite Hypothesis Testing
I Simple hypothesis testing:
R(δ) =
Z
L(δ(x), y)p(x, y)dxdy
I Composite hypothesis testing:
R(δ) =
Z
L(δ(x), h(y))p(x, y)dxdy
I Binary composite hypothesis testing:
R(δ) =L00
Z
X0 ,Y0
Z
+ L01
Pérez-Cruz
p(x, y)dxdy + L10
X0 ,Y1
Z
X1 ,Y0
Z
p(x, y)dxdy + L11
p(x, y)dxdy
X1 ,Y1
p(x, y)dxdy
35
Theory of Detection and Estimation
Neyman-Pearson
Example
I We are given a coin a we want to know if it is biased towards
heads?
• H0: P (π > 0.5)
• H1: P (π <= 0.5)
where π = p(H).
I We are given 5 coin tosses: X = {HHHHH, . . . , TTTTT}.
I We now a priori that p(π) is Beta distributed:
Γ(α + β) α−1
π
(1 − π)β−1
Γ(α)Γ(β)
with α = 2 and β = 3.5;
p(π) =
Pérez-Cruz
36
Theory of Detection and Estimation
Neyman-Pearson
Example
I Risk for δ(x) = H0:
R(δ(x) = H0) = L00
Z
Y0
p(x, y)dy + L01
Z
Y1
p(x, y)dy
I Risk for δ(x) = H1:
R(δ(x) = H1) = L11
Z
Y1
p(x, y)dy + L10
Z
Y0
p(x, y)dy
I where
!
Γ(5.5)
5
N
5−N
H
H
p(x, y) =
π (1 − π)
π(1 − π)2.5
NH
Γ(2)Γ(3.5)
I How would you choose X1, if L00 = L11 = 0 and L01 = L10 = 1?
Pérez-Cruz
37
Theory of Detection and Estimation
Neyman-Pearson
Uniformly Most Powerful Test
I Is there a Neyman-Pearson test for composite hypothesis testing?
Pérez-Cruz
38
Theory of Detection and Estimation
Neyman-Pearson
Uniformly Most Powerful Test
I Is there a Neyman-Pearson test for composite hypothesis testing? No
I Uniformly Most Powerful Test:
• For y = y0 and y ∈ Y1.
• If X1 is independent of y for R(δ|y = y0) ≤ α.
• Example:
Test if the mean of a Gaussian is µ0 (H0) or greater (H1).
Pérez-Cruz
39
Theory of Detection and Estimation
Neyman-Pearson
Uniformly Most Powerful Test
I Is there a Neyman-Pearson test for composite hypothesis testing? No
I Uniformly Most Powerful Test:
• For y = y0 and y ∈ Y1.
• If X1 is independent of y for R(δ|y = y0) ≤ α.
• Example:
Test if the mean of a Gaussian is µ0 (H0) or greater (H1).
n
o
−1
X1 = x|x > σΦ (1 − α) + µ0
Independent of y. Why?
Pérez-Cruz
40
Theory of Detection and Estimation
Neyman-Pearson
Locally Most Powerful Test
I UMP tests usually do not exist.
I For y = y0 and y ∈ Y1.
I We can apply the Locally most powerful test.
I Idea: Compare y0 with y tending towards y0.
∂β(y) + O(y 2).
• β(y) = β(y0) + (y − y0) ∂y y=y0
∂β(y) • β(y) = α + (y − y0) ∂y + O(y 2).
y=y0
∂β(y) • We want a test that maximizes ∂y .
y=y0
Pérez-Cruz
41
Theory of Detection and Estimation
Neyman-Pearson
Generalized Likelihood Ratio Test
I Neyman-Pearson test:
n
p(xi|y = y1)
L(y = y1)
T (x) =
= Qi=1
n
L(y = y0)
i=1 p(xi |y = y0 )
Q
I GLR test:
Qn
supy∈Y1 L(y)
supy∈Y1 i=1 p(xi|y)
T (x) =
=
Q
supy∈Y0 L(y)
supy∈Y0 n
i=1 p(xi |y)
I GLR rejection region:
X1 = {x|T (x) > k}
where k is such that α = supy∈Y0 β(y).
I This test does not need to be the most powerful.
Pérez-Cruz
42
Theory of Detection and Estimation
Neyman-Pearson
Example
I Repeat the previous example.
I For x = {HHHHH}:
• p(x|π) = π 5
• supy∈Y1 L(y) = 0.55 = 0.03125
• supy∈Y0 L(y) = 15 = 1
I For x = {THHHH}, . . . , {HHHHT}:
• p(x|π) = 5π 4(1 − π)
• supy∈Y1 L(y) = 5 × 0.55 = 0.15625
• supy∈Y0 L(y) = 5 × 0.84 × 0.2 = 0.4096
Pérez-Cruz
43
Theory of Detection and Estimation
Neyman-Pearson
Example
I For x = {TTHHH}, . . . , {HHHTT}:
• p(x|π) = 10π 3(1 − π)2
• supy∈Y1 L(y) = 10 × 0.55 = 0.3125
• supy∈Y0 L(y) = 10 × 0.63 × 0.42 = 0.3456
I For x = {TTTHH}, . . . , {HHTTT}:
• p(x|π) = 10π 2(1 − π)3
• supy∈Y1 L(y) = 10 × 0.42 × 0.63 = 0.3456
• supy∈Y0 L(y) = 10 × 0.55 = 0.3125
Pérez-Cruz
44
Theory of Detection and Estimation
Neyman-Pearson
Example
I For x = {TTTTH}, . . . , {HTTTT}:
• p(x|π) = 5π(1 − π)4
• supy∈Y1 L(y) = 5 × 0.2 × 0.84 = 0.4096
• supy∈Y0 L(y) = 5 × 0.55 = 0.15625
I For x = {TTTTT}:
• p(x|π) = (1 − π)5
• supy∈Y1 L(y) = 15 = 1
• supy∈Y0 L(y) = 0.55 = 0.03125
Pérez-Cruz
45
Download