Other Test Constructions: Likelihood Ratio & Bayes Tests

advertisement
Other Test Constructions: Likelihood Ratio & Bayes Tests
Side-Note: So far we have seen a few approaches for creating tests such as
• Neyman-Pearson Lemma (“most powerful” tests of H0 : θ = θ0 vs H1 : θ = θ1 )
• Two methods for “uniformly most powerful” (UMP) tests
– Method I: Based on Neyman-Pearson Lemma (may work for Rp -valued parameters θ
and tests of H0 : θ ∈ Θ0 ⊂ Rp vs H1 : θ 6∈ Θ0 )
– Method II: Monotone Likelihood Ratio (may work for real-valued θ ∈ R and tests
“H0 : θ ≤ θ0 vs H1 : θ > θ0 ” or “H0 : θ ≥ θ0 vs H1 : θ < θ0 ”)
The tests above are often rooted in comparing likelihoods to make a testing decision (e.g., NeymanPearson Lemma). We next consider a very general testing procedure based on comparing the ratio
of two likelihoods.
A.) Likelihood Ratio Tests
Definition: Let f (x|θ), θ ∈ Θ ⊂ Rp , be the joint pdf/pmf of X = (X1 , . . . , Xn ) (the
˜
˜
parameter θ can be vector-valued) and let Θ0 be a nonempty proper subset of Θ. Then,
the likelihood ratio statistic (LRS) for testing H0 : θ ∈ Θ0 ⊂ Rp vs H1 : θ 6∈ Θ0 is
defined as
max f (x|θ)
θ∈Θ0
˜ .
λ(x) =
max f (x|θ)
˜
θ∈Θ
˜
Note that if θ̂ ≡ MLE of θ over entire Θ
& θ̃ ≡ maximum of f (x|θ) over θ ∈ Θ0
˜
f (x|θ̃)
⇒ we may write λ(x) = ˜
˜
f (x|θ̂)
˜
Definition: A size α likelihood ratio test (LRT) for testing H0 : θ ∈ Θ0 ⊂ Rp vs
H1 : θ 6∈ Θ0 is defined as


1




γ
ϕ(x) =

˜



0
if λ(x) < k
˜
if λ(x) = k
˜
if λ(x) > k
˜
where γ ∈ [0, 1] and 0 ≤ k ≤ 1 are constants determined by maxθ∈Θ0 Eθ ϕ(X ) = α.
˜
1
Example: Let X1 , . . . , Xn be iid Gamma(α = 3, θ), θ > 0. Find a size α LRT for
H0 : θ = θ0 vs H1 : θ 6= θ0 .
2
Example: Let X1 , . . . , Xn be iid Exponential(θ, ν), θ > 0, ν ∈ R with common pdf
(
1 −(x−ν)/θ
if x ≥ ν
θe
f (x|θ, ν) =
0
otherwise
Find a size α LRT for H0 : ν = ν0 vs H1 : ν 6= ν0 (where ν0 ∈ R is fixed).
3
B.) Large Sample Properties of LRT Tests (for calibration)
The following result describes the asymptotic distribution of the likelihood ratio
statistic (under appropriate regularity conditions) & may be used to calibrate a
LRT in a simple fashion when the sample size is sufficiently large.
Theorem: Let X1 ,X2 , . . . be iid random vectors with common pdf/pmf f (x|θ),
˜
˜ ˜
θ ∈ Θ ⊂ R (the parameter θ can be vector-valued). Let λn (X1 ,X2 , . . . ,Xn ) denote the
˜ ˜
˜
likelihood ratio statistic based on X1 ,X2 , . . . ,Xn for testing H0 : θ ∈ Θ0 ⊂ Rp vs
˜ ˜
˜
H1 : θ 6∈ Θ0 , where Θ0 has the form
o
n
0
0
0
0
Θ0 = θ = (θ1 , . . . , θp ) ∈ Θ : θ1 = θ1 , . . . , θr = θr for some θ1 , . . . , θr and r ≤ p
|
{z
}
hypothesized values
for first r ≤ p parameters
Then, under the Cramér-Rao type regularity conditions, it holds that:
if H0 is true,
d
− 2 log λn (X1 ,X2 , . . . ,Xn ) −→ χ2r
˜
˜ ˜
as n → ∞.
Remark: The above limiting distribution suggests the following testing procedure
based on the (1 − α)-quantile of a χ2r distribution, denoted as χ21−α (r) for which
¡
¢
¡
¢
P χ2r ≤ χ21−α (r) = 1 − α, P χ2r > χ21−α (r) = α.
Namely, for n large (e.g., say n ≥ 30 observations),

1
if −2 log λn (X1 ,X2 , . . . ,Xn ) > χ21−α (r)
˜ ˜
˜
ϕ(X1 ,X2 , . . . ,Xn ) =

˜ ˜
˜
0
otherwise
is an approximate size α LRT for testing “H0 : θ1 = θ10 , . . . , θr = θr0 ” vs “H1 : θi 6=
θi0 for some 1 ≤ i ≤ r.”
4
Example: Let X1 ,X2 , . . . be iid N2 (µ, A) random vectors, where µ = (µ1 , µ2 ) ∈ R2
˜ ˜
and A is a known 2 × 2 positive definite matrix. Find a size α LRT for testing
H0 : 2µ1 + 3µ2 = 0 vs H1 : 2µ1 + 3µ2 6= 0.
5
C.) Bayes Tests
Let X1 , . . . , Xn have joint pdf/pmf f (x|θ), θ ∈ Θ ⊂ Rp , and we want to test
˜
H0 : θ ∈ Θ0 ⊂ Rp vs H1 : θ 6∈ Θ0 . Let
• π(θ) be a prior pdf
R
• P (θ ∈ Θ0 |x) = Θ0 fθ|x (θ)dθ ⇐ posterior probability that θ ∈ Θ0
˜
˜
R
• P (θ 6∈ Θ0 |x) = Θ\Θ0 fθ|x (θ)dθ ⇐ posterior probability that θ 6∈ Θ0
˜
˜
• Note that P (θ ∈ Θ0 |x) + P (θ 6∈ Θ0 |x) = 1
˜
˜
Then, a Bayes test for testing H0 : θ ∈ Θ0 vs H1 : θ 6∈ Θ0 is given by


 1 if P (θ 6∈ Θ0 |x) ≥ P (θ ∈ Θ0 |x)
 1 if P (θ 6∈ Θ0 |x) ≥ 1/2
˜
˜ =
˜
ϕ(x) =
 0 otherwise
 0 otherwise
˜
Discussion: The Bayes test follows from minimizing the Bayes Risk BRϕ1 of a
simple test ϕ1 (x) (i.e., tests where ϕ1 (x) ∈ {0, 1} for any x)
˜
˜
˜
• Consider a loss function L(θ, a) = I{θ∈Θ0 } I{a=1} + I{θ6∈Θ0 } I{a=0} where “a” may
assume two values: a = 1 means “reject H0 ” and a = 0 means “don’t reject
H0 ”. So, the loss L(θ, a) = 0 for a correct decision and L(θ, a) = 1 for an
incorrect decision:
L(θ, a) =
(
0
1
if θ ∈ Θ0 & a = 0 or if θ 6∈ Θ0 & a = 1
otherwise
• The risk function of a simple test ϕ1 (x) is
˜
Rϕ1 (θ) = Eθ L(θ, ϕ1 (X )) = I{θ∈Θ0 } Pθ (ϕ1 (X ) = 1) +I{θ6∈Θ0 } Pθ (ϕ1 (X ) = 0)
˜
˜
˜
|
{z
}
{z
}
|
prob. of Type I error
prob. of Type II error
R
• The Bayes risk of ϕ1 (x) w.r.t. π(θ) is: BRϕ1 = E(θ) Rϕ1 (θ) = Θ Rϕ1 (θ)π(θ)dθ
˜
& the Bayes test ϕ(x) minimizes BRϕ1 over all simple tests ϕ1 (x)
˜
˜
6
• Alternatively, we can find the Bayes test by minimizing the posterior risk of
a simple test ϕ1 (x) for each fixed x, where the posterior risk is
˜
˜
¶
Z µ
Eθ|x L(ϕ1 (x), θ) =
I{θ∈Θ0 } I{ϕ1 (x)=1} + I{θ6∈Θ0 } I{ϕ1 (x)=0} fθ|x (θ)dθ
˜
Θ
˜
˜
˜Z
˜
Z
= I{ϕ1 (x)=1}
fθ|x (θ)dθ + I{ϕ1 (x)=0}
fθ|x (θ)dθ
Θ0
Θ\Θ0
˜
˜
˜
˜
= I{ϕ1 (x)=1} P (θ ∈ Θ0 |x) + I{ϕ1 (x)=0} P (θ 6∈ Θ0 |x)
˜
˜
˜
˜
For each fixed x, we choose the values ϕ1 (x) = 1 or 0 of the test to minimize
˜
˜
the posterior risk; that is, for each fixed x, we should pick ϕ1 (x) = 1 if P (θ 6∈
˜
˜
Θ0 |x) ≥ P (θ ∈ Θ0 |x) and pick ϕ1 (x) = 0 if P (θ 6∈ Θ0 |x) < P (θ ∈ Θ0 |x). Note
˜
˜
˜
˜
˜
this is the same decision rule as the Bayes test ϕ(x) above.
˜
Example: Let X1 , . . . , Xn be iid N (θ, 1), θ ∈ R. Find the Bayes test for H0 : θ ≤ θ0
vs H1 : θ > θ0 under the N (µ, τ 2 ) prior for θ, where µ, τ 2 , θ0 are fixed.
7
Download