Homework 4 Solutions

advertisement
b. Using the normal approximation, we have µv = r(1 − p)/p = 20(.3)/.7 = 8.57 and
�
�
2
σv = r(1 − p)/p = (20)(.3)/.49 = 3.5.
Then,
P (Vn = 0) = 1 − P (Vn ≥ 1) = 1 − P
!"#$%&'()*+),(-)(.#/012(3)(.4"1(45(-6(.4"1($7(
�
Vn −8.57
1−8.57
≥
3.5
3.5
10-705:
Intermediate
Statistics
Another
way to approximate
this probability is
�
�
�
= 1 − P (Z ≥ −2.16) = .0154.
Fall 2012
0−8.57
V − 8.574 Solutions
Homework
≤
= P (Z ≤ −2.45) = .0071.
P (Vn = 0) = P (Vn ≤ 0) = P
Lecturer: Larry Wasserman
3.5
3.5
TA: Wanjie Wang, Haijie Gu
Continuing in this way we have P (V = 1) = P (V ≤ 1) − P (V ≤ 0) = .0154 − .0071 = .0083,
etc.
�
�
(k−.5)−8.57
c.
With
the continuity
≤ Z ≤ (k+.5)−8.57
, so
Problem 1[C&B]
5.39correction, compute P (V = k) by P
3.5
3.5
P (V = 0) = P (−9.07/3.5 ≤ Z ≤ −8.07/3.5) = .0104 − .0048 = .0056, etc. Notice that the
continuity correction gives some improvement over the uncorrected normal approximation.
5.39 a. If h is continuous given � > 0 there exits δ such that |h(xn ) − h(x)| < � for |xn − x| < δ. Since
X1 , . . . , Xn converges in probability to the random variable X, then limn→∞ P (|Xn − X| <
δ) = 1. Thus limn→∞ P (|h(Xn ) − h(X)| < �) = 1.
b. Define the subsequence Xj (s) = s + I[a,b] (s) such that in I[a,b] , a is always 0, i.e, the subsequence X1 , X2 , X4 , X7 , . . .. For this subsequence
�
s
if s > 0
Xj (s) →
s + 1 if s = 0.
5.41 a. Let � = |x − µ|.
(i) For x − µ ≥ 0
Problem 2[C&B] 5.44
P (|Xn − µ| > �) =
=
≥
=
!"#$%&'()*88,(-)(.#/012(3)(.#/012(&49:7!
(
P (|Xn − µ| > x − µ)
P (Xn − µ < −(x − µ)) + P (Xn − µ > x − µ)
P (Xn − µ > x − µ)
P (Xn > x) = 1 − P (Xn ≤ x).
Therefore, 0 = limn→∞ P (|Xn −µ| > �) ≥ limn→∞ 1−P (Xn ≤ x). Thus limn→∞ P (Xn ≤
x) ≥ 1.
(ii) For x − µ < 0
P (|Xn − µ| > �) =
=
≥
=
P (|Xn − µ| > −(x − µ))
P (Xn − µ < x − µ) + P (Xn − µ > −(x − µ))
P (Xn − µ < x − µ)
P (Xn ≤ x).
Therefore, 0 = limn→∞ P (|Xn − µ| > �) ≥ limn→∞ P (Xn ≤ x).
By (i) and (ii) the results follows.
b. For every � > 0,
!
P (|Xn − µ| > �) ≤ P (Xn − µ < −�) + P (Xn − µ > �)
= P (Xn < µ − �) + 1 − P (Xn ≤ µ + �) → 0 as n → ∞.
1
f (x1 , ..., xn |θ) =
n
�
P
(eiθ−xi I[iθ,∞) (xi )) = eθ(
i=1
P
i
i) −
e
P
i
xi
I(xi ≥ iθ ∀ i) =
= e�− ��i x�i eθn(n+1)/2 I(mini (xi /i) ≥ θ)
�
��
�
→
−
−
g(T (→
x )|θ)
h( x )
−
where T (→
x ) = mini (xi /i). Therefore, by the Factorization Theorem(C&B Theorem 6.2.6),
T = mini (Xi /i) is a sufficient statistic for θ.
Problem 3[C&B] 6.5
Problem 2. C&B 6.5 (20 points)
Similarly to the previous problem we can write the joint pdf to identify a sufficient statistic.
f (x1 , x2 , ..., xn |θ) =
=
�
1
2θ
�n
n
�
1
I (−i(θ − 1) ≤ xi ≤ i(θ + 1)) =
2iθ
i=1
1
−
I(mini (xi /i) ≥ −(θ − 1))I(maxi (xi /i) ≤ (θ + 1)) = g(T (→
x )|θ)
n!
−
where T (→
x ) = (mini (xi /i), maxi (xi /i)). Therefore, (mini (Xi /i), maxi (Xi /i)) is a two-dimensional
sufficient statistic for θ.
Problem 5. C&B 7.1 (10 points) One observation is taken on a discrete random variable
Problem
3. C&B
6.8θ (20
X with
pmf f (x|θ),
where
∈ {1,points)
2, 3}. Find the MLE of θ.
Problem
4 pdf of X1 , X2 , ..., Xn is:
The joint
The likelihood ratio of two samples
{Xi } and
{Yni }f (x|3)
is
x fn(x|1)
f (x|2)
�
�
→
−
f ( x |θ) =
f (x − θ)1 = n f (x(i) − θ)
1 ni
0 ≤ Yi ≤ θ + 1)
πi=1 I(θ
; θ) 4 i=1
3
n n 0 i=1p(y
R(x , y ; θ) =
=
n
n
p(x
; θ) 1 Therefore,
πi=1 I(θ by
≤ the
Xi ≤
θ + 1)
where x(i) is the i-th observation in magnitude.
Factorization
Theorem the
1
1
0
4
order statistics T (X1 , X2 , ..., Xn ) = (X(1) , 3X(2) , ..., X
)
are
sufficient
for
θ.
(n)
n n
→
−
x |θ) max(X) = max(Y ).
R(xSince
, y ; fθ)can
does
not pdf,
depend
on θ if
and only
if min(X)
= min(Y )f (and
1 In particular,
be any
no further
is1possible.
is independent
→
−
2 reduction
0
f
(
y |θ)
4
4
→
−
→
−
Therefore
the
minimal
sufficient
statistics
is
T
(X)
=
{min(X),
max(X)}.
of θ if and only if T ( x ) = T ( y ) so by C&B Theorem 6.2.13 the order statistics are minimal
1
1
1
1
6
0
1
4
3 reduction
sufficient. The question about further
is a4 little vague
here, so we are going to ignore it
6
2
Problem
5[C&B] 7.1
when grading.
4
Solution
1. x = 0, the likelihood L(θ) = 13 I(θ = 1) + 14 I(θ = 2) + 0 · I(θ = 3) = 13 I(θ = 1) + 14 I(θ = 2),
therefore, the MLE θ̂ = 1;
2. x = 1, L(θ) = 13 I(θ = 1) + 14 I(θ = 2), θ̂ = 1;
3. x = 2, L(θ) = 14 I(θ = 2) + 14 I(θ = 3), θ̂ = 12 or θ̂ = 3;
4. x = 3, L(θ) = 16 I(θ = 1) + 14 I(θ = 2) + 12 I(θ = 2), θ̂ = 3;
5. x = 4, L(θ) = 16 I(θ = 1) + 14 I(θ = 3), θ̂ = 3.
Finally,

X = 0, 1;
 1
2 or 3 X = 2;
θ̂ =

3
X = 3, 4.
Problem 6. C&B 7.6 (10 points)
Let X1 , ..., Xn be a random sample from the pdf
f (x|θ) = θx−2 , 0 < θ ≤ x < ∞.
(a) What is a sufficient statistic for θ?
(b) Find the MLE of θ.
(c) Find the method of moments estimator of θ.2
Solution
(a) Joint likelihood
n
n
3
X = 3, 4.
Problem 6. C&B 7.6 (10 points)
Let X1 , ..., Xn be a random sample from the pdf
f (x|θ) = θx−2 , 0 < θ ≤ x < ∞.
(a) What is a sufficient statistic for θ?
Problem
(b) Find 6[C&B]
the MLE of7.6
θ.
(c) Find the method of moments estimator of θ.
Solution n
(b) Since Qnθ x2 is strictly increasing w/ θ, and I(θ � mini xi ) puts an upper cutoff beyond which
i=1
i
(a) Joint
L(θ) likelihood
varnishes, therefore, the MLE θ̂M LE = mini Xi .
n
�
�∞
θ� ∞ θ
θn
�n thus
L(θ)
=
�
x
)
=
min x
(c) E[|Xi |] = E[Xi ] = θ xf
(x|θ)dx
= 2 I(θ
dx
diverges,
we�can’t
get
i
i ). an expression for any
2 I(θ
i
xi θ x
i=1 xi
i=1doesn’t
moment, therefore, the MOME
exist.
Take h(x) = Qn 1 x2 , g(θ, T (x)) = θn I(θ � mini xi ), then by the Factorization Theorem the
i=1 i
sufficient statistic is T = mini Xi .
n
7.7 increasing
(10 points)
(b)Problem
Since Qnθ 7.x2C&B
is strictly
w/ θ, and I(θ � mini xi ) puts an upper cutoff beyond which
i=1
Let X1 , ..., Xni be iid with one of two pdfs. If θ = 0, then
L(θ) varnishes, therefore, the MLE θ̂M LE = mini Xi .
�
�∞
� ∞ θ 1 3 if 0 < x < 1
f (x|θ)
(c) E[|Xi |] = E[Xi ] = θ xf (x|θ)dx
= =θ x dx diverges, thus we can’t get an expression for any
0 otherwise,
moment, therefore, the MOME doesn’t exist.
while if θ = 1, then
�
√
1/(2 x) if 0 < x < 1
f (x|θ) =
0
otherwise,
Problem
Problem7[C&B]
7. C&B 7.7
7.7 (10 points)
Let Xthe
Xn be
Find
MLE
of iid
θ. with one of two pdfs. If θ = 0, then
1 , ...,
�
Solution
1 if 0 < x < 1
f (x|θ) =
0 otherwise,
L(0|�x) = 1, 0 < xi < 1
while if θ = 1, then
� �
n
√
1/(2 1x) , if
0<
L(1|�
x
)
=
0<
xi x<<1 1
√
f (x|θ) =
2
x
0
otherwise,
i
i=1
�n
Find
the
MLE
of
θ.
Then, θ̂M LE = 0, when 1 ≥ i=1 2√1xi I(xi ∈ (0, 1)), and θ̂M LE = 1 otherwise.
Solution
Problem 8. C&B 7.12 (a,b) (10
points)
L(0|�
x) = 1, 0 < xi < 1
Let X1 , ..., Xn be a random sample from a population with pmf
n
�
1
L(1|�
x
)
=
√ , 0 < xi < 1
1
x
1−x
Pθ (X = x) = θ (1 −i=1
θ) 2 ,xi x = 0 or 1, 0 ≤ θ ≤ .
2
�n
1
√
Then,
θ̂
=
0,
when
1
≥
I(x
∈
(0,
1)),
and
θ̂
=
1
otherwise.
LE method of moments
i and the MLE of
M LE
i=1 estimator
2 xi
(a) FindMthe
θ.
(b) Find the mean squared errors of each of the estimators.
Problem 8. C&B 7.12 (a,b) (10 points)
Solution
Let EX
X1 , =
...,θ.
XnTherefore,
be a random
sample
from a population with pmf
(a)
θ̂M OM
E = X̄.
11
x
1−x
n−nx̄
PP
θ (X
(θ|�x=) x)
= θ=nx̄θ(1(1−−θ)θ)
,, xxi==00,or1 1,
, 00≤≤θθ≤≤ .
22
θ̂(a)
maximizes
L(θ|�
as well asestimator
log(L(θ|�xand
)). (see
Example
Find
the method
ofx)moments
the MLE
of θ.7.2.7 in C&B)
M LE
(b) Find the mean squared errors of each of the estimators.
∂
∂
log(L(θ|�x)) =
[nx̄ log θ + (n − nx̄) log(1 − θ)] =
Solution
∂θ
∂θ
(a) EX = θ. Therefore, θ̂M OM E = X̄. 1
1
= nx̄ − (n − nx̄)
θ
1−θ
1
P (θ|�x) = θnx̄ (1 − θ)n−nx̄ , xi = 0, 1 , 0 ≤ θ ≤
The derivative is 0, when
2
θ̂M LE maximizes L(θ|�x) as well as log(L(θ|�x)). (see
Example 7.2.7 in C&B)
3 nx̄)θ
nx̄(1 − θ) = (n −
∂
∂
log(L(θ|�x)) =
[nx̄ log θ + (n − nx̄) log(1 − θ)] =
∂θ
∂θ
4
1
1
= nx̄ − (n − nx̄)
fZ1 ,Z 2 (z1 , z2 ) can be factored into two densities. Therefore Z1 and Z2 are independent and
Z1 ∼ gamma(r + s, 1), Z2 ∼ beta(r, s).
4.25 For X and Z independent, and Y = X + Z, fXY (x, y) = fX (x)fZ (y − x). In Example 4.5.8,
fXY (x, y) = I(0,1) (x)
1
I(0,1/10) (y − x).
10
In Example 4.5.9, Y = X 2 + Z and
1
1
Problem 8[C&B]
7.14
fXY (x, y)
= fX (x)fZ (y − x2 ) = I(−1,1) (x) I(0,1/10) (y − x2 ).
2
10
4.26 a.
P (Z ≤ z, W = 0)
=
P Homework
(min(X, Y ) ≤ z, 2
Y –
≤ STAT
X) = 543
P (Y ≤ z, Y ≤ X)
� �
z
∞
1 −y/µ26 by 5:00 pm (TA’s office);
1 −x/λ
On campus: Due
Friday,
January
e
e
dxdy
=
λ assignment
µ
you also may turn
in class on the same Friday
0
yin the
�
� �
� ��
Distance students: Due
λ Wednesday, January
1
1 31 by 12:00 pm (TA’s email)
=
1. Problem 7.1, Casella & Berger
1 − exp −
µ+λ
µ
+
λ
z
.
Similarly,
2. P
Problem
7.12(a),
& Berger
(Z ≤ z,W
=1) Casella
= P (min(X,
Y ) ≤ z, X ≤ Y ) = P (X ≤ z, X ≤ Y )
�
z
�
∞
1
1
= & Bergere−x/λ e−y/µ dydx =
3. Problem 7.14, Casella
λ
µ
0
x
b.
µ
µ+λ
�
� �
� ��
1
1
1 − exp −
+
z
.
µ λ
Hint: You should be able to show that the joint density of (Z, W ) is
�
�
�
∞
∞
−1 −z (λ−1 +µ−1 )
λ
1 µ
−x/λe1 −y/µ
dxdy
= z>
. 0, w = 0
P (W =dF
0) (z,
= Pw)
(Y =≤fX)
(z,=w|λ, µ) = λ e −1 µ−ze(λ−1 +µ
−1
µ+λ
)
0
y
dz
λ e
z > 0, w = 1
P (W = 1) = 1 − P (W = 0) =
where F (z, w|λ, µ) = P (Z ≤ z, W = w|λ, µ)
µ
.
µ+λ
� �
� �
1
1
P (Z
≤ z) =Casella
P (Z ≤&
z, Berger
W = 0) + P (Z ≤ z, W = 1) = 1 − exp −
+
z .
4. Problem
7.15(a),
n
µ X
λ
Pn
Pn
1 1
The parameters
λw>
n inverse Gaussian satisfy
− µ,
−n+ i=1 wi
i 0.
i=1
Therefore,
P (Z
≤
z,
W f=(zi)i ,=
(Zµ)
≤=
z)Pλ(W
= i),
i = 0, 1, zexp{−(
> 0. So Z and
L(λ,
µ) of
= the
πi=1
wiP|λ,
µfor
zi )( W +are )}
λ µ
i=1
5. independent.
Problem 7.46, Casella & Berger (skip part(c))2
4.27 From Theorem 4.2.14 we know UX
∼ n(µ + γ, 2σ ) and V ∼
n(µ − γ, 2σ 2 ). ItX
remains 1to show
X
1
that
they are
independent.
Exercise
4.24.+random
−( aswiin) log
λ + (−n
wi )variables
log µ −with
( pdf
zi )(f (x|β).
+ Here,
)
6.
Suppose
Xlog
. , Xnµ)are=Proceed
independent,
exponential(β)
we’ll suppose
1 , . .L(λ,
λ
µ
2
that each Xi represents
the1 lifetime
of a 2new
type of AAA battery being tested in a development lab (that
1 −
+(y−γ)
]
fXY (x, y) =
e 2σ2 [(x−µ)
(by
sofXY
fX fY )
2
Take
derivative
with
respect
to
µ
and
λ
and
setindependence,
them
to zero,
weith=have
is, Xi is the duration
in hours,
of the
tested
battery). Due to time con2πσof use until the battery is “dead,”
1fixed time P
1 tested
P
P
P
straints,
the
batteries
can
only
be
until
a
point
t
at
which
the
test
will be stopped. We will
s
Let u = x + y, v = x − y, then x = 2 (u + v), y = 2 (u − v) and
−
w
−n
+
w
z
z
i
i
i
i
�
�
measure or observe lifetimes Xi =+x� i for some
that “die
before
= 0,batteries
+ out”
= 0 :ts ; for other batteries that do
1 λ∗
∗ 2 1/2 ��
∗2
� 1/2
µ∗ |J|
µ
λ
=
.
not die out before the test ends,
we =
will
only
know
that
the
battery’s
life
X
is
greater than the stopping point ts .
i
� 1/2 −1/2 �
2
P
Therefore
0 <is mapped
wi <onto
n) is:
The
set {−∞the
< xMLE
< ∞, (assuming
−∞ < y < ∞}
the set {−∞ < u < ∞, −∞ < v < ∞}.
Suppose
in
the
sample,
we
observe
exact
lifetimes
given by X1 = x1 , . . . , Xr = xr , which correspond to the r
Therefore
P
P
� u+v0 ≤ r2 ≤
�
batteries failing before time
ts ,1where
n.
n
−
r batteries
(i.e., lifetimes Xr+1 , . . . , Xn )
2
z
zi
u−vThe remaining
i
1
1 −
∗
(∗( =
(( 2 ,)−γ ) λ̂
2 )−µ) +
2σ 2 µ
P
P
µ̂
=
=
λ
=
e
·
f
(u,
v)
=
U
V
survived past time ts (so
2 don’t have observed values xr+1
2πσwe
2 , . . . , xn for
n − wi
withese).
�
�
2
2
2
2
(µ+γ)
(µ+γ)
1 − 2σ12 2( u2 ) −u(µ+γ)+ 2 +2( v2 ) −v(µ−γ)+ 2
e
We would like to find4πσ
the 2MLE for the exponential parameter β by writing out a likelihood function L(β) for
1
1 above
− 1do not fall easily2into the−framework
the data. The data=mentioned
for STAT
2 543 and it may not be clear
g(u)
e 2(2σ2 ) (u − (µ + γ)) · h(v)e 2(2σ2 ) (v − (µ − γ)) .
2
4πσ (since for n−r observations in the sample, we don’t really know their values).
how to formulate the likelihood
�
By Itthe
factorization
V are independent.
turns
out that wetheorem,
can writeU aand
likelihood
function for β as L(β) = ni=1 Ci where, for each i = 1, . . . , n, we
=
let Ci = f (xi |β) if we observe a failure time xi for the variable Xi and set Ci = Pβ (Xi > ts ) if we have no
observed failure time for Xi (i.e., Xi takes on some value beyond ts so we use the probability of that event).
(Note: we won’t worry about how this likelihood arises. You can take “STAT 533” to learn about reliability
testing and failure time data.)
(a) Given this definition of the likelihood above and a random sample of size n containing r failures with
�
observed lifetimes X1 = x1 , . . . , Xr = xr (where 0 ≤ r ≤ n), write out L(θ) = ni=1 Ci for β in terms
�r 4
of r, n and the “total time on test” T = i=1 xi + (n − r)ts .
(b) If r > 0, find the MLE of β
(c) When r = 0 (no failures observed), does the MLE exist? Explain.
Problem 9
(a) The posterior of µ under prior N (a, b2 ) is
µ|X ∼ N (
nb2 X̄ + σ 2 a
σ 2 b2
,
)
σ 2 + nb2 σ 2 + nb2
Under squared loss, the bayes estimator is just the posterior mean. Hence,
µ̂ =
Let η =
σ2
,
σ 2 +nb2
σ2
nb2
X̄
+
a
σ 2 + nb2
σ 2 + nb2
(1)
we can rewrite Eq. 1 as
µ̂ = (1 − η)X̄ + ηa
(2)
(b) The risk
R(µ, µ̂) = M SE(µ̂) = Bias2 (µ̂) + V ar(µ̂)
σ2
=(ηa + ((1 − η) − 1)µ)2 + (1 − η)2
n
2
σ
=(η(a − µ))2 + (1 − η)2
n
(c) The sup of the risk is infinity. By making µ arbitrarily far from a, the risk goes unbounded.
(d) The Bayes risk
Z
B(π, µ̂) = R(µ, µ̂)π(µ)dµ
Z
σ2
= [(1 − η)2 + η 2 (µ − a)2 ]π(µ)dµ
n
Z
2
2σ
2
=(1 − η)
+η
(µ − a)2 π(µ)dµ
n
σ2
=(1 − η)2 + η 2 b2
(π(µ) ∼ N (a, b2 ))
n
σ2
σ2
σ2
= − 2η + η 2 (b2 + )
n
n
n
2
2
2
2
2
σ
σ
σ
2 σ + nb
= − 2η + ( 2
)
n
n
σ + nb2
n
2
σ
σ 2 b2
= (1 − η) = 2
n
σ + nb2
5
Problem 10. (10 points)
In class, we found the minimax estimator for the Bernoulli. Here, you will fill in the details.
Let X1 , ..., Xn ∼ Bernoulli(p). Let L(p, p̂) = (p − p̂)2 .
(a) Let p̂ be the Bayes estimator using a Beta(α, β) prior. Find the Bayes estimator.
(b) Compute the risk function.
(c) Compute the Bayes risk.
Problem
(d) Find 10
the α and β to make the risk constant and hence find the minimax estimator.
Solution
iid
(a) Bayes estimator under square error loss L(p, p̂) = (p − p̂)2 is the posterior �
mean. Xi ∼
Bernoulli(p), p ∼ Beta(α, β) are conjugate,Pthe posterior is p|X ∼ Beta(α + i Xi , β + n −
�
α+ i Xi
i Xi ). Therefore, Bayes estimator p̂ = α+β+n .
(b) Risk function for p̂
R(p, p̂) = Ep [L(p, p̂)] = M SE(p̂)
= (E[p̂] − p)2 + V [p̂]
α + np
np(1 − p)
= (
− p)2 +
α+β+n
(α + β + n)2
(α(1 − p) − βp)2
np(1 − p)
=
+
(α + β + n)2
(α + β + n)2
(c) Bayes risk for p̂
B(π, p̂) =
=
=
=
=
=
�
R(p, p̂)π(p)dp
�
1
α 2
[(α + β)2 (p −
) + np − np2 ]π(p)dp
(α + β + n)2
α+β
1
αβ
nα
αβ
α2
[(α + β)2
+
− n(
+
)]
2
2
2
(α + β + n)
(α + β) (α + β + 1) α + β
(α + β) (α + β + 1) (α + β)2
1
αβ
nα
nα(α + 1)
[
+
−
]
(α + β + n)2 α + β + 1 α + β
(α + β)(α + β + 1)
1
αβ
nαβ
[
+
]
(α + β + n)2 α + β + 1 (α + β)(α + β + 1)
αβ
(α + β)(α + β + 1)(α + β + n)
(d) The risk
R(p, p̂) =
(α(1 − p) − βp)2
np(1 − p)
1
+
=
{p2 [(α+β)2 −n]+p[n−2α(α+β)]+α2 }
(α + β + n)2
(α + β + n)2
(α + β + n)2
is a 2nd order polynomial of p. To make it constant, set
�
�
(α + β)2 − n = 0;
α=
=⇒
n − 2α(α + β) = 0.
β=
Thus p̂m =
P
α+ i Xi
α+β+n
=
√
P
n/2+ i Xi
√
n+n
is the minimax estimator.
6
6
√
n
;
√2
n
2 .
Download