IEOR 165 Homework 3 Due March 19, 2015

advertisement
IEOR 165 Homework 3
Due March 19, 2015
Question 1. Use One Sample Kolmogorov Smirnov Test at significance level 0.05 to test
whether the observations come from a standard uniform distribution.
0.276 0.612 0.19 0.452 0.966 0.89 0.483 0.682
Discuss the advantages and the limitations of Kolmogorov Smirnov test.
Question 2. Let X1 , X2 , ..., Xn be i.i.d random variables, each with the same cumulative
distribution function FX (x) = P (Xi < x). Let Xmax = max{X1 , X2 , ..., Xn }. What is the
cdf of Xmax ?
Question 3. Suppose that X1 , X2 , ..., Xn form a random sample from a uniform distribution
on the interval (0, θ), and that the following hypotheses are to be tested:
H0 : θ ≥ 2
H1 : θ < 2
Let Xmax = max{X1 , X2 , ..., Xn }, and consider a test whose rejection region contains all the
outcomes for which Xmax ≤ 1.5.
a. Determine the power function of the test.
b. Determine the size of the test.
Question 4. Let Ai denote the categorical variables. Draw the simplicial complex associated to the null hypothesis below.
a. A1 , A2 and A3 are pairwise dependent but not jointly dependent.
b. A1 , A2 , A3 and A4 are jointly dependent and A5 is independent of A1 , A2 , A3 and A4 .
c. A1 and A2 are dependent and independent of A3 , A4 and A5 . A3 , A4 and A5 are jointly
dependent.
d. A1 , A2 , A3 , A4 and A5 are independent.
Question 5.
1
a. Let X1 , ..., Xn be iid with density
Pθ (X = x) = θx (1 − θ)1x
for x = 0, 1 and 0 ≤ θ ≤ 1/2
Find the MLE of θ.
b. Let Yi = aXi + ϵi where ϵi ∼ U nif orm(0, θ). Find the MLE of θ.
Question 6. Suppose iid data is from Xi ∼ N (µ, σ 2 ) where sigma2 = 1. Consider the null
hypothesis
H0 : µ = 2
H1 : µ = 1
Assume the power of the test is 0.90 and the significance level is 0.05, use a Monte Carlo
algorithm to determine the threshold k.
2
Solution 1.
Sorted Xi
F (Xi )
F̂ (Xi )
F̂ (Xi−1 )
|F̂ (Xi ) − F (Xi )|
|F̂ (Xi−1 ) − F (Xi )|
0.19
0.19
0.125
0
0.065
0.19
0.276
0.276
0.25
0.125
0.026
0.151
0.452 0.482 0.612 0.682 0.89 0.966
0.452 0.482 0.612 0.682 0.89 0.966
0.375
0.5 0.625 0.75 0.875 1.00
0.25 0.375 0.5 0.625 0.75 0.875
0.077 0.018 0.013 0.68 0.015 0.034
0.202 0.107 0.112 0.067 0.14 0.091
Dmax = 0.202
D0.05,8 = 0.457
Since Dmax < Dcritical , H0 cannot be rejected.
Advantages
Can work with very small samples
No class selection, subjectivity
Limitations
Only applies to continuous istributions
Assumes all parameters are known
More sensitive near the center of the
distribution than at the tails.
Solution 2.
FXmax (x) =
=
=
=
P (Xmax < x)
P (X1 < x, X2 < x, ..., Xn < x)
P (X1 < x)P (X2 < x)...P (Xn < x)
FX (x)n
Solution 3.
a.
β(θ) = Pθ ({X1 , X2 , ..., Xn } ∈ R)
= Pθ (max{X1 , X2 , ..., Xn } ≤ 1.5)
= Pθ (X ≤ 1.5)n whereX ∼ U nif (0, θ)
{
1( )
if θ ≤ 1.5
=
1.5 n
if
θ > 1.5
θ
b.
α = sup β(θ)
θ∈H0
( )n ( )n ( )n
1.5
3
1.5
=
=
= sup
θ
2
4
θ≥2
3
Solution 4.
a.
b.
c.
d.
Solution 5.
a. The likelihood function is
Ln (θ) =
n
∏
Pθ (X = Xi ) = θ
i=1
4
∑n
i=1
Xi
(1 − θ)n−
∑n
i=1
Xi
The log-likelihood function is
( n
)
(
)
n
∑
∑
ℓn (θ) =
Xi log θ + n −
Xi log (1 − θ)
i=1
i=1
By taking the first derivative and setting it as 0, we have
∑n
∑
n − ni=1 Xi
dℓn
i=1 Xi
(θ) =
−
= 0 ⇒ θ = Xn
dθ
θ
1−θ
Also,
d2 ℓn
(θ) = −
dθ2
∑n
i=1
θ2
Xi
∑
n − ni=1 Xi
−
<0
(1 − θ)2
so ℓn (θ) has a global maximum at θ = X n .
However, since 0 ≤ θ ≤ 1/2, ℓn (θ) can only achieve the global maximum when 0 ≤
X n ≤ 1/2.
We know that X n ≥ 0. When X n > 1/2,
∑n
∑
∑n
Xi − θn
n − ni=1 Xi
dℓn
Xn − θ
i=1 Xi
(θ) =
−
= i=1
=
>0
dθ
θ
1−θ
θ(1 − θ)
nθ(1 − θ)
So ℓn (θ) is an increasing function for θ. Hence ℓn (θ) would have its maximum at θ =
1/2. Therefore, the MLE of θ is θ̂ = min{X, 1/2}.
b. The likelihood function is
( )n
1
I(min(Yi − aXi ) ≥ 0)I(max(Yi − aXi ) ≤ θ)
Ln (θ) =
θ
therefore, the MLE of θ is θ̂ = max(Yi − aXi )
Solution 6.
alpha=.05; %desired size
beta=0.9; %desired power
M=100; %repetition count
gamma=zeros(1,M);
delta=zeros(1,M);
H1=1; %alternative hypothesis
H0=2; %null hypothesis
sigma=1; %stdev
N=2:50; %sample size vector
c=0.5:0.5:20; %threshold vector
a=zeros(1,length(c));
b=zeros(1,length(c));
sample_size=0;
threshold=0;
for j=1:length(N)
5
for k=1:length(c)
for s=1:M
X=normrnd(H0,sigma,[1 N(j)]);
Y=normrnd(H1,sigma,[1 N(j)]);
LX=exp(-sum((X-H1).^2)/2+sum((X-H0).^2)/2);
LY=exp(-sum((Y-H1).^2)/2+sum((Y-H0).^2)/2);
if LX>c(k)
delta(s)=1;
else
delta(s)=0;
end
if LY>c(k)
gamma(s)=1;
else
gamma(s)=0;
end
end
a(k)=sum(delta)/M;
if a(k)>alpha %to find the max a value less than alpha
a(k)=0;
end
b(k)=sum(gamma)/M;
end
[maxim, lambda]=max(a);
if b(lambda)>=beta
sample_size=N(j);
threshold=c(lambda);
return
end
end
6
Download