LINK ¨ OPINGS UNIVERSITET 732A23 THEORY OF STATISTICS, 6 CDTS Institutionen f¨

advertisement
LINKÖPINGS UNIVERSITET
Institutionen för datavetenskap
Statistik, ANd
732A23 THEORY OF STATISTICS, 6 CDTS
Master’s program in Statistics and Data Mining
Spring semester 2011
Examples of tasks that may appear in exam
Task 1
Assume a sample of observations x = (x1 , . . . , xn ) comes from the Beta distribution with
p.d.f.
1
xa−1 (1 − x)b−1 , 0 < x < 1
f (x; a, b) =
B(a, b)
R 1 a−1
where B(a, b) = 0 y (1 − y)b−1 dy, the Beta function.
(a) Write down minimal sufficient statistics for the two-dimensional parameter (a, b)
Assume a > 1 and b = 2.
(b) Find the equation that gives the maximum likelihood estimator of a.
(c) Assume the maximum likelihood estimate for a sample of 20 values is âM L = 2.5.
Find an approximate 95% confidence interval for a that can be calculated from sample
values.
Now, suppose that we are not able to observe values in a sample above 0.9. In a sample of
10 observations 7 observations are 0.7, 0.6, 0.5, 0.8, 0.7, 0.5, 0.5. The remaining 3 are all
above 0.9. (a > 1, b = 2 is still assumed).
(d) Find the equation that gives the maximum likelihood estimate of a.
Now assume b = 1 and that the first 7 values above consitutes the available sample.
(b) Find the numerical value of the method-of-moments estimate of a
Task 2
Suppose x1 , x2 , . . . , xn form is a random sample from N (2, σ 2 ).
P
(a) Show that the sample variance s2 = (n − 1)−1 ni=1 (xi − x̄)2 does not attain the
Cramér-Rao lower bound for n < ∞, but when n → ∞, the bound is attained.
P
(b) For what value of c does the point estimator σb2 = c ni=1 (xi − x̄)2 have the smallest
mean square error?
Task 3
Let x = (x1 , . . . , xn ) be a random sample from the Poisson distribution with p.m.f.
f (x; λ) =
λx −λ
e , x = 0, 1, . . . ; λ > 0
x!
(a) Find the form of the best test for H0 : λ = λ0 vs. H1 : λ = λ1 , where λ1 < λ0 .
(b) If the size of the test is supposed to be 5%, give the equation for finding the critical
region of the best test when λ0 = 2 and λ1 = 1.
1
(c) Is there a UMP test for H0 : λ = λ0 vs. H1 : λ 6= λ0 ? Motivate your answer.
(d) Find the MLRT for H0 : λ ≤ 1 vs. H1 : λ > 1 for the sample x=(2, 4, 3). Use an
asymptotic approximation and state whether the test is significant at a size of 5%.
(e) Find the Wald test for H0 : λ = 1 vs. H1 : λ 6= 1
Task 4
Let x=(3.5, 3.9, 2.8) be a random sample from a N (µ, σ = 2)-distribution. Assume a prior
density for µ is N (3, σ = 2)
(a) Find the Bayes estimator of µ under absolute error loss. Calculate its value.
Suppose now that the cost of taking an observation is 3 units of cost. If the observation is
below 3 we are payed back 3 units, otherwise we’ll have to pay another 3 units. Let δ1 be
the decision to take another observation and δ2 the decision to take no more observations.
(b) Which of δ1 and δ2 is the minimax procedure? Motivate your answer.
(c) Find expressions for the Bayes risk for each of δ1 and δ2 .
Task 5
Suppose we are about to draw conclusions about an unknown proportion π of individuals
with a certain property in a large population. Our prior information about about π can
be expressed with a Beta distribution with parameters 2 and 5, i.e. the prior density is
(1/B(2, 5))π(1 − π)4 (See task 1 for further information about the beta function). In a
sample of 50 individuals from the population, 20 individuals have the specific property.
(a) Find a 90% equal-tailed credible interval for π.
(b) Determine the posterior odds of H0 : π = 0.4 vs. H1 : π 6= 0.4 when the prior odds is
1 against 10.
Suppose now that we wish to extend our sample with one observation.
(c) Find an expression for the predictive p.m.f. of this new observation conditioned on the
sample already obtained.
Task 6
An experiment is conducted in which two brands of tomatoes should be compared. In each
of 15 jars a seed from each brand is planted. Then the number of hours until a sprout (first
sign of plant) is detectable is counted for each of the seeds. This is referred to as the time
of growth. The following result is obtained:
2
Jar
1
2
3
4
5
6
7
89
10
11
12
13
14
15
Number of hours
Brand 1 Brand 2
46
55
52
49
61
71
46
53
54
58
49
49
50
55
56
55
47
49
58
64
67
55
49
51
43
56
60
60
(a) Use a sign test to decide upon whether Brand 2 has more than 5 hours longer median
time of growth than Brand 2.
(b) Use a Wilcoxon signed-rank test to make the analogous decision as in a).
(c) Discuss under what circumstances you probably would be convinced that a paired
t-test would have higher efficiency than the sign test for the decision making in a).
3
Download