advertisement

LINKÖPINGS UNIVERSITET Institutionen för datavetenskap Statistik, ANd 732A23 THEORY OF STATISTICS, 6 CDTS Master’s program in Statistics and Data Mining Spring semester 2011 Examples of tasks that may appear in exam Task 1 Assume a sample of observations x = (x1 , . . . , xn ) comes from the Beta distribution with p.d.f. 1 xa−1 (1 − x)b−1 , 0 < x < 1 f (x; a, b) = B(a, b) R 1 a−1 where B(a, b) = 0 y (1 − y)b−1 dy, the Beta function. (a) Write down minimal sufficient statistics for the two-dimensional parameter (a, b) Assume a > 1 and b = 2. (b) Find the equation that gives the maximum likelihood estimator of a. (c) Assume the maximum likelihood estimate for a sample of 20 values is âM L = 2.5. Find an approximate 95% confidence interval for a that can be calculated from sample values. Now, suppose that we are not able to observe values in a sample above 0.9. In a sample of 10 observations 7 observations are 0.7, 0.6, 0.5, 0.8, 0.7, 0.5, 0.5. The remaining 3 are all above 0.9. (a > 1, b = 2 is still assumed). (d) Find the equation that gives the maximum likelihood estimate of a. Now assume b = 1 and that the first 7 values above consitutes the available sample. (b) Find the numerical value of the method-of-moments estimate of a Task 2 Suppose x1 , x2 , . . . , xn form is a random sample from N (2, σ 2 ). P (a) Show that the sample variance s2 = (n − 1)−1 ni=1 (xi − x̄)2 does not attain the Cramér-Rao lower bound for n < ∞, but when n → ∞, the bound is attained. P (b) For what value of c does the point estimator σb2 = c ni=1 (xi − x̄)2 have the smallest mean square error? Task 3 Let x = (x1 , . . . , xn ) be a random sample from the Poisson distribution with p.m.f. f (x; λ) = λx −λ e , x = 0, 1, . . . ; λ > 0 x! (a) Find the form of the best test for H0 : λ = λ0 vs. H1 : λ = λ1 , where λ1 < λ0 . (b) If the size of the test is supposed to be 5%, give the equation for finding the critical region of the best test when λ0 = 2 and λ1 = 1. 1 (c) Is there a UMP test for H0 : λ = λ0 vs. H1 : λ 6= λ0 ? Motivate your answer. (d) Find the MLRT for H0 : λ ≤ 1 vs. H1 : λ > 1 for the sample x=(2, 4, 3). Use an asymptotic approximation and state whether the test is significant at a size of 5%. (e) Find the Wald test for H0 : λ = 1 vs. H1 : λ 6= 1 Task 4 Let x=(3.5, 3.9, 2.8) be a random sample from a N (µ, σ = 2)-distribution. Assume a prior density for µ is N (3, σ = 2) (a) Find the Bayes estimator of µ under absolute error loss. Calculate its value. Suppose now that the cost of taking an observation is 3 units of cost. If the observation is below 3 we are payed back 3 units, otherwise we’ll have to pay another 3 units. Let δ1 be the decision to take another observation and δ2 the decision to take no more observations. (b) Which of δ1 and δ2 is the minimax procedure? Motivate your answer. (c) Find expressions for the Bayes risk for each of δ1 and δ2 . Task 5 Suppose we are about to draw conclusions about an unknown proportion π of individuals with a certain property in a large population. Our prior information about about π can be expressed with a Beta distribution with parameters 2 and 5, i.e. the prior density is (1/B(2, 5))π(1 − π)4 (See task 1 for further information about the beta function). In a sample of 50 individuals from the population, 20 individuals have the specific property. (a) Find a 90% equal-tailed credible interval for π. (b) Determine the posterior odds of H0 : π = 0.4 vs. H1 : π 6= 0.4 when the prior odds is 1 against 10. Suppose now that we wish to extend our sample with one observation. (c) Find an expression for the predictive p.m.f. of this new observation conditioned on the sample already obtained. Task 6 An experiment is conducted in which two brands of tomatoes should be compared. In each of 15 jars a seed from each brand is planted. Then the number of hours until a sprout (first sign of plant) is detectable is counted for each of the seeds. This is referred to as the time of growth. The following result is obtained: 2 Jar 1 2 3 4 5 6 7 89 10 11 12 13 14 15 Number of hours Brand 1 Brand 2 46 55 52 49 61 71 46 53 54 58 49 49 50 55 56 55 47 49 58 64 67 55 49 51 43 56 60 60 (a) Use a sign test to decide upon whether Brand 2 has more than 5 hours longer median time of growth than Brand 2. (b) Use a Wilcoxon signed-rank test to make the analogous decision as in a). (c) Discuss under what circumstances you probably would be convinced that a paired t-test would have higher efficiency than the sign test for the decision making in a). 3