LINKÖPINGS UNIVERSITET Institutionen för datavetenskap Statistik, ANd 732A36 THEORY OF STATISTICS, 6 CDTS Master’s program in Statistics and Data Mining Fall semester 2011 Written exam Written exam Jan 17, 2012 Rules: Hours: From 08.00 until 12.00 Means allowed: The textbook “Garthwaite PH, Joliffe IT, Jones B. (2002). Statistical Inference. 2nd ed. Oxford University Press.”, Laptop, Calculator. In your solutions, provide details of your derivations. Do not only provide the answer. Grades: The grading on this exam uses the whole ECTS-scale, i.e. the grades may range from F to A. Grades are given on basis of the total impression from the solution of tasks, no specific points or intervals of points are used. Good luck! Task 1 Assume a sample of ten observations x=(x1 , . . . , x10 ) from the Gamma distribution with p.d.f. 3 x x e− λ , x > 0 f (x; λ) = 4 λ · 3! Assume further that the sample mean is x̄ = 8. (a) Find the moment estimate of λ. (b) Is there a lower bound on the variance of the moment estimator? If so, estimate its value. Task 2 Assume we have a sample x=(x1 , . . . , xn ) from a distribution with p.m.f. f (x; θ) = ex ln θ−θ−ln x! , x = 0, 1, 2, . . . (a) Assume n = 50 and x̄ = 3. Find an approximate 95% confidence interval for θ. (b) Assume now that n = 4 and that x1 = 4, x2 = 3 and x3 = 3, but the only thing we know about x4 is that it is lower than 2. Find the maximum-likelihood estimate of θ. Task 3 Let x=(x1 , . . . , xn ) be a random sample from the Beta distribution with p.d.f. xa−1 (1 − x) ; 0 < x < 1; f (x; a, 2) = B(a, 2) where B(a, b) = Γ(a) · Γ(b) . Γ(a + b) (a) Find the form of the best test for H0 : a = a0 vs. H1 : a = a1 , where a1 < a0 . (b) Is the test UMP? Motivate your answer. 1 (c) Assume n = 100 and use a normal approximation (the Central Limit Theorem) for the distribution of the test statistic to find the test (i.e. calculate the critical limit) when a0 = 3, a1 = 2 and the significance level, α is 5%. [Hints: xk · ln x −→ 0, when x −→ 0 for k ≥ 1 and remember that Γ(k) = (k − 1)! when k is a positive integer. Further, it can be shown that for a ≥ 2, E ((ln X)2 ) = (1/B(a, 2)) · (1/a3 − 1/(a + 1)2 )] Task 4 Assume an experiment where we make two independent tosses of a coin. We are interested in the probability π of obtaining “heads” in a single tossing. (a) What is the minimax point estimator of π with quadratic loss function? (b) Assume we’ll have to pay the amount of 1 SEK if the difference between our estimate and the true value of π is at least 0.25. With this loss function what is the Bayes estimator of π when x=2 and when the prior of π is the one that makes the Bayes estimator equivalent to the minimax estimator under quadratic loss? Task 5 In a forensic set-up we are investigating a questioned signature on a will. There is one person suspected to have written this signature although not being the owner of it, i.e. suspected to have made a foregery. But we cannot initially exclude the possibility that the true owner actually wrote the signature. Therefore we study samples of handwriting known to origin from the two persons. We think that the characteristics of the writings are such that they rarely would occur if the true owner was the writer. In terms of probabilities it is our belief that they would not occur more often than one in one hundred cases. On the other hand, was the suspect the writer we believe these characteristics would be seen in approximately every second case. (a) If our prior odds for the suspect being the writer are nine to one on, i.e. the prior probability that the suspect is the writer is 0.90, what are the posterior odds upon consideration of the findings, i.e. the study of the characteristics? (b) Now, assume that besides the true owner there is also a third person that could have written the signature, i.e. an alternative forger. For that person investigation of handwriting samples gives that his writing would show the characteristics found in about one of ten cases. Further it may be assumed that if we were to choose between the owner of the signature and the thrd person, the owner has a conditional prior probability of being the writer (conditional on that it would be one of these two) that is twice as large as the conditional prior probability of the third person being the writer. If our prior odds are nine to one on for the suspect (the first person) being the writer against the alternative that one of the owner and the third person is the writer, what are the posterior odds taking all data and prior probabilities into account? 2 Task 6 Ten independent experiments are made two compare the response times of two competing search engines. The data are the following: Search engine A B 1 14.1 15.6 2 15.4 16.0 Response time [ms], experiment 3 4 5 6 7 8 22.7 10.0 4.4 17.8 11.5 11.9 22.5 10.5 4.3 18.2 12.4 11.9 9 25.1 26.1 10 7.7 8.5 (a) Use a sign test to judge upon whether engine B has generally longer response times than has engine A. Compute the P -value of the test. (b) Use a Wilcoxon rank-sum test to judge upon the same thing as in a). Compute the P -value of the test. 3