LINKÖPINGS UNIVERSITET Institutionen för datavetenskap Statistik, ANd 732A36 THEORY OF STATISTICS, 6 CDTS Master’s program in Statistics and Data Mining Fall semester 2012 Written exam Suggested solutions to written exam Jan 17, 2012 Task 1 (a) The mean of the Gamma distribution is Z ∞ Z ∞ x4 x3 −x/λ dx = 4λ x · 5 e−x/λ dx = 4λ · 1 x· 4 e λ 3! λ 4! 0 0 since the latter integral integrates a new Gamma density from zero to infinity. The moment estimator of λ satisfies the equation E(X) = x̄ which gives the moment estimate λ̂M M = x̄ 4 =2 (b) Since the Gamma distribution satisfies normal regularity properties the Cramér-Rao inequality applies and hence the answer is yes. ! 2 n Y xi d l(λ; x) x3i I(λ) = −E l(λ; x) = ln e− λ = 4 dλ2 λ · 3! 1 =3 n X 1 n 1X ln xi − 4n ln λ − n ln 3! − xi λ 1 Thus, n dl(λ; x) 4n 1 X =− + 2 xi dλ λ λ 1 and n 4n 2 X d2 l(λ; x) = − xi dλ2 λ2 λ3 1 giving n 4n 4n 2 X 2 4n E(Xi ) = − 2 + 3 n · 4λ = 2 I(λ) = − 2 + 3 λ λ 1 λ λ λ and the lower bound is λ2 λ2 = 4n 40 An estimate of this lower bound is found by replacing λ in I −1 (λ) by its moment estimate from a) giving (x̄/4)2 1 d I −1 (λ) = = 40 10 I −1 (λ) = Task 2 f (x; θ) = ex ln θ−θ−ln x! = Hence, it is the Poisson distribution. 1 θx −θ e x! (a) Use the result that the MLE of θ has an asymptotic normal distributions with mean θ and variance I −1 (θ) l(θ; x) = n X xi ln θ − nθ − n X 1 ln xi 1 This can be easily derived from the first expression of the density above. Using the exponential family representation with the natural parameterization φ = ln θ the MLE of φ is found by solving ! n n X X xi = E Xi = n · θ = n · e φ i i giving θ̂MLE = x̄ (due to the invariance property of MLE s). Now Pn xi dl(θ; x) = 1 −n dθ θ and Pn xi d2 l(θ; x) = − 12 2 dθ θ that gives Pn I(θ) = 1 E(Xi ) nθ n = 2 = 2 θ θ θ Thus, θ̂MLE ∼ N θ θ, n A 95% approximate confidence interval for θ can be found by solving for θ −1.96 < θ̂MLE − θ p < 1.96 θ/n However, it is simpler and still good enough use the interval p √ θ̂MLE 3 θ̂MLE ± 1.96 · √ ⇒ 3 ± 1.96 · √ ' 3 ± 0.5 = (2.5 , 3.5) n 50 (b) The likelihood can be written 3 Y θix −θ e · Pr(X < 2) ∝ θ4+3+3 e−3θ · (e−θ + θe−θ ) = = θ10 e−4θ (1 + θ) L(θ; x) = xi ! 1 and the log-likelihood l(θ; x) = Constant + 10 ln θ − 4θ + ln(1 + θ) 2 Hence, dl(θ; x) 10 1 = −4+ dθ θ 1+θ The score equation dl(θ; x) =0 dθ gives upon simplification 7 10 θ2 − θ − =0 4 4 with the solution √ 7 + 209 θ= 8 since θ is known to be > 0. The second derivative 10 1 d2 l(θ; x) =− 2 − 2 dθ θ (1 + θ)2 which is negative for all θ(> 0). Hence, θ̂MLE = 7+ √ 209 8 Task 3 (a) The likelihood function is L(a; x) = n Y xa−1 (1 − xi ) i B(a, 2) 1 The best test satisfies = ( Qn 1 Q xi )a−1 n1 (1 − xi ) (B(a, 2))n L(a1 ; x) ≥A L(a0 ; x) Taking natural logarithms gives l(a1 ; x) − l(a0 ; x) ≥ ln A = B ⇒ (a1 −1) n X n n n X X X ln xi + ln(1−xi )−n ln B(a1 , 2)−(a0 −1) ln xi − ln(1−xi )+n ln B(a0 , 2) ≥ B 1 1 ⇒ (a1 − a0 ) 1 n X ln xi ≥ B + n(ln B(a1 , 2) − ln B(a0 , 2)) 1 Since a1 − a0 < 0 we get 1 n X ln xi ≤ C 1 as the best test. 3 (b) No, it is not UMP since the inequality above changes direction if a1 > a0 P (c) The Central Limit Theorem can be used on ln Xi provided we know E(ln Xi |H0 ) and Var (ln Xi |H0 ). Z 2 Z 2 xa−1 (1 − x) 1 (ln x) E(ln Xi ) = dx = ((ln x) · xa−1 − (ln x) · xa )dx = B(a, 2) B(a, 2) 0 0 ( ) 1 Z 1 a−1 1 Z 1 a x xa xa+1 x 1 · (ln x) − dx − ln x · dx = + = B(a, 2) a 0 a a+1 0 0 0 a+1 ( a+1 1 ) a 1 1 x x 1 1 1 = · 0− 2 −0+ = · − B(a, 2) a 0 (a + 1)2 0 B(a, 2) (a + 1)2 a2 Now, B(a, 2) = Γ(a) · Γ(2) Γ(a + 2) Under H0 we have a = a0 = 3 which is an integer. B(3, 2) = and Γ(3) · Γ(2) 2! · 1! 1 = = Γ(5) 4! 12 1 · E(ln Xi |H0 ) = 12 1 1 2 − 2 4 3 =− 7 ' −0.0041 1728 Further, 2 Z E((ln Xi ) ) = 0 2 xa−1 (1 − x) 1 (ln x) · dx = B(a, 2) B(a, 2) 2 Z 2 ((ln x)2 · xa−1 − (ln x)2 · xa )dx = 0 ( ) Z 1 Z 1 a 1 a−1 a+1 1 a 1 x x x x = · (ln x)2 · dx − (ln x)2 · dx = −2 (ln x) +2 ln x · B(a, 2) a 0 a a+1 0 a+1 0 0 = a ≥ 2 ⇒ (ln x)2 · xa → 0 when x → 0 since x(ln x) → 0 = ( ) 1 1 Z 1 a−1 Z 1 1 xa x xa+1 xa · 0 − 2 (ln x) 2 + 2 dx − 0 + 2 (ln x) −2 = = 2 B(a, 2) a 0 a2 (a + 1)2 0 0 0 (a + 1) a 1 a+1 1 ! 1 x x = · (−2) · 0 + 2 3 + 2 · 0 − 2 = B(a, 2) a 0 (a + 1)3 0 1 1 1 = · − B(a, 2) a3 (a + 1)3 and this gives 1 E((ln Xi ) |H0 ) = · 12 2 4 1 1 3 − 3 3 4 = 37 20736 and 37 7 2 − (− ) ' 0.0018 20736 1728 The Central limit theorwm now gives that V ar(ln X|H0 ) = 50 X ln Xi ∼ N (50 · 0.0041; 50 · 0.0018) ifH0 istrue ⇒ 1 Pr 50 X ! ln Xi ≤ C 'Φ 1 C − 50 · 0.0041 √ 50 · 0.0018 Hence, with α = 5% we’ll get C ' 50 · 0.0041 − z0.05 · √ 50 · 0.0018 ' /z0.05 = 1.6449/ ' −0.29 Task 4 Repetaed coin tossing is a binomial experiment. Thus the likelihood function for the exper x 2 imental data in this case is L(π; x) = x π (1 − π)2−x (a) The Minimax estimator coincides with the Bayes estimator when the risk function is constant. The conjugate prior to the binomial likelihood is the beta distribution with parameters α and β. The Bayes estimator with quatratic loss is the posterior mean, i.e. α+x α+x π̂B = = α+β+n α+β+2 For this estimator the risk function is R(π; π̂) = EX|π (π̂ − π)2 = Var (π̂) + (Bias(π̂))2 = 2 α+x α+x + E −π = = Var α+β+2 α+β+2 = /Var (X) = 2 · π(1 − π) and E(X) = 2π/ = 2 2 · π(1 − π) α + 2π = + −π = (α + β + 2)2 α+β+2 = [α + 2π − π(α + β + 2)]2 + 2π(1 − π) [α − π(α + β)]2 + 2π(1 − π) = (α + β + 2)2 (α + β + 2)2 For this risk function to be independent of π we require the coeffcients of π and π 2 in the numerator to be zero. This gives (α + β)2 = 2 and 2α(α + β) = 2 √ which is satisfied by α = β = 22 . (The whole derivation for a general n is made in the textbook on page p 121, unfortunately with a √ small error stating that the common value should be n/2, but the correct value is n/2). 5 Thus, with these values on α and β the Bayes estimator and also the minimax estimator becomes √ 2/2 + x π̂ = √ 2+2 (b) This loss function is a zero-one loss function 0 , |π̂ − π| < 0.25 LS (π, π̂) = 1 , |π̂ − π| ≥ 0.25 With zero-one loss the Bayes estimator is the mode of the posterior distribution, which with a beta prior (α,β) is π̂B = α+x−1 α+x−1 = /n = 2/ = α+β+n−2 α+β √ The prior to be used here is the one that was derived in a), i.e. a beta with α = β = 22 , which gives √ √ 2/2 + x − 1 2/2 + 1 √ √ π̂B = = /x = 2/ = ' 0.85 2 2 Task 5 (a) We would like to test H0 : The suspect is the writer of the signature against H1 : The owner of the signature is the writer of the signature The study of the samples of handwriting gives us (approximative) likelihoods of the two hypotheses: L(H0 ; x) = Pr(Characteristics|H0 ) = 1/2 L(H1 ; x) = Pr(Characteristics|H1 ) = 1/100 Since the two hypothese are both simple, the Bayes factor is B= 1/2 = 50 1/100 Now, since the prior odds for H0 are nine to one on (Q = 9) the posterior odds are Q∗ = B · Q = 50 · 9 = 450 or 450 to 1 on. 6 (b) In this case we test H0 above against the composite hypothesis H2 : One of the owner and the third person is the writer of the signature The likelihood for the third person being the writer is (analogously to the previous likelihoods) Pr(Characteristics|Third person) = 1/10 Further, we also have that Pr(Owner|H2 ) = 2 · Pr(Third person|H2 ) ⇒ Pr(Owner|H2 ) = 2/3 and Pr(Third person|H2 ) = 1/3 The Bayes factor then becomes B= Pr(Characteristics|H0 ) = Pr(Characteristics|Owner) · (2/3) + Pr(Characteristics|Third person) · (1/3) 1/2 = 12.5 (1/100) · (2/3) + (1/10) · (1/3) = and the posterior odds become Q∗ = 12.5 · 9 = 112.5 Task 6 (a) Compute the difference in response time between engina A and engine B: resp.time A resp.time B difference 1 14.1 15.6 −1.5 2 15.4 16.0 −0.6 3 22.7 22.5 0.2 4 10.0 10.5 −0.5 5 4.4 4.3 0.1 6 17.8 18.2 −0.4 7 11.5 12.4 −0.9 8 11.9 11.9 0 9 25.1 26.1 −1.0 10 7.7 8.5 −0.8 Discard the pair with equal response times. AMong the remaining nine, seven differences are negative. Under the null hypothesis of generally equally long response times (median difference = 0) the number of negative differences, X, would follow a Bi(9, 0.5)-distribution. Hence, the P -value is 9 9 9 Pr(X ≥ 7) = Pr(X ≤ 2) = + + · 0.59 ' 0.09 0 1 2 (b) Rank the absolute differences in a) (discard the zero difference) and compute the rank sum of the absolute differences origin in negative differences. resp.time A resp.time B difference abs. difference rank 1 14.1 15.6 −1.5 1.5 9 2 15.4 16.0 −0.6 0.6 5 3 22.7 22.5 0.2 0.2 2 4 10.0 10.5 −0.5 0.5 4 7 5 4.4 4.3 0.1 0.1 1 6 17.8 18.2 −0.4 0.4 3 7 11.5 12.4 −0.9 0.9 7 8 11.9 11.9 0 0 - 9 25.1 26.1 −1.0 1.0 8 10 7.7 8.5 −0.8 0.8 6 The rank sum of the originally negative differences becomes W = 9 + 5 + 4 + 3 + 7 + 8 + 6 = 42 Under the assumption the response times are generally equally long for the two engines n(n + 1) n(n + 1)(2n + 1) W ∼N ; = /n = 9/ = N (22.5; 71.25) 4 24 Hence, the P -value is approximately Pr(W ≥ 42) ' 1 − Φ 42 − 22.5 p 1710/24 8 ! ' 1 − Φ(2.31) ' 0.01