4769 Mark Scheme June 2006 Q1 (i) L= 1 σ 1 2π e − ln L = const − (W1 − μ ) 2 2σ 12 1 2σ 2 1 ⋅ 1 σ 2 2π e (W1 − μ ) 2 − − (W2 − μ ) 2 2σ 22 1 2σ 2 2 (W2 − μ ) 2 2 d ln L 2 = (W2 − μ ) (W1 − μ ) + dμ 2σ 22 2σ 12 = 0 ⇒ σ 22W1 − σ 22 μ + σ 12W2 − σ 12 μ = 0 σ 22W1 + σ 12W2 σ 12 + σ 22 Check this is a maximum. ⇒ μˆ = E.g. (ii) (iii) d ln L 1 1 =− 2 − 2 <0 2 σ1 σ 2 dμ M1 A1 M1 A1 A1 Differentiate w.r.t. μ. A1 BEWARE PRINTED ANSWER. M1 11 A1 M1 σ 22 μ + σ 12 μ =μ σ 12 + σ 22 ∴ unbiased. E( μˆ ) = 2 A1 2 ⎛ ⎞ 1 ⎟ ⋅ (σ 24σ 12 + σ 14σ 22 ) Var(μˆ ) = ⎜⎜ 2 2 ⎟ + σ σ 1 ⎠ ⎝ 1 B1 B1 σ 12σ 22 σ 12 + σ 22 First factor. Second factor. Simplification not required at this point. 2 T = 12 (W1 + W2 ) B1 Var(T) = 14 (σ 12 + σ 22 ) Relative efficiency (y) = (v) Product form. Two Normal terms. Fully correct. 2 = (iv) M1 M1 A1 Var( μˆ ) Var(T ) = σ 24σ 12 + σ 14σ 22 4 ⋅ 2 2 2 2 (σ 1 + σ 2 ) σ 1 + σ 22 = 4σ 12σ 22 (σ 12 + σ 22 ) 2 E.g. consider σ 12 + σ 22 − 2σ 1σ 2 = (σ 1 − σ 2 ) 2 ≥ 0 ∴ Denominator ≥ numerator, ∴ fraction ≤ 1 [Both μ̂ and T are unbiased,] μ̂ has smaller variance than T and is therefore better. M1 M1 Any attempt to compare variances. If correct. A1 A1 BEWARE PRINTED ANSWER. 5 M1 E1 E1 E1 4 24 4769 Q2 Mark Scheme f ( x) = λk +1 x k e − λx k! Given: (i) ∫ ∞ 0 [ x > 0 (λ > 0, k integer ≥ 0)] , u m e -u du = m! M1 M X (θ ) = E[eθx ] =∫ ∞ λk +1 0 k! x k e −( λ −θ ) x dx M1 Put (λ – θ)x = u λk +1 = k!(λ − θ ) k +1 ⎛ λ ⎞ =⎜ ⎟ ⎝ λ −θ ⎠ (ii) June 2006 ∫ ∞ 0 u k e − u du M1 A1 A1 A1 A1 k +1 Y = X1 + X2 + … + Xn By convolution theorem:- mgf of Y is {MX(θ)}n nk + n ⎛ λ ⎞ i.e. ⎜ ⎟ ⎝ λ −θ ⎠ μ = M ′(0) B1 M ′(θ ) = λ nk + n (− nk − n)(λ − θ ) − nk − n −1 (−1) M1 A1 ∴μ = nk + n λ σ = M ′′(0) − μ 2 A1 M ′′(θ ) = (nk + n)λnk + n (−nk − n − 1)(λ − θ ) − nk − n − 2 (−1) M1 ∴ M ′′(0) = (nk + n)(nk + n + 1) / λ 2 A1 For obtaining this expression after substitution. Take out constants. (Dep on subst.) Apply “given”: integral = k! (Dep on subst.) BEWARE PRINTED ANSWER. 7 2 ∴σ 2 = = (iii) (nk + n)(nk + n + 1) λ 2 − (nk + n) 2 λ2 nk + n 8 A1 λ2 [Note that MY(t) is of the same functional form as MX(t) with k + 1 replaced by nk + n, i.e. k replaced by nk + n –1. This must also be true of the pdf.] Pdf of Y is λnk + n (nk + n − 1)! × y nk + n −1 × e −λy [for y > 0] (iv) M1 λ = 1, k = 2, n = 5, 0·9165 Use of N(15, 15) B1 B1 B1 One mark for each factor of the expression. Mark for third factor shown here depends on at least one of the other two earned. M1 M1 Mean. ft (ii). Variance. ft (ii). Exact P(Y > 10) = 3 4769 Mark Scheme ⎛ ⎞ 10 − 15 P(this > 10) = P⎜⎜ N(0, 1) > = −1 ⋅ 291⎟⎟ ⎝ 15 ⎠ = 0·9017 Reasonably good agreement – CLT working for only small n. June 2006 A1 c.a.o. A1 E2 c.a.o. (E1, E1) [Or other sensible comments.] 6 24 4769 Mark Scheme June 2006 Q3 (i) x = 36.48 s = 9.6307 s 2 = 92.7507 y = 45.5 s = 14.8129 s 2 = 219.4218 Assumptions: Normality of both populations equal variances H0 : μA = μB H1 : μA ≠ μB Where μA, μB are the population means. 9 × 92.7507 + 11 × 219.4218 20 834.756 + 24136.64 = = 162.4198 20 36 . 48 − 45 . 5 Test statistic is 1 1 162 . 4198 + 10 12 −9.02 = = −1.653 5.4568 B1 B1 B1 B1 B1 If all correct. [No marks for use of sn which are 9.1365 and 14.1823 respectively.] Do NOT accept X = Y or similar. Pooled s 2 = (ii) B1 M1 A1 Refer to t20. Double tailed 5% point is 2·086. Not significant. No evidence that population mean times differ. M1 A1 A1 A1 Assumption: Normality of underlying population of differences. H0 : μD = 0 H1 : μD > 0 Where μD is the population mean of “before – after” differences. B1 Differences are 6.4, 4.4, 3.9, -1.0, 5.6, 8.8, -1.8, 12.1 ( x = 4.8 = (12.7444)2 B1 B1 No ft from here if wrong. No ft from here if wrong. ft only c’s test statistic. ft only c’s test statistic. Do NOT accept D = 0 or similar. The “direction” of D must be CLEAR. Allow μA = μB etc. M1 s = 4.6393) [A1 can be awarded here if NOT awarded in part (i)]. Use of sn (=4.3396) is NOT acceptable, even in a denominator of Test statistic is sn n −1 4 .8 − 0 4 . 6393 8 =2.92(64) (iii) 12 M1 A1 Refer to t7. Single tailed 5% point is 1.895. Significant. Seems mean is lowered. M1 A1 A1 A1 No ft from here if wrong. No ft from here if wrong. ft only c’s test statistic. ft only c’s test statistic. 10 The paired comparison in part (ii) eliminates the variability between workers. E2 (E1, E1) 2 24 4769 Mark Scheme June 2006 Q4 (i) B1 Latin square. Layout such as: I II III IV V Surf -aces (ii) (iii) Locations 2 3 4 B C D C D E D E A E A B A B C 1 A B C D E 5 E A B C D B1 B1 Xij = μ + αi + eij B1 μ = population grand mean for whole experiment. B1 B1 αi = population mean amount by which the ith treatment differs from μ. B1 eij are experimental errors ~ ind B1 B1 B1 B1 N(0, σ2). (letters = paints) Correct rows and columns. A correct arrangement of letters. SC. For a description instead of an example allow max 1 out of 2. B1 Allow “uncorrelated”. Mean. Variance. Totals are: 322, 351, 307, 355, 291 (each from sample of size 5) Grand total: 1626 “Correction factor” CF = 1626 2 = 105755.04 25 Total SS = 106838 – CF = 1082.96 Between paints SS = 322 2 2912 + ... + − CF 5 5 = 106368 – CF =612.96 Residual SS (by subtraction) = 1082.96 – 612.96 = 470.00 Source of variation Between paints SS df MS 612.96 4 Residual Total 470.00 1082.96 20 24 153.2 4 23.5 MS ratio = 153.24 = 6.52 23.5 3 M1 M1 A1 For correct methods for any two SS. If each calculated SS is correct. B1 B1 M1 Degrees of freedom “between paints”. Degrees of freedom “residual”. MS column. M1 A1 Independent of previous M1. Dep only on this M1. 9 4769 Mark Scheme June 2006 Refer to F4, 20 M1 Upper 5% point is 2.87 Significant. A1 A1 Seems performances of paints are not all the same. A1 No ft if wrong. But allow ft of wrong d.o.f. above. No ft if wrong. ft only c’s test statistic and d.o.f.’s. ft only c’s test statistic and d.o.f.’s. 12 24 4769 1) (i) (ii) Mark Scheme f (x) = 1 θ E[ X ] = June 2007 0≤x ≤ θ θ B1 2 E[2 X ] = 2E[ X ] = 2E[ X ] = θ ∴ unbiased 2 .3 ∑ x = 2.3 ∴x= = 0.46 ∴ 2 x = 0.92 5 But we know θ ≥ 1 ∴ estimator can give nonsense answers, Write-down, or by symmetry, or by integration. M1 A1 E1 4 B1 E1 E2 (E1, E1) i.e. essentially useless (iii) Y = max{Xi}, g(y) = ny 4 n −1 0≤y≤ θ θn MSE (kY) = E[(kY − θ ) 2 ] = E[k 2Y 2 − 2kθ Y + θ 2 ] = k 2 E[Y 2 ] − 2kθ E[Y ] + θ 2 M1 1 M1 dMSE = dk 2kE[Y 2 ] − 2θ E[Y ] = 0 θ E[Y ] for k = E[Y 2 ] M1 A1 d 2 MSE = 2E[Y 2 ] > 0 ∴ this is a minimum 2 dk θ ny n n θ n+1 nθ dy = = E[Y ] = ∫ n n θ n +1 n +1 θ 0 θ E[Y 2 ] = ∫ 0 ny n+1 θn dy = ∴ minimising k = θ BEWARE PRINTED ANSWER M1 M1 A1 M1 A1 n θ n+ 2 nθ 2 = θn n+2 n+2 nθ n + 2 n+2 = 2 n + 1 nθ n +1 M1 A1 12 (iv) With this k, kY is always greater than the sample maximum So it does not suffer from the disadvantage in part (ii) 89 E2 (E1 E1) E2 (E1 E1) 4 4769 2(i) (ii) (iii) (iv) Mark Scheme n ⎛n⎞ G (t ) = E[t X ] = ∑ ⎜⎜ ⎟⎟( pt ) x (1 − p ) n− x x =0 ⎝ x ⎠ n = [(1 − p ) + pt ] M1 2 ∴ M Z (θ ) = e ⎛ − ⎜ qe ⎜ ⎝ (v) pθ npq + pe M Z (θ ) = (q − terms in n −3 2 ⎛ ⎜ q + pe ⎜ ⎝ 1− p θ npq ⎞ ⎟ ⎟ ⎠ 1 1 1 M1 e θ2 θ2 2n 6 1 B1 For BOTH 1 1 M1 1 θ npq n ⎞ ⎟ = ⎟ ⎠ 1 1 n 1 BEWARE PRINTED ANSWER 5 qpθ qp 2θ 2 + + npq 2npq , n −2 , ………….. + M1 For expansion of exponential terms M1 For indication that these can be neglected as n → ∞ . Use of result given in question pqθ pq 2θ 2 p+ + + "") n = npq 2npq (1 + 4 1 M Z (θ ) = ebθ M X (aθ ) np θ q Available as B2 for write-down or as 1+1 for algebra 1 = (q + pt ) n μ = G ′(1) G′(t ) = np(q + pt ) n−1 G ′(1) = np × 1 = np σ 2 = G′′(1) + μ − μ 2 G′′(t ) = n(n − 1) p 2 (q + pt ) n−2 G' ' (1) = n(n − 1) p 2 ∴σ 2 = n 2 p 2 − np 2 + np − n 2 p 2 = −np 2 + np = npq X −μ Z= Mean 0, Variance 1 σ M (θ ) = G (eθ ) = (q + peθ ) n Z = aX + b with: 1 1 μ np a= = and b = − = − σ σ q npq − June 2007 + "") n → 1 2 1 90 4 4769 (vi) Mark Scheme N(0,1) Because e June 2007 1 θ2 is the mgf of N(0,1) E1 and the relationship between distributions and their mgfs is unique E1 2 (vii) “Unstandardising”, N( μ , σ 2 ) ie N ( np , npq ) 91 3 1 Parameters need to be given. 1 4769 3(i) Mark Scheme H 0 : μ A = μB H1 : μ A ≠ μ B 1 Where μ A , μ B are the population means 1 June 2007 Do NOT allow X = Y or similar Accept absence of “population” if correct notation μ is used. Hypotheses stated verbally must include the word “population”. Test statistic 26.4 − 25.38 = 2.45 1.40 + 7 5 1.02 = 1.285 0.63 = 0.7937 1.645 × No FT if wrong No FT if wrong 10 B1 M1 (– 0.2856, 2.3256) H 0 is accepted if –1.96< test statistic < 1.96 A1 cao M1 i.e. if − 1.96 < x − y < 1.96 M1 0.7937 i.e. if − 1.556 < x − y < 1.556 A1 M1 In fact, X − Y ~ N(2,0.7937 ) So we want P(−1.556 < N(2,0.7937 2 ) < 1.556) = 2 Zero out of 4 if not N(0,1) 4 SC1 Same wrong test can get M1,M1,A0. SC2 Use of 1.645 gets 2 out of 3. BEWARE PRINTED ANSWER M1 1.556 − 2 ⎞ ⎛ − 1.556 − 2 P⎜ < N (0,1) < ⎟= 0.7937 ⎠ ⎝ 0.7937 P(−4.48 < N(0,1) < −0.5594) = 0.2879 (iv) Denominator two separate terms correct M1 0.7937 = 1.02 ± 1.3056 = (iii) M1 M1 1 1 1 1 CI ( for μ A – μ B ) is 1.02 ± Numerator A1 Refer to N(0,1) Double-tailed 5% point is 1.96 Not significant No evidence that the population means differ (ii) M1 M1 A1 cao Wilcoxon would give protection if assumption of Normality is wrong. Wilcoxon could not really be applied if underlying variances are indeed different. Wilcoxon would be less powerful (worse Type II error behaviour) with such small samples if Normality is correct. 92 Standardising 7 E1 E1 E1 3 4769 4 (i) (ii) (iii) Mark Scheme There might be some consistent source of plotto-plot variation that has inflated the residual and which the design has failed to cater for. Variation between the fertilisers should be compared with experimental error. E1 If the residual is inflated so that it measures more than experimental error, the comparison of between - fertilisers variation with it is less likely to reach significance. Randomised blocks C B A D E . . . . . . E2 E1 – Some reference to extra variation. E1 – Some indication of a reason. 2 E2 1 (E1, E1) 3 E1 Blocks (strips) clearly correctly oriented w.r.t. fertiliser gradient. E1 All fertilisers appear in a block. E1 Different (random) arrangements in the blocks. 2 (1, E1) 4 Totals are: 95.0 123.2 86.8 130.2 67.4 (each from sample of size 4) Grand total 502.6 2 “Correction factor” CF = 502.6 = 12630.338 20 Total SS = 13610.22 – CF = 979.882 SPECIAL CASE: Latin Square (iv) June 2007 4 Between fertilisers SS = M1 95.0 2 67.4 2 – CF = + ... + 4 4 M1 For correct method for any two A1 If each calculated SS is correct 13308.07 – CF = 677.732 Residual SS (by subtraction) = 979.882 – 677.732=302.15 Source of variation SS df Between fertiliser Residual 677.732 302.15 4 15 Total 979.882 19 MS 169.433 20.143 MS Ratio 8.41 Refer to F4, 15 -upper 5% point is 3.06 Significant - seems effects of fertilisers are not all the same (vii) Independent N (0, σ2 [constant]) 93 M1 M1 1,A1 1 1 1 1 1 1 1 1 No FT if wrong No FT if wrong 12 3 4769 Mark Scheme June 2008 4769 Statistics 4 Q1 (i) L= x e −θ θ x1 e −θ θ xn ⎡ e −nθ θ ∑ i ⎤ … ⎢= ⎥ x1! xn ! ⎣⎢ x1! x2 ! xn !⎦⎥ ln L = const − n θ + ∑x i ln θ product form A1 fully correct M1 A1 d ln L ∑ xi = 0 = −n + dθ θ x ∑ i (= x ) ⇒ θˆ = n M1 A1 A1 Check this is a maximum e.g. M1 M1 d 2 ln L ∑ xi =− 2 <0 2 θ dθ A1 (ii) λ = P( X = 0) = e −θ B1 (iii) We have R ~ B(n, e −θ ) , M1 so E ( R) = ne −θ −θ B1 R n M1 ~ ∴ E(λ ) = e −θ A1 A1 ~ 9 1 B1 −θ Var( R) = ne (1 − e ) λ= CAO i.e. unbiased ~ e −θ (1 − e −θ ) Var(λ ) = n A1 BEWARE PRINTED ANSWER 7 77 4769 (iv) Mark Scheme June 2008 ~ Relative efficiency of λ wrt ML est Var(ML Est) ~ Var(λ ) = = θe Eg:- −2θ n ⋅ M1 M1 n e − θ (1 − e − θ ) Expression is θ+ = θ θ2 2! any attempt to compare variances if correct θ A1 e θ −1 BEWARE PRINTED ANSWER M1 +… always < 1 E1 and this is ≈ 1 if θ is small ≈ 0 if θ is large E1 E1 Allow statement that θ θ e −1 → 0 as θ → ∞ 7 78 4769 Q2 (i) Mark Scheme B1 P( X = x) = q x −1 p Pgf G (t ) = E (t X ) = June 2008 ∞ ∑ pt x q x −1 M1 x =1 = pt (1 + qt + q 2 t 2 + …) A1 = pt (1 − qt ) −1 A1 μ = G ' (1) σ 2 = G ' ' (1) + μ − μ 2 M1 G ' (t ) = pt (−1)(1 − qt ) −2 (−q ) + p (1 − qt ) −1 = pqt (1 − qt ) − 2 + p (1 − qt ) −1 ∴ G ' (1) = pq (1 − q) −2 + p (1 − q) −1 = FT into pgf only BEWARE PRINTED ANSWER [consideration of |qt| < 1 not required] for attempt to find G'(t) and/or G"(t) A1 q 1 +1 = p p A1 BEWARE PRINTED ANSWER G ' ' (t ) = pqt (−2)(1 − qt ) −3 (−q ) + pq (1 − qt ) −2 + p (−1)(1 − qt ) − 2 (− q ) A1 ∴ G' ' (1) = 2 pq 2 (1 − q ) −3 + pq (1 − q ) −2 + pq(1 − q ) −2 = 2 q 2 2q + p2 p A1 2q 2 2 q 1 1 2q 2 + 2 pq + p − 1 + + − 2 = 2 p p p p p2 q q = 2 (2q + 2 p − 1) = 2 p p ∴σ 2 = M1 For inserting their values A1 BEWARE PRINTED ANSWER 11 79 4769 (ii) Mark Scheme X1=number of trials to first success X2= “ “ “ “ “ next “ . . . . Xn= “ “ “ “ “ nth “ ∴ Y=X1+X2+…..Xn = total no of trials to the nth success ∴ pgf of Y = (pgf of X ) n = p nt n (1 − qt ) − n (iii) (iv) μ Y = nμ X = n p σ Y2 = nσ X2 = nq p2 1 N(candidate’s μ Y , candidate’s σ Y2 ) Y = no of tickets to be sold ~ random variable as in (ii) with n = 140 and p = 0.8 ~ Approx N ( 140 = 175, 140 × 0.2 = 43.75 ) (0.8)2 1 P(Y ≥ 160) ≈ P(N(175,43.75) > 159 ) 2 5 1 E1 1 Do not award if cty corr absent or wrong, but FT if 160 used → -2.268, 0.9884 A1 A1 For any sensible discussion in context (eg groups of passengers ⇒ not indep.) (ii) 1 1 M1 = P(N(0,1)>-2.343) = 0.9905 (i) E1 E1 1 0.8 Q3 June 2008 CAO E1 E1 7 X = amount of salt ~ N( μ [750],σ 2 [20 2 ] ) Sample of n=9 Type I error: rejecting null hypothesis … … when it is true. B1 B1 Allow B1 for P(rej H0 when true) Type II error: accepting null hypothesis … … when it is false. B1 B1 Allow B1 for P(acc H0 when false) OC: P (accepting null hypothesis … … as a function of the parameter under investigation) Reject if x < 735 or x > 765 20 2 α = P ( X < 735 or X > 765 | X ~ N(750, )) 9 (735 − 750)3 = P( Z < = −2.25 20 (765 - 750)3 or Z > = 2.25) 20 B1 B1 [ P(type II error | the true value of the parameter) scores B1+B1] M1 Might be implicit A1 A1 = 2(1-0.9878) = 2 × 0.0122 = 0.0244 A1 This is the probability of rejecting good output and unnecessarily re-calibrating the machine – seems small [but not very small?] E1 E1 80 6 CAO Accept any sensible comments 6 4769 (iii) (iv) Mark Scheme June 2008 M1 Accept if 735 < x < 765, and now μ = 725 . 2 β = P(735 < X < 765 | X ~ N(725, 20 9 )) = P(1.5 < Z< 6) = 1 – 0.9332 = 0.0668 A1 A1 A1 This is the probability of accepting output and carrying on when in fact μ has slipped to 725 – small[-ish?] E1 E1 OC = P( 735 < X < 765 | X ~ N( μ , 20 M1 =Ф ⎛ ( 765 − μ ) 3 ⎞ – ⎟ ⎜ 20 ⎠ ⎝ Ф 2 9 )) might be implicit CAO If upper limit 765 not considered, maximum 2 of these 4 marks. If Ф(6) not considered, maximum 3 out of 4. accept sensible comments 6 ⎛ ( 735 − μ ) 3 ⎞ ⎜ ⎟ 20 ⎝ ⎠ ″ Ф − Ф″ μ=720:Ф(6.75) – Ф(2.25)=1– 0.9878 =0.0122 730: 5.25 0.75 =1– 0.7734 =0.2266 740: 3.75 −0.75 =1– (1–0.7734)= 0.7734 M1 A1 both correct 1 if any two correct 750: similarly or by write-down from part (ii) [ FT ] : 0.9756 1 760, 770, 780 by symmetry [FT]: 0.7734, 0.2266, 0.0122 1 6 Q4 (i) xij = μ + α i + eij μ = population … .. grand mean for whole experiment α i = population … μ .. mean by which i th treatment differs from 1 3 eij are experimental errors… ~ ind N (0, σ 2 ) (ii) Totals are 240, 246, 254, 264, 196 each from sample of size 5 Grand total 936 “Correction factor” CF = 1 1 1 1 1 936 2 = 43804.8 20 Total SS = 44544 - CF = 739.2 81 Allow “uncorrelated” 1 for ind N; 1 for 0; 1 for σ2. 9 4769 Mark Scheme June 2008 M1 M1 For correct methods for any two, A1 if each calculated SS is correct. Between contractors SS = 240 2 196 2 +…+ − CF = 44209.6 – CF = 404.8 5 5 Residual SS ( by subtraction) = 739.2 – 404.8 = 334.4 M1 Source of Variation SS df MS MS ratio M1 1 Between Contractors 404.8 3 Residual 334.4 16 134 .93 6.456 A1 20.9 CAO Total 739.2 1 19 1 NO FT IF WRONG Upper 5% point is 3.24 1 NO FT IF WRONG Significant 1 Seems performances of contractors are not all the same 1 Refer to F3,16 12 (iii) Randomised blocks B1 Description E1 E1 82 Take the subject areas as “blocks”, ensure each contractor is used at least once in each block 3 4769 Mark Scheme June 2009 4769 Statistics 4 June 2009 Q1 (i) Follow-through all intermediate results in this question, unless obvious nonsense. P(X ≥ 2) = 1 – θ – θ (1 – θ) = (1 – θ)2 [o.e.] L = [θ] =θ (ii) n0 n0 + n1 [θ(1 – θ)] n 1 [(1-θ)2] (1 – θ) n − n0 − n1 2 n − 2 n0 − n1 ln L = (n0 + n1 ) ln θ + (2n − 2n0 − n1 ) ln (1 – θ) d ln L dθ n0 + n1 = θ =0 M1 A1 M1 A1 A1 Product form Fully correct BEWARE PRINTED ANSWER 5 M1 A1 M1 – 2n − 2n0 − n1 1−θ A1 M1 ⇒ (1 - θˆ ) (n 0 + n 1 ) = θˆ (2n − 2n0 − n1 ) ⇒ θˆ = (iii) ∞ E(X) = A1 n0 + n1 2n − n0 ∑ xθ (1 − θ ) 6 M1 x x=0 = θ {0 + (1 – θ) + 2(1 – θ) 2 + 3(1 − θ ) 3 + …} = A2 1−θ θ Divisible, for algebra; e.g. by “GP of GPs” BEWARE PRINTED ANSWER So could sensibly use (method of moments) ~ 1−θ θ given by ~ = X ~ 1 ⇒θ = 1+ X ~ M1 θ A1 To use this, we need to know the exact numbers of faults for components with “two or more”. (iv) BEWARE PRINTED ANSWER E1 6 B1 14 = 0 ⋅ 14 100 1 ~ = 0·8772 θ = 1 + 0.14 x= B1 Also, from expression given in question, ~ 0 ⋅ 8772 2 (1 − 0 ⋅ 8772) Var(θ ) = 100 B1 = 0·000945 CI is given by 0·8772 ± 1·96 x (0·817, 0·937) 0 ⋅ 000945 = 80 M1 B1 M1 A1 For 0.8772 For 1.96 For 0 ⋅ 000945 7 4769 Mark Scheme June 2009 Q2 (i) Mgf of Z = E (e tZ ) = ∞ 1 −∞ 2π ∫ Complete the square e tz − z2 2 M1 dz M1 A1 A1 1 1 z2 tz − = − (z − t) 2 + t 2 2 2 2 = e t2 2 ∞ 1 −∞ 2π ∫ e − ( z −t ) 2 2 dt = e t2 2 M1 M1 M1 Pdf of N(t,1) ∴∫=1 (ii) A1 Y has mgf M Y (t ) M1 1 1 1 Mgf of aY + b is E[e t ( aY +b ) ] = e bt E[e ( at )Y ] = e bt M Y (at ) (iii) Z= ∴M (iv) X −μ σ X , so X = σZ + μ ( t ) = e μ t .e (σ t ) 2 2 =e μt + μ+ σ σ t 2 2 2 For final answer e For factor e bt For factor E[e ( at )Y ] For final answer M1 1 For factor e μt 1 1 For factor e 2 For final answer M1 A1 For E[(e X ) k ] A1 For M X (k ) t2 2 8 4 (σt ) 2 4 For E (e kX ) M1 A1 2 E (W 2 ) = M X (2) = e 2 μ + 2σ For taking out factor e For use of pdf of N(t,1) For ∫ pdf = 1 2 2 W = eX E (W k ) = E[(e X ) k ] = E (e kX ) = M X (k ) ∴ E (W ) = M X (1) = e t2 2 2 2 2 2 ∴ Var (W ) = e 2 μ + 2σ − e 2 μ +σ [ = e 2 μ +σ (e σ − 1)] 81 M1 A1 A1 8 4769 Mark Scheme June 2009 Q3 (i) x = 126 ⋅ 2 s = 8·7002 s 2 = 75 ⋅ 693 y = 133 ⋅ 9 s = 10·4760 s 2 = 109 ⋅ 746 A1 H0 : μ A = μB 1 H0 : μ A ≠ μB Where μ A , μ B are the population means. A1 if all correct. [No mark for use of s n , which are 8·2537 and 9·6989 respectively.] Do not accept X = Y or similar. 1 Pooled s 2 9 × 75 ⋅ 693 + 6 × 109 ⋅ 476 681 ⋅ 24 + 658 ⋅ 48 = 15 15 = 89 ⋅ 3146 = B1 [√ = 9·4506] Test statistic is 126 ⋅ 2 − 133 ⋅ 9 89 ⋅ 3146 (ii) 1 1 + 10 7 =− 7⋅7 = − 1⋅ 653 4 ⋅ 6573 M1 A1 Refer to t15 1 No FT if wrong Double-tailed 10% point is 1·753 Not significant No evidence that population mean concentrations differ. There may be consistent differences between days (days of week, types of rubbish, ambient conditions,…) which should be allowed for. 1 1 1 No FT if wrong Assumption: Normality of population of differences. Differences are 7·4 -1·2 11·1 5·5 6·2 3·7 −0·3 1·8 3·6 [ d = 4·2, s = 3·862 ( s 2 =14·915)] Use of s n (= 3 ⋅ 641) is not acceptable, even in a n −1 ] 4⋅2 − 0 10 E1 E1 1 M1 A1 Can be awarded here if NOT awarded in part (i) denominator of s n / Test statistic is 3 ⋅ 862 / 9 M1 A1 = 3·26 1 1 1 1 Refer to t 8 Double-tailed 5% point is 2·306 Significant Seems population means differ No FT if wrong No FT if wrong 10 82 4769 (iii) Mark Scheme June 2009 B1 B1 1 1 Wilcoxon rank sum test Wilcoxon signed rank test H0: medianA = medianB H1: medianA ≠ medianB [Or more formal statements] 4 Q4 (i) Description must be in context. If no context given, mark according to scheme and then give half-marks, rounded down. Clear description of “rows”. And “columns” (ii) As extraneous factors to be taken account of in the design, with “treatments” to be compared. Need same numbers of each Clear contrast with situations for completely randomised design and randomised trends. eij ~ ind N (0, σ2) α i is population mean effect by which ith treatment differs from overall mean (iii) Source of Variation E1 E1 E1 E1 E1 E1 E1 E1 E1 1 1 1 1 1 9 Allow uncorrelated For 0 For σ2 5 SS df MS MS ratio Between Treatments 92·30 4 23·075 5·034 Residual 68·76 15 4·584 Total 161·06 19 1 1 1 1 1 1 1 Refer to F4,15 1 No FT if wrong Upper 1% point is 4·89 Significant, seems treatments are not all the same 1 1 No FT if wrong 83 10 4769 Mark Scheme June 2010 Question 1 f ( x) = (i) xe − x / λ E( X ) = = = 1 λ2 1 λ2 { ( x > 0) λ2 1 λ 2 ∞ x 2 e − x / λ dx 0 −λ x 2 e − x / λ + λ.2 xe− x / λ dx 0 0 {[0 − 0]} + 2λ.1 = 2λ . ( 1 2 E( X ) = E( X ) (ii) } ∞ ∞ M1 for integral for E(X) M1 for attempt to integrate by parts ∴ E λˆ [ = For second term: M1 for use of integral of pdf or for integr'g by parts again A1 X ) ] =λ ∴ λˆ is unbiased. E1 [7] M1 E( X 2 ) = M1 for use of E(X2) By parts M1 ( ) = 1 λ2 ∞ 0 x3e− x / λ dx −λ x e λ { 1 3 −x/λ 2 1 λ2 ∞ } ∞ + 3λ x 2 e − x / λ dx 0 0 {[0 − 0]} + 3λ E ( X ) = 6λ 2 . M1 for use of E(X) A1 for 6λ2 ∴ Var ( X ) = E ( X 2 ) − {E ( X )} = 6λ 2 − 4λ 2 = 2λ 2 . 2 λ ∴ Var λˆ = . 2n Variance of λ̂ becomes very small as n increases. ( ) A1 2 It is unbiased and so becomes increasingly concentrated at the correct value λ. (iv) A1 1 1 Var ( X ) Var λˆ = Var ( X ) = 4 4 n = (iii) M1 ( ) E λ = ( 18 + 14 + 18 ) 2λ = λ . ∴ λ is unbiased. ( ) A1 E1 E1 ( ) M1 A1 λ2 / 6 8 ∴ relative efficiency of λ to λ̂ is = . 2 9 3λ /16 M1 any comparison of variances M1 correct comparison A1 for 8/9 So λ̂ is preferred. [Note. This M1M1A1 is allowable in full as FT if everything is plausible.] E1 (FT from above) 1 [2] E λ : B1; "unbiased": E1 Var λ = ( 641 + 161 + 641 ) 2λ 2 = 163 λ 2 . ) / Var( λ̂ ), award 1 out of 2 Special case. If done as Var( λ for the second M1 and the A1 in the scheme. [7] [8] 4769 Mark Scheme June 2010 Question 2 (i) e− λ ( λt ) G (t ) = E (t ) = x! x =0 ∞ X = e − λ eλt = eλ (t −1) [A1] (ii) x λ 2t 2 + ... [A1] [M1] = e 1 + λ t + 2! −λ [Allow omission of previous A1 step and write-down of this for A2 provided opening M1 has been earned (NB answer is given)] G ' ( t ) = λ eλ ( t −1) Mean = G'(1) (iii) (iv) Z= X −μ σ : [A1] G'' ( t ) = λ 2 eλ (t −1) [M1] Variance = G''(1) + mean – mean2 ∴ variance = λ 2 + λ − λ 2 = λ G'(1) = λ [M1] [3] G''(1) = λ2 [A1] [A1] [5] mean 0 [B1] variance 1 [B1] [2] ( ) Mgf of X is M (θ ) = G eθ = e ( ) λ eθ −1 [B1] Linear transformation result is M aX +b (θ ) = ebθ M X ( aθ ) [B2 if fully correct, any equivalent form. Allow B1 if either factor correct.] Use with a = M Z (θ ) = e 1 σ − λθ e [A1] (v) Consider = θ2 2 1 = λ eθ / and b = − λ λ eθ / λ −1 = e λ eθ / [A1] λ − μ =− λ σ λ [M1] − θ −1 λ [A1] [NB answer is given] θ θ θ2 θ3 θ − 1 = λ 1 + + + + ... − − 1 3/ 2 λ λ λ 2!λ 3!λ + terms in λ −1/ 2 , λ −1 , λ −3/ 2 ,... [A1] → θ2 2 [7] [M1] as λ → ∞ [M1] [some explanation required] ∴ M Z (θ ) → eθ 2 /2 as λ → ∞ [A1] [answer given] [4] (vi) θ2 /2 e is the mgf of N(0, 1) [E1] , and the relationship between distributions and their mgfs is unique [E1] . "Unstandardising", X tends to N(μ, σ 2) i.e. N(λ, λ) [B1, parameters must be given]. 2 [3] 4769 Mark Scheme June 2010 Question 3 (i) H 0 is accepted if –1.96 < value of test statistic < 1.96 i.e. if −1.96 < ( x1 − x2 ) − ( 0 ) 2 2 1.2 1.4 + 8 10 < 1.96 i.e. if −1.96 × 0.6132 < x1 − x2 < 1.96 × 0.6132 i.e. if −1.20(18) < x1 − x2 < 1.20(18) Note. Use of μ 1 – μ 2 instead of x1 − x2 can score M1 B1 M0 M1 A0 A0. (ii) M1 double inequality B1 1.96 M1 num of test statistic M1 den of test statistic A1 Special case. Allow 1 out of 2 of the A1 marks if 1.645 used provided all 3 M marks have been earned. [6] B1 M1 FT if wrong [FT can's acceptance region if reasonable] E1 so H 0 is rejected. CI for μ 1 – μ 2 : r A1 x1 − x2 = 1.4 which is outside the acceptance region r 1.4 ± (2.576 × 0.6132), i.e. 1.4 ± 1.5796, i.e. ( –0.18 [–0.1796] , 2.97[96] ) M1 for 1.4 B1 for 2.576 M1 for 0.6132 A1 cao for interval [7] (iii) Wilcoxon rank sum test (or Mann-Whitney form of test) Ranks are: First Second 14 13 10 8 6 11 2 12 3 1 4 7 5 9 M1 M1 A1 W = 14 + 13 + 10 + 8 + 6 + 11 = 62 [or 8 + 8 + 7 + 7 + 6 + 5 = 41 if M-W used] Refer to W 6,8 [or MW 6,8 ] tables. B1 M1 Lower 2½% critical point is 29 [or 8 if M-W used]. Combined ranking Correct [allow up to 2 errors; FT provided M1 earned] No FT if wrong A1 Special case 1. If M1 for W 6,8 has not been awarded (likely to be because rank sum 43 has been used, which should be referred to W 8.6 ), the next two M1 marks can be earned but nothing beyond them. Consideration of upper 2½% point is also needed. Eg: by using symmetry about mean of ( 1 2 × 6 × 8 ) + ( 12 × 6 × 7 ) = 45, critical point is 61. [For M-W: mean is 12 × 6 × 8 = 24, hence critical point is 40.] Result is significant. Seems (population) medians may not be assumed equal. M1 M1 for any correct method A1 if 61 correct E1, E1 Special case 2 (does not apply if Special Case 1 has been invoked). These 2 marks may be earned even if only 1 or 2 of the preceding 3 have been earned. [11] 3 4769 Mark Scheme June 2010 Question 4 (i) Randomised blocks B1 Eg:WEST D A C B C B A D D C A B EAST Plots in strips (blocks) correctly aligned w.r.t. fertility trend. Each letter occurs at least once in each block in a random arrangement. M1 E1 M1 E1 [5] (ii) μ = population [B1] grand mean for whole experiment [B1] α i = population [B1] mean amount by which the ith treatment differs from μ [B1] e ij ~ ind N [B1, accept "uncorrelated"] (0 [B1] , σ 2 [B1] ) (ii) Totals are 62.7 65.6 69.0 67.8 Grand total 265.1 4 marks, as shown 3 marks, as shown [7] all from samples of size 5 "Correction factor" CF = 265.12/20 = 3513.9005 M1 for attempt to Total SS = 3524.31 – CF = 10.4095 2 2 2 2 form three sums of squares. 62.7 65.6 69.0 67.8 Between varieties SS = – CF + + + 5 5 5 5 M1 for correct = 3518.498 – CF = 4.5975 method for any two. Residual SS (by subtraction) = 10.4095 – 4.5975 = 5.8120 Source of variation Between varieties Residual Total SS df 4.5975 3 [B1] 5.8120 16 [B1] 10.4095 19 MS [M1] 1.5325 0.36325 MS ratio [M1] 4.22 [A1 cao] A1 if each calculated SS is correct. 5 marks within the table, as shown Refer MS ratio to F 3,16 . M1 No FT if wrong Upper 5% point is 3.24. Significant. Seems the mean yields of the varieties are not all the same. A1 E1 E1 No FT if wrong [12] 4 4769 Mark Scheme June 2011 4769 June 2011 Qu 1 f ( x) = (i) L= 2 1 e − x / 2θ 2πθ [N(0, θ )] 2 2 2 1 1 1 e − x1 / 2θ . e − x2 / 2θ ..... e − xn / 2θ 2πθ 2πθ 2πθ = ( 2πθ ) n 1 ln L = − ln ( 2πθ ) − 2 2θ −n / 2 e x −Σxi 2 / 2θ d ln L = 0 dθ n 1 = 2 ˆ 2θ 2θˆ x n M1 A1 x M1 A1 2 i 2 A1 i Check this is a maximum. Eg: M1 d 2 ln L n 1 1 = . 2 − 3 xi 2 2 dθ 2 θ θ A1 which, for θ = θˆ , is for ln L fully correct M1 for differentiating A1, A1 for each term 2 i gives 1 i.e. θˆ = 2 x product form fully correct Note. This A1 mark and the next five A1 marks depend on all preceding M marks having been earned. i d ln L n 1 1 = − . + 2 dθ 2 θ 2θ M1 A1 n n n − 2 = − 2 < 0. 2 ˆ ˆ 2θ 2θˆ θ A1 for expression involving θˆ A1 for showing < 0 [14] (ii) First consider E(X 2) = Var(X) + {E(X)}2 = θ + 0 M1 A1 1 ∴ E θˆ = (θ + θ + ... + θ ) = θ n A1 i.e. θˆ is unbiased. A1 () [4] (iii) () Here θˆ = 10 and Est Var θˆ = 2 × 102/100 = 2 Approximate confidence interval is given by 10 ± 1.96 2 = 10 ± 2.77 , i.e. it is (7.23, 12.77). 1 B1, B1 M1 centred at 10 B1 1.96 M1 Use of √2 A1 c.a.o. Final interval [6] 4769 Mark Scheme June 2011 4769 June 2011 Qu 2 (i) f ( x ) = 12 e − x / 2 n=2 M (θ ) = E ( eθ X ) = − x 1 −θ 1 e (2 ) = 1 2 − ( 2 − θ ) ∞ 1 0 2 e − x ( 12 −θ ) ∞ [A1] = 0 dx 1 2 1 2 −θ A1 [A1] = (1 − 2θ ) −1 [A1] Any equivalent form A1, A1, A1 for each expression, as shown, beware printed answer f ( x ) = 14 xe − x / 2 n=4 M (θ ) = ∞ xe 1 0 4 − x ( 12 −θ ) M1 for attempt to integrate this by parts dx ∞ − x ( 1 −θ ) − x ( 12 −θ ) ∞ e 1 xe 2 = 1 d x [A1] [A1] − 1 0 − 4 − ( 2 − θ ) ( 2 −θ ) 0 = 1 1 −1 .2 (1 − 2θ ) [A1] [ 0 − 0] [A1] + 1 4 2 −θ = 1 2 A1, A1 for each component, as shown A1, A1 for each component, as shown 1 −1 −2 1 2 1 2 − θ = − θ ( ) ( ) 1 2 (1 − 2θ ) A1 for final answer, beware printed answer [10] (ii) Mean = M'(0) M ' (θ ) = −2 ( − n2 ) (1 − 2θ ) − n2 −1 = n (1 − 2θ ) − n2 −1 ∴ mean = n M1 A1 A1 Variance = M''(0) – {M'(0)}2 M '' (θ ) = n ( − n2 − 1) ( −2 )(1 − 2θ ) − n2 − 2 = n ( n + 2 )(1 − 2θ ) − n2 − 2 M1 A1 ∴ M''(0) = n(n + 2) A1 ∴ variance = n(n + 2) – n2 = 2n A1 [Note. This part of the question may also be done by expanding the mgf.] Solution continued on next page 2 [7] 4769 Mark Scheme June 2011 4769 June 2011 Qu 2 continued (iii) By convolution theorem, { M W (θ ) = (1 − 2θ ) M1 − 12 } k = (1 − 2θ ) −k / 2 . B1 This is the mgf of χ 2k , M1 so (by uniqueness of mgfs) W ~ χ 2k . B1 [4] (iv) W ~ χ 2100 has mean 100, variance 200. Can regard W as the sum of a large "random sample" of χ 21 variates. 118.5 − 100 ∴ P ( χ 2100 < 118.5 ) ≈ P N(0,1) < = 1.308 200 = 0.9045. M1 for use of N(0,1) A1 c.a.o. for 1.308 A1 c.a.o. [3] 3 4769 Mark Scheme June 2011 4769 June 2011 Qu 3 8 separate B1 marks for components of answer, as shown (i) Type I error: rejecting null hypothesis [B1] when it is true [B1] Allow B1 out of 2 for P(...) Type II error: accepting null hypothesis [B1] when it is false [B1] Allow B1 out of 2 for P(...) OC: P(accepting null hypothesis [B1] as a function of the parameter under investigation [B1]) P(Type II error | the true value of the parameter) scores B1+B1 Power: P(rejecting null hypothesis [B1] as a function of the parameter under investigation [B1]) P(Type I error | the true value of the parameter) scores B1+B1. "1 – OC" as definition scores zero. [8] (ii) X ~ N(μ, 25) H 0 : μ = 94 ( H 1 : μ > 94 ( ) We require 0.02 = P reject H 0 μ = 94 = P X > c μ = 94 c − 94 = P ( N ( 94, 25 / n ) > c ) = P N ( 0,1) > 5/ n ∴ c − 94 = 2.054 5/ n ( M1 for first expression M1 for standardising ) c − 97 = P ( N ( 97, 25 / n ) > c ) = P N ( 0,1) > 5/ n c − 97 = − 1.645 5/ n ∴ we have c = 94 + M1 B1 for 2.054 We also require 0.95 = P reject H 0 μ = 97 ∴ ) M1 for first expression M1 for standardising B1 for –1.645 10.27 8.225 and c = 97 − n n Attempt to solve; c = 95.666 [allow 95.7 or awrt] √n = 6.165, n = 38.01 Take n as "next integer up" from candidate's value M1 two equations A1 both correct (FT any previous errors) M1 A1 c.a.o. A1 c.a.o. A1 [13] (iii) Power function: step function from 0 with step marked at 94 to height marked as 1 G1 G1 G1 Zero out of 3 if step is wrong way round. [3] 4 4769 Mark Scheme June 2011 4769 June 2011 Qu 4 (a) Each E2 in this part is available as E2, E1, E0. (i) Description of situation where randomised blocks would be suitable, ie one extraneous factor (eg stream down one side of a field). E2 Explanation of why RB is suitable (the design allows the extraneous factor to be "taken out "separately). E2 Explanation of why LS is not appropriate (eg: there is only one extraneous factor; LS would be unnecessarily complicated; not enough degrees of freedom would remain for a sensible estimate of experimental error). E2 Description of situation where Latin square would be suitable, ie two extraneous factors (and all with same number of levels) (eg streams down two sides of a field). E2 Explanation of why LS is suitable (the design allows the extraneous factors to be "taken out "separately). E2 Explanation of why RB is not appropriate (RB cannot cope with two extraneous factors). E2 (ii) [12] (b) Totals are 56.5 57.4 60.6 82.3 from samples of sizes 4 3 5 4 Grand total 256.8 "Correction factor" CF = 256.82/16 = 4121.64 M1 for attempt to Total SS = 4471.92 – CF = 350.28 2 Between treatments SS = 2 2 2 56.5 57.4 60.6 82.3 – CF + + + 4 3 5 4 = 4324.1103 – CF = 202.47 Residual SS (by subtraction) = 350.28 – 202.47 = 147.81 Source of variation Between treatments Residual Total SS 202.47 147.81 350.28 df 3 [B1] 12 [B1] 15 MS [M1] 67.49 12.3175 MS ratio [M1] 5.47(92) [A1 cao] Refer MS ratio to F 3,12 . Upper 5% point is 3.49. Significant. Seems the effects of the treatments are not all the same. form three sums of squares. M1 for correct method for any two. A1 if each calculated SS is correct. 5 marks within the table, as shown M1 A1 E1 E1 No FT if wrong No FT if wrong [12] 5 4769 Question (i) 1 Mark Scheme Marks M1 use of cdfs Answer P X x FB ( x). FG ( x). 1 2 ie cdf of X is F x 1 2 FB ( x) FG ( x) ie (by differentiating) pdf of X is f x 12 f B ( x) f G ( x) 1 1 (ii) (iii) E X 1 2 2 B G 1 2 x f ( x)dx 2 2 B 2 B2 G Answer given [3] M1 [answer given; needs some indication of method] M1 ( x)dx M1 2 G 2 A1 [1] M1 x f ( x)dx x f B G 1 2 2 Use of "E X 2 2 2 " 1 2 Var( X ) E( X 2 ) E( X ) A1 M1 2 2 12 B 2 12 G 2 14 B 2 14 G 2 12 B G = 2 14 B G A1 A1 2 Answer given [7] 1 (iv) [Central Limit Theorem] Approx dist of X is N 1 2 B 12 G , 1 2n B1 B1 B1 1 (v) 2 14 B G 2 B1 X X Var X E X 2 and Var X 2n X st 1 2 B st G 1 2 either B st B4 [4] M1M1 2 n B1 G 1 4 2 n Guidance A1 1 2 xf ( x)dx xf ( x)dx E X 1 2 June 2012 B1 2 n [4] 5 4 marks as shown 4769 Mark Scheme Question (vi) 1 Marks E1 Answer 1 E ( X ) E ( X st ) ( B G ) E ( X ) 2 ie they are unbiased. June 2012 E1 M1 Clearly Var X Var X st , X st is the more efficient. Guidance for any attempt to compare variances Candidates are not required to note that the variances are equal in the case μB μG . for deduction that Var X Var X st [FT c's variances] E1 More efficient [5] 2 (i) Mean of X = 3.5 (immediate by symmetry) E X 2 16 1 4 ... 36 91 6 B1 M1 A1 2 91 7 35 Var X 6 2 12 Answer given [3] 2 (ii) G t E t X t1. 16 t 2 . 16 ... t 6 . 16 1 6 t t 2 ... t 6 M1 t 1 t 6 A1 6 1 t Answer given [2] 2 (iii) [P(N = 0) = ½, P(N = 1) = (½)(½),] P(N = r) = (½)r.(½) B1 [1] 2 (iv) H t E t N t 0 . 12 t1. 14 t 2 . 81 ... M1 A1 1 1 2 t t 1 2 2 t 1 2 6 M1 answer given; must be convincing Answer given 4769 Question (v) 2 Mark Scheme Marks M1 for use of 1st derivative A1 Answer Mean = H'(1), variance = H''(1) + mean – mean2. H ' t 1 2 t 1 2 t 3 3 H '' t 2 2 t 1 2 2 t 2 2 mean = 1 variance = 2 + 1 – 1 = 2 2 (vi) K(t) = H{G(t)} = {2 – G(t)}–1 12 1 t t 1 t 1 t t 2 ... t 5 t 1 t 6 2 6 1 t 6 1 t 1 June 2012 1 1 12 t t 2 t 3 ... t 6 2 6 1 6 12 t t ... t 6 M1 for use of 2nd derivative A1 [4] M1 M1 M1 For variance A1 Answer given Guidance inserting G(t) use of hint given [4] 2 (vii) K ' t 6 12 t t 2 ... t 1 2t 3t 6 2 4t 3 5t 4 6t 5 M1 reasonable attempt to differentiate K(t) M1 reasonable attempt at 2nd derivative M1 for use of derivatives mean = K '(1) 6(12 – 6)–2(21) = 21/6 = 7/2 A1 Substitution shown K''(1) = (12 6–3 212) + (6 6–2 70) = (49/2) + (70/6) variance = A1 49 70 7 49 294 140 42 147 329 2 6 2 4 12 12 A1 K'' t 12 12 t t 2 ... t 6 6 12 t t 2 ... t 6 2 3 2 1 2t 3t 2 6t 12t 2 2 4t 3 5t 4 6t 5 20t 3 30t 4 2 [6] 7 Ft c’s K’(1) and/or K’’(1) provided variance positive Exact. 4769 Mark Scheme Question 2 (viii) June 2012 Marks Answer Guidance We have: X 7 / 2 M1 for correct use of candidate's values for means and variances A1 answer honestly obtained (common denominator shown). A0 if different from (vii) X 2 35 /12 N 1 N2 2 Q 2 329 /12 Inserting in the quoted formula gives 7 2 35 294 35 329 2 1 12 12 2 12 3 (i) as required. H0: population medians are equal [2] B1 H1: population median for A < population median for B B1 Wilcoxon rank sum test (or Mann-Whitney form of test) Ranks are: A 1 2 4 5 9 11 B 3 6 7 8 10 12 13 14 M1 A1 W = 1 + 2 + 4 + 5 + 9 + 11 = 32 [or 0 + 0 + 1 + 1 + 4 + 5 = 11 if M-W used] Refer to W6,8 [or MW6,8] tables Lower 5% critical point is 31 [or 10 if M-W used] Result is not significant [Note: "population" must be explicit] 1) Explicit statement re shapes of distributions. (eg that they are the same shape) is not required. 2) More formal statements of hypotheses gain both marks [eg cdfs are F(x) and F(x – Δ), H0 is Δ = 0 etc).] Combined ranking Correct [allow up to 2 errors; FT provided M1 earned] B1 M1 A1 A1 Seems median yields may be assumed equal A1 [9] 8 No FT if wrong No FT if wrong 4769 Mark Scheme Question (ii) 3 June 2012 Marks Guidance B1 B1 "population" must be explicit, either in words or by notation B1 For all. Use of sn scores B0 Answer H0: population means are equal H1: population mean for A < population mean for B For A: x 11.4, sn 12 1.912 [ sn 1 1.38275] y 12.575, sn 12 1.051[ sn 1 1.025] (5 1.912) (7 1.051) 16.915 Pooled s2 = 1.4096 12 12 For B: M1 for any reasonable attempt at pooling (but not if sn2 used) A1 If correct M1 A1 Ft if incorrect Refer to t12 Lower single-tailed 5% critical point is –1.782 M1 A1 No FT if wrong No FT if wrong must compare –1.83 with –1.782 unless it is clear and explicit that absolute values are being used Significant Seems mean yield for A is less than that for B A1 A1 [11] E1 E1 E1 E1 [4] B1 Test statistic = 11.4 12.575 1.175 1.83(25) 0.6412 1.4096 16 18 3 (iii) t test is "more sensitive" if Normality is correct. Non-rejection of Normality supports t. But Wilcoxon is more reliable if not Normal – and we do not have proof of Normality. 4 (i) Latin square 4 4 layout, with rows clearly representing casting techniques and columns clearly representing chemical compositions [or vice versa] Labels each appearing exactly once in each row and in each column representing manufacturers. 9 B2,1,0 B2 if completely correct, B1 if one point omitted B1 for correct structure B1 [5] within the square 4769 Question (ii) 4 Mark Scheme Answer Totals are 440.6 453.5 458.2 459.4 all from samples of size 4 Grand total 1811.7 "Correction factor" CF = 1811.72/16 = 205141.06 Total SS = 205202.57 – CF = 61.5144 Between manufacturers SS = June 2012 Marks Guidance M1 for attempt to form three sums of squares. = 205196.55 – CF = 55.4969 Residual SS (by subtraction) = 61.5144 – 55.4969 = 6.0175 M1 A1 for correct method for any two if each calculated SS is correct. Source of variation SS df Between treatments 55.4969 3 Residual 6.0175 12 Total 61.5144 15 B1 B1 M1 M1 A1 Between treatments df Residual df Method for calculating either mean square MS ratio cao at least 3sf, condone up to 6 sf M1 A1 No FT if wrong 2 2 2 2 440.6 453.5 458.2 459.4 – CF 4 4 4 4 MS 18.4989 0.5014(6) MS ratio 36.89 Refer MS ratio to F3,12. quotation of an extreme point (not just the 5%point); eg 1% point, 5.95, or 0.1% point, 10.80. Must be some reference to very highly significant and very strong/overwhelming evidence that "the manufacturers are not all the same”. 4 (iii) A1 A1 [12] B2 yij i eij = population … …grand mean for whole exp't i = population mean amount by which the ith treatment differs from B1 B1 B1 B1 if any two RHS terms correct “Population”; award here or for i eij is experimental error – does not need to be stated explicitly here, is subsumed in error assumptions below. eij ~ ind N [*] ( accept "uncorrelated") (0 [*] , 2 [*] ) B2 [7] 10 if all three * components correct, B1 if any two correct. 4769 Mark Scheme Question 1 Answer (i) P ( X= x= ) L= Marks −θ e θ x! e − θ θ x1 e − θ θ xn e − nθ θ Σxi = . ... . x1 ! xn ! x1 ! x2 !... xn ! M1 A1 M1 for general product form. A1 (a.e.f.) for answer. ln L = – nθ + Σx i ln θ – Σln x i ! M1 A1 M1 is for taking logs (base e). Allow (±) constant instead of last term. Σx d ln L =− n + i dθ θ = 0 for ML Est θ̂ . M1 A1 M1 for differentiating, A1 for answer. M1 A1 [10] M1 P ( X= 0= ) e . ˆ (iii) M1 −θ ML Est of e − θ = e − θ , 1 Any or all of the four M1 marks down to here can be awarded in part (iv) if not awarded here. A1 Confirmation that this is a maximum: Σx d 2 ln L = − 2i < 0 . 2 θ dθ (ii) Guidance x i.e. θˆ x . = ∴ nθˆ Σxi , = 1 June 2013 M1 A1 i.e. the estimate is e − x . We have estimate of P( X= 0)= e = 0.0067 , [3] M1 so we might reasonably expect around 1000e −5 ≈ 6.7 cases of zero in a sample of size 1000 E1 – finding no such cases seems suspicious. E1 −5 [3] 6 M1 for "invariance property", not necessarily named. = = and x 5. Sensible use of n 1000 4769 Mark Scheme Question 1 (iv) Answer Marks X has Poisson distribution "scaled up" so that , therefore , ∞ ∞ eθ− θ x eθ− θ x =k ∑ − e − θ =k {1 − e − θ } . where 1 =k ∑ x! = x 1= x 0 x! 1 ∴ k = −θ . 1− e (e [for x = 1, 2, ... ]. − 1) x1 ! x2 !... xn ! ( 1− )Σ−ln x ! θ M1 M1 for sum from 0 to ∞ minus value for x = 0. Beware printed answer. A1 i neθ d ln L Σxi , = − θ θ e −1 dθ and on setting this equal to zero we get that θ̂ satisfies Σxi θeθ = = x. θ e −1 n ∴ (i) M1 for this idea, however expressed. A1 n ∴ ln = LΣ lnxθ n−ln e i 2 M1 A1 θ Σxi θ Guidance M1 1 e−θ θ x θx ∴P( X = =θ x ) = −θ . 1− e x! ( e − 1) x! ∴L= June 2013 A1 A1 Beware printed answer. [8] B1 B1 B1 (A) μ = 0. (B) E(X2) = 8/3. (C) Var(X) = 8/3. [3] 2 (ii) Mgf of X is M X (θ) = E(eθX) 1 1 1 = e −2θ . + e0 . + e 2θ . 3 3 3 1 = (1 + e 2θ + e −2θ ) . 3 M1 A1 [2] 7 Any equivalent form. 4769 2 Mark Scheme Question Answer Marks (iii) General results: [Convolution theorem] Mgf of sum of independent random variables = product of their mgfs. [Linear transformation result] M aX + b ( θ ) = ebθ M X ( aθ ) . B1 M1 M1 1 M ΣX ( θ ) = (1 + e 2θ + e −2θ ) 3 Guidance B1 for explicit mention of "independent". Note M1 mark required for f.t. to A marks below. Allow implicit b = 0. Note M1 mark required for f.t. to A marks below. n θ θ 1 2 −2 M X ( θ )= 1 + e n + e n 3 Z= June 2013 A1 n A1 X μn n nX 3 − = σ σ 2 2 B1 2 3n 2 3n 1 − θ θ M Z ( θ ) = 1 + e n 2 2 + e n 2 2 3 Might be implicit in what follows. n A1 n θ 3 θ 3 1 − = 1 + e 2 n + e 2 n . 3 A1 [8] 8 Beware printed answer. 4769 Mark Scheme Question 2 (iv) Answer Marks 1 (1 3 θ 3 3θ 2 +1+ + + terms in n −3/ 2 , n −2 , ... 2n 2n.2! MZ (θ ) = { θ 3 3θ 2 + + terms in n −3/ 2 , n −2 , ...) }n 2 .2! n 2n cancel neglect +1− 1 3θ 2 ≈ 3+ 2n 3 June 2013 Guidance M1 M1 for reasonable attempt to expand exponentials. A1 A1 for this line. A1 A1 for this line. M1 M1 M1 for cancelling first order terms, may be implicit. M1 for neglecting higher order terms in n −1 , MUST be explicit. n A1 n θ2 = 1+ . 2n A1 Beware printed answer. [7] 2 (v) Limit is e θ2 / 2 . This is mgf of N(0, 1). M1 M1 B1 A1 [4] B1 Mgfs are unique (even in this limiting process). So (approximately) distribution of Z is N(0, 1). 3 (i) Type I error: Type II error: OC: Power: rejecting null hypothesis when it is true B1 accepting null hypothesis B1 when it is false B1 P(accepting null hypothesis B1 as a function of the parameter under investigation) B1 P(rejecting null hypothesis B1 as a function of the parameter under investigation) B1 [8] 9 M1 for limit, M1 for recognising mgf. Bracketed phrase is not needed to earn the mark. Bracketed word is not needed to earn the mark. Allow B1 out of 2 for P(...). Allow B1 out of 2 for P(...). P(Type II error | the true value of the parameter) scores B1+B1. P(Type I error | the true value of the parameter) scores B1+B1. 4769 Mark Scheme Question Answer Marks 3 (ii) Correct "spike" shape, or point shown. Location of spike or point at θ 0 correct and correctly labelled. Height of spike correct and correctly labelled as 1. 3 (iii) X ~ N(μ, 9). n = 25. H 0 : μ = 0. H 1 : μ ≠ 0. Accept H 0 if –1 < x < 1. G1 G1 G1 [3] P(Type I error) = P X > 1 9 X ~ N 0, 25 M1 M1 1− 0 =P N(0,1) > = 2 × 0.0478 = 0.0956. 3/ 5 P(Type II error when μ = 0.5) 9 = P −1 < X < 1 X ~ N 0.5, 25 0.5 −1.5 < N(0,1) < P ( 2.5 < N(0,1) < 0.8333) = P =− 3/ 5 3/ 5 = 0.7976 – 0.0062 = 0.7914. 3 (iv) Either or both marks might be implicit in what follows. M1 M1 As for the two M1 marks for the Type I error. M1 Standardising with correct end-points. [9] G1 G1 G1 G1 [4] 10 M1 for use of X > 1 , M1 for distribution of X . Accept a.w.r.t. 0.096. E1 E1 Correct shape – must be symmetrical, and reasonable approximation to Normal pdf. "Centre" at 0 and clearly labelled. Height at 0 distinctly less than 1 by about 10% (ft candidate's P(Type I error)). Some indication that height at 0.5 is about 0.8 (ft candidate's P(Type II error)). Guidance A1 A1 This is high. We are trying to detect only a small departure from H 0 , the "error" (σ2) being comparatively large. June 2013 4769 4 Mark Scheme June 2013 Question Answer Marks Guidance (i) Randomisation is mainly to guard against possible sources of bias, which may be due to subjective allocation of treatments to units or unsuspected. B1 B1 B1 Award up to B3 marks. These include B1 for idea of "sources of bias" and another B1 for either idea of " unsuspected" or “subjective”. Replication enables an estimate of experimental error to be made. B1 B1 B1 Award up to B3 marks. These include B1 for some concept of replication addressing experimental error and another B1 for the explicit idea that it enables this error to be estimated. The third B1 should be awarded for the general quality of the response, especially the lack of extraneous material in it Alternative suggestions offered by candidates may be rewarded provided they are statistically sound. Suggestions that are limited to particular designs (eg for the advantages of randomisation within a randomised blocks design) may be rewarded if statistically sound, but for at most 2 of the B3 marks in each set. 4 (ii) μ is the population mean for the entire experiment. [6] B1 B1 α i is the population amount by which the mean for the ith treatment differs from μ. B1 B1 e ij ~ ind N (0, σ2) B1 B1 B1 [7] 11 B1 for explicit mention of "population", B1 for idea of mean for whole experiment. B1 for explicit mention of "population", B1 for idea of difference of means. For "ind N"; allow "uncorrelated". For mean 0. For variance σ2 [i.e. that the variance is constant]. 4769 Mark Scheme Question 4 (iii) Answer Marks Guidance B1 B1 No need for definition of k as the number of treatments, and accept simply α 1 = α 2 =... with no explicit upper end to the sequence. Accept hypotheses stated verbally provided it is clear that population parameters (means) are being referred to. B1 for each d.f. (4 and 15). Null hypothesis: α 1 = α 2 = ... = α k ( = 0 ) Alternative hypothesis: not all α i are equal Note that alternative hypothesis is NOT that all the α i are different – B0 for alternative hypothesis if this is stated. B1 Source of variation Between fertilisers Residual Total Sum of squares 219.2 d.f. 304.5 523.7 15 19 4 Mean squares 54.8 June 2013 B1 MS ratio M1 For method of mean squares. 2.7 M1 For method of mean square ratio. A1 A1 c.a.o. for 2.7 (2.6695). 20.3 f.t. from here provided all M marks earned. Refer 2.7 to F 4,15 . M1 Upper 5% point is 3.06. Not significant. Seems mean effects of fertilisers are all the same. A1 A1 E1 [11] 12 No f.t. if wrong but allow M1 for F with candidate’s df if both positive and totalling 19. cao. No f.t. if wrong (or if not quoted). Verbal conclusion in context, and not "too assertive".