MS/234/5/SS99 UNIVERSITY OF SURREY© B. Sc. Honours Courses in Mathematical Studies Level 2 Examination Module MS234 Statistical Methods Time allowed - 2 hours Spring Semester 1999 Attempt THREE Questions If any candidate attempts more than THREE questions, only the best THREE solutions will be taken into account SEE NEXT PAGE 2 MS/234/5/SS99 1. (a) Explain briefly what a Latin Square experimental design accomplishes, and how to select one at random. (b) In the table below A, B and C denote three applied treatments, and the numbers are the observed values of the response variable, all set out in a Latin square design. Compute the appropriate sums of squares, set them out in an Anova table, and interpret the results. A 3 B 9 C 18 B 15 C 15 A 9 C 21 A 9 B 15 SEE NEXT PAGE 3 MS/234/5/SS99 2. (a) A one-way analysis of covariance model is written as yij = m + ai + bixij + eij (i=1,...,t; j=1,...,r). State the usual assumptions for the error terms eij. Show, without going into detailed derivation, how an unbiased estimator of the error variance s2 can be obtained using the ‘fitted values’. (b) What constraint on the parameters determines parallel regressions in the model defined in (a)? What constraint determines coincident regressions? How many independent parameters, apart from s2, are required to define the full, unconstrained model in (a), and how many in the two constrained versions, parallel and coincident? (c) The output y of a photo-electric cell was measured. There were three types of cell and a covariate x, representing the age of the cell. Data analysis on a sample of size 15 produced the following residual sums of squares, or deviances. Those with x have been obtained under a ‘parallel-regressions’ model. Draw up an Anova table that will allow testing of the effects of type and x, both individually and after fitting the other. Interpret the results. Effects fitted mean 51.07 type 8.72 x deviance 42.11 type + x 6.04 SEE NEXT PAGE 4 MS/234/5/SS99 3. (a) A thin protective coating is applied to a metal plate. Plate specimens are then stressed and their coating is observed to crack or not. Let px denote the probability of cracking at thickness x. Write down the form of a suitable linear logit model for px. Explain the idea of an underlying tolerance distribution in this context, and its median, the ED50. (b) Suppose that one has data in which ri specimens are observed to crack out of ni tested at thickness xi, for i=1,...,m. (An example with m=4 is given in (c) below.) Write down the log-likelihood function ln(q) in terms of the corresponding cracking probabilities pi (i=1,...,m), where q denotes the parameters of the model. Show that the associated score function can be written in the form Un(q) = Si=1 [(ri – nipi) / { pi (1 – pi) }] pi’, where pi’ denotes dpi/dq. Explain, without algebraic derivations, how these functions, together with the information function, can be used to obtain estimates of the parameters and their standard errors. (c) In the following table the data are of the form described in (b) with m=4. The model was fitted by maximum likelihood and the resulting estimates for the logit-model parameters, intercept a and slope b, were 1.67 and –7.52. The corresponding estimate of their covariance matrix had elements caa = 0.45, cab = –1.33 and cbb = 4.32. Derive an approximate 95% confidence interval for b based on these results, and a similar one for a+0.4b, the logodds-ratio of px at thickness x=0.4. Thickness xi 0.1 0.2 0.3 0.5 No. specimens cracked ri 7 8 4 No. specimens tested ni 10 15 10 20 2 5 SEE NEXT PAGE MS/234/5/SS99 4. (a) Write down the general form of the density of an exponential family model, identifying the natural and nuisance parameters. (b) A random sample (y1,...,yn) is obtained from the discrete geometric distribution which has probability function f(y) = py–1 (1–p) (y=1,2,3,...), where 0<p<1. Write down the log-likelihood function, ln(p), and show that the score function can be expressed as Un(p) = n p–1 (y–1) – n (1–p)–1. Also, derive the sample information, Vn(p). Given that E(y) = (1–p)–1 and var(y) = p (1–p)–2 for this distribution, calculate E(Un), E(Vn) and var(Un). How do these results accord with general theory? Derive the maximum likelihood estimator of p and its approximate variance. SEE NEXT PAGE 6 MS/234/5/SS99 5. (a) Samples of sand are collected at many points on an island. According to a certain theory types 1, 2 and 3 of the sand should occur in the ratios 1 : 2 : 5. The observed frequencies were 35 samples of type 1, 86 of type 2, and 149 of type 3. Perform a chi-square test of the theory. (b) A random sample of hospital patients is recruited and a dietary change is introduced. The patients are classified into three groups and an assessment is made of improvement or otherwise in a certain condition. Use the following table of observed frequencies to perform a chi-square test for equality of underlying improvement proportions. group 1 group 2 group 3 not improved improved 32 48 44 62 24 40 (c) Perform a two-sample (Wilcoxon) test for equality of medians for the following data of breakage strengths on two materials. material 1: 3.1 2.5 2.8 1.7 2.3 1.9 2.9 2.7 2.3 2.2 1.9 2.4 2.0 material 2: 2.8 3.7 3.4 3.8 3.0 2.9 3.2 3.4 2.8 3.5 3.0 You may assume the formulae mj = nj (n1+n2+1) / 2, sj2 = n1n2 (n1+n2+1) / 12 for the rank sums. INTERNAL EXAMINER: PROFESSOR M J CROWDER EXTERNAL EXAMINER: PROFESSOR B JONES