SMEX99A - University of Surrey

advertisement
MS/234/5/SS99
UNIVERSITY OF SURREY©
B. Sc. Honours Courses in Mathematical Studies
Level 2 Examination
Module MS234 Statistical Methods
Time allowed - 2 hours
Spring Semester 1999
Attempt THREE Questions
If any candidate attempts more than THREE questions, only the best THREE
solutions will be taken into account
SEE NEXT PAGE
2
MS/234/5/SS99
1.
(a) Explain briefly what a Latin Square experimental design accomplishes, and
how to select one at random.
(b) In the table below A, B and C denote three applied treatments, and the
numbers are the observed values of the response variable, all set out in a Latin
square design. Compute the appropriate sums of squares, set them out in an
Anova table, and interpret the results.
A 3 B 9 C 18
B 15 C 15 A 9
C 21 A 9 B 15
SEE NEXT PAGE
3
MS/234/5/SS99
2.
(a) A one-way analysis of covariance model is written as
yij = m + ai + bixij + eij
(i=1,...,t; j=1,...,r).
State the usual assumptions for the error terms eij. Show, without going into
detailed derivation, how an unbiased estimator of the error variance s2 can be
obtained using the ‘fitted values’.
(b) What constraint on the parameters determines parallel regressions in the
model defined in (a)? What constraint determines coincident regressions? How
many independent parameters, apart from s2, are required to define the full,
unconstrained model in (a), and how many in the two constrained versions,
parallel and coincident?
(c) The output y of a photo-electric cell was measured. There were three
types of cell and a covariate x, representing the age of the cell. Data analysis
on a sample of size 15 produced the following residual sums of squares, or
deviances. Those with x have been obtained under a ‘parallel-regressions’
model. Draw up an Anova table that will allow testing of the effects of type
and x, both individually and after fitting the other. Interpret the results.
Effects fitted
mean
51.07
type
8.72
x
deviance
42.11
type + x
6.04
SEE NEXT PAGE
4
MS/234/5/SS99
3.
(a) A thin protective coating is applied to a metal plate. Plate specimens are
then stressed and their coating is observed to crack or not. Let px denote the
probability of cracking at thickness x. Write down the form of a suitable linear
logit model for px. Explain the idea of an underlying tolerance distribution in
this context, and its median, the ED50.
(b) Suppose that one has data in which ri specimens are observed to crack
out of ni tested at thickness xi, for i=1,...,m. (An example with m=4 is given
in (c) below.) Write down the log-likelihood function ln(q) in terms of the
corresponding cracking probabilities pi (i=1,...,m), where q denotes the
parameters of the model. Show that the associated score function can be
written in the form
Un(q) = Si=1 [(ri – nipi) / { pi (1 – pi) }] pi’,
where pi’ denotes dpi/dq. Explain, without algebraic derivations, how these
functions, together with the information function, can be used to obtain
estimates of the parameters and their standard errors.
(c) In the following table the data are of the form described in (b) with m=4.
The model was fitted by maximum likelihood and the resulting estimates for
the logit-model parameters, intercept a and slope b, were 1.67 and –7.52.
The corresponding estimate of their covariance matrix had elements
caa = 0.45, cab = –1.33 and cbb = 4.32. Derive an approximate 95% confidence
interval for b based on these results, and a similar one for a+0.4b, the logodds-ratio of px at thickness x=0.4.
Thickness
xi 0.1 0.2 0.3 0.5
No. specimens cracked ri
7
8
4
No. specimens tested ni 10 15 10 20
2
5
SEE NEXT PAGE
MS/234/5/SS99
4.
(a) Write down the general form of the density of an exponential family
model, identifying the natural and nuisance parameters.
(b) A random sample (y1,...,yn) is obtained from the discrete geometric
distribution which has probability function
f(y) = py–1 (1–p)
(y=1,2,3,...),
where 0<p<1. Write down the log-likelihood function, ln(p), and show that
the score function can be expressed as
Un(p) = n p–1 (y–1) – n (1–p)–1.
Also, derive the sample information, Vn(p). Given that E(y) = (1–p)–1 and
var(y) = p (1–p)–2 for this distribution, calculate E(Un), E(Vn) and var(Un).
How do these results accord with general theory? Derive the maximum
likelihood estimator of p and its approximate variance.
SEE NEXT PAGE
6
MS/234/5/SS99
5.
(a) Samples of sand are collected at many points on an island. According to a
certain theory types 1, 2 and 3 of the sand should occur in the ratios 1 : 2 : 5.
The observed frequencies were 35 samples of type 1, 86 of type 2, and 149
of type 3. Perform a chi-square test of the theory.
(b) A random sample of hospital patients is recruited and a dietary change is
introduced. The patients are classified into three groups and an assessment is
made of improvement or otherwise in a certain condition. Use the following
table of observed frequencies to perform a chi-square test for equality of
underlying improvement proportions.
group 1 group 2 group 3
not improved
improved
32
48
44
62
24
40
(c) Perform a two-sample (Wilcoxon) test for equality of medians for the
following data of breakage strengths on two materials.
material 1: 3.1 2.5 2.8 1.7 2.3 1.9 2.9 2.7 2.3 2.2 1.9 2.4 2.0
material 2: 2.8 3.7 3.4 3.8 3.0 2.9 3.2 3.4 2.8 3.5 3.0
You may assume the formulae mj = nj (n1+n2+1) / 2, sj2 = n1n2 (n1+n2+1) /
12 for the rank sums.
INTERNAL EXAMINER: PROFESSOR M J CROWDER
EXTERNAL EXAMINER: PROFESSOR B JONES
Download