STAT 511 Spring 1999 Instructions: 1. FINAL EXAM NAME ____________ This is a closed book exam. No books or notes are allowed. We have not left enough space to write your answers on this exam. Please write your answers on separate sheets of paper. Be sure to write your name on each sheet of paper that you submit. A pharmacologist modeled the responsiveness of patients to a drug using the following model β0 Yj = β0 + εj β2 Xj 1 + β1 where Xj is the dosage level of the drug Yj is the observed responsiveness expressed as a percent of the maximum possible responsiveness εj denotes independent and identically distributed random errors with E(εj) = 0 and Var(εj) = σ 2 for all j = 1, 2, ...., n. and β0 > 0, β1 > 0 and β2 > 0. Data were obtained from 19 patients at the dosage levels shown in the following table. Patient (j) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Dosage Level (Xj) 1.0 2.0 3.0 3.5 3.5 4.0 4.0 4.5 4.5 5.0 5.0 5.5 5.5 6.0 6.0 6.5 7.0 8.0 9.0 Observed Responsiveness (Yj) 0.5 2.3 3.4 11.5 10.9 24.0 25.3 39.6 37.9 54.7 56.8 70.8 68.4 82.1 80.6 87.2 92.8 94.2 94.4 2 3 4 Least squares estimates of the parameters, β0, β1, and β2 were obtained from the nls( ) function in S-PLUS. S-PLUS code and some of the results are shown on pages 2 and 3. 2. (a) Give an interpretation of the parameters β0 and β1. What do these parameters represent? (b) The least squares estimates of β0, β1 and β2 are shown in the results listed on page 2 along with values for standard errors. Describe how the standard errors are computed. Define any notation you introduce in your answer. (c) How is the deviance value listed on the top of page 3 computed? Give a formula. (d) Suppose the model is correct. Use the least squares estimates of the parameters to estimate the dosage level of the drug at which the mean responsiveness is 80% of the maximum. Show how to compute a standard error for your estimate, but a numerical value for the standard error is not needed. (e) A plot of the least squares estimate of the curve defined by our model is included in the results from the S-PLUS code. This plot suggests that the model may not be entirely correct. Describe how you would perform a test of the fit of this model. Let tn denote the 20% trimmed mean computed from a sample of n values, Y1, Y2, ..., Yn. A 20% trimmed mean is evaluated by ordering the data from smallest to largest, deleting the largest 20% of the observed values and the smallest 20% of the observed values, and computing the average of the remaining values. Explain how you would use the bootstrap to obtain a standard error of a 20% trimmed mean for a random sample of n = 100 observations. 5 3. Females of a certain species of fresh water turtle lay their eggs in the sand and immediately return to the water. The eggs are kept warm through the effect of the sun warming the sand. The sex of the young that hatch from the eggs is affected by the temperature at which the eggs are incubated. To study this phenomenon, a sample of 120 eggs was obtained from eggs deposited in the sand by turtles in New Mexico. The eggs were randomly divided into 6 groups, with 20 eggs in each group. Each group of eggs was incubated in a laboratory at a different temperature until the eggs hatched. Then, the numbers of males and females were counted. The results are show below. Incubation Temperature (ºC) 27.0 27.5 28.0 28.5 29.0 29.5 Number Females 3 5 4 12 17 18 Number of Males 17 15 16 8 3 2 (a) What are the basic features of a generalized linear model? (b) Let Yj denote the number of males that emerge from the nj = 20 eggs incubated at the j-th temperature. Assume that Yj ~ Bin(nj, πj), where log(1 - log(1 - πj)) = β0 + β1Xj, j = 1, ..., 6 and Xj denotes the j-th temperature level. Maximum likelihood estimates were computed with the glm( ) function in S-PLUS. The estimates and their standard errors are as follows: Estimate βˆ0 = − 49.74 βˆ1 = 1.76 Standard Error 8.95 0.32 6 The formula for the residual deviance is 6 [ ( 2 ∑ Yj log( Yj / n jπˆ j ) + ( n j − Yj ) log (n j − Yj ) /(n j − n j πˆ j ) j =1 )] where π̂ j is the maximum likelihood estimator of π j for the model in part (a). This statistic can be used to test a hypothesis. Clearly state the null and alternative hypotheses, and state what you know about the distribution of this test statistic. (c) 4. An independent sample of 120 eggs was obtained from turtles of the same species living in Illinois. The researchers want to know if the relationship between sex of the hatchlings and incubation temperature is the same for turtles in New Mexico and Illinois. Describe how this hypothesis could be tested? Three rabbits were used in an experiment to examine the effectiveness of the drug MDL in controlling blood pressure. First, each rabbit was exposed to a stimulant (PBG) and the increase in blood pressure was recorded, as a percentage of blood pressure measured before PBG was administered. After waiting for two weeks, to allow any effect of the first exposure to PBG to wear off, each rabbit was treated with MDL and once again exposed to the same level of PBG. The resulting increase in blood pressure was recorded. The goal is to determine if treatment with MDL is effective in suppressing the effect of the stimulant (PBG) on blood pressure. The observations are as follows: Rabbit 1 2 3 Control (Not Treated with MDL) Y11 Y21 Y31 Treated with MDL Y12 Y22 Y32 Here, the rabbits are considered to be a sample from a larger population of rabbits used in the experiment, and the following model was proposed. Yij = µj + ηi + εij where ηi ~ NID(0, σ η2 ) i = 1, 2, 3 εij ~ NID(0, σ ε2 ) i = 1, 2, 3 and j = 1, 2 and any ηi is independent of any εij. 7 (a) Show how to write the model in the form Y = Xβ + Zu + ε ~ ~ ~ ~ where β is a vector of non-random parameters, u is a vector of random ~ ~ effects, and ε is a vector of random errors. Report formulas for covariance ~ matrices for u and ε . ~ (b) ~ Suppose the values of the variance components, σ η2 and σ ε2 , are known. Give a formula for the best linear unbiased estimator for β . Report what you know ~ about the distributional properties of this estimator. (c) Show how to construct a set of "error contrasts" to use in REML estimation of the variance components. What is the motivation for using REML estimates of variance components? (d) Suppose the REML estimates for σ η2 and σ ε2 are inserted into your formula for the estimator for β from part (b). Report what you know about the ~ distributional properties of the resulting estimator. Exam Score ________ Course Grade ______