Spring 2009 Sample Final Exam Questions The final exam consists of 8 short answer questions (5 points each) and 3 long answer questions (20 points each). The long answer questions may have several parts relating to the same problem. The questions below are some typical questions and may form parts of both type of questions in the final exam. Problem 1. Derive 3 different algorithms to sample from the beta density f (x) = 6 x(1 − x) on (0,1). Problem 2. Obtain a rejection algorithm to generate a random variable X from the distribution with density 1 f (x) = x2 exp (−x), 2 x>0 using the exponential density with mean λ to construct an envelope. Find the value of λ that minimizes the expected number of iterations of the algorithm required to return a single value of X. Problem 3. Consider the Monte Carlo evaluation of 1 −(1/2)x e dx 2 2 i.e., Prob (X > 2), where X ∼ Exponential with mean 2. Use the first 2 terms of the Taylor series expansion of e−(1/2)x to derive a control variate technique for evaluating this integral by Monte Carlo. Z ∞ Z 1 Problem 4. Consider the Monte Carlo evaluation of 0 √ 1 − x2 dx. Show how to use the density h(x) ∝ 1 − (1/2)x2 on (0, 1) to derive an importance sampling technique to do this. Explain reasons for efficiency gains possible. Problem 5. Let x1 , x2 , . . . , xn be a random sample of size n. The quantity x̄/s where x̄ = sample mean, s2 = (xi −x̄)2 is known to have a t-distribution with (n − 1) degrees of freedom under certain conditions. Suppose that you want to study the robustness of the t-statistic using a Monte Carlo experiment. Suppose, in particular, one is interested in computing the Type I error rates under various sample distributions. Answer the following questions concerning the planning of this experiment. P 1. What value (or values) of n would you use? 2. From what distribution(s) would you generate sample(s) of size n? What methods would you use to do this? 1 3. To study the distribution of x̄/s, you will need to generate a sample from the sampling distribution √ of t = (x̄ − µ0 )/s/ n. How would you do this? How large a MC sample would you take? (What is your µ0 ?) 4. Once you have the sample of t values, describe things that you would do to reach the aims of your study? 5. How would you incorporate an efficient MC method for your computations. Problem 6. Given the model αxi + i β + xi and the data (yi , xi ) = (1.4, 1), (2, 2), (2.3, 3), derive the iterative formulas for obtaining least squares estimate of (α, β) using yi = 1. the Newton- Raphson Algorithm, and 2. the Gauss-Newton Algorithm, and compute one iteration in each case. Use α(0) = 2, β(0) = 2. Problem 7. Describe the Metropolis algorithm to generate samples from the Laplace density g(x) = 12 e−|x| , −∞ < x < ∞. Use N (x, 2) as the CGD. Problem 8. Let X be an n × p matrix n > p such that rank(X) = r < p. Householder transformations, with possible column permutations can be used to decompose X as X = QRP where Q and P are orthogonal and R= R1 R2 0 0 ! where R1 is r × r upper triangular and nonsingular. Describe how to compute X † , the Moore-Penrose generalized inverse of X given the Matrices Q, R and P . Problem 9. Consider the integral Z ∞ 0 x(log (1 + x)e−x/2 dx . (a) Describe two methods for applying the crude Monte Carlo method to estimate this integral. (b) Describe carefully three, possibly more efficient, Monte Carlo methods you would use to estimate this integral . (c) How would you evaluate their performance? 2 Problem 10. Suppose it is required to estimate θ, where θ= Z 1 0 exp (x2 )dx . Show that generating a Uniform random number u and then using the estimator exp (u2 )(1+exp (1 − 2u))/2 is better than generating two Uniforms u1 and u2 and using [exp (u21 ) + exp (u22 )]/2. Problem 11. Suppose that Y is a normal random variable with mean 1 and variance 1, and suppose that, conditional on Y = y, X has normal distribution with mean y and variance 4. It is required to estimate θ = P [X > 1] using Monte Carlo simulation. (i) Describe the crude MC estimator. (ii) Show how the conditional expectation can be used to obtain an improved estimator. (iii) Show how the estimator in (ii) can be further improved by using antithetic sampling. (iv) Show how the estimator in (ii) can be further improved by using a control variate. Problem 12. Design a Monte Carlo experiment to estimate σ 2 = E{X 2 } when X has the density that is proportional to q(x) = exp{−|x|3 /3} using importance sampling with standardized weights. Give arguments supporting your choice of weights. Problem 13. The Cholesky decomposition has been applied to the normal equations (X̃ 0 X̃, X̃ 0 ỹ) where the data has been corrected for the mean. Compute least squares estimate of β = (β1 , β2 ) , the residual sum of squares for fitting the model y = β0 + β1 x1 + β2 x2 + and the matrix (X̃ 0 X̃)−1 using the results: 6 1 −2 Cholesky gives → 0 5 √−1 0 0 15 36 6 −12 6 26 −7 (X̃ 0 X̃|X̃ 0 y) = −12 −7 20 Problem 14. Consider y = (y1 , . . . , yn ) be observed from a population such that √ (yi − µ) qi σ has a N (0, 1) distribution conditional on qi and q = (q1 , . . . , qn ) is an i.i.d. sample from a distribution with some density h(·) free of unknown parameters. Consider the joint density of (y, q) as the completedata density. This density can be shown to be from an exponential family with the sufficient statistics 3 yi qi and yi2 qi , 2 and σ are P P P qi . The complete-data (i.e., if q is observed) maximum likelihood estimates of µ µ̂ = σ̂ 2 = n X i=1 n X i=1 yi q i / n X qi i=1 (yi − µ̂)2 qi /n When q is not observed, we may apply EM algorithm to obtain m.l.e.’s of µ and σ 2 based on observed data y. Formulate the E-step and M-step computations needed using the facts given above. (To derive the conditional expectations needed, you need the distribution of q). Problem 15. White cell counts (xi ) and time to death (in weeks) from initial diagnosis (yi ) were recorded for 17 leukemia patients. A linear hazard model specifies that yi ∼ Exponential with mean θi where θi−1 = α + β xi . (The data and some statistics are given.) (i) Write the log-likelihood function of (α, β). (ii) Derive the score vector and the observed information matrix. (iii) Perform a single step of the method of steepest descent with α(0) = 1.0, β(0) = 0.0. (iv) Perform a single step of the method of scoring algorithm starting with α (0) = 1.0, β(0) = 0.0. Problem 16. Iterative methods for numerical optimization of the function f (x) = log (x) 1+x can be viewd as may be based on methods for finding roots of the first derivative of f (x). Describe or derive an algorithm for doing this using (i) a bracketing method, (ii) a fixed-point method, (iii) the Newton-Raphson method, (iv) the secant method 4