Studienzentrum Gerzensee Doctoral Program in Economics Econometrics 2014 Week 1 Exercises 1. Let X and Y have a joint density f X ,Y ( x, y ) = x + y for 0 < x < 1 and 0 < y < 1, and f X ,Y ( x, y ) = 0 elsewhere. (a) What is the density of X ? (b) Find E (Y | X = 3 / 4) (c) Find Var (Y | X = x) for 0 < x < 1 . (d) Let Z = 3 + X 2 . What is the probability density of Z ? 2. Let f X ( x) = 2 x , 0 < x < 1 and f X ( x) = 0 elsewhere. (a) Find E(1/X) (b) Let Y = 1/X (i) Find fY ( y ) (ii) Find E(Y) 3. Suppose that X is distributed with CDF F ( x) and density f ( x) . Show that 1 E ( X ) = ∫ F −1 (t )dt . 0 4. If X has probability density f ( x) = 1 4 for −1 < x < 3 and f ( x) = 0 elsewhere. Let Y=X . 2 (a) What is the CDF of Y ? (b) What is the probability density function of Y ? 5. Suppose X has probability density function (pdf) f(x) = 1/x2 for x ≥ 1, and f(x) = 0 elsewhere. (a) Compute P(X ≥ 5). (b) Derive the CDF of X. (c) Show that the mean of X does not exist. (d) Let Y = 1/X. Derive the probability density function of Y. (e) Compute E(Y). 6. Suppose that X is a discrete random variable with the following joint distribution x! 1" 2" 3" Prob(X#=!x)! 0.25" 0.50" 0.25" The distribution of Y|X=x is Bernoulli with P(Y = 1 | X = x) = (1+x)−1. (a) Compute the mean of X. (b) Find the variance of X. (c) Suppose X = 2. (c.i) What is your best guess of the value of Y? (c. ii) Explain what you mean by “best”. (d) What is the distribution of X conditional on Y = 1? 1 7. (a) X is a Bernoulli random variable with P(X = 1) = p. What is the moment generating function for X? (b) Y = n ∑X i =1 i where Xi are iid Bernoulli with P(Xi = 1) = p. What is the moment generating function for Y? (c) Use your answer in (b) to compute E(Y), E(Y2) and E(Y3) for n > 3. 8. Let M (t ) denote the mgf of X . Let S (t ) = ln( M (t )) . Let S '(0) denote the first derivative of S (t ) evaluated at t = 0 and S ''(0) denote the second derivative. (a) Show that S '(0) = E ( X ) (b) Show that S ''(0) = var ( X ) . 9. Suppose that f X ,Y ( x, y ) = 1/ 4 for 1 < x < 3 and 1 < y < 3, and f X ,Y ( x, y ) = 0 elsewhere. (a) Derive the marginal densities of X and Y . (b) Find E (Y | X = 2.3) (c) Find P[( X < 2.5) ∩ (Y < 1.5)] (d) Let W = X + Y . Derive the moment generating function of W . 10. Show that if X is N (0,1) , then E ( g '( X ) − Xg ( X )) = 0 for any differentiable function g with first derivative that is integrable. (Hint: use integration by parts and the fact that because g ( x) is integrable, lim x→∞ g ( x) f ( x) = lim x→−∞ g ( x) f ( x) = 0 , where f ( x) is the standard normal density.) Use this result to show E ( X i +1 ) = iE ( X i −1 ) . 11. X and Y are random variables. The marginal distribution of X is U[0,1], that is X is uniformly distributed on [0,1]. The conditional distribution of Y|X=x is U[x, 1+x] for 0 ≤ x ≤ 1. (a) Derive the mean of X. (b) Derive the variance of X. (c) Derive the joint probability density function of X and Y. (d) Derive the mean of Y. 2 12. Suppose that a random variable X is distributed N (6, 2) . Let ε be another random variable that is distributed N (0, 3) . Suppose that X and ε are statistically independent. Let the random variable Y be related to X and ε by Y = 4 + 6 X + ε . (a) Prove that the joint distribution of ( X , Y ) is normal. Include in your answer the value of the mean vector and covariance matrix for the distribution. (b) Suppose that you knew that X = 10 . Find the optimal (minimum mean square error) forecast of Y . 13. Y and X are two random variables with ( Y | X = x) : N (0, x 2 ) and X : U (1, 2) . (a) What is the joint density of Y and X ? (b) Compute E (Y 4 ) . (Hint: if Q : N (0, σ 2 ) , then E (Q 4 ) = 3σ 4 .) (c) What is the density of W = YX ? (Hint: What is the density of W conditional on X = x ?) 14. Let X : N (0,1) and suppose that ⎧ X if 0.5 < X < 1⎫ Y =⎨ ⎬ ⎩ 0 otherwise ⎭ (a) Find the Distribution of Y (b) Find E(Y) (c) Find Var (Y ) . 15. Suppose that a random variable X has density function f and distribution function F, where F is strictly increasing. Let Z = F ( X ) (so that Z is a random variable constructed by evaluating F at the random point X). Show that Z is distributed U (0,1) . 16. Let f ( x) = ( 12 ) x , x = 1, 2, 3,.... Find E(X). 17. Let f ( x1, x2 ) = 2 x1, 0 < x1 < 1, 0 < x2 < 1, zero elsewhere, be the density of the vector ( x1 , x2 ) . (a) Find the marginal densities of X 1 and X 2 . (b) Are the two random variables independent? Explain. (c) Find the density of X 1 given X 2 . (d) Find E ( X1 ) and E ( X 2 ) . (e) Find P[( X 1 < 0.5)] (f) Find P[( X 2 < 0.5)] . (g) Find P[( X1 < 0.5) ∩ ( X 2 < 0.5)] . (h) Find P[( X1 < 0.5) ∪ ( X 2 < 0.5)] . 1 18. Let f ( x) = 1 , for 0 < x < 1 and zero elsewhere be the density of X . Let Y = X 2 . (a) Find the density of Y (b) Find the mean of Y (c) Find the variance of Y 3 19. Suppose that X and Y are independent random variables, g and h are two functions, and E(g(X)) and E(h(X)) exist. Show E ( g ( X )h(Y )) = E ( g ( X )) E (h(Y )) . 20. The following table gives the joint probability distribution between employment status and college graduation among those either employed or looking for work (unemployed) in the working age U.S. population, based on the 1990 U.S. Census. Joint Distribution of Employment Status and College Graduation in the U.S. Population Aged 25-64, 1990 Unemployed (Y=0) Employed (Y=1) Non College Grads (X=0) 0.045 0.709 College Grads (X=1) 0.005 0.241 Total 0.050 0.950 Total 0.754 0.246 1.00 (a) Compute E(Y). (b) The unemployment rate is the fraction of the labor force that is unemployed. Show that the unemployment rate is given by 1−E(Y). (c) Calculate E(Y|X=1) and E(Y|X = 0). (d) Calculate the unemployment rate for (i) college grads and (ii) non college grads. (e) A randomly selected member of this population reports being unemployed. What is the probability that this worker is a college graduate? A non-college graduate? (f) Are educational achievement and employment status independent? Explain. 21. Suppose that X has density f ( x) and µ = E ( X ), m = the median of X , and E ( X 2 ) < ∞ . (The median solves F (m) = 0.5.) (a) Show that a = µ solves min a E[( X − a) 2 ] (b) Show that a = m solves min a E[| X − a |] 22. Suppose that Y and X are two random variable and assume the existence of any moments necessary to (a) Find the values of α 0 and α1 that minimize MSE (α 0 , α1 ) = E[(Y − α 0 − α1 X ) 2 ] (b) Find the values of α 0 , α1 and α 2 that minimize MSE (α 0 , α1 ) = E[(Y − α 0 − α1 X − α 2 X 2 ) 2 ] . 23. The density of Y is eY for −∞ < Y < 0 and zero elsewhere. (a) Compute P(Y ≥ −1). (b) Find the value of c that minimizes E|Y – c|. 4 24. Suppose X = ( X 1 X 2 ! X n )′ (a) Show that ⎡ E( X ) ⎤ 1 ⎢ ⎥ ⎢ E( X 2 ) ⎥ E( X ) = µ X = ⎢ ⎥ ! ⎢ ⎥ ⎢ E( X n ) ⎥ ⎣ ⎦ is the mean vector. (b) Let σ ij = E{( X i − µi )( X j − µ j )}. Show that σ ij = σ ji (c) Show that if X i and X j are independent, then σ ij = 0. 25. Suppose X i , i = 1,..., n are iid random variables and let Y = ∑i =1 X i . n (a) Show that M Y (t ) = Πin=1M X i (t ) (b) When the X ′ s are Bernoulli, then Y is binomial. Use your results for the MGF to compute the mean and variance of Y. ⎧Y if Y ≤ 2 ⎫ 26. Suppose Y is a Poisson random variable with E (Y ) = 1. Let X = ⎨ ⎬ ⎩ 2 if Y > 2 ⎭ (a) Derive the CDF of X . (b) Derive the mean and variance of X . 27. Suppose that W ~ χ 32 and that the distribution of X given W = w is U[0, w] (that is uniform on 0 to w). (a) Find E(X). (b) Find Var(X). (c) Suppose W = 4. What is the value of the minimum mean square error forecast of X? 28. Complete the following calculations (a) Suppose X ∼ N (5,100) . Find P(−2 < X < 7). (b) Suppose X ~ N(50, 4). Find P( X ≥ 48) (c) Suppose X ∼ χ 42 . Find P( X ≥ 9.49) (d) Suppose X ∼ χ 62 . Find P(1.64 ≤ X ≤ 12.59) (e) Suppose X ∼ χ 62 . Find P( X ≤ 3.48) (f) Suppose X ∼ t9 . Find P(1.38 ≤ X ≤ 2.822) (g) Suppose X ∼ N (5,100) . Find P(−1 < X < 6). (h) Suppose X ∼ N (46,2) . Find P( X ≥ 48) (i) Suppose X ∼ χ 62 . Find P(1.24 ≤ X ≤ 12.59) (j) Suppose X ∼ F3.5 . Find P( X > 7.76) (k) Suppose X ∼ t10 . Find P( X ≤ 2.228) 5 29. If X ∼ N (1,4) find P(1 < X < 9) . 30. Find the 90th percentile of the N (10, 30) distribution. 31. Suppose X ∼ N ( µ,σ 2 ) and P( X < 89) = .90 and P ( X < 94 ) = .95. Find µ and σ 2. 32. Suppose X ∼ N ( µ,σ 2 ) , show that E(| X − µ |) = σ 2/π . 33. X ~ N(1, 4) and Y = eX. (a) What is the density of Y ? (b) Compute E(Y). (Hint: What is the MGF for X?) 34. X ~ N(0,1) and Y|X is N(X,1). (a) Find (i) E(Y) (ii) Var(Y) (iii) Cov(X,Y) (b) Prove that the joint distribution of X and Y is normal. 35. X,Y,U are independent random variables: X~N(5,4), Y~ χ12 , and U is distributed Bernoulli with P(U=1)=0.3. Let W=UX+(1−U)Y. Find the mean and variance of W. 36. Let Y be the number of successes in n independent trials with p = 0.25 . Find the smallest value of n such that P(1 ≤ Y ) ≥ 0.7 . 37. Suppose that the 3 ×1 vector X is distributed N (0, I 3 ) , and let V be a 2 × 3 nonrandom matrix that satisfies VV ′ = I . Let Y = VX . (a) Show that Y ′Y ∼ χ 22 (b) Find Prob (Y ′Y > 6) 38. Let l denote a p×1 vector of 1’s, Ip denote the identity matrix of order I, suppose the p×1 vector Y is distributed as Y ~ N(0, σ2Ip), with σ2 = 4, and let M = Ip − (1/p)ll′. (a) Show that M is idempotent. (b) Compute E(Y′MY). (c) Compute var(Y′MY). 39. Suppose that the joint distribution of X and Y are two random variables with joint normal distribution. Further suppose var ( X ) = var (Y ) . Let U=X+Y and V =X −Y. Prove that U and V are independent. 6 40. Let X ∼ N p ( µ, Σ) . Also let X = ( X 1′ , X 2′ )′ , µ = ( µ1′ , µ 2′ )′ , and Σ11 Σ12 ⎞⎟ ⎟ ⎟⎟ ⎜Σ Σ 22 ⎠ ⎝ 21 be the partitions of X , µ and Σ such that X 1 and µ1 are k –dimensional and Σ11 is a k × k matrix. Show that Σ12 = 0 implies that X 1 and X 2 are independent. ⎛ ⎜ Σ = ⎜⎜ 41. Suppose Xi are i.i.d. N(µ, σ2), for i = 1, … , n.. Let X = 1 n ∑ X i and n i =1 1 n ( X − X )2 . ∑ n − 1 i =1 Show each of the following. (Hint: Let l denote an n×1 vector of 1’s. Note that s2 = X = (l ' l )−1 l ' X , where X = (X1 X2 … Xn)´, and n ∑(X i =1 (a) ( X − µ) σ2 /n i − X ) 2 =X´[I – l(l´l)–1l´]X .) ~ N (0,1) (b) (n – 1)s2/σ2 ~ χ n2−1 . (c) X and (n – 1)s2/σ2 are independent. ( X −µ ) (d) σ 2 /n 2 ( n −1) s /σ n −1 2 = ( X − µ) s2 / n : tn −1. n 42. Suppose Yi ~ i.i.d. N (10, 4), and let s2 = (n–1)–1 ∑ (Yi − Y ) 2 . i =1 Compute E( Y | s2 = 5). 43. X and Y are random variables. The marginal distribution of X is χ12 . The conditional distribution of Y|X=x is N(x,x). (a) Derive the mean and variance of Y. (b) Suppose that you are told that X < 1. Derive the minimum mean square error forecast of Y. 44. X and Y have a bivariate normal distribution with parameters µX =5, µY =10, σ X2 =9, σ Y2 =4, and ρ XY = 0.9 , where ρXY is the correlation between X and Y. Find the shortest interval of the real line, say C, such that P(Y ∈ C | X = 7) = 0.90. 7 45. Suppose that X and Y are two random vectors that satisfy: ⎡ X ⎤ ⎡µ ⎤ ⎡ X ⎤ ⎡Σ E ⎢ ⎥ = ⎢ X ⎥ and var ⎢ ⎥ = ⎢ XX ⎣ Y ⎦ ⎣ µY ⎦ ⎣ Y ⎦ ⎣ Σ XY Σ XY ⎤ ΣYY ⎥⎦ where X is k×1 and Y is scalar. Let Ŷ = a + bX . (a) Derive the values of the scalar a and the 1×k matrix b that minimize E[(Y – Yˆ )2]. (b) Find an expression for the minimized value of E[(Y – Yˆ )2]. 46. Suppose that Xi are i.i.d. N(0,1) for i = 1,..., n and let s 2 = n−1 ∑ i =1 X i2 n as (a) Show that s 2 →1 p (b) Show that s 2 →1 ms (c) Show that s 2 →1 (d) Suppose n = 15 , find P( s 2 ≤ .73) using an exact calculation (e) Now let s 2 = (n − 1)−1 ∑i =1 ( X i − X n )2 , where X n = n−1 ∑i =1 X i n n (i) Show E( s 2 ) = 1 p (ii) Show s 2 →1 47. Suppose X t = ε t + ε t −1 , where ε t are i.i.d . with mean 0 and variance 1 . (a) Find an upper bound for P (| X |≥ 3) . p ∑t =1 X t → 0 . (b) Show that 1 n (c) Show that 1 n n ∑ n d X t → N (0,V ) , and derive an expression for V . t =1 48. Let X have a Poisson distribution with E( X ) = 80 . (a) Use Chebyshev’s inequality to find a lower bound for P(70 ≤ X ≤ 90) . (b) Is the answer the same for P(70 < X < 90) ? 49. Suppose X i , i = 1,..., n are a sequence of iid random variables with E( X i ) = 0 p and var ( X i ) = σ 2 < ∞ . Let Yi = iX i and Y = n−2 ∑i =1 Yi . Prove Y → 0 . (Hint – can you show mean square convergence?) n 50. Xi ~ U[0,1] for i = 1, …, n. Let Y = n ∑X i =1 i . Suppose that n = 100. Use an appropriate approximation to estimate the value of P(Y ≤ 52). 8 51. Suppose Xi ~ i.i.d. N(0, σ2) for i = 1,..., n and let X1 Y= n ∑ i=2 X i2 n −1 (a) Prove that Y : tn−1 d (b) Prove that Y → Z where Z : N (0,1) (as n → ∞ ). 52. Let Y ∼ χ n2 , let X n = n −1/ 2Y − n1/ 2 , and Wn = n −1Y . d (a) Prove that X n → X : N (0, 2) . p (b) Prove that Wn →1. (c) Suppose that n = 30. Calculate P(Y > 43.8). (d) Use the result in (a) to approximate the answer in (c). 53. Let X 1 and X 2 denote the sample means from two independent randomly drawn samples Bernoulli random variables. Both of the samples is size n and the Bernoulli population has P( X = 1) = 0.5 . Use an appropriate approximation to determine how large n must be so that P(| X 1 − X 2 |< .01) = .95 . 54. Suppose that Xi ~ i.i.d. N(0,Σ) for i = 1,..., n where X i is a p×1 vector. Let α denote a non-zero p × 1 vector and let Yn = α ′ X n X n′ α n −1 ′ ′ 1 n −1 ∑ i =1 α X i X i α (a) Prove that Yn ∼ F1,n−1 . (Hint: recognize that α ′ X n is a scalar.) d (b) Sketch a proof that Yn → Y where Y ∼ χ12 (as n → ∞ ). 9 55. Let Y ∼ N p (0, λ I p ) , where λ is a positive scalar. Let Q denote a p × k nonstochastic matrix with full column rank, PQ = Q(Q′Q)−1 Q′ and MQ = I − PQ. Finally, Let X1 = Y ′ PQY and X 2 = Y ′ M QY . (A few useful matrix facts: (i) If A and B are two matrices with dimensions so that AB and BA are defined, then tr ( AB) = tr ( BA) ; (ii) if Q is an idempotent matrix then rank (Q) = tr (Q) .) (a) Prove that E ( X 2 ) = ( p − k )λ. (b) Let W = ( X1/X 2 )(( p − k ) /k ) . Show that W ∼ Fk , p−k . (c) Suppose that p = 30 and k = 15 . Find P(W > 2.4). (d) Suppose that k = [ p/ 2] where [ z ] denote the greatest integeter less than or d equal to z . Show that as p → ∞ , p(W − 1)→ R ∼ N (0,V ) , and derive an expression for V . (e) Use your answer in (d) to approximate the answer to (c). 56. Suppose that X i is i.i.d . N ( µ ,1) for i = 1,..., n . The risk function for an estimator µ̂ is R( µˆ , µ ) = E[( µˆ − µ )2 ] . (a) Let µˆ1 = X . Derive R( µˆ1 , µ ) . (b) Let µˆ 2 = 12 X . Derive R(µˆ 2 , µ ) . (c) Which estimator is better? (d) (More challenging)) Let µˆ3 = aµˆ1 + (1 − a) µˆ 2 , where a = 1 if µˆ12 > 1n and equals 0 otherwise. Derive R( µˆ3 , µ ) . Is this estimator better than µ̂1 ? Is this estimator better than µ̂ 2 ? 57. Assume that Yi ~ iid N(µ, 1) for i = 1, … , n. Let µˆ = 1(| Y |> c)Y , where c is a positive constant. (The notation 1(| Y |> c) means that 1(| Y |> c) = 1 if | Y |> c , and is equal to 0 otherwise). (a) Derive E( µ̂ ). (b) Discuss the consistency of µ̂ when (i) |µ | > c, (ii) |µ | < c, and |µ | = c. 58. Suppose X is a Bernoulli random variable with P( X = 1) = p . Suppose that you have sample of i.i.d. random variables from this distribution, X 1, X 2 ,..., X n . Let po denote the true, but unknown value of p . (a) Write the log-likelihood function for p . (b) Compute the score function for p and show that the score has expected value 0 when evaluated at po . (c) Compute the information I . (d) Show the maximum likelihood estimator is given by p̂ MLE = X . (e) Show that p̂ MLE is unbiased. d (f) Show that n( p̂ MLE − po )→ N (0,V ) , and derive an expression for V . (g) Propose a consistent estimator for V . 10 59. Let Yi , i = 1,..., n denote an i.i.d. sample of Bernoulli random variables with Pr(Yi = 1) = p . I am interested in a parameter θ that is given by θ = e p . (a) Construct the likelihood function for θ . (b) Show that θ! MLE = eY . (d) Show that θ! is consistent. MLE d (e) Show that n(θ! MLE − θ )→ N (0,V ) , and derive an expression for V . 60. Suppose Yi|µ ~ i.i.d. N(µ, 1), i = 1,..n . Suppose that you know µ ≤ 5 . Let µˆ MLE denote the value of µ that maximizes the likelihood, subject the constraint µ ≤ 5 , and Y denote the sample mean of the Yi . (a) Show that µ̂ MLE = a(Y )5 + (1− a(Y ))Y , where a (Y ) = 1 for Y ≥ 5, and a (Y ) = 0 otherwise. p p ! → µ . Show (b) Suppose that µ = 3 . (i) Show that a (Y ) → 0 . Show that µ MLE d that ! n( µ − µ )→ N (0,1) . MLE p ! →µ . (c) Suppose that µ = 5 . Show that µ MLE 61. Yi ~ i.i.d.N(µ, 1) for i = 1, 3, 5, … n–1, Yi ~ i.i.d.N(µ, 4) for i = 2, 4, …, n where n is even, and Yi and Yj are independent for i odd and j even. (a) Write the log-likelihood function. (b) Derive the MLE of µ. (c) Is the MLE preferred to Y as an estimator of µ ? Explain. (Discuss the properties of the estimators. Are the estimators unbiased? Consistent? Asymptotically Normal? Which estimator has lower quadratic risk?) 62. Yi ~ i.i.d.N(µY, 1) and Xi ~ i.i.d.N(µX, 1) for i = 1, …, n. Suppose Xi and Yj are independent for all i and j. A researcher is interested in θ = µYµX. Let θˆ denote the MLE if θ. (a) Show that θˆ = XY . d (b) Suppose that µY = 0 and µX = 5. Show that an expression for V. nθˆ →W ~ N (0,V ) and derive (c) Suppose that µY = 1 and µX = 5. Show that n θˆ − 5 →W ~ N (0,V ) and ( ) d derive an expression for V. 11 63. Suppose that yi is i.i.d.(0,σ2) for i = 1, …, n with E( yi4 ) = κ. Let σˆ 2 = (a) Show that 1 n 2 ∑ yi . n i =1 d n (σˆ 2 − σ 2 ) → N (0,Vσˆ 2 ) and derive an expression for Vσˆ 2 . 1 n 4 ∑ yi = 16. Construct a 95% confidence n i =1 interval for σ, where σ is the standard deviation of y. (b) Suppose that n = 100, σˆ 2 = 2.1 and 64. Suppose that Xi ~ i.i.d.N(0, σ2), i = 1,..., n. (a) Construct the MLE of σ 2 . (b) Prove that the estimator is unbiased. (c) Compute the variance of the estimator. d n(σ̂ 2MLE − σ 2 )→ N (0,V ) . Derive and expression for V . (e) Suppose σ̂ 2MLE = 4 and n = 100 . Test H o : σ 2 = 3 versus H a : σ 2 ≠ 3 using a Wald test with a size of 10% (f) Suppose σ̂ 2MLE = 4and n = 100 . Construct a 95% confidence interval for (d) Prove that σ 2. (g) Suppose σ̂ 2MLE = 4 and n = 100 . Test H o : σ 2 = 3 versus H a : σ 2 = 4 using a test of your choosing. Set the size of the test at 5% . Justify your choice of test. (h) Suppose that the random variables were iid but non-normal. Would the confidence interval that you constructed in (f) have the correct coverage probability ( i.e., would it contain the true value of σ 2 95% of the time.)? How could you modify the procedure to make it robust to the assumption of normality (i.e., so that it provided the correct coverage probability even if the data were drawn from a non-normal distribution.) (i) Construct the MLE of σ (j) Is the MLE of σ unbiased? Explain. d (k) Prove that n(σ! MLE − σ )→ N (0,W ) . Derive and expression for W . (l) Suppose σ! MLE = 3 and n=200. Construct a 90% confidence interval for σ. 65. Yi, i = 1, … , n are iid Bernoulli random variables with P(Yi = 1) = p . p (a) Show that Y (1 − Y ) → var(Y), where var(Y) denotes the variance of Y. (b) I am interested in Ho: p= 0.5 vs. Ha: p > 0.5. The sample size is n = 100 , ∑ n Y > 55. Use the central limit theorem to derive the approximate size of the test. and I decide the reject the null hypothesis if i =1 i 12 66. X i , i = 1,..., n , are distributed i.i.d. Bernoulli random variables with P(Xi = 1) = p. Let X denote the sample mean. (a) Show that X is an unbiased estimator of the mean of X. (b) Let σˆ 2 = X (1 − X ) denote an estimator of σ2, the variance of X. (b.i) Show that σˆ 2 is a biased estimator σ2. (b.ii) Show that σˆ 2 is a consistent estimator σ2. (b.iii) Show that n (σˆ 2 − σ 2 ) → N (0,V ) and derive an expression for d V. (c) Suppose n = 100 and X = 0.4. (c.i) Test H0: p = 0.35 versus H0: p > 0.35 using a test with size of 5%. (c.ii) Construct a 95% confidence interval for σ2. 67. Suppose X i , i = 1,..., n , are a sequence of independent Bernoulli random variables with Prob ( X = 1) = p . Based on a sample of size n, a researcher decides to test the null hypothesis H o : p = 0.45 using the following rule: Reject H o if ∑ n i =1 X i ≥ n2 . Otherwise, do not reject H o . (a) When n = 2 compute the exact size of this test. (b) Write down an expression for the exact size when n = 50. Use the Central Limit Theorem to get an approximate value of the size. (c) Prove that limn→∞ P(∑i =1 X i ≥ n2 ) = 0 when p = 0.45. n 68. Xi ~ iidN(0, σ2) for i = 1, …, 10. A researcher is interested in the hypothesis Ho: 1 10 2 σ2 = 1 versus Ha: σ2 > 1. He decides to reject the null if ∑ X i > 1.83. 10 i =1 (a) What is the size of his test? (b) Suppose that alternative is true and that σ2 = 1.14. What is the power of his test? . (c) Is the researcher using the best test? Explain. 69. X i are i.i.d . N (0, σ 2 ) . Let s 2 = 1 n ∑ n i =1 X i2 . I am interested in the competing hypotheses H o : σ 2 = 1 versus H a : σ 2 > 1 . Suppose n = 15 and I decide to reject the null if s 2 > 5/ 3 . (a) What is the size of the test? (b) Suppose that the true value of σ 2 is 1.5. What is the power of the test? (c) Is this the most powerful test? Explain. 13 70. X ~ N(0,σ2). You are interested in the competing hypothesis Ho: σ2 = 1 vs Ha: σ2 = 2. Based on a sample of size 1, you reject the null when X2 > 3.84. (a) Compute the size of the test. (b) Compute the power of the test. (c) Prove that this is the most powerful test. (d) Let σˆ 2 = X2. (i) Show that σˆ 2 is unbiased. (ii) Show that σˆ 2 is the minimum mean squared estimator of σ2. 71. A coin is to be tossed 4 times and the outcomes are used to test the null hypothesis that H o : p = 12 vs. H a : p > 12 , where p denotes the probability that a “tails” appears. (“Tails” is one surface of the coin, and “heads” is the other surface.) I decide to reject the null hypothesis if “tails” appears in all four of the tosses. (a) What is the size of this test? (b) Sketch the power function of this test for values of p > 1/ 2 . (c) Is this the best test to use? Explain. 72. Yi, i = 1,...,100 are i.i.d. Poisson random variables with a mean of m. I am interested in the competing hypotheses Ho: m = 4 versus Ha: m > 4. I decide to reject the null if Y > 4.2 . Use an appropriate large-sample approximation to compute the size of my test. 73. A researcher has 10 i.i.d . draws from a N (0, σ 2 ) distribution, ( X1, X 2 ,..., X10 ) . He is interested in the competing hypotheses: H o : σ 2 = 1 vs. H A : σ 2 = 4 . He plans 10 2 to test these hypotheses using the test statistic σ! = 101 ∑ i=1 X i2 . The procedure is to 2 reject H o in favor of H A if σ! > 1.83 ; otherwise H o will not be rejected. (a) Determine the size of the test. (b) Determine the power of the test. (c) Is the researcher using the most powerful test given the size from (a)? If so, prove it. If not, provide a more test. 74. Suppose that X i is i.i.d . with mean µ , variance σ 2 and finite fourth moment. A random sample of 100 observations are collected, and you are given the following summary statistics: X = 1.23 and s 2 = 4.33 (where s 2 = N1−1 ∑ i =1 ( X i − X ) 2 is the sample variance.) (a) Construct an approximate 95% confidence interval for µ . Why is this an ”approximate” confidence interval? (b) Let α = µ 2 . Construct an approximate 95% confidence interval for α . 100 14 75. A researcher is interested in computing the likelihood ratio statistic LR = f ( y, θ1 ) f ( y, θ 2 ) but finds the calculation difficult because the density function f (.) is very complicated. However, it easy to compute the joint density g ( y, x, θ ) where x denotes another random variable. Unfortunately, the researcher has data on y, but not on x . Show that the researcher can compute LR using the formula LR = f ( y,θ1 ) g ( y, x,θ1 ) = Eθ2 [ | y] f ( y, θ 2 ) g ( y, x, θ 2 ) where the notation Eθ2 [• | y ] means that the expectation should be taken using the distribution conditional on y and using the parameter value θ 2 . 76. Show the LR test for the normal mean is consistent. (Write the alternative as H a : µ ≠ µo , and show that the power of the test approaches 1 as n → ∞ for any fixed value of µ ≠ µo .) 77. Suppose Xi ~ i.i.d. Bernoulli with parameter p, for i = 1, …, n. I am interested in testing H0: p = 0.5 versus Ha: p > 0.5. Let X denote the sample mean of Xi. (a) A researcher decides to reject the null if X > 0.5(1 + n−1/ 2 ) . Derive the size of the test as n → ∞. (b) Suppose p = 0.51. What is the power of this test as n → ∞. (c) The researcher asks you if she should use another test instead of the test in (a). What advice do you have to offer her? 78. Consider two random variables X and Y . The distribution of X is given by the uniform distribution X ~ U [0.5,1.0] and the distribution of Y conditional on X is given by the normal Y | X ~ N (0, λ X ) . (a) Provide an expression for the joint density of X and Y , say f X ,Y ( x, y ) . (b) Prove that Z = Y X is distributed as Z ~ N (0, λ ) . (c) You have a sample { X i , Yi }iN=1 . Derive the log likelihood function L(λ | { X i , Yi }iN=1 ) . (d) Construct the MLE of λ . (e) Suppose that N = 10 and interval for λ . ∑ 10 i=1 Y ( Xii )2 = 4.0 . Construct a 95% confidence 15 79. Suppose Yi ~ i.i.d. with density f(y,θ), for i = 1, …, n, where θ is the 2×1 vector θ = (θ1 θ2)′. Let θ0 = (θ1,0 θ2,0)′ denote the true value of θ. Let si(θ) denote the score for the i’th observation, var[si(θ0)] = I(θo) denote the 2×2 information matrix, and Sn(θ) = n ∑i=1 si (θ ) denote the score for the observations {Yi }in=1 . Suppose that the problem is d “regular” in the sense that n–1/2Sn(θ0) → N (0, I (θ0 ) ) , so forth, where θˆ is the MLE of θ. d n (θˆ − θ0 ) → N (0, I (θ0 )−1 ) , and Now, suppose the likelihood is maximized not over θ, but only over θ2, with θ1 taking on its true value θ1 = θ1,0. That is, let θ!2 solve maxθ2 Ln (θ1,0 , θ 2 ) , where Ln(θ1, θ2) is the log-likelihood. d (a) Show that n(θ!2 − θ 2,0 )→ N (0,V ) and derive an expression for V. (b) Show that 1 ∂Ln (θ10 ,θ!2 ) d → N (0,Ω) and derive an expression for Ω. (Note: ∂θ1 n ∂Ln (θ10 ,θ!2 ) is the first element of Sn(θ1,0, θ!2 ).) ∂θ1 80. Suppose that two random variables X and Y have joint density f ( x, y ) and marginal densities g ( x) and h( y ) ; suppose that the distribution of Y | X is N(X, σ2). Denote this conditional density as fY | X ( y | x) (a) Show that E ( X | Y = y ) = (b) Show that ∂fY | X ( y | x) ∂y = ∫ xf Y|X ( y | x) g ( x)dx h( y ) 1 σ2 . ( x − y) fY | X ( y | x) and solve this for xfY | X ( y | x) . (c) Show that ∂fY | X ( y | x) g ( x) ∂h( y) =∫ dx . ∂y ∂y (d) Using your results from (a) and (c), show that E ( X | Y = y) = y + σ 2 S ( y) , ∂h( y) /∂y where S ( y) = ∂ ln(h( y )) /∂y = h( y ) Often this formula provide a simple method for computing E ( X | Y = y ) when fY | X ( y | x) is N ( x, σ 2 ) . (In the literature, this is called a “simple Bayes formula” for the conditional expectation.) 16 81. Suppose that a test statistic for testing H o : θ = 0 vs. H a : θ ≠ 0 constructed from a sample of size n , say Tn is known to have a certain asymptotic null distribution d , Ho Tn → X : F where X is random variable with distribution function F ( x) . Suppose that the distribution F ( x ) depends on an unknown nuisance parameter ρ . Represent this dependence by F ( x, ρ ) . Let x(α , ρ ) solve F ( x, ρ ) = α , and suppose that x(α , ρ ) is continuous in ρ . Let ρ o denote the true value of ρ and let ρ̂ denote a consistent estimator of ρ . Prove that a critical region defined by T > x(α , ρ! ) n produces a test with asymptotic size equal to α . (Such a test is said to be asymptotically similar.) 82. Xi ~ i.i.d. U[0,θ], for i = 1, … , n, and let X = 1 n ∑ Xi . n i =1 p (a) Show that 2 X →θ . d (b) Show that n (2 X − θ ) → N 0, V ) and derive an expression for V. (c) I am interested in the competing hypotheses Ho: θ = 1 versus Ha: θ = 0.8, and I decide to reject Ho if X ≤ 0.45. (c1) Using an appropriate approximation, compute the size of the test. (c2) Using an appropriate approximation, compute the power of the test. (c3) Is this the best test to use? 83. Suppose that Yt is generated by a “Moving Average-1” process ( MA(1) ) Yt = µ + ε t + θε t −1 where µ is a constant and ε t ∼ iid(0,σ 2 ) for all t. Let Yn = n−1 ∑t =1 Yt . Prove that n d n (Yn − µ ) → N (0, σ 2 (1 + θ )2 ) 17 84. A sample of size 30 from a iidN ( µ , σ 2 ) population yields: ∑ 30 i =1 ∑ 30 i =1 xi = 50 and xi2 = 120 (a) Calculate the MLE’s of µ and σ 2 (b) Suppose that you knew that µ = 2 , Calculate the MLE of σ 2 (c) Construct the LR, LM and Wald tests of H o : µ = 2 vs. H a : µ ≠ 2 using a 5% critical value. (d) Construct the LR, LM and Wald tests of Ho: µ = 2, σ2 = 1 vs. Ha: µ≠2, σ2≠1, using a 10% critical value. (e) Construct a joint 95% approximate confidence set for µ and σ 2 . Plot the resulting confidence ellipse. (f) Calculate the MLE of θ = µ/σ 2 . Calculate a 95% confidence interval for θ. 85. An economist is flying to a conference where he will present a new research paper. While looking over his presentation he realizes that he failed to calculate the “p-value” for the key t-statistic in his paper. Unfortunately he doesn’t have any statistical tables with him, but he does have a laptop computer and a program that can generate iid N (0,1) random variables. He decides to estimate the p-value by simulating t-distributed random variables by transforming the standard normals generated by his random number generator. Specifically, he does the following. Let t denote the numerical value of the economist’s t-statistic. The p-value that he wants to estimate is P = Pr(tdf > t ) where tdf is a t-distributed random variable with degrees of freedom equal to df. For his problem, the correct value of df is df = 5. The economist generates 100 draws from the t5 distributions and estimates P by the sample fraction of the draws that were larger than t . That is, letting xi denote the i′th draw from the t5 distribution, and qi denote the indicator variable then P is estimated as ⎧ 1if xi > t ⎫ qi = ⎨ ⎬ ⎩0 otherwise ⎭ 100 != 1 P ∑q 100 i=1 i (a) If you had a random number generator that only generated iid N (0,1) random variables, how would you generate a draw from the t5 distribution? (b) Prove that the qi′ s are distributed as iid Bernoulli random variables with “success” probability equal to P . (c) Compute the mean and variance of P̂ as a function of P . (d) Prove that P̂ is consistent estimator or P (as the number of simulated values of qi → ∞ ). 18 (e) Suppose P̂ = 0.04. Construct a 90% confidence interval from P . (Hint: Since the number of simulations is large ( N = 100) , you can use the Central Limit Theorem.) (f) Write the likelihood function for P as a function of the sample qi′ s . (g) Show that P̂ is the MLE. (h) The economist thinks that he remembers looking up the p-value for the statistic and seems to remember that the correct value was .03. How would you construct the most powerful test for the null hypothesis that P = .03 versus the alternate that P = .04 ? 86. Suppose X i , i = 1,..., n , are distributed i.i.d. U [0,θ ] (that is, uniformly over 0 to θ ). (a) What is the MLE of θ ? (b) Derive the CDF of θ! MLE p (c) Prove that θ! MLE →θ 0 where θ 0 denotes the true value of θ . (d) I am interested in testing H 0 : θ = 1 vs. H a : θ > 1 , and do this by using a critical region defined by max1≤i ≤n { X i } > c . Suppose that the sample size is n = 100 . (i) Find the value of c so that the test has a size of 5%. (ii) Using this value of c, find the power of the test when θ = 1.02 . 87. Yi ~ i.i.d. with mean µ and variance σ2 for i = 1, …, n. A sample of size n = 100 1 100 1 100 2 yields = 5 and ∑ Xi ∑ X i = 50. Construct an approximate 95% confidence 100 i =1 100 i =1 interval for µ. (Justify the steps in your approximation.) 88. Yi are iid Bernoulli random variables with P(Yi = 1) = p . I am interested in Ho: p= ½ vs. Ha: p > ½. The sample size is n = 100 , and I decide the reject the null ∑ n Y > 55 . (a) Use the central limit theorem to derive the approximate size of the test. (b) Suppose p = 0.60 . Use the central limit theorem to derive the power of the test. hypothesis if i =1 i 89. Y ~ N(µ, σ2) and X is a binary random variable that is equal to 1 if Y > 2 and equal to 0 otherwise. Let p = Prob(X = 1). You have n i.i.d. observations on Yi, and suppose n the sample mean is Y = 4 and the sample variance is n−1 ∑i =1 (Yi − Y )2 = 9. (a) Construct an estimate of p. (b) Explain how you would construct a 95% confidence for p. (c) Is the estimator that you used in (a) consistent? Explain why or why not. 19 90. Yi ~ i.i.d. N(0, σ2), i = 1, …, n. Show that S = n ∑Y i =1 i 2 is a sufficient statistic for σ2. 91. Suppose Yi|µ ~ iid N(µ, 1), i = 1,..n , µ ~ N(2,1), and let L( µ̂ , µ) = ( µ̂ −µ)2. Suppose n = 1. (a) What is the MLE of µ ? (b) Derive the Bayes risk of the MLE. (c) Derive the posterior for µ . (d) Derive the Bayes estimator of µ . (e) Derive the Bayes risk of the Bayes estimator. (f) Suppose Y1 = 1.5. (f.i) Construct a (frequentist) 95% confidence set for µ. (f.ii) Construct a (Bayes) 95% credible set for µ. 92. Suppose Yi ~ i.i.d. N(µ, 1). Let Y denote the sample mean. Suppose loss is quadratic, i.e. L( µˆ , µ ) = (µˆ − µ )2 . Show that ½ Y is admissible. 93. Suppose Y1 and Y2 are iid with mean µ and variance σ2. Let µ̂ = 0.25Y1 + 0.75Y2. Suppose that risk is mean squared error, so that R( µ̂ ,µ) = E[( µ̂ −µ)2]. (a) Compute R( µ̂ ,µ). (b) Show that µ̂ is inadmissable. 94. Assume that Yi|µ ~ iid N(µ, 1) for i = 1, … , n. Suppose that µ ~ g for some prior density g. Show that the posterior distribution of µ depends on {Yi }in=1 only through Y. 95. Assume that Yi|µ ~ iid N(µ, 1) for i = 1, … , n. Suppose that µ ~ N(0, λ) with probability p, and µ = 0 with probability 1–p. Derive E(µ| Y ). 96. Yi is i.i.d. N(µ, σ2) for i = 1, … , n. Let Y denote the sample mean, and consider µˆ = Y + 2 as an estimator of µ. Suppose that the loss function is L(µˆ , µ ) = (µˆ − µ )2 . Show that µ̂ is an inadmissible estimator of µ. 20