Stats 851.3, Stats 443.3 Final Examination April, 2012 Take Home Questions I Let C denote the (k+1) x (k+1) matrix: A 1 C= 1 0 where 1 denotes the vector of 1's and A is a symmetric k x k matrix of rank k-1 such that 1 'A = 0 . Find C-1. II Find the inverse of the triangular matrix A : I J J A = 0 I J 0 0 I where I denotes the matrix n x n identity matrix, J denotes the n x n matrix of 1's, and 0 denotes the n x n matrix of 0's. III Consider the three p-variate Normal populations: N μ1 , Σ , N μ2 , Σ and N μ3 , Σ . Also consider the two linear discriminant functions W xΣ1 μ μ 1 μ μ Σ 1 μ μ 12 1 2 2 1 2 1 2 W13 xΣ1 μ1 μ3 12 μ1 μ3 Σ 1 μ1 μ3 (a) What is the joint distribution of W12 and W13? (b) If x is classified as coming from population 1 if W12 ≥ 0 and W13 ≥ 0, what is the misclassification probability of assigning x to population 1 if μ1 μ 2 and x really came from population 3. Find an expression for this probability. x1 y1 IV. Let x x2 and y y2 denote two independent trivariate Normal random vectors with y3 x3 mean respective parameters: 2 3 2 1 3 4 2 0 1 2 , 2 4 1 , 2 4 , 2 2 4 2 2 1 1 2 2 0 2 2 Find the distribution of a. x y x y b. x y y c. x 1 1 1 2 1 1 Let A , B 1 1 0 0 1 1 0 1 1 Determine the distribution of d. Ax e. Bx Ax f. By Ax g. Bx V. In the following study the researcher was interested in how the dependent variable Y was related to X1, X2, and X3. The data is given below: X1 19.1 19.5 11.1 7.6 17.7 14.6 22.3 15.3 X2 13.5 12.4 12.7 7.7 18.2 15.3 14.2 11.2 X3 9.4 11.7 11 8.8 7.6 10.4 12.3 8.7 Y 36.5 34.6 25.4 24.1 27.6 25.6 38.3 30.8 Regression Analysis was performed with every possible subset of the independent variables. In each case The total sum of squares is 8 SSTotal yi y 212.97875 2 i 1 In addition the proportion of variance R2 for each regression analysis is found in the following table: Variables in Equation No variables X1 X2 X3 X1, X2 X1, X3 X2, X3 X1, X2, X3 Estimates of Regression Coefficients 0 (30.4) 0 (14.4), 1 (1.007) 0 (26.6), 2 (0.284) 0 (15.1), 3 (1.532) 0 (21.3), 1 (1.365), 2 (-0.964) 0 (10.2), 1 (0.942), 3 (0.517) 0 (10.5), 2 (0.324), 3 (1.562) 0 (21.7), 1 (1.372), 2 (-0.972), 3 (-0.035) R2 0.0000000 0.7678094 0.024821 0.2049671 0.9569737 0.7880561 0.2373178 0.9570531 a) Use R2 to determine the "best" equation for predicting Y from X1, X2, X3. b) Repeat a) using s2, the mean square for error. c) Repeat a) using Mallows statistic Ck. note: RSS Note : Mallows satistic Ck 2 k [n 2(k 1)] scomplete where RSSk the residual sum of squares with k variables in the equation. 2 and scomplete the residual mean square with all variables in the equation. d) Use Stepwise regression to find the best equation using critical values of 4.0 for F to remove and F to Remove. VI. Let k12 and k 22 be positive constants and X is an n × p matrix of rank p ≤ n. Prove that -1 -1 V k12 XXX X k 22 I XXX X is positive definite Show that 1 1 -1 -1 V 1 2 XXX X 2 I XXX X k1 k1 VII. VIII. If y is distributed N μ, V , show that Q y μ V 1 y μ is distributed a 2(p). -1 If y is distributed N 0, I , find E y I XXX X y IX. The data given in Table 3.6 represents the voltage drop in the battery of a guided missile motor during its time of flight. (a) Fit a polynomial regression estimator to this data using automated methods to select the degree of the polynomial. Through examination of estimated coefficients, diagnostics, etc., determine any adjustments which should be made to the estimator. Table 3.6: Voltage Drop Data t y t y t y 0 0.0732 0.1463 0.2195 0.2927 0.3659 0.439 0.5122 0.5854 0.6585 0.7317 0.8049 0.878 0.9512 8.33 7.14 7.94 8.71 10.91 12.81 14.59 14.92 15.18 13.81 13.04 11.15 10.08 9.95 0.0244 0.0976 0.1707 0.2439 0.3171 0.3902 0.4634 0.5366 0.6098 0.6829 0.7561 0.8293 0.9024 0.9756 8.23 7.31 8.3 9.71 11.67 13.3 14.05 14.37 14.51 13.79 12.6 11.15 9.78 9.51 0.0488 0.122 0.1951 0.2683 0.3415 0.4146 0.4878 0.561 0.6341 0.7073 0.7805 0.8537 0.9268 7.17 7.6 8.76 10.26 11.76 13.88 14.48 14.63 14.34 13.05 12.05 10.14 9.8 X. Consider the following data for 15 subjects with two predictors. The dependent variable, MARK, is the total score for a subject on an examination. The first predictor, COMP, is the score for the subject on a so called compulsory paper. The other predictor, CERTIF, is the score for the subject on a previous exam. Fit a model for predicting y (MARK) from x1 (COMP) and x2 (CERTIF) Candidate MARK COMP CERTIF Candidate MARK COMP CERTIF 1 2 3 4 5 6 7 8 476 457 540 551 575 698 545 574 111 92 90 107 98 150 118 110 68 46 50 59 50 66 54 51 9 10 11 12 13 14 15 645 556 634 637 390 562 560 117 94 130 118 91 118 109 59 97 57 51 44 61 66