Take Home Part of final exam

advertisement
Stats 851.3, Stats 443.3
Final Examination
April, 2012
Take Home Questions
I
Let C denote the (k+1) x (k+1) matrix:
 A 1
C=

 1 0

where 1 denotes the vector of 1's and A is a symmetric k x k matrix of rank k-1


such that 1 'A = 0 .
Find C-1.
II Find the inverse of the triangular matrix A :
I J J
A = 0 I J 


0 0 I 
where I denotes the matrix n x n identity matrix, J denotes the n x n matrix of 1's, and 0
denotes the n x n matrix of 0's.
III
Consider the three p-variate Normal
populations: N  μ1 , Σ  , N  μ2 , Σ  and N  μ3 , Σ  . Also consider the two linear discriminant
functions
W  xΣ1  μ  μ   1  μ  μ  Σ 1  μ  μ 
12
1
2
2
1
2
1
2
W13  xΣ1  μ1  μ3   12  μ1  μ3  Σ 1  μ1  μ3 
(a) What is the joint distribution of W12 and W13?
(b) If x is classified as coming from population 1 if W12 ≥ 0 and W13 ≥ 0, what is the
misclassification probability of assigning x to population 1 if μ1  μ 2 and x really came from
population 3. Find an expression for this probability.
 x1 
 y1 


IV. Let x  x2 and y   y2  denote two independent trivariate Normal random vectors with
 
 
 y3 
 x3 
mean respective parameters:
2
 3 2 1
 3
 4 2 0






1  2 ,   2 4 1 , 2  4 , 2  2 4 2
 


 


 2
1 1 2
 2
 0 2 2
Find the distribution of
a. x  y
x  y
b. 

x  y
 y
c.  
x
1 1 1 
2 1 1


Let A  
 , B  1 1 0 
0
1

1


0 1 1
Determine the distribution of
d. Ax
e. Bx
 Ax 
f.  
 By 
 Ax 
g.  
 Bx 
V. In the following study the researcher was interested in how the dependent variable Y was
related to X1, X2, and X3. The data is given below:
X1
19.1
19.5
11.1
7.6
17.7
14.6
22.3
15.3
X2
13.5
12.4
12.7
7.7
18.2
15.3
14.2
11.2
X3
9.4
11.7
11
8.8
7.6
10.4
12.3
8.7
Y
36.5
34.6
25.4
24.1
27.6
25.6
38.3
30.8
Regression Analysis was performed with every possible subset of the independent
variables. In each case The total sum of squares is
8
SSTotal    yi  y   212.97875
2
i 1
In addition the proportion of variance R2 for each regression analysis is found in the
following table:
Variables in Equation
No variables
X1
X2
X3
X1, X2
X1, X3
X2, X3
X1, X2, X3
Estimates of Regression Coefficients
0 (30.4)
0 (14.4), 1 (1.007)
0 (26.6), 2 (0.284)
0 (15.1), 3 (1.532)
0 (21.3), 1 (1.365), 2 (-0.964)
0 (10.2), 1 (0.942), 3 (0.517)
0 (10.5), 2 (0.324), 3 (1.562)
0 (21.7), 1 (1.372), 2 (-0.972), 3 (-0.035)
R2
0.0000000
0.7678094
0.024821
0.2049671
0.9569737
0.7880561
0.2373178
0.9570531
a) Use R2 to determine the "best" equation for predicting Y from X1, X2, X3.
b) Repeat a) using s2, the mean square for error.
c) Repeat a) using Mallows statistic Ck. note:
RSS
Note : Mallows satistic Ck  2 k  [n  2(k  1)]
scomplete
where RSSk  the residual sum of squares with k variables in the equation.
2
and scomplete
 the residual mean square with all variables in the equation.
d) Use Stepwise regression to find the best equation using critical values of 4.0 for F to
remove and F to Remove.
VI. Let k12 and k 22 be positive constants and X is an n × p matrix of rank p ≤ n.
Prove that
-1
-1
V  k12 XXX  X  k 22 I  XXX  X
is positive definite
Show that
1
1
-1
-1
V 1  2 XXX  X  2 I  XXX  X
k1
k1



VII.
VIII.

  
 


If y is distributed N μ, V  , show that Q  y  μ  V 1 y  μ  is distributed a 2(p).
 

 




-1
If y is distributed N 0, I , find E y  I  XXX  X  y
IX. The data given in Table 3.6 represents the voltage drop in the battery of a guided
missile motor during its time of flight.
(a) Fit a polynomial regression estimator to this data using automated methods to
select the degree of the polynomial. Through examination of estimated
coefficients, diagnostics, etc., determine any adjustments which should be made to
the estimator.
Table 3.6: Voltage Drop Data
t
y
t
y
t
y
0
0.0732
0.1463
0.2195
0.2927
0.3659
0.439
0.5122
0.5854
0.6585
0.7317
0.8049
0.878
0.9512
8.33
7.14
7.94
8.71
10.91
12.81
14.59
14.92
15.18
13.81
13.04
11.15
10.08
9.95
0.0244
0.0976
0.1707
0.2439
0.3171
0.3902
0.4634
0.5366
0.6098
0.6829
0.7561
0.8293
0.9024
0.9756
8.23
7.31
8.3
9.71
11.67
13.3
14.05
14.37
14.51
13.79
12.6
11.15
9.78
9.51
0.0488
0.122
0.1951
0.2683
0.3415
0.4146
0.4878
0.561
0.6341
0.7073
0.7805
0.8537
0.9268
7.17
7.6
8.76
10.26
11.76
13.88
14.48
14.63
14.34
13.05
12.05
10.14
9.8
X. Consider the following data for 15 subjects with two predictors. The dependent variable,
MARK, is the total score for a subject on an examination. The first predictor, COMP, is the
score for the subject on a so called compulsory paper. The other predictor, CERTIF, is the
score for the subject on a previous exam. Fit a model for predicting y (MARK) from x1
(COMP) and x2 (CERTIF)
Candidate
MARK
COMP
CERTIF
Candidate
MARK
COMP
CERTIF
1
2
3
4
5
6
7
8
476
457
540
551
575
698
545
574
111
92
90
107
98
150
118
110
68
46
50
59
50
66
54
51
9
10
11
12
13
14
15
645
556
634
637
390
562
560
117
94
130
118
91
118
109
59
97
57
51
44
61
66
Download