Regression Analysis Example 3.1:

advertisement
Regression Analysis
Example 3.1: Yield of a chemical
process
3. Linear Models
Yield (%) Temperature (oF) Time (hr)
Y
X1
X2
77
160
1
82
165
3
84
165
2
89
170
1
94
175
2
Unied approach
{ regression analysis
{ analysis of variance
{ balanced factorial experiments
Extensions
{ analysis of unbalanced experiments
{ models with correlated responses
split plot designs
repeated measures
{ models with xed and random
components (mixed models)
Simple linear regression model
Yi = 0 + 1 X1i + 2 X2i + i
i = 1; 2; 3; 4; 5
124
Matrix formulation:
2
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
Y1
Y2
Y3
Y4
Y5
3
2
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
=
2
=
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
2
=
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
Analysis of Variance (ANOVA)
0 + 1 X11 + 2 X21 + 1
0 + 1 X12 + 2 X22 + 2
0 + 1 X13 + 2 X23 + 3
0 + 1 X14 + 2 X24 + 4
0 + 1 X15 + 2 X25 + 5
3
0 + 1 X11 + 2 X21
0 + 1 X12 + 2 X22
0 + 1 X13 + 2 X23
0 + 1 X14 + 2 X24
0 + 1 X15 + 2 X25
1
2
3
4
5
1
1
1
1
1
X11
X12
X13
X14
X15
X21
X22
X23
X24
X25
3
2
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
+
3
2
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
2
6
6
6
6
6
6
4
0
1
2
125
3
7
7
7
7
7
7
5
+
1
2
3
4
5
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
Source of
Variation
Sums of
d.f. Squares
Model
2
Error
2
C. total
4
5
X
i
(Y^
Y )2
1
2
SSmodel
(Y
Y^ )2
1
2
SSerror
=1
5
X
i
i
=1
=1
:
i
(Y
i
i
Y )2
:
where
Y: = n1 n Yi
i=1
Y^i = b0 + b1 X1i + b2 X2i
X
n
126
5
X
i
Mean
Squares
=
total number of observations
127
Example 3.2. Blood coagulation times
(in seconds) for blood samples from six
dierent rats. Each rat was fed one
of three diets.
Diet 1 Diet 2 Diet 3
Y11 = 62 Y21 = 71 Y31 = 72
Y12 = 60
Y32 = 68
Y33 = 67
observed time
for the j -th
rat fed the
i-th diet
"
mean time
for rats
given the
i-th diet
2
3
2
3
2
3
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
"
Y
"
X
2
3
6
6
6
6
6
6
6
6
4
7
7
7
7
7
7
7
7
5
"
"
Assuming that E (ij ) = 0 for all
(i; j ), this is a linear model with
\Means" model
Yij = i + ij
%
You can express this model as
Y11
100
11
Y12
1 0 0 1
12
Y21 = 0 1 0 2 + 21
Y31
0 0 1 3
31
Y32
001
32
Y33
001
33
-
random error
with
E ( ) = 0
E (Y) = X and V ar(Y) = ij
129
128
An \eects" model
Yij = + i + ij
This can be expressed as
Y11
1100
Y12
1100 Y21 = 1 0 1 0 1 +
Y31
1 0 0 1 2
Y32
1 0 0 1 3
Y33
1001
or
Y = X + :
This is a linear model with
2
3
2
3
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
2
3
6
6
6
6
6
6
6
6
6
6
6
6
6
4
7
7
7
7
7
7
7
7
7
7
7
7
7
5
2
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
11
12
21
31
32
33
3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
You could add the assumptions
independent errors
homogeneous variance,
i.e.,
V ar(ij ) = 2
to obtain a linear model
Y = X + :
with
E (Y) = X and V ar(Y) = 130
E (Y) = X V ar(Y) = V ar() = 2 I
131
Analysis of Variance (ANOVA)
Source of
Variation
d.f.
Diets
3-1=2
3
X
Error
(n
3
X
C. total
where
ni
=
Yi:
=
Y::
=
n:
=
=
i
(n
=1
i
1) = 3
i
=1
i
Sums of
Squares
3
X
n (Y Y )2
1) = 5
i
=1
i
ni
3 X
X
=1 j =1
i
3
X
i
=1
i:
::
(Y
ij
ni
X
j
=1
Mean
Squares
1
2 SSdiets
Y
i:
(Y
1
3
)2
SSerror
Y )2
ij
::
Example 3.3. A 2 2 factorial
experiment
Experimental units: 8 plots with 5 trees
per plot.
Factor 1: Variety (A or B)
Factor 2: Fungicide use (new or old)
Response: Percentage of apples
with spots
Percentage of
apples with spots
number of rats fed the i-th diet
1 nXi
Y111
Y112
Y121
Y122
Y211
Y212
Y211
Y222
ni j =1 Yij
1 X
3
n: i=1
nXi
Y
j =1 ij
3
n
i=1 i
X
total number of observations
Variety
Fungicide
use
A
A
A
A
B
B
B
B
new
new
old
old
new
new
old
old
= 4.6
= 7.4
= 18.3
= 15.7
= 9.8
= 14.2
= 21.1
= 18.9
132
133
Linear model:
Y = + + + Æ
+ "
"
"
"
"
pervariety
fung.
interrandom
cent
eects
use
action
error
with
(i=1,2)
(j=1,2)
spots
ijk
Analysis of Variance (ANOVA)
Source of
Variation
Sums of
Squares
d.f.
2
Varieties
2
1=1
4 X (Y
Fungicide use
Variety Fung.
use interaction
2
1=1
4
Error
Corrected total
(2
4(2
8
1)(2
=1
1)
1) = 4
1=7
=1
2
X
Y
i::
:::
i
2
j
=1
2
X
(Y
Y
:j:
2
X
=1 j =1
i
ij:
)2
i
j
k
X X X
i
j
k
(Y
(Y
ijk
ijk
ij
ijk
Y
Here we use 9 parameters
Y
+ Y
)2
Y
)2
Y
to represent the 4 response means,
)2
E (Yijk) = ij ; i = 1; 2; and j = 1; 2;
corresponding to the 4 combinations of levels of the two factors.
i::
:j:
:::
X X X
j
)2
:::
(Y
i
ij:
:::
134
T = ( 1 2 1 2 Æ11 Æ12 Æ21 Æ22)
135
\Means" model
Yijk = ij + ijk
Write this model in the form
Y =X+
2
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
Y111
Y112
Y121
Y122
Y211
Y212
Y221
Y224
3
2
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
=
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
111
112
121
122
211
212
221
222
2
+
0
0
0
0
1
1
1
1
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
1
1
0
0
1
1
0
0
0
0
1
1
0
0
1
1
1
1
0
0
0
0
0
0
%
0
0
1
1
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
1
1
2
36
6
76
76
76
76
76
76
76
76
76
76
76
76
76
76
76
76
76
76
76
76
76
76
76
76
76
76
56
6
4
1
2
1
2
Æ11
Æ12
Æ21
Æ22
ij = E (Yijk) = mean
percentage
of apples
with spots
3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
This linear model can be written in
the form Y = X + , that is,
3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
2
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
Y111
Y112
Y121
Y122
Y211
Y212
Y221
Y222
3
2
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
136
Y = X + X11
Y1
Y2 = X12
..
.
X1n
Yn
"
3
2
7
7
7
7
7
7
7
7
7
7
7
7
7
5
6
6
6
6
6
6
6
6
6
6
6
6
6
4
observed
responses
X21 Xk1 1 1
X Xk2 .. + 2
.. 22
.
.
k
n
X2n Xkn
3
7
7
7
7
7
7
7
7
7
7
7
7
7
5
"
the elements of
X are known
(non-random)
values
2
3
6
6
6
6
6
6
6
6
4
7
7
7
7
7
7
7
7
5
2
3
6
6
6
6
6
6
6
6
6
6
6
6
6
4
7
7
7
7
7
7
7
7
7
7
7
7
7
5
"
random
errors
are not
observed
138
0
0
0
0
1
1
0
0
0
0
0
0
0
0
1
1
3
2
7
7
7
72
7
7
76
76
76
7
76
76
76
74
7
7
7
7
7
5
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
11
12
21
22
3
7
7
7
7
7
7
5
+
111
112
121
122
211
212
221
222
3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
Y1
Y = Y..2 is a random vector with
Yn
Any linear model can be written as
6
6
6
6
6
6
6
6
6
6
6
6
6
4
0
0
1
1
0
0
0
0
137
General Linear Model
2
=
1
1
0
0
0
0
0
0
2
3
6
6
6
6
6
6
6
6
6
6
6
6
6
4
7
7
7
7
7
7
7
7
7
7
7
7
7
5
(1) E (Y) = X for some known n k matrix X
of constants and unknown k 1
parameter vector (2) Complete the model by specifying a probability distribution for
the possible values of Y or 139
Sometimes we will only specify the
covariance matrix
Gauss-Markov Model
V ar(Y) = Defn 3.1: The linear model
Y =X+
is a Gauss-Markov model if
Y = X + V ar(Y) = V ar() = 2 I
Since
we have
for an unknown constant 2.
= Y X = Y E (Y)
Notation: Y ( X ; 2 I )
"
"
"
and
distributed E (Y) V ar(Y)
as
E () = 0
V ar() = V ar(Y) = The distribution of Y is not
completely specied.
141
140
Normal Theory
Gauss-Markov Model
Defn 3.2: A normal theory GaussMarkov model is a GaussMarkov model in which Y (or
) has a multivariate normal
distribution.
The additional assumption of a
normal distribution is
(1) not needed for some estimation
results
(2) useful in creating
condence intervals
tests of hypotheses
Y N (X ; 2 I )
%
distr.
as
"
-
-
multivar. E (Y) V ar(Y)
normal
distr.
142
143
Objectives
(iii) Analysis of Variance (ANOVA)
(i) Develop estimation procedures
Estimate ?
Estimate E(Y) = X Estimable functions of .
(iv) Tests of hypotheses
Distributions of quadratic forms
F-tests
power
(ii) Quantify uncertainty in
estimates
variances, standard deviations
distributions
condence intervals
(v) sample size determination
144
145
Least Squares Estimation
OLS Estimator
For the linear model with
Defn 3.3: For a linear model with
E (Y) = X, any vector b that
minimizes the sum of squared
residuals
E (Y) = X and V ar(Y) = we have
Y1
X11
Y2 = X12
..
..
Yn
X1n
and
2
3
2
6
6
6
6
6
6
6
6
6
6
6
6
6
4
7
7
7
7
7
7
7
7
7
7
7
7
7
5
6
6
6
6
6
6
6
6
6
6
6
6
6
4
X21 Xk1
X22 Xk2 ..1 + ..1
..
.
X2n Xkn k n
3
7
7
7
7
7
7
7
7
7
7
7
7
7
5
2
3
2
3
6
6
6
6
6
6
6
6
4
7
7
7
7
7
7
7
7
5
6
6
6
6
6
6
6
6
4
7
7
7
7
7
7
7
7
5
Yi = 1 X1i + 2 X2i + + k Xki + i
= XTi + i
where XTi = (X1i X2i Xki) is the
i-th row of the model matrix X .
146
n
Q(b) = i=1
(Yi XTi b)2
= (Y X b)T (Y X b)
X
is an ordinary least squares (OLS)
estimator for .
147
OLS Estimating Equations
For j = 1; 2; : : : ; k, solve
(b) = 2 n (Y XT b) X
0 = @Q
ij
i
i=1 i
@bj
X
These equations are expressed in
matrix form as
0 = X T (Y X b)
= XT Y XT Xb
If Xnk has full column rank, i.e.,
rank(X ) = k, then
(i) X T X is non-singular
(ii) (X T X )
1
exists and is unique
Consequently,
(X T X ) 1(X T X )b = (X T X ) 1X T y
or
XT Xb = XT Y
These are called the \normal"
equations.
148
and
b = (X T X ) 1 X T Y
is the unique solution to the normal
equations.
149
Generalized Inverse
If rank(X ) < k, then
(i) there are innitely many solutions to the normal equations
(ii) if (X T X ) is a generalized inverse of X T X , then
b = (X T X ) X T Y
is a solution of the normal equations.
150
Defn 3.4: For a given m n matrix
A, any n m matrix G that satises
AGA = A
is a generalized inverse of A.
Comments
We will often use A to denote
a generalized inverse of A.
There may be innitely many
generalized inverses.
If A is an m m nonsingular
matrix, then G = A 1 is the
unique generalized inverse for
A.
151
Example 3.5.
2
A=
6
6
6
6
6
6
6
6
4
16 6 10
6 21 15
10 15 25
Example 3.2. \means" model
3
7
7
7
7
7
7
7
7
5
2
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
with rank(A) = 2
A generalized inverse is
0 0
G = 0 301 0
0 0 501
Check that AGA = A.
2
6
6
6
6
6
6
6
6
6
6
4
1
20
3
7
7
7
7
7
7
7
7
7
7
5
G=
16
6
0
6
21
0
3 1
5
2
3
0
0
0
7
7
7
7
7
5
=
6
6
6
6
6
6
4
3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
2
1
1
0
=
0
0
0
0
0
1
0
0
0
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
0
0
0
1
1
1
2
XT X =
21
300
6
300
0
3
2
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
2
6
6
6
6
6
6
4
1
2
3
3
7
7
7
7
7
7
5
+
11
12
21
31
32
33
3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
For this model
Another generalized inverse is
2 2
6 4
6
6
6
6
4
Y11
Y12
Y21
Y31
Y32
Y33
6
300
16
300
0
0
0
0
XT Y =
7
7
7
7
7
7
5
n1
0
0
0 n2 0
0 0 n3
3
7
7
7
7
7
7
5
Y11 + Y12
Y21
Y31 + Y32 + Y33
2
3
6
6
6
6
6
6
4
6
6
6
6
6
6
4
3
7
7
7
7
7
7
5
152
153
Example 3.2. \eects" model
2
and the unique OLS estimator for
= (1 2 3)T is
b = (X T X ) 1 X T Y
1
Y1:
n1 0 0 Y1:
1
= 0 n2 0 Y2: = Y2:
Y3:
0 0 n13 Y3:
2
3
6
6
6
6
6
6
6
6
6
6
6
6
4
7
7
7
7
7
7
7
7
7
7
7
7
5
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
Y11
Y12
Y21
Y31
Y32
Y33
Here
2
3
2
3
6
6
6
6
6
6
6
6
4
7
7
7
7
7
7
7
7
5
6
6
6
6
6
6
6
6
4
7
7
7
7
7
7
7
7
5
154
3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
2
1
1
1
=
1
1
1
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
1
1
0
0
0
0
0
0
1
1
0
0
0
0
0
1
1
1
3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
2
2
6
6
6
6
6
6
6
6
6
6
4
n: n1
X T X = nn12 n01
n3 0
Y::
X T Y = YY12::
Y3:
2
6
6
6
6
6
6
6
6
6
6
6
6
6
4
2
3
6
6
6
6
6
6
6
6
6
6
6
6
6
4
7
7
7
7
7
7
7
7
7
7
7
7
7
5
1
2
3
3
7
7
7
7
7
7
7
7
7
7
5
+
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
n2 n3
0 0
n2 0
0 n3
11
12
21
31
32
33
3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
3
7
7
7
7
7
7
7
7
7
7
7
7
7
5
155
Solution B: Another generalized inverse
Solution A:
2
0
0
(X T X ) = 0
0
6
6
6
6
6
6
6
6
6
6
6
6
6
4
0
1
n1
0
0
for X T X is
3
0
0
0
7
7
7
7
7
7
7
7
7
7
7
7
7
5
0 n12
0 0 n13
(X T X ) =
and a solution to the normal
equations is
b = (X T X )
2
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
=
2
6
6
6
6
6
6
6
6
6
6
4
=
0
0
0
0
0
n1
0
0
0
Y1:
Y2:
Y3:
1
Let
XT Y
0
0
n2 1
0
0
0
0
n3 1
3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
2 2
6
6 6
6 6
6 6
6 6
6 6
6 6
6 4
6
6
6
4
6
6
6
6
6
6
6
6
6
6
4
Y::
Y1:
Y2:
Y3:
0
2
A=
2
n: n1 n2
n1 n1 0
n2 0 n2
6
6
6
6
6
6
4
3
7
7
7
7
7
7
7
7
7
7
5
0
1
3
7
7
7
7
7
7
5
0
n: n1 n2
n1 n1 0
n2 0 n 2
0
0
0
0
3
7
7
7
7
7
7
5
A
7
7
7
7
7
7
7
7
7
7
5
1
=
=
1
jA11j
jA12j
jA13j
2
jAj
6
6
6
4
1
2
n1n2n3
6
6
6
4
jA21j
jA22j
jA23j
jA31j
jA32j
jA33j
3
7
7
7
5
n1n2
n1n2
n1n2
n1n2 n2(n1 + n3) n1n2
n1n2 n1n2
n1(n2 + n3)
2
A
1=
n3
6
6
6
6
6
6
6
6
4
1
1
1
n
+
n
1
3
1
1
n1
n2+n3
1
1
n2
3
Then
7
7
7
7
7
7
7
7
5
and a solution to the normal equations is
b = (X T X )
2
=
=
1
n3
XT Y
1
1
1
n
+
n
1
3
1
1
n1
n
+
n3
2
1
1
n2
0
0
0
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
0
0
0
0
3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
2
6
6
6
6
6
6
6
6
6
6
4
Y:: Y1: Y2:
Y:: + ( n1n+1n3 )Y1: + Y2:
1
n3 Y:: + Y1: + ( n2n+2n3 )Y2:
Y::
Y1:
Y2:
Y3:
2
3
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
0
3
7
7
7
7
7
7
7
7
7
7
5
3
7
7
7
5
157
156
1
7
7
7
7
7
7
7
7
7
7
7
5
and use result 1.4(ii) to compute
3
Then
3
2
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
b = Y1:
Y2:
Y3:
0
3
Y3:
Y3:
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
This is the OLS estimator for
= 12
3
2
3
6
6
6
6
6
6
6
6
6
6
6
6
6
4
7
7
7
7
7
7
7
7
7
7
7
7
7
5
reported by PROC GLM in the SAS
package, but it is not the only possible solution to the normal equations.
158
159
Solution D: Another
generalized
inverse for X T X is
Solution C: Another
generalized
T
inverse for X X is
n2n3
2
(X
T
X)
=
1
0
0 0
n2n3 0
n2n3 0
6
6
6
6
6
6
6
6
3 66
4
n1n2n
n2n3
n2n3
3
7
7
7
7
7
7
7
7
3 77
5
0
n3(n1 + n2)
2
0
n2n
n2n3 n2(n1 + n3)
The corresponding solution to the
normal equations is
b = (X T X ) X T Y
Y1:
= Y2: 0 Y1:
Y3: Y1:
2
3
6
6
6
6
6
6
6
6
6
6
6
6
6
4
7
7
7
7
7
7
7
7
7
7
7
7
7
5
(X T X ) =
1
2
n
1
n
1
n
1
n
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
1
n
1
n1
1
n
n
0
0
0
1
0
0
n2
1
0
n3
2
=
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
XT Y
1
2
n
1
n
1
n
1
n
1
n
1
n1
1
n
n
0
0
0
1
0
0
n2
1
0
n3
3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
Y::
Y1:
Y2:
Y3:
2
6
6
6
6
6
6
6
6
6
6
4
3
2
7
7
7
7
7
7
7
7
7
7
5
6
6
6
6
6
6
6
6
6
6
4
=
(i) Find any r r nonsingular submatrix of
A where r=rank(A). Call this matrix
W.
(ii) Invert and transpose W , ie., compute
(W 1 )T .
(iii) Replace each element of W in A with
the corresponding element of (W 1)T
(iv) Replace all other elements in A with
zeros.
(v) Transpose the resulting matrix to obtain G, a generalized inverse for A.
Y::
Y1: Y::
Y2: Y::
Y3: Y::
161
160
Algorithm 3.1:
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
The corresponding solution to the
normal equations is
b = (X T X )
Evaluating Generalized Inverses
3
Example 3.6.
2
4
A= 1
3
1
2
-
= 664
(W
G
=
6
6
6
6
6
6
6
6
6
4
0
0
0
2
=
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
3
1 5
1 3
1)T =
7
7
5
2
6
6
6
4
3=2
5=2
1=2
1=2
3
7
7
7
5
3
0 0 77T
7
1 0 777
7
2
7
1 0 75
2
0
3
2
5
2
0
0
0
0
7
7
7
7
7
7
5
2
W
2
3
0
15
5
1 5
1 3
6
6
6
6
6
6
4
0
0
0
0
3
2
1
2
5
2
1
2
3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
Check: AGA = A
162
163
3
7
7
7
7
7
7
7
7
7
7
5
Another solution
2
4 1 2 0
1 1 5 15
3 1 3 5
6
6
6
6
6
6
4
A=
W
G
=
6
6
6
6
6
6
6
6
6
4
2
= 664
3
(i) compute a singular value decomposition of A (see result 1.14) to obtain
P AQ = Drr 0
7
7
5
2
5
20
0
20
6
6
6
6
4
3
20
4
20
3
2
7
7
7
7
5
6
6
4
7
7
7
7
7
7
7
7
5
0
4
20
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
0 0 0
0 0 0
3 0 4
20
20
Proof: Check if AGA = A.
AGA = P
T
4
2
=P
T
=P
T
=P
T
4
2
4
2
4
D
0
0
0
3
5
Q Q4 D
F2
2
T
"
nn
1
F1 5 P P
F3
3
T
4
"
D
0
0 0
3
5
T
D
0
0
0
0
0 0
3
5
7
7
5
2
3
6
6
4
7
7
5
Moore-Penrose Inverse
2
mm
identity
identity
matrix
matrix
32
32
1
D 0 5 4 D F1 5 4 D 0 35 Q
0 0
F2 F3 0 0
32
I DF1 5 4 D 0 35 Q
0
0
165
164
2
0
1 F
1
(ii) G = Q D
F2 F3 P is a generalized
inverse for A for any choice of F1; F2; F3.
3
5
0
20 0 20
3
where
P is an m m orthogonal matrix
Q is an n n orthogonal matrix
D is an r r matrix of singular values
T
7
3
3
20
0 0 0
0 0 0
20
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
=
4 0
3 5
1)T =
5
20 0 0
2
For any m n matrix A with rank(A) = r,
7
7
7
7
7
7
5
(W
2
Algorithm 3.2
3
T
Q
T
=A
166
Q
T
Defn 3.5: For any matrix A there
is a unique matrix M , called the
Moore-Penrose inverse, that
satises
(i) AMA = A
(ii) MAM = M
(iii) AM is symmetric
(iv) MA is symmetric
167
Properties of generalized inverses
of X T X
Normal equations: (X T X )b = X T Y
Result 3.1
1
M = Q D0 00 P
2
3
6
6
6
6
4
7
7
7
7
5
is the Moore-Penrose inverse of A,
where
P AQ = D 0
2
3
6
6
6
4
7
7
7
5
0 0
is a singular value decomposition of
A.
168
Result 3.3 If G is a generalized inverse of X T X , then
(i) GT is a generalized inverse of
XT X.
(ii) XGX T X = X , i.e., GX T is a
generalized inverse of X .
(iii) XGX T is invariant with respect
to the choice of G.
(iv) XGX T is symmetric.
169
Proof:
(i) Since G is a generalized inverse of
(X T X ), (X T X )G(X T X ) = X T X .
Taking the transpose of both sides
X T X T = (X T X )G(X T X ) T
= (X T X )T GT (X T X )T
But (X T X )T = X T (X T )T = X T X ,
hence (X T X )GT (X T X ) = (X T X )
"
#
"
#
(ii) From (i) (X T X )GT (X T X ) = (X T X )
- Call this B
Then
0 = BX T X X T X
= (BX T X X T X )(B T I )
= BX T XB T X T XB T BX T X X T X
= (BX T X T )(BX T X T )T
170
Hence, 0 = BX T X T
) BX T = X T
) X T XGT X T = X T
Taking the transpose
X = (X T XGT X T )T
= XGX T X
Hence, GX T is a generalized inverse for X .
(iii) Suppose F and G are generalized inverses for X T X . Then, from (ii)
XGX T X = X
and
XF X T X = X
171
It follows that
0 = X X
= (XGX T X XF X T X )
= (XGX T X XF X T X )(GT X T F T X T )
= (XGX T XF X T )X (GT X T F T X T )
= (XGX T XF X T )(XGT X T XF T X T )
= (XGX T XF X T )(XGX T XF X T )T
Since the (i,i) diagonal element of the result of multiplying a matrix by its transpose
is the sum of the squared entries in the i-th
row of the matrix, the diagonal elements
of the product are all zero only if all entries are zero in every row of the matrix.
Consequently,
(XGX T XF X T ) = 0
(iv) For any generalized inverse G,
T = GX T XGT
is a symmetric generalized inverse.
Then
XT X T
is symmetric and from (iii),
XGX T = XT X T :
173
172
Estimation of the Mean Vector
E (Y) = X
For any solution to the normal equations,
say
b = (X T X ) X T Y ;
the OLS estimator for E (Y) = X is
^ = X b = X (X T X ) X T Y
Y
= PX Y
The matrix PX
X (X T X ) X T
is called an \orthogonal projection
matrix".
Y^ = PX Y is the projection of Y onto
the space spanned by the columns
of X .
=
174
Result 3.4 Properties of a
projection matrix
PX = X (X T X ) X T
(i) PX is invariant to the choice of
(X T X ) . For any solution
b = (X T X ) X T Y
to the normal equations
Y^ = X b = PX Y
is the same.
(from Result 3.3 (iii))
175
Residuals:
(ii) PX is symmetric
(from Result 3.3 (iv))
(iii) PX is idempodent (PX PX = PX )
(iv) PX X = X
(from Result 3.3 (ii))
(v) Partition X as
X = [X1jX2j jXk] ;
ei = Yi Y^i i = 1; : : : ; n
The vector of residuals is
e=
=
=
=
Y
Y
Y
(I
Y^
Xb
PX Y
PX )Y
Comment: I PX is a projection matrix that projects Y onto
the space orthogonal to the space
spanned by the columns of X .
then PX Xj = Xj
176
Result 3.5 Properties of I PX
(i) I PX is symmetric
(ii) I PX is idempodent
177
(v) Partition X as [X1jX2j jXk]
then
(I PX )Xj = 0
(I PX )(I PX ) = I PX
(iii)
(I PX )PX = PX PX PX
= PX PX = 0
(iv)
(vi) Residuals are invariant with respect to the choice of (X T X ) ,
so
e Y X b = (I PX )Y
is the same for any solution
b = (X T X ) X T Y
(I PX )X = X PX X
= X X=0
to the normal equations
178
179
Download