Regression Analysis Example 3.1: Yield of a chemical process 3. Linear Models Yield (%) Temperature (oF) Time (hr) Y X1 X2 77 160 1 82 165 3 84 165 2 89 170 1 94 175 2 Unied approach { regression analysis { analysis of variance { balanced factorial experiments Extensions { analysis of unbalanced experiments { models with correlated responses split plot designs repeated measures { models with xed and random components (mixed models) Simple linear regression model Yi = 0 + 1 X1i + 2 X2i + i i = 1; 2; 3; 4; 5 124 Matrix formulation: 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 Y1 Y2 Y3 Y4 Y5 3 2 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 = 2 = 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 2 = 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 Analysis of Variance (ANOVA) 0 + 1 X11 + 2 X21 + 1 0 + 1 X12 + 2 X22 + 2 0 + 1 X13 + 2 X23 + 3 0 + 1 X14 + 2 X24 + 4 0 + 1 X15 + 2 X25 + 5 3 0 + 1 X11 + 2 X21 0 + 1 X12 + 2 X22 0 + 1 X13 + 2 X23 0 + 1 X14 + 2 X24 0 + 1 X15 + 2 X25 1 2 3 4 5 1 1 1 1 1 X11 X12 X13 X14 X15 X21 X22 X23 X24 X25 3 2 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 + 3 2 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 2 6 6 6 6 6 6 4 0 1 2 125 3 7 7 7 7 7 7 5 + 1 2 3 4 5 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 Source of Variation Sums of d.f. Squares Model 2 Error 2 C. total 4 5 X i (Y^ Y )2 1 2 SSmodel (Y Y^ )2 1 2 SSerror =1 5 X i i =1 =1 : i (Y i i Y )2 : where Y: = n1 n Yi i=1 Y^i = b0 + b1 X1i + b2 X2i X n 126 5 X i Mean Squares = total number of observations 127 Example 3.2. Blood coagulation times (in seconds) for blood samples from six dierent rats. Each rat was fed one of three diets. Diet 1 Diet 2 Diet 3 Y11 = 62 Y21 = 71 Y31 = 72 Y12 = 60 Y32 = 68 Y33 = 67 observed time for the j -th rat fed the i-th diet " mean time for rats given the i-th diet 2 3 2 3 2 3 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 " Y " X 2 3 6 6 6 6 6 6 6 6 4 7 7 7 7 7 7 7 7 5 " " Assuming that E (ij ) = 0 for all (i; j ), this is a linear model with \Means" model Yij = i + ij % You can express this model as Y11 100 11 Y12 1 0 0 1 12 Y21 = 0 1 0 2 + 21 Y31 0 0 1 3 31 Y32 001 32 Y33 001 33 - random error with E ( ) = 0 E (Y) = X and V ar(Y) = ij 129 128 An \eects" model Yij = + i + ij This can be expressed as Y11 1100 Y12 1100 Y21 = 1 0 1 0 1 + Y31 1 0 0 1 2 Y32 1 0 0 1 3 Y33 1001 or Y = X + : This is a linear model with 2 3 2 3 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 2 3 6 6 6 6 6 6 6 6 6 6 6 6 6 4 7 7 7 7 7 7 7 7 7 7 7 7 7 5 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 11 12 21 31 32 33 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 You could add the assumptions independent errors homogeneous variance, i.e., V ar(ij ) = 2 to obtain a linear model Y = X + : with E (Y) = X and V ar(Y) = 130 E (Y) = X V ar(Y) = V ar() = 2 I 131 Analysis of Variance (ANOVA) Source of Variation d.f. Diets 3-1=2 3 X Error (n 3 X C. total where ni = Yi: = Y:: = n: = = i (n =1 i 1) = 3 i =1 i Sums of Squares 3 X n (Y Y )2 1) = 5 i =1 i ni 3 X X =1 j =1 i 3 X i =1 i: :: (Y ij ni X j =1 Mean Squares 1 2 SSdiets Y i: (Y 1 3 )2 SSerror Y )2 ij :: Example 3.3. A 2 2 factorial experiment Experimental units: 8 plots with 5 trees per plot. Factor 1: Variety (A or B) Factor 2: Fungicide use (new or old) Response: Percentage of apples with spots Percentage of apples with spots number of rats fed the i-th diet 1 nXi Y111 Y112 Y121 Y122 Y211 Y212 Y211 Y222 ni j =1 Yij 1 X 3 n: i=1 nXi Y j =1 ij 3 n i=1 i X total number of observations Variety Fungicide use A A A A B B B B new new old old new new old old = 4.6 = 7.4 = 18.3 = 15.7 = 9.8 = 14.2 = 21.1 = 18.9 132 133 Linear model: Y = + + + Æ + " " " " " pervariety fung. interrandom cent eects use action error with (i=1,2) (j=1,2) spots ijk Analysis of Variance (ANOVA) Source of Variation Sums of Squares d.f. 2 Varieties 2 1=1 4 X (Y Fungicide use Variety Fung. use interaction 2 1=1 4 Error Corrected total (2 4(2 8 1)(2 =1 1) 1) = 4 1=7 =1 2 X Y i:: ::: i 2 j =1 2 X (Y Y :j: 2 X =1 j =1 i ij: )2 i j k X X X i j k (Y (Y ijk ijk ij ijk Y Here we use 9 parameters Y + Y )2 Y )2 Y to represent the 4 response means, )2 E (Yijk) = ij ; i = 1; 2; and j = 1; 2; corresponding to the 4 combinations of levels of the two factors. i:: :j: ::: X X X j )2 ::: (Y i ij: ::: 134 T = ( 1 2 1 2 Æ11 Æ12 Æ21 Æ22) 135 \Means" model Yijk = ij + ijk Write this model in the form Y =X+ 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 Y111 Y112 Y121 Y122 Y211 Y212 Y221 Y224 3 2 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 = 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 111 112 121 122 211 212 221 222 2 + 0 0 0 0 1 1 1 1 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 1 1 0 0 1 1 0 0 0 0 1 1 0 0 1 1 1 1 0 0 0 0 0 0 % 0 0 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 2 36 6 76 76 76 76 76 76 76 76 76 76 76 76 76 76 76 76 76 76 76 76 76 76 76 76 76 76 56 6 4 1 2 1 2 Æ11 Æ12 Æ21 Æ22 ij = E (Yijk) = mean percentage of apples with spots 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 This linear model can be written in the form Y = X + , that is, 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 Y111 Y112 Y121 Y122 Y211 Y212 Y221 Y222 3 2 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 136 Y = X + X11 Y1 Y2 = X12 .. . X1n Yn " 3 2 7 7 7 7 7 7 7 7 7 7 7 7 7 5 6 6 6 6 6 6 6 6 6 6 6 6 6 4 observed responses X21 Xk1 1 1 X Xk2 .. + 2 .. 22 . . k n X2n Xkn 3 7 7 7 7 7 7 7 7 7 7 7 7 7 5 " the elements of X are known (non-random) values 2 3 6 6 6 6 6 6 6 6 4 7 7 7 7 7 7 7 7 5 2 3 6 6 6 6 6 6 6 6 6 6 6 6 6 4 7 7 7 7 7 7 7 7 7 7 7 7 7 5 " random errors are not observed 138 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 3 2 7 7 7 72 7 7 76 76 76 7 76 76 76 74 7 7 7 7 7 5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 11 12 21 22 3 7 7 7 7 7 7 5 + 111 112 121 122 211 212 221 222 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 Y1 Y = Y..2 is a random vector with Yn Any linear model can be written as 6 6 6 6 6 6 6 6 6 6 6 6 6 4 0 0 1 1 0 0 0 0 137 General Linear Model 2 = 1 1 0 0 0 0 0 0 2 3 6 6 6 6 6 6 6 6 6 6 6 6 6 4 7 7 7 7 7 7 7 7 7 7 7 7 7 5 (1) E (Y) = X for some known n k matrix X of constants and unknown k 1 parameter vector (2) Complete the model by specifying a probability distribution for the possible values of Y or 139 Sometimes we will only specify the covariance matrix Gauss-Markov Model V ar(Y) = Defn 3.1: The linear model Y =X+ is a Gauss-Markov model if Y = X + V ar(Y) = V ar() = 2 I Since we have for an unknown constant 2. = Y X = Y E (Y) Notation: Y ( X ; 2 I ) " " " and distributed E (Y) V ar(Y) as E () = 0 V ar() = V ar(Y) = The distribution of Y is not completely specied. 141 140 Normal Theory Gauss-Markov Model Defn 3.2: A normal theory GaussMarkov model is a GaussMarkov model in which Y (or ) has a multivariate normal distribution. The additional assumption of a normal distribution is (1) not needed for some estimation results (2) useful in creating condence intervals tests of hypotheses Y N (X ; 2 I ) % distr. as " - - multivar. E (Y) V ar(Y) normal distr. 142 143 Objectives (iii) Analysis of Variance (ANOVA) (i) Develop estimation procedures Estimate ? Estimate E(Y) = X Estimable functions of . (iv) Tests of hypotheses Distributions of quadratic forms F-tests power (ii) Quantify uncertainty in estimates variances, standard deviations distributions condence intervals (v) sample size determination 144 145 Least Squares Estimation OLS Estimator For the linear model with Defn 3.3: For a linear model with E (Y) = X, any vector b that minimizes the sum of squared residuals E (Y) = X and V ar(Y) = we have Y1 X11 Y2 = X12 .. .. Yn X1n and 2 3 2 6 6 6 6 6 6 6 6 6 6 6 6 6 4 7 7 7 7 7 7 7 7 7 7 7 7 7 5 6 6 6 6 6 6 6 6 6 6 6 6 6 4 X21 Xk1 X22 Xk2 ..1 + ..1 .. . X2n Xkn k n 3 7 7 7 7 7 7 7 7 7 7 7 7 7 5 2 3 2 3 6 6 6 6 6 6 6 6 4 7 7 7 7 7 7 7 7 5 6 6 6 6 6 6 6 6 4 7 7 7 7 7 7 7 7 5 Yi = 1 X1i + 2 X2i + + k Xki + i = XTi + i where XTi = (X1i X2i Xki) is the i-th row of the model matrix X . 146 n Q(b) = i=1 (Yi XTi b)2 = (Y X b)T (Y X b) X is an ordinary least squares (OLS) estimator for . 147 OLS Estimating Equations For j = 1; 2; : : : ; k, solve (b) = 2 n (Y XT b) X 0 = @Q ij i i=1 i @bj X These equations are expressed in matrix form as 0 = X T (Y X b) = XT Y XT Xb If Xnk has full column rank, i.e., rank(X ) = k, then (i) X T X is non-singular (ii) (X T X ) 1 exists and is unique Consequently, (X T X ) 1(X T X )b = (X T X ) 1X T y or XT Xb = XT Y These are called the \normal" equations. 148 and b = (X T X ) 1 X T Y is the unique solution to the normal equations. 149 Generalized Inverse If rank(X ) < k, then (i) there are innitely many solutions to the normal equations (ii) if (X T X ) is a generalized inverse of X T X , then b = (X T X ) X T Y is a solution of the normal equations. 150 Defn 3.4: For a given m n matrix A, any n m matrix G that satises AGA = A is a generalized inverse of A. Comments We will often use A to denote a generalized inverse of A. There may be innitely many generalized inverses. If A is an m m nonsingular matrix, then G = A 1 is the unique generalized inverse for A. 151 Example 3.5. 2 A= 6 6 6 6 6 6 6 6 4 16 6 10 6 21 15 10 15 25 Example 3.2. \means" model 3 7 7 7 7 7 7 7 7 5 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 with rank(A) = 2 A generalized inverse is 0 0 G = 0 301 0 0 0 501 Check that AGA = A. 2 6 6 6 6 6 6 6 6 6 6 4 1 20 3 7 7 7 7 7 7 7 7 7 7 5 G= 16 6 0 6 21 0 3 1 5 2 3 0 0 0 7 7 7 7 7 5 = 6 6 6 6 6 6 4 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 2 1 1 0 = 0 0 0 0 0 1 0 0 0 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 0 0 0 1 1 1 2 XT X = 21 300 6 300 0 3 2 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 2 6 6 6 6 6 6 4 1 2 3 3 7 7 7 7 7 7 5 + 11 12 21 31 32 33 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 For this model Another generalized inverse is 2 2 6 4 6 6 6 6 4 Y11 Y12 Y21 Y31 Y32 Y33 6 300 16 300 0 0 0 0 XT Y = 7 7 7 7 7 7 5 n1 0 0 0 n2 0 0 0 n3 3 7 7 7 7 7 7 5 Y11 + Y12 Y21 Y31 + Y32 + Y33 2 3 6 6 6 6 6 6 4 6 6 6 6 6 6 4 3 7 7 7 7 7 7 5 152 153 Example 3.2. \eects" model 2 and the unique OLS estimator for = (1 2 3)T is b = (X T X ) 1 X T Y 1 Y1: n1 0 0 Y1: 1 = 0 n2 0 Y2: = Y2: Y3: 0 0 n13 Y3: 2 3 6 6 6 6 6 6 6 6 6 6 6 6 4 7 7 7 7 7 7 7 7 7 7 7 7 5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 Y11 Y12 Y21 Y31 Y32 Y33 Here 2 3 2 3 6 6 6 6 6 6 6 6 4 7 7 7 7 7 7 7 7 5 6 6 6 6 6 6 6 6 4 7 7 7 7 7 7 7 7 5 154 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 2 1 1 1 = 1 1 1 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 1 1 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 2 2 6 6 6 6 6 6 6 6 6 6 4 n: n1 X T X = nn12 n01 n3 0 Y:: X T Y = YY12:: Y3: 2 6 6 6 6 6 6 6 6 6 6 6 6 6 4 2 3 6 6 6 6 6 6 6 6 6 6 6 6 6 4 7 7 7 7 7 7 7 7 7 7 7 7 7 5 1 2 3 3 7 7 7 7 7 7 7 7 7 7 5 + 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 n2 n3 0 0 n2 0 0 n3 11 12 21 31 32 33 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 3 7 7 7 7 7 7 7 7 7 7 7 7 7 5 155 Solution B: Another generalized inverse Solution A: 2 0 0 (X T X ) = 0 0 6 6 6 6 6 6 6 6 6 6 6 6 6 4 0 1 n1 0 0 for X T X is 3 0 0 0 7 7 7 7 7 7 7 7 7 7 7 7 7 5 0 n12 0 0 n13 (X T X ) = and a solution to the normal equations is b = (X T X ) 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 = 2 6 6 6 6 6 6 6 6 6 6 4 = 0 0 0 0 0 n1 0 0 0 Y1: Y2: Y3: 1 Let XT Y 0 0 n2 1 0 0 0 0 n3 1 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 2 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 6 6 6 4 6 6 6 6 6 6 6 6 6 6 4 Y:: Y1: Y2: Y3: 0 2 A= 2 n: n1 n2 n1 n1 0 n2 0 n2 6 6 6 6 6 6 4 3 7 7 7 7 7 7 7 7 7 7 5 0 1 3 7 7 7 7 7 7 5 0 n: n1 n2 n1 n1 0 n2 0 n 2 0 0 0 0 3 7 7 7 7 7 7 5 A 7 7 7 7 7 7 7 7 7 7 5 1 = = 1 jA11j jA12j jA13j 2 jAj 6 6 6 4 1 2 n1n2n3 6 6 6 4 jA21j jA22j jA23j jA31j jA32j jA33j 3 7 7 7 5 n1n2 n1n2 n1n2 n1n2 n2(n1 + n3) n1n2 n1n2 n1n2 n1(n2 + n3) 2 A 1= n3 6 6 6 6 6 6 6 6 4 1 1 1 n + n 1 3 1 1 n1 n2+n3 1 1 n2 3 Then 7 7 7 7 7 7 7 7 5 and a solution to the normal equations is b = (X T X ) 2 = = 1 n3 XT Y 1 1 1 n + n 1 3 1 1 n1 n + n3 2 1 1 n2 0 0 0 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 0 0 0 0 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 2 6 6 6 6 6 6 6 6 6 6 4 Y:: Y1: Y2: Y:: + ( n1n+1n3 )Y1: + Y2: 1 n3 Y:: + Y1: + ( n2n+2n3 )Y2: Y:: Y1: Y2: Y3: 2 3 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 0 3 7 7 7 7 7 7 7 7 7 7 5 3 7 7 7 5 157 156 1 7 7 7 7 7 7 7 7 7 7 7 5 and use result 1.4(ii) to compute 3 Then 3 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 b = Y1: Y2: Y3: 0 3 Y3: Y3: 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 This is the OLS estimator for = 12 3 2 3 6 6 6 6 6 6 6 6 6 6 6 6 6 4 7 7 7 7 7 7 7 7 7 7 7 7 7 5 reported by PROC GLM in the SAS package, but it is not the only possible solution to the normal equations. 158 159 Solution D: Another generalized inverse for X T X is Solution C: Another generalized T inverse for X X is n2n3 2 (X T X) = 1 0 0 0 n2n3 0 n2n3 0 6 6 6 6 6 6 6 6 3 66 4 n1n2n n2n3 n2n3 3 7 7 7 7 7 7 7 7 3 77 5 0 n3(n1 + n2) 2 0 n2n n2n3 n2(n1 + n3) The corresponding solution to the normal equations is b = (X T X ) X T Y Y1: = Y2: 0 Y1: Y3: Y1: 2 3 6 6 6 6 6 6 6 6 6 6 6 6 6 4 7 7 7 7 7 7 7 7 7 7 7 7 7 5 (X T X ) = 1 2 n 1 n 1 n 1 n 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 1 n 1 n1 1 n n 0 0 0 1 0 0 n2 1 0 n3 2 = 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 XT Y 1 2 n 1 n 1 n 1 n 1 n 1 n1 1 n n 0 0 0 1 0 0 n2 1 0 n3 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 Y:: Y1: Y2: Y3: 2 6 6 6 6 6 6 6 6 6 6 4 3 2 7 7 7 7 7 7 7 7 7 7 5 6 6 6 6 6 6 6 6 6 6 4 = (i) Find any r r nonsingular submatrix of A where r=rank(A). Call this matrix W. (ii) Invert and transpose W , ie., compute (W 1 )T . (iii) Replace each element of W in A with the corresponding element of (W 1)T (iv) Replace all other elements in A with zeros. (v) Transpose the resulting matrix to obtain G, a generalized inverse for A. Y:: Y1: Y:: Y2: Y:: Y3: Y:: 161 160 Algorithm 3.1: 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 The corresponding solution to the normal equations is b = (X T X ) Evaluating Generalized Inverses 3 Example 3.6. 2 4 A= 1 3 1 2 - = 664 (W G = 6 6 6 6 6 6 6 6 6 4 0 0 0 2 = 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 3 1 5 1 3 1)T = 7 7 5 2 6 6 6 4 3=2 5=2 1=2 1=2 3 7 7 7 5 3 0 0 77T 7 1 0 777 7 2 7 1 0 75 2 0 3 2 5 2 0 0 0 0 7 7 7 7 7 7 5 2 W 2 3 0 15 5 1 5 1 3 6 6 6 6 6 6 4 0 0 0 0 3 2 1 2 5 2 1 2 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 Check: AGA = A 162 163 3 7 7 7 7 7 7 7 7 7 7 5 Another solution 2 4 1 2 0 1 1 5 15 3 1 3 5 6 6 6 6 6 6 4 A= W G = 6 6 6 6 6 6 6 6 6 4 2 = 664 3 (i) compute a singular value decomposition of A (see result 1.14) to obtain P AQ = Drr 0 7 7 5 2 5 20 0 20 6 6 6 6 4 3 20 4 20 3 2 7 7 7 7 5 6 6 4 7 7 7 7 7 7 7 7 5 0 4 20 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 0 0 0 0 0 0 3 0 4 20 20 Proof: Check if AGA = A. AGA = P T 4 2 =P T =P T =P T 4 2 4 2 4 D 0 0 0 3 5 Q Q4 D F2 2 T " nn 1 F1 5 P P F3 3 T 4 " D 0 0 0 3 5 T D 0 0 0 0 0 0 3 5 7 7 5 2 3 6 6 4 7 7 5 Moore-Penrose Inverse 2 mm identity identity matrix matrix 32 32 1 D 0 5 4 D F1 5 4 D 0 35 Q 0 0 F2 F3 0 0 32 I DF1 5 4 D 0 35 Q 0 0 165 164 2 0 1 F 1 (ii) G = Q D F2 F3 P is a generalized inverse for A for any choice of F1; F2; F3. 3 5 0 20 0 20 3 where P is an m m orthogonal matrix Q is an n n orthogonal matrix D is an r r matrix of singular values T 7 3 3 20 0 0 0 0 0 0 20 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 = 4 0 3 5 1)T = 5 20 0 0 2 For any m n matrix A with rank(A) = r, 7 7 7 7 7 7 5 (W 2 Algorithm 3.2 3 T Q T =A 166 Q T Defn 3.5: For any matrix A there is a unique matrix M , called the Moore-Penrose inverse, that satises (i) AMA = A (ii) MAM = M (iii) AM is symmetric (iv) MA is symmetric 167 Properties of generalized inverses of X T X Normal equations: (X T X )b = X T Y Result 3.1 1 M = Q D0 00 P 2 3 6 6 6 6 4 7 7 7 7 5 is the Moore-Penrose inverse of A, where P AQ = D 0 2 3 6 6 6 4 7 7 7 5 0 0 is a singular value decomposition of A. 168 Result 3.3 If G is a generalized inverse of X T X , then (i) GT is a generalized inverse of XT X. (ii) XGX T X = X , i.e., GX T is a generalized inverse of X . (iii) XGX T is invariant with respect to the choice of G. (iv) XGX T is symmetric. 169 Proof: (i) Since G is a generalized inverse of (X T X ), (X T X )G(X T X ) = X T X . Taking the transpose of both sides X T X T = (X T X )G(X T X ) T = (X T X )T GT (X T X )T But (X T X )T = X T (X T )T = X T X , hence (X T X )GT (X T X ) = (X T X ) " # " # (ii) From (i) (X T X )GT (X T X ) = (X T X ) - Call this B Then 0 = BX T X X T X = (BX T X X T X )(B T I ) = BX T XB T X T XB T BX T X X T X = (BX T X T )(BX T X T )T 170 Hence, 0 = BX T X T ) BX T = X T ) X T XGT X T = X T Taking the transpose X = (X T XGT X T )T = XGX T X Hence, GX T is a generalized inverse for X . (iii) Suppose F and G are generalized inverses for X T X . Then, from (ii) XGX T X = X and XF X T X = X 171 It follows that 0 = X X = (XGX T X XF X T X ) = (XGX T X XF X T X )(GT X T F T X T ) = (XGX T XF X T )X (GT X T F T X T ) = (XGX T XF X T )(XGT X T XF T X T ) = (XGX T XF X T )(XGX T XF X T )T Since the (i,i) diagonal element of the result of multiplying a matrix by its transpose is the sum of the squared entries in the i-th row of the matrix, the diagonal elements of the product are all zero only if all entries are zero in every row of the matrix. Consequently, (XGX T XF X T ) = 0 (iv) For any generalized inverse G, T = GX T XGT is a symmetric generalized inverse. Then XT X T is symmetric and from (iii), XGX T = XT X T : 173 172 Estimation of the Mean Vector E (Y) = X For any solution to the normal equations, say b = (X T X ) X T Y ; the OLS estimator for E (Y) = X is ^ = X b = X (X T X ) X T Y Y = PX Y The matrix PX X (X T X ) X T is called an \orthogonal projection matrix". Y^ = PX Y is the projection of Y onto the space spanned by the columns of X . = 174 Result 3.4 Properties of a projection matrix PX = X (X T X ) X T (i) PX is invariant to the choice of (X T X ) . For any solution b = (X T X ) X T Y to the normal equations Y^ = X b = PX Y is the same. (from Result 3.3 (iii)) 175 Residuals: (ii) PX is symmetric (from Result 3.3 (iv)) (iii) PX is idempodent (PX PX = PX ) (iv) PX X = X (from Result 3.3 (ii)) (v) Partition X as X = [X1jX2j jXk] ; ei = Yi Y^i i = 1; : : : ; n The vector of residuals is e= = = = Y Y Y (I Y^ Xb PX Y PX )Y Comment: I PX is a projection matrix that projects Y onto the space orthogonal to the space spanned by the columns of X . then PX Xj = Xj 176 Result 3.5 Properties of I PX (i) I PX is symmetric (ii) I PX is idempodent 177 (v) Partition X as [X1jX2j jXk] then (I PX )Xj = 0 (I PX )(I PX ) = I PX (iii) (I PX )PX = PX PX PX = PX PX = 0 (iv) (vi) Residuals are invariant with respect to the choice of (X T X ) , so e Y X b = (I PX )Y is the same for any solution b = (X T X ) X T Y (I PX )X = X PX X = X X=0 to the normal equations 178 179