Topic 11: Matrix Approach to Linear Regression Outline • Linear Regression in Matrix Form The Model in Scalar Form • Yi = β0 + β1Xi + ei • The ei are independent Normally distributed random variables with mean 0 and variance σ2 • Consider writing the observations: Y1= β0 + β1X1 + e1 Y2= β0 + β1X2 + e2 : Yn= β0 + β1Xn + en The Model in Matrix Form Y1 0 1 X1 1 Y2 0 1 X 2 2 Y X n 0 1 n n The Model in Matrix Form II Y1 1 Y2 1 Y 1 n X1 1 X 2 0 2 1 Xn n The Design Matrix X n 2 1 1 1 X1 X2 Xn Vector of Parameters 0 21 1 Vector of error terms 1 2 ε n 1 n Vector of responses Y1 Y2 Y n1 Y n Simple Linear Regression in Matrix Form Y X Y n 1 n 2 21 n 1 Variance-Covariance Matrix 2{Y1} {Y1 , Y2 } {Y1 , Yn } 2 {Y2 , Y1} {Y2 } 2 (Y) {Yn 1 , Yn } 2 {Yn , Yn -1} {Yn } {Yn , Y1} Main diagonal values are the variances and off-diagonal values are the covariances. Covariance Matrix of e 1 2 2 2 {ε} Cov I nn nn n Independent errors means that the covariance of any two residuals is zero. Common variance implies the main diagonal values are equal. Covariance Matrix of Y Y1 Y 2 2 2 {Y} Cov I n n n n Y n Distributional Assumptions in Matrix Form • e ~ N(0, σ2I) • I is an n x n identity matrix • Ones in the diagonal elements specify that the variance of each ei is 1 times σ2 • Zeros in the off-diagonal elements specify that the covariance between different ei is zero • This implies that the correlations are zero Least Squares • We want to minimize (Y-Xβ)(Y-Xβ) • We take the derivative with respect to the (vector) β • This is like a quadratic function • Recall the function we minimized using the scalar form Least Squares • The derivative is 2 times the derivative of (Y-Xβ) with respect to β • In other words, –2X(Y-Xβ) • We set this equal to 0 (a vector of zeros) • So, –2X(Y-Xβ) = 0 • Or, XY = XXβ (the normal equations) Normal Equations • XY = (XX)β • Solving for β gives the least squares solution b = (b0, b1) • b = (XX)–1(XY) • See KNNL p 199 for details • This same matrix approach works for multiple regression!!!!!! Fitted Values Ŷ1 b b X 1 0 1 1 Ŷ b0 b1 X 2 1 Ŷ 2 b b X 1 Ŷn 0 1 n X1 X2 b0 Xb b1 Xn Hat Matrix ˆ Xb Y 1 ˆ Y X( X' X) X' Y ˆ HY Y 1 H X( X' X) X' We’ll use this matrix when assessing diagnostics in multiple regression Estimated Covariance Matrix of b • This matrix, b, is a linear combination of the elements of Y • These estimates are Normal if Y is Normal • These estimates will be approximately Normal in general A Useful MultivariateTheorem • U ~ N(μ, Σ), a multivariate Normal vector • V = c + DU, a linear transformation of U c is a vector and D is a matrix • Then V ~ N(c+Dμ, DΣD) Application to b • b = (XX)–1(XY) = ((XX)–1X)Y • Since Y ~ N(Xβ, σ2I) this means the vector b is Normally distributed with mean (XX)–1XXβ = β and covariance σ2 ((XX)–1X) I((XX)–1X) = σ2 (XX)–1 Background Reading • We will use this framework to do multiple regression where we have more than one explanatory variable • Another explanatory variable is comparable to adding another column in the design matrix • See Chapter 6