The Model Model in matrix form Goodness of fit and inference The K-Variable Linear Model Walter Sosa-Escudero February 3, 2012 Walter Sosa-Escudero The K-Variable Linear Model The Model Model in matrix form Goodness of fit and inference Motivation Consider the following natural generalization Yi = β1 + β2 X2i + . . . + βKi XKi + ui , i = 1, . . . , n This the K-variable linear model K? It is as if the first variable is X1i = 1 for every i. Walter Sosa-Escudero The K-Variable Linear Model The Model Model in matrix form Goodness of fit and inference Example Yi = β1 + β2 X2i + β3i X3i + ui Yi is consumption of family i. X2i is income of family i X31 is wealth of family i. Walter Sosa-Escudero The K-Variable Linear Model The Model Model in matrix form Goodness of fit and inference The classical assumptions 1 Linearity: Yi = β1 + β2 X2i + . . . + βKi XKi + ui , i = 1, . . . , n. 2 Non-random X: Xki , k = 1, . . . , K are taken as deterministic, non-random variables. 3 Zero mean: E(ui ) = 0, i = 1, . . . , n. 4 Homoskedasticity: V (ui ) = σ 2 , = 1, . . . , n. 5 No serial correlation: E(ui uj ) = 0, i 6= j. 6 No multicollinearity: none of the explanatory variables can be expressed as an exact linear combination of the others. Walter Sosa-Escudero The K-Variable Linear Model The Model Model in matrix form Goodness of fit and inference Interpretations As before, taking expectations E(Yi ) = β1 + β2 X2i + . . . + βKi XKi Hence ∂E(Yi ) = βk ∂Xk Coefficients are partial derivatives. Regression as control. Replace experimentation. Common mistake: interpret as total derivatives. Example: parents education. Walter Sosa-Escudero The K-Variable Linear Model The Model Model in matrix form Goodness of fit and inference Least squares estimation Let β̂k , k = 1, . . . , K be the OLS estimators. Define Ŷi ≡ β̂1 + β̂2 X2i + . . . + β̂Ki XKi . ei ≡ Ŷi − Yi Then the OLS esimators for β̂1 , . . . , β̂K are the solutions to min n X e2i i=1 with respecto to β̂1 , . . . , β̂K . Walter Sosa-Escudero The K-Variable Linear Model The Model Model in matrix form Goodness of fit and inference Example Walter Sosa-Escudero The K-Variable Linear Model The Model Model in matrix form Goodness of fit and inference The model in matrix notation Define the following vectors and matrices: β1 y1 β2 y2 u= β= . Y = . . . . . βK K×1 yn n×1 X= x11 x12 .. . x1n u1 u2 .. . un x21 . . . xK1 x22 xK2 .. .. . . xKn n×K Walter Sosa-Escudero The K-Variable Linear Model n×1 The Model Model in matrix form Goodness of fit and inference The linear model Yi = β1 + β2 X2i + . . . + βKi XKi + ui , i = 1, . . . , n is actually a system of n equations Y1 = β1 + β2 X21 + . . . + βK XK1 + u1 Y2 = β1 + β2 X22 + . . . + βK XK2 + u2 ··· ··· Yn = β1 + β2 X2n + . . . + βK XKn + un Walter Sosa-Escudero The K-Variable Linear Model The Model Model in matrix form Goodness of fit and inference Then the y1 .. . .. . yn linear model can be written as: x11 x21 . . . xK1 x12 x22 xK2 = .. .. .. . . . x1n xKn β1 β2 .. . βK + Y = Xβ + u This is the linear model in matrix form. Walter Sosa-Escudero The K-Variable Linear Model u1 .. . .. . un The Model Model in matrix form Goodness of fit and inference Basic results on matrices and random vectors Before we proceed, we need to establish some results involving matrices and vectors. Let A be a m × n matrix. A: n column vectors, or m row vectors. The column rank of A is defined as the maximum number of columns linearly dependent. Similarly, the row rank is the maximum numbers of rows that are linearly dependent. The row rank is equal to the column rank. So we will talk, in general, about the rank of a matrix A, and will denote it as ρ(A) Let A be a square (m × m) matrix. A is non singular if |A| = 6 0. In such case, there exists a unique non-singular matrix A−1 called the inverse of A such that AA−1 = A−1 A = Im . Walter Sosa-Escudero The K-Variable Linear Model The Model Model in matrix form Goodness of fit and inference A a square m × m matrix. If ρ(A) = m ⇒ |A| = 6 0 If ρ(A) < m ⇒ |A| = 0 X a n × K matrix, with ρ(X) = K (full column rank): ρ(X) = ρ(X 0 X) = k This results guarantees the existence of (X 0 X)−1 based on the rank of X. Walter Sosa-Escudero The K-Variable Linear Model The Model Model in matrix form Goodness of fit and inference Let Y be a vector of K random variables: Y1 Y = ... Yk E(Y ) = µ = E(Y1 ) E(Y2 ) .. . E(YK ) Walter Sosa-Escudero The K-Variable Linear Model The Model Model in matrix form Goodness of fit and inference V (Y ) = = E[(Y − µ)(Y − µ)0 ] E(Y1 − µ1 )2 E(Y1 − µ1 )(Y2 − µ2 ) E(Y2 − µ2 )2 ··· .. . E(Yk − µK )2 = V (Y1 ) Cov(Y1 , Y2 ) V (Y2 ) ... .. Cov(Y1 YK ) . V (YK ) The variance of a vector is called its variance-covariance matrix, an K × K matrix If V (Y ) = Σ and c is an K × 1 vector, then V (c0 Y ) = c0 V (Y )c = c0 Σc. Walter Sosa-Escudero The K-Variable Linear Model The Model Model in matrix form Goodness of fit and inference Classical assumptions in matrix form 1 Linearity: Y = Xβ + u. 2 Non-random X: X is a deterministic matrix. 3 Zero mean: E(u) = 0. 4 Homoskedasticity and no serial correlation: V (u) = σ 2 In . 5 No multicollinearity: ρ(X) = K. Walter Sosa-Escudero The K-Variable Linear Model The Model Model in matrix form Goodness of fit and inference OLS estimator in matrix form It can be show (we’ll do it later) that the OLS estimator can be expressed as: β̂ = (X 0 X)−1 X 0 Y Walter Sosa-Escudero The K-Variable Linear Model The Model Model in matrix form Goodness of fit and inference Properties 1 Linearity: β̂ = AY for some matrix A. 2 Unbiasedness: E(β̂) = β. 3 Variance: V (β̂) = σ 2 (X 0 X)−1 4 Gauss-Markov Theorem: under all classical assumptions, β̂ has the minimum variance in the class of all linear and unbiased estimators. Walter Sosa-Escudero The K-Variable Linear Model The Model Model in matrix form Goodness of fit and inference Proof of unbiasedness Unbiasedness: E(β̂) = β β̂ = (X 0 X)−1 X 0 Y = (X 0 X)−1 X 0 (Xβ + u) = β + (X 0 X)−1 X 0 u E(β̂) = β + E (X 0 X)−1 X 0 u = β + (X 0 X)−1 X 0 E [u] = β (Since E(u) = 0) How does heteroskedasticity affect unbiasedness? Which assumptions do we use and which ones we don’t? Walter Sosa-Escudero The K-Variable Linear Model The Model Model in matrix form Goodness of fit and inference Goodness of fit It can be shown that the decomposition of squared errors still holds for the K variable model, that is X (Yi − Ȳ )2 = X X (Ŷi − Ȳ )2 + e2i So our old R2 provides a goodness-of-fit measure R2 ≡ RSS ESS =1− T SS T SS Walter Sosa-Escudero The K-Variable Linear Model The Model Model in matrix form Goodness of fit and inference Comments and properties on R2 . R2 ≡ ESS RSS =1− T SS T SS 0 ≤ R2 ≤ 1 (as before) β̂ maximizes R2 (why?) R2 is non-decreasing in the number of explanatory variables, K. (why?) Use and abuse of R2 . Walter Sosa-Escudero The K-Variable Linear Model The Model Model in matrix form Goodness of fit and inference Simple Hypothesis Assumption (normality): ui ∼ N (0, σ 2 ) Then, as before under all classical assumptions and when H0 : β = β0 holds t≡ q β̂ − β0 ∼ tn−2 P S 2 / x2i So stantard t tests are implemented as in the two variable case. Walter Sosa-Escudero The K-Variable Linear Model The Model Model in matrix form Goodness of fit and inference Linear combinations of coefficients Consider the following hypothesis H0 : a1 βj + a2 βi = r a1 , a2 and r are numbers. The test for this hypothesis will be t = = a1 β̂j + a2 β̂i − r q V̂ (a1 β̂j + a2 β̂i − r) a1 β̂j + a2 β̂i − r q a21 V̂ (β̂j ) + a22 V̂ (β̂i ) − 2a1 a2 Cov(β̂j , β̂i ) Walter Sosa-Escudero The K-Variable Linear Model The Model Model in matrix form Goodness of fit and inference Global significance Consider the hypothesis H0 : β2 = β2 = · · · βK = 0 against HA : β2 6= 0 ∨ β3 6= 0 ∨ · · · βK ∨ 0 Under the null, none of the explanatory variables account for Y . Under the alternative, at least one variable is relevant. The test statistic for this case is given by F = ESS/(K − 1) RSS/(n − K) which has the F (K − 1, n − K) under H0 . Idea: reject if too large. Walter Sosa-Escudero The K-Variable Linear Model The Model Model in matrix form Goodness of fit and inference Alternatively, note that dividing by TSS in the numerator and denominator F = ESS/(K − 1) R2 /(K − 1) = RSS/(n − K) (1 − R2 )/(n − K) so the F test is checking whether R2 is significantly different from zero. Walter Sosa-Escudero The K-Variable Linear Model