Introduction to the Gauss-Markov Linear Model c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 1 / 36 Random Vectors y= y1 y2 .. . is a random vector if and only if each element of y is a yn random variable (i.e., yi is a random variable ∀ i = 1, . . . , n). The mean of the random vector y is E(y) = E(y1 ) E(y2 ) .. . . E(yn ) The variance of the random vector y is the matrix whose i, jth element is Cov(yi , yj ) = E(yi yj ) − E(yi )E(yj ). c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 2 / 36 Example: Variance of a Random Vector y1 For example, the variance of y = y2 is y3 Cov(y1 , y1 ) Cov(y1 , y2 ) Cov(y1 , y3 ) Var(y) = Cov(y2 , y1 ) Cov(y2 , y2 ) Cov(y2 , y3 ) Cov(y3 , y1 ) Cov(y3 , y2 ) Cov(y3 , y3 ) Var(y1 ) Cov(y1 , y2 ) Cov(y1 , y3 ) Var(y2 ) Cov(y2 , y3 ) . = Cov(y2 , y1 ) Cov(y3 , y1 ) Cov(y3 , y2 ) Var(y3 ) c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 3 / 36 The Gauss-Markov Linear Model y = Xβ + y is an n × 1 random vector of responses. X is an n × p matrix of constants with columns corresponding to explanatory variables. X is sometimes referred to as the design matrix. β is an unknown parameter vector in IRp . is an n × 1 random vector of errors. E() = 0 and Var() = σ 2 I, where σ 2 is an unknown parameter in IR+ . c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 4 / 36 The Gauss-Markov Linear Model Note that the model is not completely specified because the distribution of y is not completely specified. y = Xβ + , =⇒ =⇒ E() = 0, Var() = σ 2 I E(y) = Xβ, Var(y) = σ 2 I y ∼ (Xβ, σ 2 I) “y has a distribution with mean Xβ and variance σ 2 I.” c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 5 / 36 The Normal Theory Gauss-Markov Linear Model We often add an assumption of multivariate normality to the Gauss-Markov linear model: ∼ N(0, σ 2 I). The assumption ∼ N(0, σ 2 I) is equivalent to i.i.d. 1 , . . . , n ∼ N(0, σ 2 ). The assumption ∼ N(0, σ 2 I) =⇒ y ∼ N(Xβ, σ 2 I), i.e., y1 , . . . , yn are independent normal random variables, Var(yi ) = σ 2 ∀ i = 1, . . . , n, and E(yi ) = x0(i) β (where x0(i) is the ith row of X) ∀ i = 1, . . . , n. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 6 / 36 Goal of Analysis y = Xβ + The goal of analysis often focuses on answering questions about certain linear functions of β of the form Cβ for a specified matrix C. The normality assumption is useful for constructing confidence intervals and performing tests concerning Cβ. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 7 / 36 Example 1 Researchers harvested five randomly selected ears of corn from a field. For i = 1, . . . , 5; let yi denote the weight in grams of the ith ear. i.i.d. y1 , . . . , y5 ∼ N(µ, σ 2 ) yi = µ + i , i = 1, . . . , 5; i.i.d. 1 , . . . , 5 ∼ N(0, σ 2 ) y1 = µ + 1 y2 = µ + 2 y3 = µ + 3 i.i.d. 1 , . . . , 5 ∼ N(0, σ 2 ) y4 = µ + 4 y5 = µ + 5 c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 8 / 36 Example 1 (continued) y1 = µ + 1 y2 = µ + 2 i.i.d. 1 , . . . , 5 ∼ N(0, σ 2 ) y3 = µ + 3 y4 = µ + 4 y5 = µ + 5 y1 y2 y3 y4 y5 = µ µ µ µ µ + c Copyright 2012 Dan Nettleton (Iowa State University) 1 2 3 4 5 , 1 2 3 4 5 ∼ N(0, σ 2 I) Statistics 511 9 / 36 Example 1 (continued) y1 y2 y3 y4 y5 y1 y2 y3 y4 y5 1 µ µ 2 = µ + 3 , µ 4 5 µ 1 1 1 2 = 1 [µ] + 3 , 1 4 1 5 c Copyright 2012 Dan Nettleton (Iowa State University) 1 2 3 ∼ N(0, σ 2 I) 4 5 1 2 3 ∼ N(0, σ 2 I) 4 5 Statistics 511 10 / 36 Example 1 (continued) y1 y2 y3 y4 y5 = 1 1 1 1 1 [µ] + y = Xβ + , 1 2 3 4 5 , 1 2 3 4 5 ∼ N(0, σ 2 I) ∼ N(0, σ 2 I) Cβ = [1][µ] = µ c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 11 / 36 Example 2 Researchers randomly assigned eight experimental units to two treatments and measured a response of interest. For i = 1, 2; let yi1 , yi2 , yi3 , yi4 denote the responses of the experimental units in the ith treatment group. i.i.d. y11 , y12 , y13 , y14 ∼ N(µ1 , σ 2 ) independent of i.i.d. y21 , y22 , y23 , y24 ∼ N(µ2 , σ 2 ) yij = µi + ij , i = 1, 2; j = 1, . . . , 4 i.i.d. 11 , 12 , 13 , 14 , 21 , 22 , 23 , 24 ∼ N(0, σ 2 ) c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 12 / 36 Example 2 (continued) y11 = µ1 + 11 y12 = µ1 + 12 y13 = µ1 + 13 y14 = µ1 + 14 y21 = µ2 + 21 y22 = µ2 + 22 y23 = µ2 + 23 y24 = µ2 + 24 i.i.d. 11 , 12 , 13 , 14 , 21 , 22 , 23 , 24 ∼ N(0, σ 2 ) c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 13 / 36 Example 2 (continued) y11 y12 y13 y14 y21 y22 y23 y24 = µ1 µ1 µ1 µ1 µ2 µ2 µ2 µ2 + c Copyright 2012 Dan Nettleton (Iowa State University) 11 12 13 14 21 22 23 24 , 11 12 13 14 21 22 23 24 ∼ N(0, σ 2 I) Statistics 511 14 / 36 Example 2 (continued) y11 y12 y13 y14 y21 y22 y23 y24 = 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 µ1 µ2 + c Copyright 2012 Dan Nettleton (Iowa State University) 11 12 13 14 21 22 23 24 , 11 12 13 14 21 22 23 24 ∼ N(0, σ 2 I) Statistics 511 15 / 36 Example 2 (continued) y11 y12 y13 y14 y21 y22 y23 y24 = 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 µ1 µ2 + y = Xβ + , Cβ = [1, −1] c Copyright 2012 Dan Nettleton (Iowa State University) 11 12 13 14 21 22 23 24 , 11 12 13 14 21 22 23 24 ∼ N(0, σ 2 I) ∼ N(0, σ 2 I) µ1 µ2 = µ1 − µ2 Statistics 511 16 / 36 Example 3 Suppose eight fertilizer amounts denoted x1 , . . . , x8 were randomly assigned to eight field plots. For i = 1, . . . , 8; let yi denote the yield of the plot that received fertilizer amount xi . yi = β0 + β1 xi + i , i = 1, . . . , 8 i.i.d. 1 , . . . , 8 ∼ N(0, σ 2 ) y1 = β0 + β1 x1 + 1 y2 = β0 + β1 x2 + 2 y3 = β0 + β1 x3 + 3 y4 = β0 + β1 x4 + 4 y5 = β0 + β1 x5 + 5 y6 = β0 + β1 x6 + 6 y7 = β0 + β1 x7 + 7 y8 = β0 + β1 x8 + 8 c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 17 / 36 Example 3 (continued) y1 y2 y3 y4 y5 y6 y7 y8 = β0 + β1 x1 β0 + β1 x2 β0 + β1 x3 β0 + β1 x4 β0 + β1 x5 β0 + β1 x6 β0 + β1 x7 β0 + β1 x8 c Copyright 2012 Dan Nettleton (Iowa State University) + 1 2 3 4 5 6 7 8 , 1 2 3 4 5 6 7 8 ∼ N(0, σ 2 I) Statistics 511 18 / 36 Example 3 (continued) y1 y2 y3 y4 y5 y6 y7 y8 = 1 1 1 1 1 1 1 1 x1 x2 x3 x4 x5 x6 x7 x8 β0 β1 + c Copyright 2012 Dan Nettleton (Iowa State University) 1 2 3 4 5 6 7 8 , 1 2 3 4 5 6 7 8 ∼ N(0, σ 2 I) Statistics 511 19 / 36 Example 3 (continued) y1 y2 y3 y4 y5 y6 y7 y8 = 1 1 1 1 1 1 1 1 x1 x2 x3 x4 x5 x6 x7 x8 β0 + β1 y = Xβ + , c Copyright 2012 Dan Nettleton (Iowa State University) , 1 2 3 4 5 6 7 8 ∼ N(0, σ 2 I) ∼ N(0, σ 2 I) Cβ = [0, 1] 1 2 3 4 5 6 7 8 β0 β1 = β1 Statistics 511 20 / 36 Example 4 Eight hogs were randomly assigned to two diets and two inoculations such that two hogs received each combination of diet and inoculation. This experiment involves two factors: diet and inoculation. In this case, each factor has two levels (denoted here generically as 1 and 2). A combination of one level from each factor forms a treatment. In this case, we have four treatments: Treatment 1 2 3 4 Diet 1 1 2 2 Inoculation 1 2 1 2 c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 21 / 36 Example 4 (continued) For i = 1, 2; j = 1, 2; and k = 1, 2; let yijk denote the average daily gain of the kth hog that received diet i and inoculation j. yijk = µ + ijk i = 1, 2; j = 1, 2; k = 1, 2; i.i.d. 111 , 112 , 121 , 122 , 211 , 212 , 221 , 222 ∼ N(0, σ 2 ) Under this model, neither diet nor inoculation affects average daily gain. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 22 / 36 Example 4 (continued) For i = 1, 2; j = 1, 2; and k = 1, 2; let yijk denote the average daily gain of the kth hog that received diet i and inoculation j. yijk = µ + αi + ijk i = 1, 2; j = 1, 2; k = 1, 2; i.i.d. 111 , 112 , 121 , 122 , 211 , 212 , 221 , 222 ∼ N(0, σ 2 ) Under this model, only diet affects average daily gain. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 23 / 36 Example 4 (continued) For i = 1, 2; j = 1, 2; and k = 1, 2; let yijk denote the average daily gain of the kth hog that received diet i and inoculation j. yijk = µ + βj + ijk i = 1, 2; j = 1, 2; k = 1, 2; i.i.d. 111 , 112 , 121 , 122 , 211 , 212 , 221 , 222 ∼ N(0, σ 2 ) Under this model, only inoculation affects average daily gain. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 24 / 36 Example 4 (continued) yijk = µ + αi + βj + ijk i = 1, 2; j = 1, 2; k = 1, 2; i.i.d. 111 , 112 , 121 , 122 , 211 , 212 , 221 , 222 ∼ N(0, σ 2 ) Under this model, factors diet and inoculation affect the mean average daily gain in an additive manner. There is no interaction between the factors diet and inoculation. diet 1 2 diet difference inoculation 1 2 µ + α1 + β1 µ + α1 + β2 µ + α2 + β1 µ + α2 + β2 α1 − α2 α1 − α2 c Copyright 2012 Dan Nettleton (Iowa State University) inoculation difference β1 − β2 β1 − β2 Statistics 511 25 / 36 Example 4 (continued) yijk = µ + αi + βj + γij + ijk i = 1, 2; j = 1, 2; k = 1, 2; i.i.d. 111 , 112 , 121 , 122 , 211 , 212 , 221 , 222 ∼ N(0, σ 2 ) Under this model, there is one mean for each combination of diet and inoculation. Those four means are free to take any four values with no restrictions. diet 1 2 ∆diet inoculation 1 2 µ + α1 + β1 + γ11 µ + α1 + β2 + γ12 µ + α2 + β1 + γ21 µ + α2 + β2 + γ22 α1 − α2 + γ11 − γ21 α1 − α2 + γ12 − γ22 c Copyright 2012 Dan Nettleton (Iowa State University) ∆inoculation β1 − β2 + γ11 − γ12 β1 − β2 + γ21 − γ22 Statistics 511 26 / 36 Example 4 (continued) An equivalent model is the so called cell means model: yijk = µij + ijk i = 1, 2; j = 1, 2; k = 1, 2; i.i.d. 111 , 112 , 121 , 122 , 211 , 212 , 221 , 222 ∼ N(0, σ 2 ) diet 1 2 ∆diet inoculation 1 2 µ11 µ12 µ21 µ22 µ11 − µ21 µ12 − µ22 c Copyright 2012 Dan Nettleton (Iowa State University) ∆inoculation µ11 − µ12 µ21 − µ22 Statistics 511 27 / 36 Example 4 (continued) yijk = µ + αi + βj + γij + ijk i = 1, 2; j = 1, 2; k = 1, 2; y111 = µ + α1 + β1 + γ11 + 111 y112 = µ + α1 + β1 + γ11 + 112 y121 = µ + α1 + β2 + γ12 + 121 y122 = µ + α1 + β2 + γ12 + 122 y211 = µ + α2 + β1 + γ21 + 211 y212 = µ + α2 + β1 + γ21 + 212 y221 = µ + α2 + β2 + γ22 + 221 y222 = µ + α2 + β2 + γ22 + 222 i.i.d. 111 , 112 , 121 , 122 , 211 , 212 , 221 , 222 ∼ N(0, σ 2 ) c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 28 / 36 Example 4 (continued) y111 y112 y121 y122 y211 y212 y221 y222 = µ + α1 + β1 + γ11 µ + α1 + β1 + γ11 µ + α1 + β2 + γ12 µ + α1 + β2 + γ12 µ + α2 + β1 + γ21 µ + α2 + β1 + γ21 µ + α2 + β2 + γ22 µ + α2 + β2 + γ22 c Copyright 2012 Dan Nettleton (Iowa State University) + 111 112 121 122 211 212 221 222 , 111 112 121 122 211 212 221 222 ∼ N(0, σ 2 I) Statistics 511 29 / 36 Example 4 (continued) y111 y112 y121 y122 y211 y212 y221 y222 = 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 1 1 0 0 0 0 1 1 0 0 1 1 y = Xβ + , c Copyright 2012 Dan Nettleton (Iowa State University) 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 µ α1 α2 β1 β2 γ11 γ12 γ21 γ22 + 111 112 121 122 211 212 221 222 ∼ N(0, σ 2 I) Statistics 511 30 / 36 Example 4 (continued) β = [µ, α1 , α2 , β1 , β2 , γ11 , γ12 , γ21 , γ22 ]0 diet 1 2 ∆diet inoculation 1 2 µ + α1 + β1 + γ11 µ + α1 + β2 + γ12 µ + α2 + β1 + γ21 µ + α2 + β2 + γ22 α1 − α2 + γ11 − γ21 α1 − α2 + γ12 − γ22 Is the difference between diet means for inoculation 1 the same as the difference between diet means for inoculation 2? Cβ = [0, 0, 0, 0, 0, 1, −1, −1, 1]β = γ11 − γ12 − γ21 + γ22 = 0? This questions asks if there is interaction between the factors diet and inoculation. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 31 / 36 Example 4 (continued) β = [µ, α1 , α2 , β1 , β2 , γ11 , γ12 , γ21 , γ22 ]0 diet 1 2 inoculation 1 2 µ + α1 + β1 + γ11 µ + α1 + β2 + γ12 µ + α2 + β1 + γ21 µ + α2 + β2 + γ22 ∆inoculation β1 − β2 + γ11 − γ12 β1 − β2 + γ21 − γ22 Is the difference between inoculation means for diet 1 the same as the difference between inoculation means for diet 2? Cβ = [0, 0, 0, 0, 0, 1, −1, −1, 1]β = γ11 − γ12 − γ21 + γ22 = 0? This questions also asks if there is interaction between the factors diet and inoculation. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 32 / 36 Example 4 (continued) β = [µ, α1 , α2 , β1 , β2 , γ11 , γ12 , γ21 , γ22 ]0 diet 1 2 inoculation 1 2 µ + α1 + β1 + γ11 µ + α1 + β2 + γ12 µ + α2 + β1 + γ21 µ + α2 + β2 + γ22 Diet Means µ + α1 + β̄· + γ̄1· µ + α2 + β̄· + γ̄2· Is the average over inoculation means for diet 1 different than the average over inoculation means for diet 2? Cβ = [0, 1, −1, 0, 0, .5, .5, −.5, −.5]β = α1 − α2 + γ̄1· − γ̄2· = 0? This question asks about the main effect of the factor diet. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 33 / 36 Example 4 (continued) β = [µ, α1 , α2 , β1 , β2 , γ11 , γ12 , γ21 , γ22 ]0 diet 1 2 Inoculation Means inoculation 1 2 µ + α1 + β1 + γ11 µ + α1 + β2 + γ12 µ + α2 + β1 + γ21 µ + α2 + β2 + γ22 µ + ᾱ· + β1 + γ̄·1 µ + ᾱ· + β2 + γ̄·2 Is the average over diet means for inoculation 1 different than the average over diet means for inoculation 2? Cβ = [0, 0, 0, 1, −1, .5, −.5, .5, −.5]β = β1 − β2 + γ̄·1 − γ̄·2 = 0? This question asks about the main effect of the factor inoculation. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 34 / 36 Example 4 (continued) β = [µ, α1 , α2 , β1 , β2 , γ11 , γ12 , γ21 , γ22 ]0 diet 1 2 ∆diet inoculation 1 2 µ + α1 + β1 + γ11 µ + α1 + β2 + γ12 µ + α2 + β1 + γ21 µ + α2 + β2 + γ22 α1 − α2 + γ11 − γ21 Is there a difference between the diet means for inoculation 1? Cβ = [0, 1, −1, 0, 0, 1, 0, −1, 0]β = α1 − α2 + γ11 − γ21 = 0? This question asks about the simple effect of the factor diet for the first level of the factor inoculation. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 511 35 / 36 Example 4 (continued) β = [µ, α1 , α2 , β1 , β2 , γ11 , γ12 , γ21 , γ22 ]0 diet 1 2 ∆diet inoculation 1 2 µ + α1 + β1 + γ11 µ + α1 + β2 + γ12 µ + α2 + β1 + γ21 µ + α2 + β2 + γ22 α1 − α2 + γ11 − γ21 α1 − α2 + γ12 − γ22 Are all four treatment means identical? 0 0 0 1 −1 Cβ = 0 0 0 1 −1 0 1 −1 0 0 β1 − β2 + γ11 − γ12 = β1 − β2 + γ21 − γ22 α1 − α2 + γ11 − γ21 c Copyright 2012 Dan Nettleton (Iowa State University) ∆inoculation β1 − β2 + γ11 − γ12 β1 − β2 + γ21 − γ22 1 −1 0 0 0 0 1 −1 β 1 0 −1 0 0 = 0 ? 0 Statistics 511 36 / 36