Estimable Functions of β: An Example customer 1 2 3 4 movie 1 2 3 4 1 ? ? 3 5 ? ? 3 3 1 ? yij = customer i’s rating of movie j yij = µ + ci + mj + ij c Copyright 2010 (Iowa State University) Can we guess ratings for customer/movie combinations not in the dataset? Which movie is best? Statistics 511 1 / 11 The Linear Model in Vector/Matrix Form 4 1 3 5 3 3 1 = 1 1 1 1 1 1 1 y= c Copyright 2010 (Iowa State University) 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 0 0 X 0 0 0 0 0 1 1 1 0 0 0 0 1 0 0 1 1 0 0 0 1 0 0 0 1 1 0 0 β µ c1 c2 c3 c4 m1 m2 m3 + + 11 12 22 23 33 41 42 Statistics 511 2 / 11 Can We Estimate the Means that Underly the Missing Table Entries? Xβ = µ + c1 + m1 µ + c1 + m2 µ + c2 + m2 µ + c2 + m3 µ + c3 + m3 µ + c4 + m1 µ + c4 + m2 Can we estimate µ + c1 + m3 ? µ + c2 + m1 ? µ + c3 + m1 ? µ + c3 + m2 ? µ + c4 + m3 ? m1 − m2 is estimable because [1, −1, 0, 0, 0.0.0] Xβ = m1 − m2 . c Copyright 2010 (Iowa State University) Statistics 511 3 / 11 Can We Estimate the Means that Underly the Missing Table Entries? Xβ = µ + c1 + m1 µ + c1 + m2 µ + c2 + m2 µ + c2 + m3 µ + c3 + m3 µ + c4 + m1 µ + c4 + m2 Can we estimate µ + c1 + m3 ? µ + c2 + m1 ? µ + c3 + m1 ? µ + c3 + m2 ? µ + c4 + m3 ? Likewise, m2 − m3 is estimable because [0, 0, 1, −1, 0, 0, 0] Xβ = m2 − m3 . c Copyright 2010 (Iowa State University) Statistics 511 4 / 11 We can estimate all pairwise differences between movie effects. Because m1 − m2 and m2 − m3 are estimable, we can also estimate (m1 − m2 ) + (m2 − m3 ) = m1 − m3 . This follows because any linear combination of estimable functions is also estimable. c Copyright 2010 (Iowa State University) Statistics 511 5 / 11 We can estimate the mean underlying the rating for any combination of customer and movie. It follows that any linear combination of the form µ + ci + mj can be estimated ∀ i = 1, 2, 3, 4 and j = 1, 2, 3 because µ + ci + mj = (µ + ci + mj0 ) + (mj − mj0 ) ∀ i = 1, 2, 3, 4 and j, j0 = 1, 2, 3. c Copyright 2010 (Iowa State University) Statistics 511 6 / 11 Movie lsmeans If our goal is to compare movies to see which is most highly rated, we can accomplish that by estimating the pairwise differences between movie effects. However, if we want to retain information about the mean rating rather than the difference between mean ratings, it is natural to consider estimating the average (across all customers) of the mean rating for each movie. c Copyright 2010 (Iowa State University) Statistics 511 7 / 11 Movie lsmeans This average for the jth movie is 4 4 1X 1X (µ + ci + mj ) = µ + ci + mj = µ + c̄· + mj . 4 i=1 4 i=1 This average is estimable for each movie in our example because it is a linear combination of estimable functions. SAS calls estimates of µ + c̄· + mj ∀ j = 1, 2, 3 movie lsmeans, where ls presumably denotes least squares to indicate that these estimates are based on OLS estimators. c Copyright 2010 (Iowa State University) Statistics 511 8 / 11 Suppose we consider a different model. customer 1 2 3 4 movie 1 2 3 4 1 ? ? 3 5 ? ? 3 3 1 ? yij = customer i’s rating of movie j yij = µij + ij c Copyright 2010 (Iowa State University) Can we guess ratings for customer/movie combinations not in the dataset? Which movie is best? Statistics 511 9 / 11 The Linear Model in Vector/Matrix Form 4 1 3 5 3 3 1 = 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 c Copyright 2010 (Iowa State University) 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 µ11 µ12 µ13 µ21 µ22 µ23 µ31 µ32 µ33 µ41 µ42 µ43 + 11 12 22 23 33 41 42 Statistics 511 10 / 11 Can we estimate the means that underly the missing table entries? No. Xβ = µ11 µ12 µ22 µ23 µ33 µ41 µ42 Can we estimate µ13 ? µ21 ? µ31 ? µ32 ? µ43 ? None of the means underlying missing table entries are estimable under this cell-means model. c Copyright 2010 (Iowa State University) Statistics 511 11 / 11