Document 10639743

advertisement
Best Linear Unbiased Prediction
(BLUP) of Random Effects in the
Normal Linear Mixed Effects Model
c
Copyright 2012
(Iowa State University)
Statistics 511
1 / 26
C. R. Henderson
Born April 1, 1911, in Coin, Iowa, in Page County
same county of birth as Jay Lush
Page County Farm Bureau Picnic (12 and under,
14 and under, 16 and under changed to 10-12,
13-14, 15-16)
Dean H.H. Kildee visited in 1929 and convinced
Henderson to come to Iowa State College.
c
Copyright 2012
(Iowa State University)
Statistics 511
2 / 26
C. R. Henderson
ISC track 4 x 220 indoor world record
1933 ISC Field House indoor 440 record of 51.7
(stood for 30 years)
Outdoor 440 record of 48.6 when world record
was 47.4
MS in nutrition from ISC
1942 U.S. Army Sanitary Corps. Nutrition
research for troops.
c
Copyright 2012
(Iowa State University)
Statistics 511
3 / 26
C. R. Henderson
Returned to ISU after the war for Ph.D. with Jay
Lush in animal breeding.
Professor at Cornell until 1976
Know for “Henderson’s Mixed Model Equations”
and use of BLUP in animal breeding.
Elected member of the National Academy of
Sciences
c
Copyright 2012
(Iowa State University)
Statistics 511
4 / 26
Henderson’s Ph.D. Students Included
Shayle Searle (who taught Henderson matrix
algebra)
David Harville (professor emeritus, Department of
Statistics, ISU, linear models expert)
c
Copyright 2012
(Iowa State University)
Statistics 511
5 / 26
Henderson’s Advice to Beginning Scientists
Study methods of your predecessors.
Work hard.
Do not fear to try new ideas.
Discuss your ideas with others freely.
Be quick to admit errors. Progress comes by
correcting mistakes.
Always be optimistic. Nature is benign.
Enjoy your scientific work. It can be a great joy.
c
Copyright 2012
(Iowa State University)
Statistics 511
6 / 26
Sources
L. D. Van Vleck (1998). Charles Roy Henderson,
1911-1989: a brief biography. Journal of Animal
Science. 76, 2959-2961.
L. D. Van Vleck (1991). C. R. Henderson: Farm
Boy, Athlete, and Scientist. Journal of Dairy
Science. 74, 4082-4096.
c
Copyright 2012
(Iowa State University)
Statistics 511
7 / 26
A problem that reportedly sparked Henderson’s
interest in BLUP
We present here a variation of the original problem 23
on page 164 of Mood, A. M. (1950), Introduction to
the Theory of Statistics, New York: McGraw-Hill.
c
Copyright 2012
(Iowa State University)
Statistics 511
8 / 26
Suppose intelligence quotients (IQs) for a
population of students are normally distributed
with a mean µ and variance σu2 .
Suppose an IQ test was given to an i.i.d. sample
of such students.
Suppose that, given the IQ of a student, the test
score for that student is normally distributed with a
mean equal to the student’s IQ and a variance σe2
and is independent of the test score of any other
student.
c
Copyright 2012
(Iowa State University)
Statistics 511
9 / 26
Suppose it is known that σu2 /σe2 = 9.
If the sample mean of the students’ test scores
was 100, what is the best prediction of the IQ of a
student who scored 130 on the test?
c
Copyright 2012
(Iowa State University)
Statistics 511
10 / 26
Consider our linear mixed effects model
y = Xβ + Zu + e,
where
u
e
∼N
0
0
G 0
,
.
0 R
Given data y, what is our best guess for the
unobserved vector u?
c
Copyright 2012
(Iowa State University)
Statistics 511
11 / 26
Because u is a random vector rather than a fixed
parameter, we talk about predicting u rather than
estimating u.
We seek a Best Linear Unbiased Predictor
(BLUP) for u, which we will denote by û.
c
Copyright 2012
(Iowa State University)
Statistics 511
12 / 26
To be a BLUP, we require...
1
û to be a linear function of y,
2
û to be unbiased for u so that E(û − u) = 0, and
3
Var(û − u) to be no “larger” than the Var(v − u),
where v is any other linear and unbiased predictor.
c
Copyright 2012
(Iowa State University)
Statistics 511
13 / 26
In 611, we prove that the BLUP for u is the BLUE
of E(u|y).
What does E(u|y) look like?
We will use the following result about conditional
distributions for multivariate normal vectors.
c
Copyright 2012
(Iowa State University)
Statistics 511
14 / 26
Suppose
w1
w2
∼N
where Σ ≡
Σ11
Σ21
Σ11 Σ12
,
Σ21 Σ22
Σ12
is positive definite.
Σ22
µ1
µ2
Then the conditional distribution of w2 given w1 is
−1
(w2 |w1 ) ∼ N(µ2 +Σ21 Σ−1
11 (w1 −µ1 ), Σ22 −Σ21 Σ11 Σ12 ).
c
Copyright 2012
(Iowa State University)
Statistics 511
15 / 26
Now note that
y
Xβ
Z I
u
=
+
u
0
I 0
e
Thus,
0
y
Xβ
Z I
G 0
Z I
∼N
,
u
0
I 0
0 R
I 0
Xβ
ZGZ0 + R
ZG
d
.
= N
,
0
GZ0
G
c
Copyright 2012
(Iowa State University)
Statistics 511
16 / 26
Thus, E(u|y) = 0 + GZ0 (ZGZ0 + R)−1 (y − Xβ)
= GZ0 Σ−1 (y − Xβ),
and the BLUP of u is
GZ0 Σ−1 (y − Xβ̂ Σ ) = GZ0 Σ−1 (y − X(X0 Σ−1 X)− X0 Σ−1 y)
= GZ0 Σ−1 (I − X(X0 Σ−1 X)− X0 Σ−1 )y.
c
Copyright 2012
(Iowa State University)
Statistics 511
17 / 26
For the usual case in which
G and Σ = ZGZ0 + R
are unknown, we replace the matrices by estimates
and approximate the BLUP of u by
−1
ĜZ0 Σ̂ (y − Xβ̂ Σ̂ ).
c
Copyright 2012
(Iowa State University)
Statistics 511
18 / 26
Often we wish to make predictions of quantities
like Cβ + Du for some estimable Cβ.
The BLUP of such a quantity is Cβ̂ Σ + Dû, the
BLUE of Cβ plus D times the BLUP of u.
c
Copyright 2012
(Iowa State University)
Statistics 511
19 / 26
Suppose intelligence quotients (IQs) for a
population of students are normally distributed
with a mean µ and variance σu2 .
Suppose an IQ test was given to an i.i.d. sample
of such students.
Suppose that, given the IQ of a student, the test
score for that student is normally distributed with a
mean equal to the student’s IQ and a variance σe2
and is independent of the test score of any other
student.
c
Copyright 2012
(Iowa State University)
Statistics 511
20 / 26
Suppose it is known that σu2 /σe2 = 9.
If the sample mean of the students’ test scores
was 100, what is the best prediction of the IQ of a
student who scored 130 on the test?
c
Copyright 2012
(Iowa State University)
Statistics 511
21 / 26
i.i.d.
Suppose u1 , . . . , un , ∼ N(0, σu2 ) independent of
i.i.d.
e1 , . . . , en , ∼ N(0, σe2 ).
If we let µ + ui denote the IQ of student
i (i = 1, . . . , n), then the IQs of the students are
N(µ, σu2 ) as in the statement of the problem.
If we let yi = µ + ui + ei denote the test score of
student i (i = 1, . . . , n), then
(yi |µ + ui ) ∼ N(µ + ui , σe2 ) as in the problem
statement.
c
Copyright 2012
(Iowa State University)
Statistics 511
22 / 26
We have y = Xβ + Zu + e, where
X = 1, β = µ, Z = I, G = σu2 I, R = σe2 I, and
Σ = ZGZ0 + R = (σu2 + σe2 )I.
Thus,
β̂ Σ = (X0 Σ−1 X)− X0 Σ−1 y = (10 1)− 10 y = ȳ·
and
0
−1
GZ Σ
c
Copyright 2012
(Iowa State University)
σu2
= 2
I.
σu + σe2
Statistics 511
23 / 26
Thus, the BLUP for u is
σu2
û = GZ Σ (y − Xβ̂ Σ ) = 2
(y − 1ȳ· ).
σu + σe2
0
−1
The ith element of this vector is
σu2
(yi − ȳ· ).
ûi = 2
σu + σe2
Thus, the BLUP for µ + ui (the IQ of student i) is
σu2
σu2
σe2
µ̂ + ûi = ȳ· + 2
(yi − ȳ· ) = 2
yi + 2
ȳ·
σu + σe2
σu + σe2
σu + σe2
c
Copyright 2012
(Iowa State University)
Statistics 511
24 / 26
Note that the BLUP is a convex combination of the
individual score and the overall mean score.
σu2
σe2
yi + 2
ȳ·
σu2 + σe2
σu + σe2
c
Copyright 2012
(Iowa State University)
Statistics 511
25 / 26
Because
σu2
σe2
is assumed to be 9, the weights are
σu2
=
σu2 + σe2
and
σu2
σe2
σu2
σe2
+1
=
9
= 0.9
9+1
σe2
= 0.1.
σu2 + σe2
We would predict the IQ of a student who scored 130
on the test to be 0.9(130) + 0.1(100) = 127.
c
Copyright 2012
(Iowa State University)
Statistics 511
26 / 26
Download