Density for a Normal Distribution -3 -2

advertisement
4. Normal Theory Inference
Density for a Normal Distribution
with mean = 0 and variance = 1
0.2
Y
0.0
f(y)
0.4
We will use the notation
-3
-2
-1
0
1
2
Suppose Z has a normal distribution
with E (Z ) = 0 and V ar(Z ) = 1, i.e.,
3
y
Defn 4.1: A random variable Y with
density function
1 e
f (y ) = p
2
N (; 2)
1 y !2
2 Z N (0; 1);
then Z is said to have a
standard
normal distribution.
is said to have a normal (Gaussian)
distribution with
E (Y ) = and V ar(Y ) = 2:
267
2
66
66
66
66
4
Z1
3
77
77
77
77
5
.. is a
Zm
random vector whose elements
are independently distributed
standard normal random variables.
For any m n matrix A, We say
that
Defn 4.2: Suppose Z =
Y = + AT Z
has a
268
and variance-covariance matrix
V ar(Y)
= AT V ar(Z)A
= AT A We will use the notation
Y N (; )
multivariate normal
distribution
with mean vector
E (Y )
When is positive denite, the
joint density function is
1
1
T
1
f (y) =
e 2 (y ) (y )
n=
2
1
=
2
(2) jj
= E ( + AT Z)
= + AT E (Z)
= + AT 0 = 269
270
The multivariate normal distribution has many useful properties:
Proof: By Defn 4.1, Y = + AT Z
where AT A = . Then,
Result 4.1 Normality is preserved
under linear transformations:
W = c + BY = c + B( + AT Z)
= (c + B) + BAT Z
which satises Defn. 4.1. with
If Y N (; ), then
V ar(W) = BAT AB T
W = c + BY N (c + B; BBT )
for any non-random c and B .
= BBT
271
Result 4.2 Suppose
2
3
02
3 2
31
66 Y1 77
B
66 1 77 66 11 12 77C
B
Y = 46 Y2 57 N @B46 2 57 ; 46 21 22 75CCA
then
Y1 N (1; 11) :
272
Example 4.1. Suppose
2
3
02
3 2
66 Y1 77
B
66
77 66
1
4
B
66
77
B
66
B
6
7
B
6
Y = 6664 Y2 7775 N BBB@6664 3 7777775 ; 6666664 1
Y3
2
1
then
"
#
Note: This result applies to any
subset of the elements of Y
because you can move that subset
to the top of the vector by
multiplying Y by an appropriate
matrix of zeros and ones.
273
2
66
64
Y1 777 = 666 1
Y3 5 4 0
3
31
77C
77C
C
77C
C
77C
C
5C
A
= [1 0 0] Y N (1; 4)
= [0 1 0] Y N ( 3; 3)
= [0 0 1] Y N (2; 9)
Y1
Y2
Y3
Proof: Note that Y1 = I 0 Y and
apply Result 4.1.
1 1
3 3
3 9
2
0 0 YN 1 ;
01
2
3
77
75
"
call this
matrix B
02
B
66
B
B
@64
3
77
75
"
B
2
66
64
4 1
1 9
"
B B T
274
31
77C
75C
C
A
Comment: If Y1 N (1; 1) and
Y2 N (2; 2), it is not always true
that
1
Y= Y
Y2 has a normal distribution.
2
66
64
3
77
75
Result 4.3: If Y1 and Y2 are
independent random vectors such
that
Y1 N (1; 1)
and
Y2 N (2; 2)
then
2
3
02
3
2
31
66 Y1 77
B
66 1 77
66 1 0
B
6
7
B
6
7
6
Y = 4 Y2 5 N @4 2 5 ; 4 0 2 7775CCCA
Proof: Since Y1 N (1; 1), we have
from Denition 4.2 that
Y1 = 1 + AT1 Z1
where AT1 A1 = 1 and the elements of Z1
are independent standard normal random
variables.
A similar result, Y2 = 2 + AT2 Z2, is true for
Y2.
Since Y1 and Y2 are independent, it follows
that Z1 and Z2 are independent. Then
2
3
2
3
T
66 1 + A1 Z1 77
1 5
Y=4 Y
=
4
5
Y2
2 + AT2 Z2
2
32
2
3
3
T
= 4 12 5 + 664 P01 P0T 775 4 ZZ12 5
2
satises Defn 4.2.
276
275
Y1
Result 4.4 If Y = .. is a ranYk
2
66
66
66
66
4
Alternatively, you could prove Result 4.3
by showing that the product of the
characteristic functions for Y1 and Y2 is
a characteristic function for a multivariate
normal distribution.
If 1 and 2 are both non-singular, you
could prove Result 4.3 by showing that the
product of the density functions for Y1 and
Y2 is a density function for the specied
multivariate normal distribution.
277
3
77
77
77
77
5
dom vector with a multivariate normal distribution, then Y1; Y2; ; Yk
are independent if and only if
Cov(Yi; Yj ) = 0 for all i 6= j .
Comments:
(i) If Yi is independent of Yj , then
Cov(Yi; Yj ) = 0.
(ii) When Y = (Y1; ; Yn)T has a
multivariate normal distribution,
Yi uncorrelated with Yj implies
Yi is independent of Yj . This is
usually not true for other distributions.
278
Result 4.5 If
0
B
B
B
@
Y N
X
1
C
C
C
A
0
B
B
B
B
@
2
66
64
Y 777 666 Y Y
X 5 ; 64 XY
3
2
Y X
XX
3
77
77
5
1
C
C
C
C
A
with a positive denite covariance
matrix, the conditional distribution
of Y given the value of X is a
normal distribution with mean vector
E (YjX) = Y
1 (X X )
+ Y X XX
and positive denite covaraince
matrix
V (YjX) = Y Y
1 XY
Y X XX
Quadratic forms:
YT AY
Sums of squares in ANOVA
Chi-square tests
F-tests
Estimation of variances
Some useful information about the
distribution of quadratic forms is
summarized in the following results.
%
note that this does not
depend on the value of X
279
2
66
66
66
66
4
Y1
280
Proof: (a) Note that the denition
of a covariance matrix implies that
V ar(Y) = E (YYT ) T , where =
E (Y). Then,
3
77
77
77
77
5
Result 4.6a If Y = . is a
Yn
random vector with
E (Y) = and V ar(Y) = E (YT AY)
and A is an n n non-random
matrix, then
E (YT AY) = T A + tr(A)
Result 4.6b If Y N (; ) and
A is a symmetric matrix, then
var(YT AY) = 4T AA+2tr(AA)
=
=
=
=
=
=
=
=
=
E (tr(YT AY))
E (tr(AYYT ))
tr(E (AYYT ))
tr(AE (YYT ))
tr(A[V ar(Y) + T ])
tr(A + AT )
tr(A) + tr(AT )
tr(A) + tr(T A)
tr(A) + T A
(b) See Searle(1971, page 57).
281
282
Example 4.2 Consider a GaussMarkov model with
E (Y) = X and V ar(Y) = 2I .
Let
b=(
XT X
)
XT
The residual vector is
e = Y Y^ = (I PX )Y
and the sum of squared residuals,
also called the error sum of squares,
is
SSE
Y
n
= i=1
(Yi Y^i)2
n 2
= i=1
ei
X
X
be any solution to the normal
equations.
Since E (Y) = X is estimable, the
unique OLS estimator is
Y^ = X b = X (X T X ) X T Y
= PX Y
=
=
=
=
=
eT e
[(I PX )Y]T (I PX )Y
YT (I PX )T (I PX )Y
YT (I PX )(I PX )Y
YT (I PX )Y
283
From Result 4.6
E (SSE )
Chi-square Distributions
= E (YT (I PX )Y)
= T X T (I PX )X
+tr((I PX )2I )
= 0 + 2tr(I PX )
= 2[tr(I ) tr(PX )]
= 2[n rank(PX )]
= 2[n rank(X )]
Consequently,
^ 2 =
SSE
rank(X )
is an unbiased estimator for 2
(provided that rank(X ) < n)
n
284
285
2
66
66
66
66
4
Z1
3
77
77
77
77
5
.. N (0; I ),
Zn
i.e., the elements of Z are n
independent standard normal random
variables. The distribution of
Defn 4.3 Let Z =
W
n
= ZT Z = i=1
Zi2
X
is called the central chi-square
distribution with n degrees
of freedom.
We will use the notation
W
2(n)
286
0.5
Central Chi-Square Densities
1 df
3 df
5 df
7 df
then
0.1
E (W ) = n
0.2
2n,
f(x;df)
If W
0.3
0.4
Moments:
0.0
V ar (W ) = 2n
0
2
4
6
8
10
x
287
# This code is stored in:
288
chiden.ssc
#================================================#
# chisq.density.plot()
#
# -----------------------#
# Input : degrees of freedom; it can be a vector.#
#
(e.g.) chisq.density.plot(c(1,2,5,7))#
#
creates curves for df = 1,3,5,and 7 #
# Output: density plot of chisquare distribution.#
#
#
##################################################
chisq.density.plot <- function(df)
{
x <- seq(.001,10,,50)
# Create the x,y axis and title
plot(c(0,10), c(0,.5), type="n",
xlab="x", ylab="f(x;df)",
main="Central Chi-Square Densities")
for(i in 1:length(df)) {
lty.num <- 3*i-2
# specify the line types.
f.x <- dchisq(x,df[i])
# calculate density.
# draw the curve.
lines(x, f.x, type="l",lty=lty.num)
}
289
# The following lines are for legends;
legend( x = rep(5,length(df)) ,
y = rep(.45,length(df)) ,
legend = paste(as.character(df),"df") ,
lty = seq(1,by=3,length=length(df)) ,
bty = "n")
}
#
#
#
#
#
#
#
#
#
This function can be executed with
the following commands. The source( )
function enters the entire file into
the command window and executes all
of the commands in the file.
source("chiden.ssc")
par(fin=c(7,8),cex=1.2,mex=1.5,lwd=4)
chisq.density.plot(c(1,3,5,7))
290
2
66
66
66
66
4
Y1
3
77
77
77
77
5
Defn 4.4: Let Y = . N (; I )
Yn
i.e., the elements of Y are independent normal random variables with
Yi N (i; 1). The distribution of
the random variable
W
is called a
n
= YT Y = i=1
Yi2
X
If W
2n(Æ2) then
noncentral chi-square
with n degrees of freedom and noncentrality parameter
distribution
Æ2
Moments:
E (W ) = n + Æ 2
V ar (W ) = 2n + 4Æ 2
n
= T = i=1
2i
X
We will use the notation
W
2n(Æ2)
Defn 4.5: If W1 2n1 and
W2 2
n2 and W1 and W2 are independent, then the distribution of
F =
Central moments:
W1=n1
W2=n2
= n n2 2
for n2 > 2
2
2n22(n1 + n2 2)
V ar(F ) =
n1(n2 2)2(n2 4)
E (F )
is called the central F distribution
with n1 and n2 degrees of freedom.
We will use the notation
F
292
291
for n2 > 4
Fn1;n2
293
294
# This code is stored in the file
#=================================================#
# f.density.plot()
#
# -------------------#
# Input : degrees of freedom; it can be a vector. #
#
(e.g.) f.density.plot(c(1,10,10,50))
#
#
creates a plot with density curves #
#
for (1,10) df and (10,50) df
#
# Output: density plots of the central F
#
#
distribution.
#
#
#
###################################################
1.2
Densities for Central F Distributions
0.8
(1,10) df
(10,50) df
(10,10) df
(10,4) df
f.density.plot <- function(n)
{
# draw x,y axis and title
0.0
0.4
f(x;n)
fden.ssc
0
1
2
3
x <- seq(.001,4,,50)
plot(c(0,4), c(0,1.4), type="n",
xlab="x", ylab="f(x;n)",
main="Densities for Central F Distributions")
4
x
# the length of df should be even.
legend.txt <- NULL
d.f <- matrix(n,ncol=2,byrow=T); r <- dim(d.f)[1]
295
for(i in 1:r) {
lty.num <- 3*i-2
# specify the line types.
f.x <- df(x,d.f[i,1],d.f[i,2]) # calculate density.
lines(x, f.x, type="l",lty=lty.num) # draw a curve.
legend.txt <- c(legend.txt,
paste("(",d.f[i,1],",",d.f[i,2],")",sep=""))
}
# The following lines are for inserting
# legends on plots using a motif
# graphics window.
}
#
#
#
#
#
#
296
Defn 4.6: If W1 2n1(Æ12) and
W2 2
and W2 are
n2 and W1
independent, then the distribution
of
F =
legend( x = rep(1.9,r) , y = rep(1.1,r) ,
cex=1.0,
legend = paste(legend.txt,"df") ,
lty = seq(1,by=3,length=r) ,
bty = "n" )
This function can be executed with the following
commands.
par(fin=c(6,7),cex=1.0,mex=1.3,lwd=4)
source("fden.ssc")
f.density.plot(c(1,10,10,50,10,10,10,4))
W2=n2
is called a noncentral F distribution
with n1 and n2 degrees of freedom
and noncentrality parameter Æ12.
We will use the notation
F
297
W1=n1
Fn1;n2(Æ12)
298
Moments:
0.6
for n2 > 2
Central F
Noncentral F
0.2
0.4
2
= n(n2(n1 +2)Æn1 )
2
1
f(x;v,w)
E (F )
0.8
Central and Noncentral F Densities
with (5,20) df and noncentrality parameter = 1.5
0.0
V ar(F ) =
2n22[(n1+Æ12)2+(n2 2)(n1+2Æ12)]
n21(n2 2)2(n2 4)
0
1
2
3
4
5
x
for n2 > 4
299
300
# This code is stored in the file:
#================================================#
# dnf; density of non.central.F
#
# ----------------------------#
# Input : x
can be a scalar or a vector
#
#
v
df for numerator
#
#
w
df for denominator
#
#
delta non-centrality parameter
#
#
(e.g.) dnf(x,5,20,1.5) when x is a
#
#
scalar,
#
#
sapply(x,dnf,5,20,1.5) when x
#
#
is a vector
#
# Output: evaluate density curve of the
#
#
noncentral F distribution
#
##################################################
Central F
0.2
0.4
Noncentral F
dnf <- function(x,v,w,delta)
{
sum <- 1
term <- 1
p <- ((delta*v*x)/(w+v*x))
nt <- 100
for (j in 1:nt) {
term <- term*p*(v+w+2*(j-1))/((v+2*(j-1))*j)
sum <- sum + term
}
dnf.x <- exp(-delta)*sum*df(x,v,w)
return(dnf.x)
}
0.0
f(x;v,w)
0.6
0.8
Central and Noncentral F Densities
with (5,20) df and noncentrality parameter = 3
0
1
2
3
4
fdennc.ssc
5
x
301
302
#
#
#
#
#
#
dnf.slow is aimed to show vectorized
calculations and use of a loop avoidance
function (sapply). Vectorized calculations
operate on entire vectors rather than on
individual components in sequence.
(V&R on p.103-108)
dnf.slow <- function(x,v,w,delta)
{
prod.seq <- function(a,b) prod(seq(b,b+2*(a-1),2))
}
j <- 1:100
p <- ((delta*v*x)/(w+v*x))
numer <- sapply(j,prod.seq,v+w,simplify=T)
denom <- gamma(j+1)*sapply(j,prod.seq,v,simplify=T)
k <- 1 + sum( p^j * numer / denom )
f.x <- k*exp(-delta)*df(x,v,w)
return(f.x)
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
The folowing commands can be applied
to obtain a single density value
dnf(0.5,5,20,1.5)
dnf.slow(0.5,5,20,1.5)
The following commands are used
to evalute the noncentral F density
for a vector of values
x <- seq(1,10,.1)
f.x1 <- sapply(x,dnf,5,20,1.5)
f.x2 <- sapply(x,dnf.slow,5,20,1.5)
You will notice that the performance of
dnf is better than that of dnf.slow.
The results should be the same. In this
case using a loop is better than using
vectorized calculations, but is is
usually more efficient to use
vectorized computions.
304
303
#=============================================#
# noncen.f.density()
#
# -------------------#
# Input : v,w
degrees of freedom
#
#
delta non-centrality parameter
#
#
(e.g.) noncen.f.density(5,20,1.5)
#
# Output: Two F density curves on one plot
#
#
(central and noncental).
#
# Note : You must define the dnf( ) function #
#
before applying this function.
#
#
#
###############################################
# create the axes, lines, and legends.
plot(c(0,5), c(0,0.8), type="n",xlab="x",
ylab="f(x;v,w)")
mtext(main.txt, side=3,line=2.2)
lines(x, cf.x, type="l",lty=1)
lines(x, nf.x, type="l",lty=3)
legend(x=1.6, y=0.64,legend="Central F ",cex=0.9,
lty =1, bty = "n" )
legend(x=1.6, y=0.56,legend="Noncentral F",cex=0.9,
lty =3, bty = "n" )
}
n.f.density.plot <- function(v,w,delta)
{
x <- seq(.001,5,length=60)
cf.x <- df(x,v,w)
nf.x <- sapply(x,dnf,v,w,delta)
# For the main title;
main1.txt <- "Central and Noncentral F Densities \n
with"
df.txt <- paste(paste("(",paste(paste(v,",",sep=""),
w,sep=""),sep=""),")",sep="")
main2.txt <- paste(df.txt,
"df and noncentrality parameter =")
main2.txt <- paste(main2.txt,delta)
main.txt <- paste(main1.txt,main2.txt)
305
#
#
#
#
#
#
#
#
This function is executed with the following
commands.
par(fin=c(6.5,7),cex=1.2,mex=1.5,lwd=4,
mar=c(5,4,5,2))
source("fdennc.ssc")
n.f.density.plot(5,20,1.5)
n.f.density.plot(5,20,3.0)
306
Sums of squares in ANOVA tables
are quadratic forms
Reminder:
YT AY
If Y1; Y2; :::; Yk are independent
random vectors, then
where A is a non-negative denite
symmetric matrix (usually a
projection matrix).
f1(Y1); f2(Y2); : : : ; fk(Yk)
are distributed
independently
.
Here fi(Yi) indicates that fi( ) is a
function only of Yi and not a function of any other Yj ; j 6= i.
These could be either real valued or
vector valued functions.
307
Result 4.7: Let A be an n n symmetric matrix with rank(A) = k,
and let
2
3
66 Y1 77
66
Y = 66664 .. 7777775 N (; )
Yn
where is an n n symmetric positive denite matrix. If
A is idempotent
then
Y
To develop F-tests we need to identify conditions under which
YT AY has a central (or
noncentral) chi-square
distribution
YT AiY and YT Aj Y are
independent
308
Proof: We will show that the denition of
a noncentral chi-square random variable
(Defn 4.4) is satised by showing that
YT AY = ZT Z
for a normal random vector
2
3
66 Z1 77
Z = 66664 . 77775 with V ar(Z) = Ikk :
Zk
is idempotent we have
A = AA
Since is positive denite, then
Step 1: Since
A
Step 2:
1 exists and we have
A 1 = AA 1
0
1
Y 2k @T AA
TA
)
In addition, if A = 0 then
)
YT AY 2k
309
= AA
A = AT A
A
310
Step 3: For any vector x =
2
66
66
66
4
3
77
77
77
5
x1
..
xn
we have
xT Ax = xT AT Ax 0
because is positive denite. Hence, A is
non-negative denite and symmetric.
Step 4: From the spectral decomposition
of A (Result 1.12) we have
A
=
where
1
k
X
vv =
j j jT
j =1
4
are
k
=
B
V
2
66
66
66
66
64
p1
1
...
A
are zero
p1
3
77
77
77
77
75
k
=
Since
are the positive eigenvalues of A,
2
3
66 1
77
66
77
2
66
77
D = 66
77
.
.
.
66
77
V
Step 5: Dene
V DV T
2 k > 0
and the columns of
The other n k eigenvalues of
because rank(A) = k.
V TV
B T AB
5
= I , we have
= D 1=2V T V DV T V D
= D 1=2DD 1=2
= Ik k
1=2
= AT A we have
I = B T AB = B T AT AB
Then, since
v1; v2; ; vk;
VD
1=2
the eigenvectors corresponding to the positive eigenvalues of A.
A
311
312
Z = BT AY, then
V ar (Z) = B T AT AB = Ikk
Step 6: Dene
and
Finally, since
Z N (BT A; I )
Z N (BT A; I )
we have
Step 7:
because
ZT Z 2k (Æ2)
ZT Z = (BT AY)T (BT AY)
= YT AT BBT AY
= YT AY
AT BB T A
=
=
=
=
=
ABB T A
V D V T V D 1=2D
V DD 1DV T
V DV T
A
from Defn 4.4, where
1=2 V T V
313
DV T
Æ2
= (BT A)T (BT A)
= T AT BBT A
= T A
314
Example 4.3
For the GaussMarkov model with
and V ar(Y) = 2I
E (Y) = X
include the assumption that
2
3
66 Y1 77
66
Y = 66664 .. 7777775 N (X; 2I ):
Yn
For any solution
The sum of squared residuals is
SSE =
2
T
=1 ei = e e
= YT (I PX )Y:
n
X
i
Use Result 4.7 to obtain the distribution of
2
SSE
66 1
T
64
=
Y
(I
2
2
b = (X T X ) X T Y
to the normal equations, the OLS
estimator for X is
Y^ = X b = X (X T X ) X T Y = PX Y
and the residual vector is
e = Y Y^ = (I PX )Y:
3
77
75
PX )
Y
Here
=
=
A =
E (Y) = X
V ar(Y) = 2I is p.d.
1 (I P ) is symmetric
X
2
315
Note that
A
= 12 (I PX )2I
= I PX
316
We could also express this as
SSE 2 2n
is idempodent, and
1
A = 2 (I
PX )X = 0
Now consider the \uncorrected"
model sum of squares
^2 ^T ^
=1 Yi = Y Y
= (PX Y)T PX Y
Then
n
X
i
SSE
2n
2
k
= YT PXT PX Y
where
rank(I
k
PX )
= n rank(X )
=n k
317
= YT PX Y:
318
Use Result 4.7 to show
12
1
Y^ 2 = YT ( 2 PX )Y 2k (Æ2)
i=1 i
n
X
%
this is
and
where
Æ2
A
k
"
= rank(X )
= 2I
= (X)T (12 PX )(X)
= 12 T XT (PX X )
- this is X
1
= 2 T XT X
The next result addresses the independence of several quadratic
forms
2
66
66
66
66
4
Y1
3
77
77
77
77
5
Result 4.8 Let Y = .. N (; )
Yn
and let A1; A2; : : : ; Ap be n n
symmetric matrices. If
AiAj = 0 for all i 6= j
then
Y T A 1 Y ; Y T A2 Y ; : : : ; Y T A p Y
are independent random variables.
319
Proof: From Result 4.1
2
3
2
3
66 A1Y 77
66 A1 77
66
77
66
.. 777 = 666 .. 77777 Y
66
66
4
ApY 57 46 Ap 57
has a multivariate normal distribution, and for i 6= j
Cov(AiY; Aj Y)
= AiATj
=0
320
Since
YT AiY = YT AiAi AiY
= YT ATi Ai AiY
= (AiY)T Ai (AiY)
is a function of AiY only, it follows
that
It follows from Result 4.4 that
YT A1Y; ; YT ApY
A1Y; A2Y; ; ApY
are independent random vectors.
321
are independent random variables.
322
Example 4.4. Continuing Example
4.3, show that the \uncorrected"
model sum of squares
n
X
i
=1
Use Result 4.8 with A1 = PX and
A2 = I PX . Note that
A1A2
Y^i2 = YT PX Y
and the sum of squared residuals
^ 2 T
=1(Yi Yi) = Y (I PX )Y
n
X
i
are independently distributed for
the \normal theory" Gauss-Markov
model where
Y N (X; 2I ):
(I PX )(2I )PX
2(I PX )PX
2(PX PX PX )
2(PX PX )
=
=
=
=
=
0
Consequently,
1
2 i
n
X
=1
Y^i2 and 12
^ 2
=1(Yi Yi)
n
X
i
are independently distributed.
324
323
By Defn 4.6,
In Example 4.3 we showed that
12
i
n
X
=1
T T
Y^i2 2k X2 X 1 =1 Y^i2
F =
1
( ) =1(Yi Y^i)2
Pn
k 2 i
Pn
n k 2 i
!
uncorrected model
#
mean square
n
X ^2
Yi
=1
= 1 nX
Y^i)2
n k i=1(Yi
1
ki
and
1
^ 2 2
=1(Yi Yi) n
2 i
n
X
%
k
where k = rank(X ).
Fk;n
Residual mean square
0
@
1
k 2 "
This reduces to a central
F distribution with (k; n
when
325
X = 0
1
T X T X A
k)
d.f.
326
Comments
Use
F
1
n
X
(i) The null hypothesis corre-.
sponds to the condition under
which F has a central F
distribution (the noncentrality
parameter is zero). In this
example
Y^i2
= 1 =1 ^ 2
Yi)
n k i=1(Yi
ki
n
X
to test the null hypothesis
1
Æ2 = 2 (X)T (X) = 0
if and only if X = 0
H0 : E (Y) = X = 0
against the alternative
(ii) If k = rank(X ) = number of
columns in X , then
H0 : X = 0 is equivalent to
H0 : = 0.
HA : E (Y) = X 6= 0
327
328
Example 4.4 is a simple illustration
of a typical
(iii) If k = rank(X ) is less than the
number of columns in X , then
X = 0 for some 6= 0 and
H0 : X = 0 is not equivalent
to H0 : = 0.
2
T
=1 Yi = Y Y
= YT [(I PX ) + PX ] Y
n
X
i
= YT (I PX )Y + YT PX Y
"
call this
"
A2
call this
A1
n
n ^2
= i=1
(Yi Y^i)2 + i=1
Yi
X
X
"
d.f. = rank(A2)
329
"
d.f. = rank(A1)
330
More generally an uncorrected total
sum of squares can be partitioned
as
Y 2 = YT Y
i=1 i
= YT A1Y + YT A2Y +
+ Y T Ak Y
n
X
using orthogonal projection
matrices
and
AiAj = 0
Since we are dealing with orthogonal projection matrices we also have
ATi
AiAi
A1 + A2 + + Ak = Inn
where
for any i 6= j :
= Ai (symmetry)
= Ai (idempodent matrices)
rank(A1)+rank(A2)+ +rank(Ak) = n
332
331
Result 4.9 (Cochran's Theorem)
2
66
66
66
66
4
Y1
3
77
77
77
77
5
Let Y = .. N (; 2I )
Yn
and let A1; A2; ; Ak be n n
symmetric matrices with
and
YT A1Y; YT A2Y; ; YT AkY
I = A1 + A2 + + Ak
are distributed independently.
and
n = r1 + r2 + + rk
where ri = rank(Ai) . Then, for
i = 1; 2; : : : ; k
Proof: This result follows directly
from Result 4.7, Result 4.8 and the
following Result 4.10.
12 YT AiY 2r 12 T Ai
i 333
334
Proof:
Result 4.10 Let A1; A2; ; Ak be
n n symmetric matrices such that
First show that (i)
A1 + A2 + + Ak = I :
AiAi
Then the following statments are
equivalent
(i) AiAj = 0
for any i 6= j
(ii) AiAi = Ai
for all i = 1; : : : ; k
(iii) rank(A1) + + rank(Ak) = n
Since
=I
Ai
X
j 6=i
=
=
) (ii)
Aj ,
(
Ai I
Ai
Now show that (ii)
we have
X
j 6=i
X
j 6=i
Aj
)
AiAj
= Ai
) (iii)
Since an idempodent matrix has eigenvalues that are either 0 or 1 and the number
of non-zero eigenvalues is the rank of the
matrix, (ii) implies that tr(Ai) = rank(Ai).
Then,
n
= tr(I )
= tr(A1 + A2 + + Ak)
= tr(A1) + tr(A2) + + tr(Ak)
= rank(A1) + rank(A2) + + rank(Ak)
336
335
Finally, show that (iii)
Then
) (i)
Let ri = rank(Ai). Since Ai is symmetric,
we can apply the spectral decomposition
(Result 1.12) to write Ai as
where
i
Ui
Ai
= Ui i UiT
I
=
=
+ A2 + + Ak
U11U1T + + Uk k UkT
2
66 1
6
= [U1j jUk] 6666666 2 . . .
A1
4
is an ri ri diagonal matrix containing the non-zero eigenvalues of Ai
and
= [ u1i j u2i j j ur ;i ]
i
is an n ri matrix whose columns
are the eigenvectors corresponding
to the non-zero eigenvalues of Ai.
337
=
U
2
66
66
66
4
1
...
k
3
77
77 T
77 U
5
k
3
77 2
77 66
77 66
77 66
77 4
5
U1T
..
UkT
3
77
77
77
5
Since rank(A1) + + rank(Ak) = n and
rank(Ai) is the number of columns in Ui,
then U = [U1j jUk] is an nxn matrix.
Furthermore, rank(U ) = n because the
identity matrix on the left side of the equal
sign has rank n. Then, U T U is an nxn matrix of full rank and (U T U ) 1 exists, and
338
I
=
U
)
UT U
=
2
66
66
66
4
1
UT U
)
2
66
66
66
4
...
1
(U T U ) 1U T U = 2
6 1
(U T U ) 1U T U 666664
)
I
=
2
66
66
66
4
1
...
k
...
...
k
It follows that
3
77
77 T
77 U
5
k
k
2
66
66
66
64
3
77
77 T
77 U U
5
1 1
...
Consequently,
3
77
77 T
77 U U
5
UiT Uj
and
AiAj
3
77
77 T
77 U U
5
k
1
=0
3
77
77
77
75
=
2
66
66
66
4
U1T
..
UkT
3
77
77
77
5
for any
[U1 Uk]
i
6= j
= Uii UiT Uj j Uj = 0
"
this is a matrix
for any
339
References:
Johnson, R. A. and Wichern, D. W. (1998)
Applied Multivariate Statistical Analysis,
4th edition, Pentice Hall (Chapter 4)
Anderson, T. W. (1984)
An Introduction to Multivariate Statistical
Analysis, 2nd edition, Wiley, New York.
Johnson, N. L., Kotz, S. and Balakrishnan, N.
(1994,95) Distributions in Statistics: Continuous Univariate Distributions Vols 1 and
2, 2nd edition, Wiley, New York.
Searle, S. R. (1987)
Linear Models for Unbalanced Data, Wiley,
New York (Chapters 7 and 8).
Rencher, A. C. (2000)
Linear Models in Statistics, Wiley, New
York (Chapters 4 and 5).
341
i
6= j .
of zeros
340
Download