Distributions of Quadratic Forms c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 31 Under the Normal Theory GMM (NTGMM), y = Xβ + ε, where ε ∼ N(0, σ 2 I). By Result 5.3, the NTGMM =⇒ y ∼ N(Xβ, σ 2 I). c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 2 / 31 Mean of y determined by β through Xβ. Variance of y determined by σ 2 . y = PX y + (I − PX )y c Copyright 2012 Dan Nettleton (Iowa State University) PX y ∈ C(X), (I − PX )y ∈ C(X)⊥ . Statistics 611 3 / 31 We use ŷ = PX y to estimate Xβ (PX y = Xβ̂) We use ε̂ = (I − PX )y to estimate σ 2 . σ̂ 2 = c Copyright 2012 Dan Nettleton (Iowa State University) ε̂0 ε̂ n−rank(X) . Statistics 611 4 / 31 Also, recall that y0 y = ŷ0 ŷ + ε̂0 ε̂ SSTO = SSR + SSE. Under the NTGMM, what can we say about the distribution of these sums of squares? c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 5 / 31 Lemma 5.1: A p × p symmetric matrix A is idempotent with rank s iff ∃ a p × s matrix G with orthogonal columns such that c Copyright 2012 Dan Nettleton (Iowa State University) A = GG0 . Statistics 611 6 / 31 Proof of Lemma 5.1: (=⇒) By the Spectral Decomposition Theorem, A = QΛQ0 = p X λi qi q0i , i=1 where Q = [q1 , . . . , qp ], c Copyright 2012 Dan Nettleton (Iowa State University) Q0 Q = I, Λ = diag(λ1 , . . . , λp ). Statistics 611 7 / 31 Because A is idempotent, λi ∈ {0, 1} ∀ i = 1, . . . , p. Because rank(A) = s, ∃ exactly s of λ1 , . . . , λp equal to 1 and p − s of λ1 , . . . , λp equal to 0. Let i1 , . . . , is be 3 λi1 = · · · = λis = 1. Let G = [qi1 , . . . , qis ]. Then A= = p X i=1 s X λi qi q0i = s X λij qij q0ij j=1 qij q0ij = GG0 and G0 G = I. j=1 c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 8 / 31 (⇐=) If A = GG0 , where G is a p × s matrix with orthonormal columns. Then rank(GG0 ) = rank(G0 ) = rank(G) = rank(G0 G) = rank(s×s I) = s. Thus, rank(A) = s. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 9 / 31 Furthermore, A0 = (GG0 )0 = GG0 = A and AA = (GG0 )(GG0 ) = G(G0 G)G0 = GIG0 = GG0 = A. ∴ A is also symmetric and idempotent. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 10 / 31 Result 5.14: Let X ∼ N(µ, p×p I ) and letp×p A be a symmetric matrix. Then A is idempotent with rank s =⇒ X0 AX ∼ χ2s (µ0 Aµ/2). c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 11 / 31 Proof of Result 5.14: By Lemma 5.1, ∃ G 3 A = GG0 and G0 G =s×s I. Then G0 X ∼ N(G0 µ, G0 IG = G0 G =s×s I ). Thus, by Result 5.9, (G0 X)0 (G0 X) ∼ χ2s ((G0 µ)0 G0 µ/2). (G0 X)0 (G0 X) = X0 GG0 X = X0 AX (G0 µ)0 G0 µ = µ0 GG0 µ = µ0 Aµ. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 12 / 31 Result 5.15: Suppose X ∼ N(µ, Σ), with p×p Σ of rank p. Suppose A is p × p and symmetric. Then AΣ is idempotent of rank s =⇒ X0 AX ∼ χ2s (µ0 Aµ/2). c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 13 / 31 Proof of Result 5.15: Let W = Σ−1/2 X. Then W ∼ N(Σ−1/2 µ, Σ−1/2 ΣΣ−1/2 = I). Let B = Σ1/2 AΣ1/2 . Then B is symmetric by symmetry of Σ1/2 and A. Furthermore, rank(B) = rank(Σ1/2 AΣ1/2 ) = rank(A) = rank(AΣ) = s. ∵ Σ1/2 and Σ are full-rank. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 14 / 31 Finally, note that B is idempotent: AΣAΣ = AΣ ⇐⇒ Σ1/2 AΣAΣ = Σ1/2 AΣ ⇐⇒ Σ1/2 AΣAΣΣ−1/2 = Σ1/2 AΣΣ−1/2 ⇐⇒ Σ1/2 AΣAΣ1/2 = Σ1/2 AΣ1/2 ⇐⇒ Σ1/2 AΣ1/2 Σ1/2 AΣ1/2 = Σ1/2 AΣ1/2 ⇐⇒ BB = B. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 15 / 31 Thus, by Result 5.14, W 0 BW ∼ χ2s ((Σ−1/2 µ)0 B(Σ−1/2 µ)/2). Now note W 0 BW = X0 Σ−1/2 Σ1/2 AΣ1/2 Σ−1/2 X = X0 AX and likewise (Σ−1/2 µ)0 B(Σ−1/2 µ) = µ0 Aµ. ∴ X0 AX ∼ χ2s (µ0 Aµ/2). c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 16 / 31 Find the distribution of SSE. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 17 / 31 SSE = ε̂0 ε̂ = y0 (I − PX )y I − PX y. = σ 2 y0 σ2 Let A= I − PX σ2 Then AΣ = and Σ = σ 2 I = Var(y). I − PX 2 σ I = I − PX . σ2 Thus, AΣ is idempotent and rank(AΣ) = n − rank(X). c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 18 / 31 By Result 5.15, 0 y I − PX σ2 y∼ χ2n−rank(X) 1 0 0 βX 2 I − PX σ2 Xβ . ∵ (I − PX )X = X − PX X = X − X = 0, we have y 0 I − PX σ2 y ∼ χ2n−rank(X) . Thus, SSE ∼ σ 2 χ2n−rank(X) , which is a scaled Chi-Square distribution. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 19 / 31 Similarly, SSR ∼ χ2rank(X) σ2 c Copyright 2012 Dan Nettleton (Iowa State University) 1 0 0 β X Xβ/σ 2 . 2 Statistics 611 20 / 31 Result 5.16: Suppose X ∼ N(µ, Σ) and A is symmetric with rank s. Then BΣA = 0 =⇒ BX and X0 AX are independent. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 21 / 31 By the SDT, we have A = QΛQ0 , p×p where Q is square with orthonormal columns q1 , . . . , qp and Λ = diag(λ1 , . . . , λp ) with exactly s of λ1 , . . . , λp not equal to zero. Because QΛQ0 = p X λi qi q0i , i=1 we can without loss of generality (WLOG) assume λ1 , . . . , λs 6= 0. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 22 / 31 Thus, A= s X λi qi q0i = Q1 Λ1 Q01 , i=1 where Q1 = [q1 , . . . , qs ], Q01 Q1 =s×s I Λ1 = diag(λ1 , . . . , λs ) c Copyright 2012 Dan Nettleton (Iowa State University) and Λ−1 1 = diag 1 1 ,..., λ1 λs . Statistics 611 23 / 31 " Now consider BX Q01 X # " = " # B Q01 # B Q01 X. Then " X∼N B Q01 # ! µ, V , where " V= # B Q01 " = c Copyright 2012 Dan Nettleton (Iowa State University) h Σ B0 BΣB0 Q1 i BΣQ1 Q01 ΣB0 Q01 ΣQ1 # . Statistics 611 24 / 31 BΣA = 0 =⇒ BΣQ1 Λ1 Q01 = 0 =⇒ BΣQ1 Λ1 Q01 Q1 = 0Q1 =⇒ BΣQ1 Λ1 = 0 −1 =⇒ BΣQ1 Λ1 Λ−1 1 = 0Λ1 =⇒ BΣQ1 = 0 =⇒ BX and Q01 X independent by Result 5.4. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 25 / 31 Now BX and Q01 X are independent, =⇒ BX and (Q01 X)0 Λ1 Q01 X independent =⇒ BX and X0 Q1 Λ1 Q01 X independent =⇒ BX and X0 AX independent. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 26 / 31 Corollary 5.4: Suppose X ∼ N(µ, Σ), A symmetric with rank r, B symmetric with rank s. Then BΣA = 0 =⇒ X0 AX and X0 BX are independent. Proof: HW problem. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 27 / 31 Find the distribution of SSR/r , SSE/(n − r) c Copyright 2012 Dan Nettleton (Iowa State University) where r = rank(X). Statistics 611 28 / 31 We have seen that SSR ∼ χ2r σ2 1 0 0 β X Xβ/σ 2 2 and SSE ∼ χ2n−r . σ2 c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 29 / 31 Thus, SSR/r = SSE/(n − r) will have the distribution Fr,n−r SSR /r σ2 SSE /(n − σ2 1 0 0 2 β X Xβ/σ 2 r) if SSR and SSE independent. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 30 / 31 SSR = y0 PX y, SSE = y0 (I − PX )y (I − PX )(σ 2 I)PX = σ 2 (I − PX )PX = σ 2 (PX − PX PX ) = σ 2 (PX − PX ) = 0. By Corollary 5.4, SSR and SSE are independent. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 31 / 31