The Multivariate Normal Distribution Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 36 The Moment Generating Function (MGF) of a random vector X is given by 0 MX (t) = E(et X ) 0 provided ∃ h > 0 3 E(et X ) exists ∀ t = (t1 , . . . , tn )0 3 ti ∈ (−h, h) ∀ i = 1, . . . , n. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 2 / 36 Result 5.1: If the MGFs of two random vectors X1 and X2 exist in an open rectangle R that includes the origin, then the cumulative distribution functions (CDFs) of X1 and X2 are identical iff MX1 (t) = MX2 (t) ∀ t ∈ R. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 3 / 36 A random variable Z with MGF MZ (t) = E(etZ ) = et 2 /2 is said to have a standard normal distribution. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 4 / 36 Show that a random variable Z with density 1 2 fZ (z) = √ e−z /2 2π has a standard normal distribution. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 5 / 36 Z ∞ = et 2 /2 1 2 etz √ e−z /2 dz 2π Z−∞ ∞ 1 −1/2(z2 −2tz) √ e = dz 2π −∞ Z ∞ 1 2 2 2 √ e−1/2(z −2tz+t −t ) dz = 2π −∞ Z ∞ 1 2 2 √ e−1/2(z−t) dz = et /2 2π −∞ E(etZ ) = Copyright c 2012 Dan Nettleton (Iowa State University) . Statistics 611 6 / 36 Suppose Z has a standard normal distribution. Then E(Z) = ∂MZ (t) ∂t t=0 2 ∂et /2 = ∂t t2 /2 =e E(Z 2 ) = t=0 (t)|t=0 = 0. ∂ 2 MZ (t) ∂t2 t2 /2 =e t=0 2 t2 /2 +t e |t=0 = 1. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 7 / 36 Thus, E(Z) = 0 and Var(Z) = 1. If Z is standard normal, then Y = µ + σZ has mean E(Y) = E(µ + σZ) = µ + σE(Z) = µ and variance Var(Y) = Var(µ + σZ) Copyright c 2012 Dan Nettleton (Iowa State University) = Var(σZ) = σ 2 Var(Z) = σ2. Statistics 611 8 / 36 Furthermore, the MGF of Y is MY (t) = E(etY ) Copyright c 2012 Dan Nettleton (Iowa State University) = E(et(µ+σZ) ) = etµ E(etσZ ) = etµ MZ (tσ) = etµ et 2 σ 2 /2 = etµ+t 2 σ 2 /2 . Statistics 611 9 / 36 A random variable Y with MGF MY (t) = etµ+t 2 σ 2 /2 is said to have a normal distribution with mean µ and variance σ 2 . We denote the distribution of Y by N(µ, σ 2 ). Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 10 / 36 If Y ∼ N(µ, σ 2 ), then P(Y ≤ y) = P(µ + σZ ≤ y) y−µ ). = P(Z ≤ σ Thus, the density of Y is ∂P(Z ≤ y−µ ∂P(Y ≤ y) σ ) = ∂y ∂y 1 y−µ 2 y−µ 1 1 = fZ =√ e− 2 ( σ ) . σ σ 2πσ 2 Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 11 / 36 That is, fY (y) = √ =√ Copyright c 2012 Dan Nettleton (Iowa State University) 1 2πσ 2 1 2πσ 2 1 e− 2 ( e − y−µ 2 σ ) 1 (y−µ)2 2σ 2 . Statistics 611 12 / 36 i.i.d. Suppose Z1 , . . . , Zp ∼ N(0, 1). Z1 . . Then Z = . is said to have a Zp standard multivariate normal distribution. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 13 / 36 Copyright c 2012 Dan Nettleton (Iowa State University) E(Z) = 0 Var(Z) = I. Statistics 611 14 / 36 Find the Moment Generating Function of a standard multivariate normal random vector Z. p×1 Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 15 / 36 Pp 0 E(et Z ) = E(e Copyright c 2012 Dan Nettleton (Iowa State University) = E( = i=1 ti Zi p Y ) eti Z i ) i=1 p Y E(eti Zi ) i=1 = = p Y i=1 p Y MZi (ti ) 2 eti /2 i=1 Pp 2 i=1 ti /2 =e 0 = et t/2 . Statistics 611 16 / 36 A p-dimensional random vector Y has the Multivariate Normal Distribution with mean µ and variance Σ (Y ∼ N(µ, Σ)) iff the MGF of Y is 0 0 MY (t) = et µ+t Σt/2 . Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 17 / 36 Suppose Z ∼ N(0, I). Show that Y = µ + AZ has a multivariate normal distribution. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 18 / 36 If Z standard multivariate normal, then Y = µ + AZ has mean E(Y) = E(µ + AZ) = µ + AE(Z) =µ and variance Var(Y) = Var(µ + AZ) Copyright c 2012 Dan Nettleton (Iowa State University) = Var(AZ) = AVar(Z)A0 = AA0 . Statistics 611 19 / 36 Furthermore, 0 MY (t) = E(et Y ) Copyright c 2012 Dan Nettleton (Iowa State University) 0 = E(et (µ+AZ) ) 0 0 = et µ E(et AZ ) 0 = et µ MZ (A0 t) 0 0 0 = et µ+t AA t/2 . Statistics 611 20 / 36 Note that if rank(q×p A ) < q, then Var(Y) = Var(µ + AZ) = AA0 will be singular. In this case, the support of the q × 1 random vector Y will lie within a rank(A)(< q)- dimensional vector space. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 21 / 36 Give a specific example of a singular multivariate normal distribution (Y ∼ N(µ, Σ), Σ singular). Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 22 / 36 Suppose Z = Z ∼ N(0, 1). Suppose A = " # 1 1 . Then rank(A) = 1 < 2. 2×1 Let Y = AZ. Then " Y∼N 0 0, AA = 1 1 #! . 1 1 Y lies in the 1-dimensional vector space C = {Y = (Y1 , Y2 )0 ∈ R2 : Y1 = Y2 } with probability 1. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 23 / 36 Result 5.3: If X ∼ N(µ, Σ) and Y = a + BX, then Y ∼ N(a + Bµ, BΣB0 ). Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 24 / 36 Proof of Result 5.3: 0 0 0 E(et Y ) = E(et a+t BX ) 0 0 = et a E(et BX ) 0 = et a MX (B0 t) 0 0 0 0 = et a et Bµ+t BΣB t/2 0 0 0 = et (a+Bµ)+t BΣB t/2 . Thus, Y ∼ N(a + Bµ, BΣB0 ). Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 25 / 36 Corollary 5.1: If X is multivariate normal (MVN), then the joint distribution of any p×1 subvector of X is MVN. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 26 / 36 Proof of Corollary 5.1: Suppose {i1 , . . . , iq } ⊆ {1, . . . , p}. Then e0i1 X i1 . . .. = .. X, X iq e0iq where e0 denotes the ith row of the p × p identity matrix. i e0i .1 .. X ∼ MVN by Result 5.3. e0iq Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 27 / 36 Corollary 5.2: If X ∼ N(µ, Σ) and Σ is nonsingular, then p×1 (a) ∃ a nonsingular matrix A 3 Σ = AA0 , (b) A−1 (X − µ) ∼ N(0, I), and (c) The probability density function of X is 0 fX (t) = (2π)−p/2 |Σ|−1/2 e−1/2(t−µ) Σ Copyright c 2012 Dan Nettleton (Iowa State University) −1 (t−µ) . Statistics 611 28 / 36 Proof of Corollary 5.2: (a) We can take A = Σ1/2 because Σ is symmetric and positive definite. Because Σ positive definite, (Σ1/2 )−1 = Σ−1/2 exists. (b) By Result 5.3, A−1 (X − µ) ∼ N(A−1 µ − A−1 µ, A−1 Σ(A−1 )0 ), with A−1 µ − A−1 µ = 0 and A−1 Σ(A−1 )0 = Σ−1/2 ΣΣ−1/2 = I. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 29 / 36 (c) Homework problem. You may wish to use the multivariate change of variables result on page 185 of Casella and Berger. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 30 / 36 Result 5.2: Suppose the MGF of Xi is MXi (ti ) ∀ i = 1, . . . , p. Let X = [X01 , X02 , . . . , X0p ]0 and t = [t01 , t02 , . . . , t0p ]0 . Suppose X has MGF MX (t). Then X1 , . . . , Xp are mutually independent iff MX (t) = p Y MXi (ti ) i=1 ∀ t in an open rectangle that includes 0. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 31 / 36 Result 5.4: If X ∼ N(µ, Σ) and we partition X1 . X = .. , Xp µ1 . µ = .. , µp Σ11 · · · Σ1p . .. .. . Σ= . . . . Σp1 · · · Σpp Then X1 , . . . , Xp are mutually independent iff Σij = 0 Copyright c 2012 Dan Nettleton (Iowa State University) ∀ i 6= j. Statistics 611 32 / 36 Proof of Result 5.4: (=⇒) X1 , . . . , Xp mutually independent, then Cov(Xi , Xj ) = E[(Xi − µi )(Xj − µj )0 ] = [E(Xi − µi )][E(Xj − µj )0 ] = [µi − µi ][(µj − µj )0 ] =0 ⇒ ∀ i 6= j. Thus, Σij = 0 ∀ i 6= j. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 33 / 36 (⇐=) Partition t as [t01 , . . . , t0p ]0 . Then t0 Σt = p X p X t0i Σij tj . i=1 j=1 If Σij = 0 ∀ i 6= j, then 0 t Σt = p X t0i Σii ti . i=1 Thus, 0 0 MX (t) = et µ+t Σt/2 = e = p Y 0 0 Pp 0 0 i=1 (ti µi +ti Σii ti /2) eti µi +ti Σii ti /2 = i=1 p Y MXi (ti ). i=1 By Result 5.2, X1 , . . . , Xp are mutually independent. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 34 / 36 Corollary 5.3: Suppose X ∼ N(µ, Σ) Y 1 = a1 + B1 X, and Y 2 = a2 + B2 X. Then Y 1 and Y 2 are independent iff B1 ΣB02 = 0. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 35 / 36 Proof of Corollary 5.3: Let " Y= Y1 # Y2 = " # a1 a2 " + B1 B2 # X. Then Y MVN with " Var(Y) = B2 " = B1 # h i Σ B01 B02 B1 ΣB01 B1 ΣB02 B2 ΣB01 B2 ΣB02 # . By Result 5.4, Y 1 and Y 2 independent ⇐⇒ B1 ΣB02 = 0. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 36 / 36