Moment-generating Function of the Multivariate Normal Distribution • If X ∼ Np(µ, Σ), then the moment-generating function is 0 given by mX(t) ≡ IE{exp (t0X)} = exp (t0µ + 1 2 t Σt). 134 More Features of the Multivariate Normal Distribution • If X ∼ Np(µ, Σ), then a linear combination a0X is distributed as N1(a0µ, a0Σa). • If X ∼ Np(µ, Σ), then a set of q linear combinations Aq×pXp×1 is distributed as Nq (Aµ, AΣA0). • The above may be proved using moment-generating functions. • Example: Let X ∼ N3(µ, Σ) and let " AX = X1 − X2 X2 − X3 # " = 1 −1 0 0 1 −1 # X1 X2 . X3 135 Example (cont’d) • The mean of AX is: " E(AX) = Aµ = 1 −1 0 0 1 −1 # # " µ1 µ1 − µ2 . µ2 = µ2 − µ3 µ3 • The variance of AX is AΣA0 = " " = 1 −1 0 0 1 −1 # σ11 σ12 σ13 1 0 σ21 σ22 σ23 −1 1 σ31 σ32 σ33 0 −1 σ11 − 2σ12 + σ22 σ12 + σ23 − σ22 − σ13 σ12 + σ23 − σ22 − σ13 σ22 − σ23 + σ33 # 136 . More Features of the Multivariate Normal Distribution • If X ∼ Np(µ, Σ), then any subset of X also has a normal distribution. If we partition Xp×1 into two subvectors X1,q×1 and X2,(p−q)×1, then X1,q×1 ∼ Nq (µ1, Σ11) X2,(p−q)×1 ∼ N(p−q)(µ2, Σ22), where µ1 is q × 1, Σ11 is q × q, µ2 is (p − q) × 1 and Σ22 is (p − q) × (p − q). • The off-diagonal block Σ12 = Σ021 has the covariances between the elements of X1 and X2. • If X1 and X2 are independent, then Σ12 = Σ021 = 0. 137 More Features of the Multivariate Normal Distribution • If X1,q×1 ∼ Nq (µ1, Σ11) and X2,(p−q)×1 ∼ N(p−q)(µ2, Σ22) and X1 and X2 are independent, then " X1 X2 # " ∼ Np µ1 µ2 # " , Σ11 0 0 Σ22 #! • The joint distribution of X1 and X2 is not necessarily multivariate normal when X1 and X2 are independent. 138 Construction of Other Multivariate Distributions • Any set of marginal cdf’s: F1(x1) and F2(x2) • The cdf of a bivariate distribution with those marginal distibutions is F (x1, x2) = [F1(x1)]α11/(α11+α12) [F2(x2)]α22/(α22+α12) × [F1(x1)] • Use F1(x1) = Φ x1σ−µ1 1 −1/(α11 +α12 ) + [F2(x2)] −1/(α22 +α12 ) and F2(x2) = Φ x2σ−µ2 2 −1 to obtain a bivariate distribtuion with normal marginal distributions. 139 −α12 Construction of Other Multivariate Distributions • This can be extended to higher dimensional distributions • References – Johnson and Cook (1986) Technometrics – Koehler and Symanowski (1995) Journal of Multivariate Analysis 140 Conditional Distributions • Let X ∼ Np(µ, Σ) and consider the partition X1, X2 with |Σ22| > 0. The conditional distribution of X1 given X2 = x2 is Nq (µ1|2, Σ1|2) where µ1|2 = µ1 + Σ12Σ−1 22 (x2 − µ2 ) Σ1|2 = Σ11 − Σ12Σ−1 22 Σ21 . • Note two things: 1. The conditional mean is a linear function of the value x2. 2. The conditional variance does not depend on x2. 141 Example: Bivariate Conditional Density • The conditional density of X1 given that X2 = x2 is: f (x1, x2) f (x1|x2) = , f (x2) where f (x2) is the marginal density of X2. From before, we know that σ E(X1|X2 = x2) = µ1 + 12 (x2 − µ2) σ22 2 σ12 . var(X1|X2 = x2) = σ11 − σ22 • To show that this is the case, just plug in the expressions for f (x1, x2) in the numerator above and f (x2) in the denominator and do the algebra. Or see page 162 in textbook. 142 Conditional Distributions (cont’d) • For the MVN distribution, all conditional disrtibutions are also normal, and the conditional mean of Xq given X(p−q) = x(p−q) is of the form µq +βq,q+1(xq+1−µq+1)+βq,q+2(xq+2−µq+2)+...+βq,p(xp−µp), where β1,q+2 β 1,q+1 β2,q+1 β2,q+2 ... ... βq,q+1 βq,q+2 ... β1,p ... β2,p ... ... ... βq,p = Σ Σ−1. 12 22 • The conditional covariance matrix Σ11 − Σ12Σ−1 22 Σ21 does not depend on the value of the conditioning variables. 143 Example X1 X2 X3 X4 ∼ N 1 2 0 3 , 4 0 1 3 " Find the conditional distribution of 0 4 1 1 X1 X3 1 1 3 1 3 1 1 9 # given X2 = x2 and X4 = x4. First reorder the variables X1 X3 X2 X4 ∼ N 1 0 2 3 , 4 1 0 3 1 3 1 1 0 1 4 1 3 1 1 9 144 Example The conditional distribution is bivariate normal with mean vector " 1 0 # " + 0 3 1 1 #" 4 1 1 9 #−1 " x2 − 2 x4 − 3 # 1 5 − 3x2 + 12x4 = 35 −25 + 8x2 + 3x4 and covariance matrix " 4 1 1 3 # " − 0 3 1 1 #" 4 1 1 9 #−1 " 0 1 3 1 # = 1 35 " 104 26 26 94 # 145 Linear Combinations of Independent Normal Random Vectors For j=1,2,...,n, suppose Xj ∼ Np(µj , Σj ) and the vectors are independent. Suppose c1, c2, ..., cn are scalar constants. Then Y =b+ n X cj Xj ∼ Np b + j=1 n X j=1 cj µj , n X c2 j Σj j=1 Proof: See result 4.8 in the text. 146 Linear Combinations of Independent Normal Random Vectors For j=1,2,...,n, suppose Xj ∼ Np(µj , Σj ) and the vectors are independent. Suppose C1, C2, ..., Cn are r × p matrices of constants and b is an r × 1 vector. Then Y =b+ n X j=1 Cj Xj ∼ Nr b + n X j=1 Cj µj , n X Cj Σj CjT j=1 147 Probability Content of Ellipsoids • Let X ∼ Np(µ, Σ), with |Σ| > 0. Then 1. (X − µ)0Σ−1(X − µ) is distributed as χ2 p. 2. Probability 1 − α is assigned to ellipsoids defined by {x : (x − µ)0Σ−1(x − µ) ≤ χ2 p (α)}, where χ2 p (α) is the (1−α)×100th percentile of the central chi-square distribution with p degrees of freedom. 148 Probability Content - Proof • We know that the central chi-square distribution with p degrees of freedom is the distribution of the sum of the squares of p independent standard normal random variables, i.e., Z12 + Z22 + ... + Zp2 with Zj ∼ N(0, 1). • We need to show that the quadratic form (x − µ)0Σ−1(x − µ) is the sum of p independent squared standard normal random variables. We use the spectral decomposition of the inverse covariance matrix. • The spectral decomposition of Σ−1 is Σ−1 = p X 0. λ−1 e e j j j j=1 149 Probability content - Proof (cont’d) • Then, we write (X − µ)0Σ−1(X − µ) as (X − µ)0 p X 1 j=1 λj ej e0j (X − µ) = = = = p X 1 j=1 λj p X (X − µ)0ej e0j (X − µ) 1 0 (ej (X − µ))2 j=1 λj p X j=1 p X −1/2 0 ej (X − µ)]2 [λj Zj2. j=1 150 Probability content - Proof (cont’d) • We can write Z = A(X − µ), with Z 0 = [Z1, Z2, ..., Zp]0 and A −1/2 0 ej . as a p × p matrix with jth row equal to λj • Since X − µ ∼ Np(0, Σ), then Z ∼ Np(0, AΣA0). • But using the spectral decomposition of Σ, we find that AΣA0 = I, which then means that the elements of Z are independent standard normal random variables. Pp • Then j=1 Zj2 is distributed as χ2 p. 151