Random Vectors and Matrices • Elements of a random matrix/vector X are jointly distributed random variables. • Mean E(X) of a random matrix/vector X is the matrix/vector of elementwise means: E(X)i,j = E Xi,j • Covariance matrix Σ of a random vector X is the matrix of pairwise covariances: Cov(X)i,k = σi,k = Cov (Xi, Xk ) 1 Properties of Expected Value • Addition: E(X + Y) = E(X) + E(Y) – generalizes the scalar property E(X + Y ) = E(X) + E(Y ). • Multiplication by constants: E(AXB) = AE(X)B – generalizes the scalar property E(cX) = cE(X). 2 Covariance Matrix of a Random Vector • Defined by Cov(X)i,k = σi,k = Cov (Xi, Xk ) • Also satisfies Cov(X) = E{[(X − E(X)][(X − E(X)]0} – generalizes the scalar definition var(X) = E{[X − E(X)]2}. 3 Correlation Matrix of a Random Vector • Pairwise correlations σi,k ρi,k = Corr (Xi, Xk ) = √ √ . σi,i σk,k • Correlation matrix ρ defined by (ρ)i,k = ρi,k . 4 • Standard deviation matrix is the diagonal matrix √ σ1,1 0 ... 0 √ 0 σ2,2 . . . 0 1/2 V = ... ... ... ... √ σp,p 0 0 ... • Then V1/2ρV1/2 = Σ, and ρ = V−1/2ΣV−1/2. 5 Note that multiplying a matrix A by a diagonal matrix D: • on the left multiplies each row of A by the corresponding diagonal entry of D; • on the right multiplies each column of A by the corresponding diagonal entry of D. 6 Linear Combinations • One way of analyzing multivariate data is to reduce it to univariate form by taking a linear combination of the variables. • If the multivariate data is modeled as a random vector X and c is a vector of weights, the corresponding linear combination is c0X. • Moments: mean = E(c0X) = c0E(X) = c0µ, variance = Var(c0X) = c0Var(X)c = c0Σc. 7 • More generally, we might work with q linear combinations, typically with q < p. • Write C for the q × p matrix whose rows contain the coefficients of the linear combinations, and Z = CX as the q-vector of their values. • Then µZ = E(Z) = E(CX) = CE(X) = CµX, ΣZ = Cov(Z) = Cov(CX) = CCov(X)C0 = CΣXC0. 8 Matrix Inequalities • Cauchy-Schwarz Inequality: (b0d)2 ≤ (b0b)(d0d) with equality iff b = cd for some c. • Maximization Lemma: (x0d)2 max 0 = d0B−1d x6=0 x Bx with the maximum attained when x = cB−1d for any c 6= 0. 9