Preliminaries c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 1 / 25 Notation for Scalars, Vectors, and Matrices Lowercase letters =⇒ scalars: x, c, σ. Boldface, lowercase letters =⇒ vectors: Boldface, uppercase letters =⇒ matrices: c Copyright 2016 Dan Nettleton (Iowa State University) x, y, A, β. X, Σ. Statistics 510 2 / 25 Transpose In STAT 510, a vector is a matrix with one column. The transpose of any matrix A is written as A0 . Thus, x is a column vector and x0 is a row vector. c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 3 / 25 Matrix Multiplication a0(1) A = [aij ] = ... = [a1 , . . . , an ] m×n a0(m) 0 b(1) .. B = [bij ] = . = [b1 , . . . , bk ]. n×k b0(n) Suppose and Then A B = C = [cij = m×n n×k n X m×k ail blj ] = [cij = a0(i) bj ] l=1 a0(1) B n X 0 = [Ab1 , . . . , Abk ] = ... = al b(l) . 0 l=1 a(m) B c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 4 / 25 Random Vectors A random vector is a vector whose components are random variables. c Copyright 2016 Dan Nettleton (Iowa State University) y1 y= y2 .. . yn Statistics 510 5 / 25 Expected Value of a Random Vector The expected value, or mean, of a random vector y is the vector of expected values of the components of y. y= E(y1 ) E(y2 ) =⇒ E(y) = .. . yn E(yn ) y1 y2 .. . c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 6 / 25 Variance of a Random Vector The variance of a random vector y = [y1 , y2 , . . . , yn ]0 is the matrix whose i, jth element is Cov(yi , yj ) (i, j ∈ {1, . . . , n}). Cov(y1 , y1 ) Cov(y1 , y2 ) · · · Cov(y1 , yn ) Cov(y2 , y1 ) Cov(y2 , y2 ) · · · Cov(y2 , yn ) Var(y) = .. .. .. .. . . . . Cov(yn , y1 ) Cov(yn , y2 ) · · · Cov(yn , yn ) c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 7 / 25 Variance of a Random Vector The covariance of a random variable with itself is the variance of that random variable. Thus, Var(y1 ) Cov(y1 , y2 ) · · · Cov(y1 , yn ) Cov(y2 , y1 ) Var(y2 ) · · · Cov(y2 , yn ) Var(y) = . . .. .. .. .. . . Cov(yn , y1 ) Cov(yn , y2 ) · · · Var(yn ) c Copyright 2016 Dan Nettleton (Iowa State University) . Statistics 510 8 / 25 Covariance Between Two Random Vectors The covariance between random vectors u = [u1 , . . . , um ]0 and v = [v1 , . . . , vn ]0 is the matrix whose i, jth element is Cov(ui , vj ) (i ∈ {1, . . . , m}, j ∈ {1, . . . , n}). Cov(u1 , v1 ) Cov(u1 , v2 ) · · · Cov(u1 , vn ) Cov(u2 , v1 ) Cov(u2 , v2 ) · · · Cov(u2 , vn ) Cov(u, v) = .. .. .. . . . Cov(um , v1 ) Cov(um , v2 ) · · · Cov(um , vn ) = E(uv0 ) − E(u)E(v0 ). c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 9 / 25 Linear Transformation of a Random Vector If y is an n × 1 random vector, A is an m × n matrix of constants, and b is an m × 1 vector of constants, then Ay + b is a linear transformation of the random vector y. c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 10 / 25 Mean and Variance of a Linear Transformation E(Ay + b) = AE(y) + b Var(Ay + b) = AVar(y)A0 c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 11 / 25 Standard Multivariate Normal Distributions iid If z1 , . . . , zn ∼ N(0, 1), then z1 . . z= . zn has a standard multivariate normal distribution: z ∼ N(0, I). c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 12 / 25 Multivariate Normal Distributions Suppose z is an n × 1 standard multivariate normal random vector, i.e., z ∼ N(0, In×n ). Suppose A is an m × n matrix of constants and µ is an m × 1 vector of constants. Then Az + µ has a multivariate normal distribution with mean µ and variance AA0 : z ∼ N(0, I) =⇒ Az + µ ∼ N(µ, AA0 ). c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 13 / 25 Multivariate Normal Distributions If µ is an m × 1 vector of constants and Σ is a m × m symmetric, non-negative definite (NND) matrix of rank n, then N(µ, Σ) signifies the multivariate normal distribution with mean µ and variance Σ. d If y ∼ N(µ, Σ), then y = Az + µ, where z ∼ N(0, In×n ) and A is an m × n matrix of rank n such that AA0 = Σ. c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 14 / 25 Non-Central Chi-Square Distributions If y ∼ N(µ, In×n ), then w ≡ y0 y = n X y2i i=1 has a non-central chi-square distribution with n degrees of freedom and non-centrality parameter µ0 µ/2: w ∼ χ2n (µ0 µ/2). (Some define the non-centrality parameter as µ0 µ rather than µ0 µ/2.) c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 15 / 25 Central Chi-Square Distributions If z ∼ N(0, In×n ), then w ≡ z0 z = n X z2i i=1 has a central chi-square distribution with n degrees of freedom: w ∼ χ2n . A central chi-square distribution is a non-central chi-square distribution with non-centrality parameter 0: w ∼ χ2n (0). c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 16 / 25 Positive and Non-Negative Definite Quadratic Forms x0 Ax is known as a quadratic form. We say that an n × n matrix A is positive definite (PD) iff A is symmetric (i.e., A = A0 ), and x0 Ax > 0 for all x ∈ Rn \ {0}. We say that an n × n matrix A is non-negative definite (NND) iff A is symmetric (i.e., A = A0 ), and x0 Ax ≥ 0 for all x ∈ Rn . c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 17 / 25 Important Distributional Result about Quadratic Forms Suppose Σ is an n × n positive definite matrix. Suppose A is an n × n symmetric matrix of rank m such that AΣ is idempotent (i.e., AΣAΣ = AΣ). Then y ∼ N(µ, Σ) =⇒ y0 Ay ∼ χ2m (µ0 Aµ/2). c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 18 / 25 Mean and Variance of Chi-Square Distributions If w ∼ χ2m (θ), then E(w) = m + 2θ c Copyright 2016 Dan Nettleton (Iowa State University) and Var(w) = 2m + 8θ. Statistics 510 19 / 25 Non-Central t Distributions Suppose y ∼ N(δ, 1). Suppose w ∼ χ2m . Suppose y and w are independent. p Then y/ w/m has a non-central t distribution with m degrees of freedom and non-centrality parameter δ: y p ∼ tm (δ). w/m c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 20 / 25 Central t Distributions Suppose z ∼ N(0, 1). Suppose w ∼ χ2m . Suppose z and w are independent. p Then z/ w/m has a central t distribution with m degrees of freedom: z p ∼ tm . w/m The distribution tm is the same as tm (0). c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 21 / 25 Non-Central F Distributions Suppose w1 ∼ χ2m1 (θ). Suppose w2 ∼ χ2m2 . Suppose w1 and w2 are independent. Then (w1 /m1 )/(w2 /m2 ) has a non-central F distribution with m1 numerator degrees of freedom, m2 denominator degrees of freedom, and non-centrality parameter θ: w1 /m1 ∼ Fm1 ,m2 (θ). w2 /m2 c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 22 / 25 Central F Distributions Suppose w1 ∼ χ2m1 . Suppose w2 ∼ χ2m2 . Suppose w1 and w2 are independent. Then (w1 /m1 )/(w2 /m2 ) has a central F distribution with m1 numerator degrees of freedom and m2 denominator degrees of freedom: w1 /m1 ∼ Fm1 ,m2 (which is the same as the Fm1 ,m2 (0) distribution). w2 /m2 c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 23 / 25 Relationship between t and F Distributions If u ∼ tm (δ), then u2 ∼ F1,m (δ 2 /2). c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 24 / 25 Some Independence (⊥) Results Suppose y ∼ N(µ, Σ), where Σ is an n × n PD matrix. If A1 is an n1 × n matrix of constants and A2 is an n2 × n matrix of constants, then A1 ΣA02 = 0 =⇒ A1 y ⊥ A2 y . If A1 is an n1 × n matrix of constants and A2 is an n × n symmetric matrix of constants, then A1 ΣA2 = 0 =⇒ A1 y ⊥ y0 A2 y. If A1 and A2 are n × n symmetric matrices of constants, then A1 ΣA2 = 0 =⇒ y0 A1 y ⊥ y0 A2 y. c Copyright 2016 Dan Nettleton (Iowa State University) Statistics 510 25 / 25