Lecture 1. Matrix Algebra and Statistics Matrix Algebra (brief review) • Definition of Matrix, Vector (Column Vector, Row Vector). An m × n matrix A is a rectangular array of scalar numbers ⎞ ⎛ A =(aij)m×n = ⎜⎜⎜⎜⎝ a11 a12 ... a1n a21 a22 ... a2n .. . . . ... . .. . am1 am2 ... amn ⎟⎟⎟⎟⎠ m×n • • • • ' Transpose. A Addition. A + B Multiplication. AB Trace of a square matrix A =(aij)n×n is the sum of its diagonal elments n ntr(A)= aii i=1 A useful property of the trace: tr(AB) = tr(BA) • Eigenvalues and Eigenvectors. For a square matrix A, if we can find a scalar λ and a non-zero vector x such that Ax = λx λ will be called eigenvalue, and x is called eigenvector. • A square matrix A =(aij)n×n has n eigenvalues and n eigenvectors correspondingly. The trace, rank and determinant of A are given by n n tr(A)= λi i=1 rank(A) = The number of nonzero eigenvalues n |A| = λi i=1 • Identity matrix. ⎞ ⎛ I= ⎜⎜⎜⎜⎝ ⎟⎟⎟⎟⎠ n×n If the identity matrix of of order n, we often write In to emphasize this. 1 1 0 0 1 .. . .. . . . . 0 0 . . . . 0 0 ... . . .. . 1 • Orthogonal matrix. A square matrix An×n is said to be orthogonal if ' ' AA = A A = In • Inverse of a matrix. A square matrix An×n is said to be invertible or non-singular if we can find a matrix B such that AB = BA = In, −1 then B is called the inverse of A, denoted by A . • For a 2 × 2 matrix A, ab A= cdthe determinant of A is |A| = ad − bc. If |A| = 0, we have 1d − b A −1 = ad − • bc− ca Symmetric matrix. Matrix A is said to be symmetric if ' A = A . If A is a symmetric matrix, all its eigenvalues are real numbers. • Spectral Decomposition Theorem. Any symmetric matrix A can be written as ' A = P ΛP, where Λ is a diagonal matrix of eigenvalues of A, and P is an orthogonal matrix whose ' ' columns are standardized eigenvectors, PP = P P = I. • Idempotent matrix. A square matrix A is said to be idempotent if AA = A. Idempotent matrix has the following important properties: • tr (A) = rank (A) • rank (A)= n ⇐ ⇒ A = In • If A is symmetric and idempotent =⇒ A is positive semi-definite. • Definite matrix. • A matrix A is said to be positive definite if for all nonzero vectors x, x Ax > 0. Or all its • ' eigenvalues are positive. ' • A matrix A is said to be semi-positive definite if for all nonzero vectors x, x Ax ≥ all its eigenvalues are non-negative. ' 0. Or • A matrix A is said to be negative definite if for all nonzero vectors x, x Ax < 0. Or all its eigenvalues are negative. 2 Statistics • Basic concepts. Population, sample, independence, random variable, mean, variance, confidence intervals, hypothesis testing (Type I error, Type II error, p-value, test statistic, significance level). • Normal Distribution: y ∼ 2 N(µ, σ ), let y − z= µ σ Then z has the standard normal distribution: z ∼ N(0, 1). 2 • Chi-square or χ distribution Let z1,...,zk be independent standard normal random variables, i.e., z1,...,zk ∼ NID(0, 1), then the random variable 22 x = z1 + ... + z k follows the chi-square distribution with k degrees of freedom. • 2 t distribution: if z and χ are independent standard normal and chi-square random k variables, respectively, the random variable z tk = χ 2k k follows the t distribution with k degrees of freedom, denoted tk. • 2 2 F distribution: if χ and χ are two independent chi-square random variables with u uv and v degrees of freedom, respectively, then the ratio 2u χu 2v χv follows the F distribution with u numerator degrees of freedom and v denominator degrees of freedom. Often used matrix and vectors Fu,v = • n × 1 vector ⎤⎡ ⎤⎡ ⎤⎡ y1 1 µ y2 , 1= ⎢⎢⎢⎢⎣ 1 ⎥⎥⎥⎥ ⎦ , µ= ⎢⎢⎢⎢⎣ µ ⎥⎥⎥⎥ ⎦ y= y n 1 µ 3 • n × n matrix ⎥⎥⎥⎥⎦ 01 ... 0 11 ... 1 ⎤ In ⎡ = ⎤ , ⎡ 10 ... 0 11 ... 1 .. . ⎢⎢⎢⎢⎣ . .. . . .. ⎥⎥⎥⎥⎦ .. , .. Jn .. = ⎢⎢⎢⎢⎣ . ⎡ V = Var(y)= . .. ⎢⎢⎢⎢⎣ var(y1) cov(y1,y2) ... cov(y1,yn) cov(y2,y1) var(y2) ... cov(y2,yn) .. . . . .. . ... . cov(yn,y1) cov(yn,y2) ... var(yn) .. . ⎥⎥⎥⎥⎦ . 00 ... 1 11 ... 1 ⎤ Linear Model Theorems Theorem 1: If y ∼ Theorem 2: If y ∼ N(µ, V), and l = By + b, then l ∼ T N(0, V) and q = y Ay, then q ∼ T N(Bµ + b, BVB ) 2 χ with r = rank(A) r if and only if AV is idempotent. Theorem 3: If y ∼ T N(µ, V), l = By and q = y Ay then q and l are T independent ⇐ ⇒ BV A =0 Theorem 4: For a quadratic form q = y Ay, where E(y)= µ, Var(y = V T , then E(q) = trAV + µ Aµ 4