Database Mining CSCI 4390/6390 Lecture 3: Linear Algebra, Convex Problems Wei Liu IBM T. J. Watson Research Center Sep 5, 2014 1 Announcements Course website http://www.ee.columbia.edu/~wliu/Database_Mining_Course.html Email to TA ( lih13@rpi.edu ) who will add you into the mailinglist and Google group. I hope that you have nice mathematical and programming skills, particularly linear algebra, calculus, probability, statistics, and algorithms. 2 Overview Linear algebra Vector & matrix Linear independence & rank Linear space and subspace Linear systems Eigenvalue & eigenvector 3 Overview Convex problems Convex function and convex set Local and global optima Convex quadratic forms Least squares 4 Linear Algebra Vector & matrix Linear independence & rank Linear space and subspace Linear systems Eigenvalue & eigenvector 5 Vector attribute, dimension, feature dimensionality or total dimensions A vector represents a specific data object. 6 Matrix row vector column vector A matrix represents a set of data objects. 7 Tensor A tensor is a multi-order array, e.g., a cube is a three-order array. Tensor decomposition = matrix X matrix X matrix outer product 8 Matrix Operations For size compatible matrices, generally For size compatible matrices, the followings hold: For a square matrix A, if AB=BA=I, then For invertible matrices A and B 9 Matrix Operations If a square matrix A has standard orthogonal column or row vectors, then A is an orthogonal matrix. For any orthogonal matrix A, we have A square matrix A is symmetric if and only if The Frobenius norm of any matrix A is 10 Linear Algebra Vector & matrix Linear independence & rank Linear space and subspace Linear systems Eigenvalue & eigenvector 11 Linear Dependence vs. Linear Independence For a set of vectors : if there exist not all zeros coefficients such that are linearly dependent; if there only exist all zeros coefficients such that are linearly independent. 12 Linear Dependence vs. Linear Independence x z y linearly dependent linearly independent 13 Rank of Matrices Rank(A) = the largest number of linearly independent column vectors = the largest number of linearly independent row vectors Rank(A) <= min(m,n) Rank(A) = n implies column full-rank; Rank(A) = m implies row full-rank. 14 Linear Algebra Vector & matrix Linear independence & rank Linear space and subspace Linear systems Eigenvalue & eigenvector 15 Linear Space A linear space F is a set of vectors, which satisfies: 1) 0 (possibly virtual) is in F ; 2) for any two vectors a and b in F, a+b is in F ; 3) for any vector c in F, is in F for any constant . If any vector a in a linear space F can be represented as a linear combination of linearly independent vectors , such vectors are called a basis of F. An orthonormal basis satisfies d is the dimensionality of the linear space F. 16 Inner-Product Space An inner-product operator < , > defined over F satisfies: 1) for any two vectors a and b in F, <a,b> = <b,a> ; 2) for any two vectors a and b in F, and any constant 3) for any three vectors a, b, and c in F, 4) for any vector a in F, The vector’s length (L2 norm) 5) <a,a>=0 if and only if a=0. A linear space with a defined inner-product is an inner-product space. 17 Linear Subspace A linear subspace S is a subset of the linear space F, which satisfies: 1) 0 (possibly virtual) is in S ; 2) for any two vectors a and b in S , a+b is in S ; 3) for any vector c in S, is in S for any constant . For example, any line or plane passing through the origin is a subspace in . Any set of vectors in a linear space F can span a subspace S: The dimensionality of S is 18 Linear Algebra Vector & matrix Linear independence & rank Linear space and subspace Linear systems Eigenvalue & eigenvector 19 Equally-Constrained Linear System Solve Ax = b, where constant vector . , and variable vector When m=n, if Rank(A) = n, then the linear system has a unique solution . 20 Over-Constrained Linear System Solve Ax = b, where constant vector . , and variable vector When m>n, if Rank(A) = n, then the linear system has a solution if and only if . The solution is also unique . 21 Under-Constrained Linear System Solve Ax = b, where constant vector . , and variable vector When m<n, if Rank(A) = m, then the linear system has numerous solutions whatever b is. 22 Linear Algebra Vector & matrix Linear independence & rank Linear space and subspace Linear systems Eigenvalue & eigenvector 23 Eigenvalue and Eigenvector For a square matrix A , if holds, then is an eigenvalue of A, and x is an eigenvector of A. We usually hope to obtain real eigenvalues, and use normalized eigenvectors, i.e., . All eigenvalues can be solved from the equation Any symmetric matrix has all real eigenvalues. In Matlab, [V,E] = eig(A) (full eigenvalues/eigenvectors) or [V,E] = eigs(A,r) (top-r eigenvalues/eigenvectors). 24 Eigenvalue and Eigenvector If a symmetric matrix A is invertible, then A must have no zero eigenvalue. The full eigen-decomposition of a symmetric matrix A is where and includes normalized eigenvectors, . A matrix A is positive definite if and only if . A matrix A is positive semidefinite (note that A may be singular) if and only if . 25 Convex Problems Convex function and convex set = convex problem Local and global optima => any local opt is a global opt Convex quadratic forms => simplest convex problem Least squares => regularized least squares has unique opt 26