Ordinary Least-Squares Ordinary Least-Squares Outline • Linear regression • Geometry of least-squares • Discussion of the Gauss-Markov theorem Ordinary Least-Squares One-dimensional regression b a Ordinary Least-Squares One-dimensional regression Find a line that represent the ”best” linear relationship: b b ax a Ordinary Least-Squares One-dimensional regression • Problem: the data does not go through a line b ei bi ai x bi ai x a Ordinary Least-Squares One-dimensional regression • Problem: the data does not go through a line b ei bi ai x • Find the line that minimizes the sum: bi ai x (b a x) i i a Ordinary Least-Squares i 2 One-dimensional regression • Problem: the data does not go through a line b ei bi ai x • Find the line that minimizes the sum: bi ai x (b a x) i i i a • We are looking for minimizes xˆ that e( x) (bi ai x) i Ordinary Least-Squares 2 2 Matrix notation Using the following notations a1 a : an Ordinary Least-Squares and b1 b : bn Matrix notation Using the following notations a1 b1 and a : b : an bn we can rewrite the error function using linear algebra as: e( x) (bi ai x) 2 i (b xa )T (b xa ) e( x) b xa Ordinary Least-Squares 2 Matrix notation Using the following notations a1 b1 and a : b : an bn we can rewrite the error function using linear algebra as: e( x) (bi ai x) 2 i (b xa )T (b xa ) e( x) b xa Ordinary Least-Squares 2 Multidimentional linear regression Using a model with m parameters b a1 x1 ... am xm a j x j j Ordinary Least-Squares Multidimentional linear regression Using a model with m parameters b a1 x1 ... am xm a j x j j b a1 Ordinary Least-Squares a2 Multidimentional linear regression Using a model with m parameters b a1 x1 ... am xm a j x j j b a1 Ordinary Least-Squares a2 Multidimentional linear regression Using a model with m parameters b a1 x1 ... am xm a j x j j and n measurements n m i 1 j 1 e( x ) (bi ai , j x j ) b ai , j x j j 1 m b Ax Ordinary Least-Squares 2 2 2 Multidimentional linear regression Using a model with m parameters b a1 x1 ... am xm a j x j j and n measurements n m i 1 j 1 e( x ) (bi ai , j x j ) b ai , j x j j 1 m e( x ) b Ax Ordinary Least-Squares 2 2 2 b Ax b1 a1,1 .. a1,m x1 b Ax : : : . : bn an ,1 .. an ,m xm Ordinary Least-Squares b Ax b1 a1,1 .. a1,m x1 b Ax : : : . : bn an ,1 .. an ,m xm b1 (a1,1 x1 ... a1,m xm ) : bn (an ,1 x1 ... an ,m xm ) Ordinary Least-Squares b Ax parameter 1 b1 a1,1 .. a1,m x1 b Ax : : : . : bn an ,1 .. an ,m xn b1 (a1,1 x1 ... a1,m xm ) : bn (an ,1 x1 ... an ,m xm ) Ordinary Least-Squares b Ax parameter 1 b1 a1,1 .. a1,m x1 b Ax : : : . : bn an ,1 .. an ,m xn b1 (a1,1 x1 ... a1,m xm ) : bn (an ,1 x1 ... an ,m xm ) Ordinary Least-Squares measurement n Minimizing e(x ) x min minimizes e(x) if Ordinary Least-Squares Minimizing e(x ) x min minimizes e(x) if e(x ) x min Ordinary Least-Squares Minimizing e(x ) e(x) is flat at x min x min minimizes e(x) if e(x ) x min Ordinary Least-Squares Minimizing e(x ) e(x) is flat at x min e( x min ) 0 x min minimizes e(x) if e(x ) x min Ordinary Least-Squares Minimizing e(x ) e(x) is flat at x min e( x min ) 0 x min minimizes e(x) if e(x) does not go down around x min e(x ) x min Ordinary Least-Squares Minimizing e(x ) e(x) is flat at x min e( x min ) 0 x min minimizes e(x) if e(x) does not go down around x min H e ( x min ) is positive semi - definite e(x ) x min Ordinary Least-Squares Positive semi-definite A is positive semi - definite x T Ax 0 , for all x In 1-D Ordinary Least-Squares In 2-D Minimizing e(x ) 1 T x H e ( xˆ ) x 2 e(x ) xˆ Ordinary Least-Squares Minimizing e( x ) b Ax 2 1 T e( x ) x H e ( xˆ ) x 2 xˆ Ordinary Least-Squares Minimizing e( x ) b Ax 2 T ˆ A Ax A b T xˆ minimizes e(x) if 2 A A is positive T semi - definite Ordinary Least-Squares Minimizing e( x ) b Ax 2 T ˆ A Ax A b T xˆ minimizes e(x) if 2 A A is positive T semi - definite Always true Ordinary Least-Squares Minimizing e( x ) b Ax 2 T ˆ A Ax A b T The normal equation xˆ minimizes e(x) if 2 A A is positive T semi - definite Always true Ordinary Least-Squares Geometric interpretation Ordinary Least-Squares Geometric interpretation • b is a vector in Rn b Ordinary Least-Squares Geometric interpretation • b is a vector in Rn • The columns of A define a vector space range(A) b a1 a2 Ordinary Least-Squares Geometric interpretation • b is a vector in Rn • The columns of A define a vector space range(A) • Ax is an arbitrary vector in range(A) b a1 x1a1 x2a 2 Ax a2 Ordinary Least-Squares Geometric interpretation • b is a vector in Rn • The columns of A define a vector space range(A) • Ax is an arbitrary vector in range(A) b b Ax a1 x1a1 x2a 2 Ax a2 Ordinary Least-Squares Geometric interpretation • Axˆ is the orthogonal projection of b onto range(A) AT b Axˆ 0 AT Axˆ AT b b b Axˆ a1 xˆ1a1 xˆ2a 2 Axˆ a2 Ordinary Least-Squares The normal equation: AT Axˆ AT b Ordinary Least-Squares The normal equation: AT Axˆ AT b • Existence: Ordinary Least-Squares AT Axˆ AT b has always a solution The normal equation: AT Axˆ AT b • Existence: AT Axˆ AT b has always a solution • Uniqueness: the solution is unique if the columns of A are linearly independent Ordinary Least-Squares The normal equation: AT Axˆ AT b • Existence: AT Axˆ AT b has always a solution • Uniqueness: the solution is unique if the columns of A are linearly independent b Axˆ Ordinary Least-Squares a2 a1 Under-constrained problem b a1 a2 Ordinary Least-Squares Under-constrained problem b a1 a2 Ordinary Least-Squares Under-constrained problem b a1 a2 Ordinary Least-Squares Under-constrained problem b • Poorly selected data • One or more of the parameters are redundant a1 a2 Ordinary Least-Squares Under-constrained problem b • Poorly selected data • One or more of the parameters are redundant a1 Add constraints AT Ax AT b with minx x Ordinary Least-Squares a2 How good is the least-squares criteria? • Optimality: the Gauss-Markov theorem Ordinary Least-Squares How good is the least-squares criteria? • Optimality: the Gauss-Markov theorem Let bi and x j be two sets of random variables and define: ei bi ai ,1 x1 ... ai ,m xm Ordinary Least-Squares How good is the least-squares criteria? • Optimality: the Gauss-Markov theorem Let bi and x j be two sets of random variables and define: ei bi ai ,1 x1 ... ai ,m xm If A1 : ai,j are not random variables, A2 : E(ei ) 0, for all i, A3: var(ei ) , for all i, A4 : cov(ei ,e j ) 0, for all i and j , Ordinary Least-Squares How good is the least-squares criteria? • Optimality: the Gauss-Markov theorem Let bi and x j be two sets of random variables and define: ei bi ai ,1 x1 ... ai ,m xm If A1 : ai,j are not random variables, A2 : E(ei ) 0, for all i, A3: var(ei ) , for all i, A4 : cov(ei , e j ) 0, for all i and j , 2 ˆ Then x arg min x ei is the best unbiased linear estimator Ordinary Least-Squares b ei no errors in ai Ordinary Least-Squares a b b ei no errors in ai Ordinary Least-Squares ei a errors in ai a b homogeneous errors Ordinary Least-Squares a b b homogeneous errors Ordinary Least-Squares a non-homogeneous errors a b no outliers Ordinary Least-Squares a outliers b b no outliers Ordinary Least-Squares a outliers a