Linear Regression Yi = b0 + b1xi + ei Marco Lattuada Swiss Federal Institute of Technology - ETH Institut für Chemie und Bioingenieurwissenschaften ETH Hönggerberg/ HCI F135 – Zürich (Switzerland) E-mail: lattuada@chem.ethz.ch http://www.morbidelli-group.ethz.ch/education/index Regression Analysis • Aim: To know to which extent a certain response (dependent) variable is related to a set of explanatory (independent) variables. Y f x1 , x2 , , xN Response Observations • Example: James David Forbes (Edinburgh 1809-1868) Professor in glaciology. He measured the water boiling points and atmospheric pressures at 17 different locations in the Swiss alps (Jungfrau) and in Scotland with the aim of using the boiling temperature of water to estimate altitude. Tb log Patm Tb b0 b1 log Patm b0 b1 x Marco Lattuada – Statistical and Numerical Methods for Chemical Engineers Simple Linear Regressions – Page # 2 Regression Model Input data: vectors x and Y, where: • xi → i-th observation • Yi → i-th response, or measurement Model: Y = b0 + b1x + e or Yi = b0 + b1xi + ei Measurement Error Fundamental assumption: errors are mutually independent and normally distributed with mean zero and variance s2: ei N 0, s Output data: • bˆ 0 , bˆ 1 → estimated values of b0 and b1 Marco Lattuada – Statistical and Numerical Methods for Chemical Engineers Simple Linear Regressions – Page # 3 Residuals ei ei N 0, s 2 E Yi E b0 b1 xi ei b0 b1 xi i var Yi var b0 b1 xi ei var ei s2 E Y b0 b1 xi Yi i , s Marco Lattuada – Statistical and Numerical Methods for Chemical Engineers Simple Linear Regressions – Page # 4 Estimation of the Parameters Least Square Method: N obs S b0 , b1 Yi b0 b1 xi i 1 2 The objective function (S) expresses a measure of the closeness between the regression line and the observations I want to find the minimum of S Minimum of S: S b 0 0 S 0 b1 Marco Lattuada – Statistical and Numerical Methods for Chemical Engineers Simple Linear Regressions – Page # 5 xi x Yi Y ˆ b1 2 x x i ˆ b0 Y bˆ 1 x Example Marco Lattuada – Statistical and Numerical Methods for Chemical Engineers Simple Linear Regressions – Page # 6 Example: Parameter Estimation Averages Estimation of b0 and b1 Y bˆ 0 bˆ1 x Marco Lattuada – Statistical and Numerical Methods for Chemical Engineers Simple Linear Regressions – Page # 7 Example: Matlab Regression Routine 1 x1 Y1 X ,Y 1 xN YN obs obs a = confidence interval Marco Lattuada – Statistical and Numerical Methods for Chemical Engineers Simple Linear Regressions – Page # 8 Residuals Outlier N obs ei 0 i 1 N obs ei xi 0 i 1 Marco Lattuada – Statistical and Numerical Methods for Chemical Engineers Simple Linear Regressions – Page # 9 Removal of the Outlier Marco Lattuada – Statistical and Numerical Methods for Chemical Engineers Simple Linear Regressions – Page # 10 Analysis of Variance (ANOVA) Total Sum of Squares SSTO Yi Y N obs 2 i 1 Sum of Squares due to Regression N obs SSR Yˆi Y i 1 R2 2 Sum of Squares due to Error N obs SSE Yˆi Yi i 1 2 Coefficient of Determination N obs ei2 i 1 Marco Lattuada – Statistical and Numerical Methods for Chemical Engineers Simple Linear Regressions – Page # 11 SSR SSE 1 SSTO SSTO R2 = 1 ei = 0 R2 = 0 regression does not explain variation of Y Regression Analysis with Matlab Regression Routine Interval of confidence Alessandro Butté – Statistical and Numerical Methods for Chemical Engineers Simple Linear Regressions – Page # 12 Regression Analysis with Matlab Residuals Confidence interval for the residuals Alessandro Butté – Statistical and Numerical Methods for Chemical Engineers Simple Linear Regressions – Page # 13 Multiple Linear Regression Approximate model: Yˆ xT βˆ Y1 1 x1,1 Yn 1 xn,1 Y Xβˆ ε x1, p 1 bˆ 0 e0 xn, p 1 bˆ p 1 e n Residuals ˆ ε YY Least Squares min ε 2 ˆ min Y Y 2 XT Xβˆ XT Y Sum of Square Residuals (SSR) Marco Lattuada – Statistical and Numerical Methods for Chemical Engineers Simple Linear Regressions – Page # 14