Lab 8: MATLAB – Curve Fitting – Least Squares Regression ➢ PREPARED BY ENG. SARAH ZEIDAN ➢ SUBJECT: NUMERICAL ANALYSIS LAB – MATH 284L ➢ DUE ON: THURSDAY, 31, OCT. 2024 31 OCT. 2024 MATH 284L – Eng. Sarah Zeidan 2 Outline Curve Fitting – Linear Regression Recall Method 3 – Using Curve Fitting Apps/Tool Steps Example 1 By-hand/Excel Solution Notations MATLAB Solutions Notes Linear Model Method 1 – Basic Code Steps Note + Code Method 2 – Using ployfit() function Steps Code Extra Practice Assignment Five Exercise 1 Exercise 2 Reference & Useful Links 31 OCT. 2024 MATH 284L – Eng. Sarah Zeidan Curve Fitting – Linear Regression – Recall 3 Equations & Formulas Used Line Equation: 𝑦𝑖 = 𝑎0 + 𝑎1 𝑥𝑖 σ𝑛𝑖=1 𝑦𝑖 = 𝑛𝑎0 + 𝑎1 σ𝑛𝑖=1 𝑥𝑖 σ𝑛𝑖=1 𝑦𝑖 𝑥𝑖 = 𝑎0 σ𝑛𝑖=1 𝑥𝑖 + 𝑎1 σ𝑛𝑖=1 𝑥𝑖 2 𝑛: 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑝𝑜𝑖𝑛𝑡𝑠 σ𝑛𝑖=1 𝑦𝑖 𝑦= 𝑛 Normal Equations: (used to compute 𝑎0 & 𝑎1 ) Arithmetic Mean: 𝑛 The Total Sum of the residuals squared: 𝑆𝑡 = (𝑦𝑖 −𝑦)2 𝑖=1 The Standard Error of the Estimates (𝑆𝑦/𝑥 ): 𝑺𝒓 : the sum of squares of the residuals. 𝑛 𝑆𝑦/𝑥 = 𝑆𝑟 , 𝑆𝑟 = [𝑦𝑖 −(𝑎0 + 𝑎1 𝑥𝑖 )]2 𝑛−2 𝑖=1 The Correlation Coefficient (𝑟): 𝑟= 𝑆𝑡 − 𝑆𝑟 × 100 → r = 100 for perfect fit. 𝑆𝑡 31 OCT. 2024 MATH 284L – Eng. Sarah Zeidan 4 Curve Fitting – Linear Regression – Example 1 ➢ Use the least-square regression to fit a straight line to the given data. ➢ Along with the slope and intercept, compute the standard error of the estimate and the error correlation coefficient. Plot the data and the regression line. ➢ You can find your instructor’s PDF solution on Moodle, “Sheet 1 – Solution”. x 0 2 4 6 8 10 y 4.5 8.2 11.5 15.8 21 23.5 31 OCT. 2024 MATH 284L – Eng. Sarah Zeidan 5 Curve Fitting – Linear Regression – Example 1 – By Hand/Excel Solution (1) Given Computed For the error calculations 𝒙 𝒚 𝒙×𝒚 𝒙𝟐 𝑺𝒓 𝑺𝒕 𝒚𝒊 = 𝟒. 𝟐𝟒𝟕𝟔𝟏 + 𝟏. 𝟗𝟔𝟕𝟏𝟒 𝒙𝒊 0 2 4 6 8 10 30 4.5 8.2 11.5 15.8 21 23.5 84.5 0 16.4 46 94.8 168 235 560.2 0 4 16 36 64 100 220 0.06370 0.00033 0.37967 0.06273 1.03077 0.17557 1.71277 91.83964 34.61322 6.67344 2.94706 47.84074 88.67424 272.58834 4.24761 8.18189 12.11617 16.05045 19.98473 23.91901 - MATH 284L – Eng. Sarah Zeidan 31 OCT. 2024 6 Curve Fitting – Linear Regression – Example 1 – By Hand/Excel Solution (2) ➢ Using the data above, the normal equations for the given set of data are: 84.5 = (6)𝑎0 + 𝑎1 30 𝑎 (𝑖𝑛𝑡𝑒𝑟𝑐𝑒𝑝𝑡) = 4.24761 𝑠𝑜𝑙𝑣𝑖𝑛𝑔 𝑡ℎ𝑒𝑠𝑒 𝑒𝑞𝑢𝑎𝑡𝑖𝑜𝑛𝑠 ⟹ 0 𝑎1 (𝑠𝑙𝑜𝑝𝑒) = 1.96714 560.2 = 𝑎0 30 + 𝑎1 220 ➢ Thus, the best fit is: 𝑦𝑖 = 4.24761 + 1.96714 𝑥𝑖 ➢ The Mean: 𝑦 = 14.0833 ➢ The Standard Error of the Estimates: 𝑆𝑦/𝑥 = 0.65436 ➢ The Correlation Coefficient: 𝑟 = 99.6853 % ≈ 99.69% MATH 284L – Eng. Sarah Zeidan Basic Code 31 OCT. 2024 7 Curve Fitting – Linear Regression – Example 1 – MATLAB Solutions – Linear Model – Method 1 – Steps ➢ Steps Followed: Define x and y, as the given set of data, and n to be the length of x. 2. Compute x*y & x squared, then the sum of x, of y, of x*y, and of x squared. 3. Solve the normal equations to get the intercept 𝐚𝟎 and the slope 𝐚𝟏 . This could be done as follows: 1. 1. 2. 3. Write the coefficient square matrix and the constant column vector, then either use: The right-division (backslash \) operator, or The inv() function Put your best fit (or the regression line) as, 𝒚𝒎𝒐𝒅𝒆𝒍 = 𝒂𝟎 + 𝒂𝟏 𝒙𝒊 5. Plot the original data set along with the regression line obtained in step 4. 6. Compute the standard error of the estimate (𝑆𝑦/𝑥 )and the error correlation coefficient (𝑟). 4. MATH 284L – Eng. Sarah Zeidan Basic Code 31 OCT. 2024 8 Curve Fitting – Linear Regression – Example 1 – MATLAB Solutions – Linear Model – Method 1 – NOTE ➢ Solving Linear Equations: ➢ With the matrix notation, a system of simultaneous linear equations is written as 𝑨𝒙 = 𝑩, where there are as many equations as unknown. ➢ 𝑨 is a given square matrix of order n, 𝑩 is a given column vector of n components and 𝒙 is an unknown column vector of n components. In linear algebra, the solution for 𝒙 is 𝒙 = 𝑨−𝟏 𝐁, 𝐰𝐡𝐞𝐫𝐞 𝑨−𝟏 𝒊𝒔 𝒕𝒉𝒆 𝒊𝒏𝒗𝒆𝒓𝒔𝒆 𝒐𝒇 𝑨. ➢ In our example, the normal equations are the system of equations to be solved (for a): ➢ 𝑠𝑢𝑚_𝑦 = 𝑛 ∗ 𝑎0 + 𝑎1 ∗ 𝑠𝑢𝑚_𝑥 84.5 = (6)𝑎0 + 𝑎1 30 ቊ → ቊ 𝑠𝑢𝑚_𝑥_𝑦 = 𝑎0 ∗ 𝑠𝑢𝑚_𝑥 + 𝑎1 ∗ 𝑠𝑢𝑚_𝑥_𝑠𝑞𝑢𝑎𝑟𝑒𝑑 560.2 = 𝑎0 30 + 𝑎1 220 ➢ 𝒏 𝒔𝒖𝒎_𝒙 𝒔𝒖𝒎_𝒚 The coefficient matrix 𝑨 = 𝒔𝒖𝒎_𝒙 𝒔𝒖𝒎_𝒙_𝒔𝒒𝒖𝒂𝒓𝒆𝒅 , and the vector 𝑩 = 𝒔𝒖𝒎_𝒙_𝒚 MATH 284L – Eng. Sarah Zeidan Basic Code 31 OCT. 2024 9 Curve Fitting – Linear Regression – Example 1 – MATLAB Solutions – Linear Model – Method 1 – NOTE ➢ Solving Linear Equations: ➢ 𝒏 𝒔𝒖𝒎_𝒙 𝒔𝒖𝒎_𝒚 The coefficient matrix 𝑨 = 𝒔𝒖𝒎_𝒙 𝒔𝒖𝒎_𝒙_𝒔𝒒𝒖𝒂𝒓𝒆𝒅 , and the vector 𝑩 = 𝒔𝒖𝒎_𝒙_𝒚 ➢ Our equations are solved for 𝑎0 and 𝑎1 , this could be done in MATLAB in two ways: 1) Using the backslash (\) operator, which uses the numerically reliable way of solving the system of linear equations, namely, the well-known process of Gaussian Elimination. ➢ 2) Define A and B, put a = A\B → a will be a column vector of 𝒂𝟎 = a(1) and 𝒂𝟏 = a(2). Using the matrix inverse function, inv() ➢ Define A and B, put a = inv(A)*B → a will be a column vector of 𝒂𝟎 = a(1) and 𝒂𝟏 = a(2). The matrix A MUST be a square matrix. MATH 284L – Eng. Sarah Zeidan Basic Code 31 OCT. 2024 10 Curve Fitting – Linear Regression – Example 1 – MATLAB Solutions – Linear Model – Method 1 – Steps ➢ Steps Followed: Define x and y, as the given set of data, and n to be the length of x. 2. Compute x*y & x squared, then the sum of x, of y, of x*y, and of x squared. 3. Solve the normal equations to get the intercept 𝐚𝟎 and the slope 𝐚𝟏 . This could be done as follows: 1. 1. 2. 3. Write the coefficient square matrix and the constant column vector, then either use: The left-division (backslash \) operator, or The inv() function Put your best fit (or the regression line) as, 𝒚𝒎𝒐𝒅𝒆𝒍 = 𝒂𝟎 + 𝒂𝟏 𝒙𝒊 5. Plot the original data set along with the regression line obtained in step 4. 6. Compute the standard error of the estimate (𝑆𝑦/𝑥 )and the error correlation coefficient (𝑟). 4. MATH 284L – Eng. Sarah Zeidan Basic Code 31 OCT. 2024 11 Curve Fitting – Linear Regression – Example 1 – MATLAB Solutions – Linear Model – Method 1 – Code % Step 1: Define x and y, and n as the length of x x = 0:2:10; y = [4.5 8.2 11.5 15.8 21 23.5]; n = length(x); Check the full commented code in the attached notepad file named “Lab_8_Curve_Fitting_Method_1”. % Step 2: Compute the required summations x_y = x.*y; % Step 3: Solve the normal equations x_squared = x.^2; B = [sum_y; sum_x_y]; sum_x = sum(x); A = [n sum_x; sum_x sum_x_squared]; sum_y = sum(y); a = A\B sum_x_y = sum(x_y); % a = inv(A)*B sum_x_squared = sum(x_squared); % Set a0 (intercept) and a1 (slope) a0 = a(1); a1 = a(2); MATH 284L – Eng. Sarah Zeidan Basic Code 31 OCT. 2024 12 Curve Fitting – Linear Regression – Example 1 – MATLAB Solutions – Linear Model – Method 1 – Code % Step 4: Write your model as a0 + a1*x ymodel = a0 + a1*x; % Step 6: Compute the standard error of the estimate % and the correlation coefficient % Step 5: Plot your original data and your regression line (model) plot(x,y,'bd',x,ymodel,'mo-') xlabel('x') ylabel('y and ymodel') legend('y','ymodel') title('The Best Fit') grid on % the mean: y_mean = mean(y); s_t = sum((y-y_mean).^2) s_r = sum((y-ymodel).^2) s_y_x = sqrt(s_r/(n-2)) r_squared = (s_t - s_r)/s_t r = sqrt(r_squared) *100 MATH 284L – Eng. Sarah Zeidan Using polyfit() Function 31 OCT. 2024 13 Curve Fitting – Linear Regression – Example 1 – MATLAB Solutions – Linear Model – Method 2 – Steps ➢ Steps Followed: 1. 2. 3. 4. 5. Define x and y, as the vectors containing the coordinates of the given data points, and m as the degree of the polynomial to fit. Use the polyfit() function to get the intercept 𝐚𝟎 and the slope 𝐚𝟏 . Syntax: p=polyfit(x,y,m), Fit polynomial to data. Put your best fit (or the regression line) as, 𝒚𝒎𝒐𝒅𝒆𝒍 = 𝒂𝟎 + 𝒂𝟏 𝒙𝒊 Plot the original data set along with the regression line obtained in step 4. Compute the standard error of the estimate (𝑆𝑦/𝑥 )and the error correlation coefficient (𝑟). P = polyfit(X,Y,N) finds the coefficients of a polynomial P(X) of degree N that fits the data Y best in a least-squares sense. P is a row vector of length N+1 containing the polynomial coefficients in descending powers, P(1)*X^N + P(2)*X^(N-1) +...+ P(N)*X + P(N+1). MATH 284L – Eng. Sarah Zeidan Using polyfit() Function 31 OCT. 2024 14 Curve Fitting – Linear Regression – Example 1 – MATLAB Solutions – Linear Model – Method 2 – Code % Step 1: Define x and y, and n as the order of the polynomial x = 0:2:10; y = [4.5 8.2 11.5 15.8 21 23.5]; n = length(x); m = 1; % Step 2: Use the polyfit() function % The order of the polynomial is 1, % NOTE: here p(1) is the slope and p(2) is the intercept p = polyfit(x,y,m); Check the full commented code in a1 = p(1) the attached notepad file named “Lab_8_Curve_Fitting_Method_2”. a0 = p(2) MATH 284L – Eng. Sarah Zeidan Using polyfit() Function 31 OCT. 2024 15 Curve Fitting – Linear Regression – Example 1 – MATLAB Solutions – Linear Model – Method 2 – Code % Step 3: Write your model as a0 + a1*x ymodel = a0 + a1*x; % Step 5: Compute the standard error of the estimate % and the correlation coefficient % Step 4: Plot your original data and your regression line (model) plot(x,y,'bd',x,ymodel,'mo-') xlabel('x') ylabel('y and ymodel') legend('y','ymodel') title('The Best Fit') grid on % the mean: y_mean = mean(y); s_t = sum((y-y_mean).^2) s_r = sum((y-ymodel).^2) s_y_x = sqrt(s_r/(n-2)) r_squared = (s_t - s_r)/s_t r = sqrt(r_squared) *100 MATH 284L – Eng. Sarah Zeidan Using Curve Fitting APPS/Tool 31 OCT. 2024 16 Curve Fitting – Linear Regression – Example 1 – MATLAB Solutions – Linear Model – Method 3 – Steps ➢ Steps Followed: 1. 2. 3. Define x and y (make sure they appear in your workspace), as the vectors containing the coordinates of the given data points. Go to the “APPS” tab, choose the “Curve Fitting” from the “APPS” section. Specify the x and y already defined and put the polynomial to 1 and Done! MATH 284L – Eng. Sarah Zeidan Using Curve Fitting APPS/Tool 31 OCT. 2024 17 Curve Fitting – Linear Regression – Example 1 – MATLAB Solutions – Linear Model – Method 3 – Steps Inputs Fit name Your data ymodel Outputs Coefficients Error calculations plot MATH 284L – Eng. Sarah Zeidan Using Curve Fitting APPS/Tool 31 OCT. 2024 Curve Fitting – Linear Regression – Example 1 – MATLAB Solutions – Linear Model – Method 3 – Notations 18 ➢ Goodness of Fit Notations: ➢ The Sum of Squares due to Error (SSE), also called the summed squares of residuals 𝑺𝒓 . ➢ R-squared, also called the coefficient of determination 𝒓𝟐 . ➢ (Not required)Adjusted r-squared is generally the best indicator of the fit quality when you compare two models in specific cases. ➢ Root Mean Squared Error (RMSE), also called the standard error of the regression (or of the estimate) 𝑺𝒚/𝒙 . ➢ For more details, please refer to: https://www.mathworks.com/help/curvefit/evaluating-goodness-of-fit.html https://www.mathworks.com/help/curvefit/evaluating-goodness-of-fit.html MATH 284L – Eng. Sarah Zeidan Using Curve Fitting APPS/Tool 31 OCT. 2024 19 Curve Fitting – Linear Regression – Example 1 – MATLAB Solutions – Linear Model – Method 3 – Notes ➢ After you get your results from the Curve Fitting APPS, you can save them. ➢ Also, you can ask MATLAB to generate the code behind its work: ➢ Click on “File”, in the Curve Fitting Tool Window, and choose “Generate Code”. ➢ A new script “Untitled”, with the code, will be created in your “Editor”. ➢ You can also click on the “Help – Curve Fitting Tool Help” in the Curve Fitting Tool Window for more guidance about how to use the Curve Fitting App. ➢ you will see the Documentation for the “Interactive Curve and Surface Fitting”. To save, Generate code, Etc. To get the Documentation 31 OCT. 2024 MATH 284L – Eng. Sarah Zeidan 20 Curve Fitting – Linear Regression – Extra Practice ➢ You can try the rest of your sheet 1 exercises. ➢ Also, here are some set of data that you can test your code on (line equation). Set 1 Set 2 x 1 2 3 4 5 6 7 y 0.5 2.5 2.0 4.0 3.5 6.0 5.5 x 0 4 8 16 24 30 y 12 18 20 26 26 22 31 OCT. 2024 MATH 284L – Eng. Sarah Zeidan 21 Assignment Five – Exercise 1 ➢ For the data shown in the table, write a code that will: 1. Find the equation of the straight line that best fit the tabulated data (equation of the straight line is given by y = a0 + a1*x). 2. Find the mean, standard deviation, and standard error of estimate. 3. Find the correlation coefficient then the coefficient of determination. x 1 2 3 4 5 y 2 4 5 4 5 31 OCT. 2024 MATH 284L – Eng. Sarah Zeidan 22 Assignment Five – Exercise 2 ➢ Use a proper code that best fit the given tabulated data to the model: 𝑦 = 𝑎𝑒 𝑏𝑥 x 2 y 144 3 4 5 6 172.8 207.4 248.8 298.5 MATH 284L – Eng. Sarah Zeidan 31 OCT. 2024 23 Reference & Helpful Links 1. Lecture Sheet 1, Numerical Analysis Course, MATH 284, Curve Fitting, Questions Sent By the Course Instructors (Tripoli) & Old Solutions By the Lab Instructor. 2. D. Houcque, Introduction to MATLAB for Engineering Students, Northwestern University, 2005. 3. Polynomial Curve Fitting: https://www.mathworks.com/help/matlab/math/polynomial-curvefitting.html 4. Lecture Notes, Numerical Analysis Lab, MATH 284, Curve Fitting, Sent By the Course Instructors (Tripoli & Debbieh). 5. For more about the backslash, matrix-left division: https://www.mathworks.com/help/matlab/ref/mldivide.html