Engr/Math/Physics 25 Chp6 MATLAB Regression Bruce Mayer, PE Registered Electrical & Mechanical Engineer BMayer@ChabotCollege.edu Engineering/Math/Physics 25: Computational Methods 1 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt Learning Goals cont Use Regression Analysis as quantified by the “Least Squares” Method • Calculate – Sum-of-Squared Errors (SSE or J) The Squared Errors are Called “Residuals” – “Best Fit” CoEfficients (𝑚0 and 𝑏0 ) – Sum-of-Squares About the Mean (SSM or S) – CoEfficient of Determination (r2) • Scale Data if Needed – Creates more Meaningful Spacing Engineering/Math/Physics 25: Computational Methods 2 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt Learning Goals cont Build Math Models for Physical Data using “nth” Degree Polynomials Use MATLAB’s “Basic Fitting” Utility to find Math models for Plotted Data Engineering/Math/Physics 25: Computational Methods 3 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt Scatter on Plots in XY-Plane A scatter plot usually shows how an EXPLANATORY, or independent, variable affects a RESPONSE, or Dependent Variable Shown Below is a Conceptual Scatter plot that could Relate the RESPONSE to some EXCITITATION Sometimes the SHAPE of the scatter reveals a relationship Engineering/Math/Physics 25: Computational Methods 4 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt Linear Fit by Guessing The previous plot looks sort of Linear We could use a Ruler to draw a y = mx+b line thru the data But • which Line is BETTER? • and WHY? Engineering/Math/Physics 25: Computational Methods 5 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt Least Squares Curve Fitting In a Previous Example polyfit(x,y,1) returned the Values of m & b polyfit, as do most other curve fitters, uses the “Least Squares” Criterion • How does PolyFit Make these Calcs? • How Good is the fitted Line Compared to the Data? Engineering/Math/Physics 25: Computational Methods 6 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt Least Squares Best Guess-y yL mxk b h y xk , y k data x yk b xL Best Guess-x m Engineering/Math/Physics 25: Computational Methods 7 To make a Good Fit, MINIMIZE the |GUESS − data| distance by one of yk b x xk m y h 2 mxk b yk 2 x y x y 2 2 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt Least Squares cont MATLAB polyfit Minimizes the VERTICAL distances; i.e.: n J yk 2 k 1 n Note that The Function J contains two Variables; m & b Recall from MTH1 that to MINIMIZE a Function set the 1st (partial) Derivative(s) equal 2 to Zero J mxk b yk k 1 Engineering/Math/Physics 25: Computational Methods 8 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt Least Squares cont To Minimize J, take The Two Partials yield Two LINEAR J n 2 mxk b yk 0 Eqns in m & b m m k 1 The two eqns can J n 2 mxk b yk 0 be solved EXACTLY b b k 1 for m & b Remember, at this the Book x y point m & b are 0 2 on pg 271 UNKNOWN 5 6 gives a 10 11 good example Engineering/Math/Physics 25: Computational Methods 9 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt x y Least Squares cont 0 2 5 6 In This Case 10 11 J 0m b 2 5m b 6 2 2 10m b 11 2 Taking ∂J/∂m = 0, and ∂J/∂b = 0 yields 250m 30b 280 30m 6b 38 Engineering/Math/Physics 25: Computational Methods 10 Solving these Eqns for m & b yields • m = 9/10 • b = 11/6 This produces the best fit Line 9 11 y x 10 6 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt Goodness of Fit The Distance from The Best-Fit Line to the Actual Data Point is called the RESIDUAL For the Vertical Distance the Residual is just δy If the Sum of the Residuals were ZERO, then the Line would Fit Perfectly Thus J, after finding m & b, is an Indication of the Goodness of Fit n J yk 2 y mxk b yk Engineering/Math/Physics 25: Computational Methods 11 2 k 1 n 2 J mxk b yk k 1 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt Goodness of Fit cont Now J is an indication of Fit, but we Might want to SCALE it relative to the MAGNITUDE of the Data • For example consider – DataSet1 with x&y values in the MILLIONS – DataSet2 with x&y values in the single digits Engineering/Math/Physics 25: Computational Methods 12 • In this case we would expect J1 >> J2 To remove the affect of Absolute Magnitude, Scale J against the Data Set mean; e.g • mean1 = 730 000 • mean2 = 4.91 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt Goodness of Fit cont The Mean-Scaling Quantity is the Actual-Data Relative to the Actual-Mean n S yk y 2 Finally the Scaled Fit-Metric, “r-squared’ J r 1 S 2 n k 1 As before the Squaring Ensures that all Terms in the sum are POSITIVE Engineering/Math/Physics 25: Computational Methods 13 r 2 1 mx k 1 k y b yk 2 n k 1 k y Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt 2 r2 = Coeff of Determination The r2 Value is Also Called the COEFFICIENT OF DETERMINATION • J Sum of Residual (errors) r 1 J S • 2 – May be Zero or Positive S Data-to-Mean Scaling Factor – Always Positive if >1 Data-Pt and data not “perfectly Horizontal” If J = 0, then there is NO Distance Between the calculated Line and Data Thus if J = 0, then r2 = 1; so r2 = 1 (or 100%) indicates a PERFECT FIT Engineering/Math/Physics 25: Computational Methods 14 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt Meaning of r2 The COEFFICIENT OF DETERMINATION n r2 mx k 1 k y b yk 2 n k 1 k y Has This Meaning 2 The coefficient of determination tells you what proportion of the variation between the data points is explained or accounted for by the best line fitted to the points. It indicates how close the points are to the line. Engineering/Math/Physics 25: Computational Methods 15 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt Norm of Residuals MATLAB uses the Norm of Residuals as a Measure of Goodness of Fit Thus r2 in Terms of NR: r 2 1 N 2 S R As a Measure of Goodness of Fit as The Norm of the FIT Approaches Residuals, NR, is Perfection J→0 so: simply the SqRt of J: NR J NR n 2 y y L k k 1 n 2 mx b y k k k 1 Engineering/Math/Physics 25: Computational Methods 16 r 1 NR 0 2 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt NR vs r2 → r 1 N R S 1 J S 2 2 Notice that r2 is a RELATIVE Measure → it’s NORMALIZED to the WORST CASE Value of J which is S • Thus r2 can be expressed at PERCENTAGE withOUT Units r % 2 NR is an ABSOLUTE measure that Technically Requires that it be stated with SAME UNITS as the dependent variable, y N R Sec, m, F, Teslas, V, etc. Engineering/Math/Physics 25: Computational Methods 17 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt Data Scaling Data-Scaling is a SubTopic of DIMENSIONAL ANALYIS (DA) • DA is Covered in 3rd Yr ME/CE Courses – I Learned it in a Fluid Mechanics Course For Our Purposes we will cover only SCALING Engineering/Math/Physics 25: Computational Methods 18 Sometimes we Collect Data with a SMALL Variation Relative to the Magnitude of the MEAN • Leads to a SENSITIVE Analysis; e.g. This Data is Noisy During Analysis x y 8974 7313 8971 7309 8969 7310 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt Data Scaling - Normalization The Significance of ANY Data Set Can be Improved by Normalizing Normalize Scale Data such that the Values run: • 0 →1 • 0% → 100% Steps to Normalization 1. Find the MAX & MIN values in the Data Set; e.g., • 2. Calculate the Data Range, RD • 19 RD = (zmax – zmin) 3. Calc the Individual Data Differences Relative to the MIN • Engineering/Math/Physics 25: Computational Methods zmax & zmin Δzk = zk - zmin Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt Data Scaling – Normailzation cont 4. Finally, Scale the Δzk relative to RD • Ψk = Δzk /RD 5. Scale the corresponding “y” values in the Same Manner to produce say, Φk 6. Plot Φk vs Ψk on x & y scales that Run from 0→1 Engineering/Math/Physics 25: Computational Methods 20 Example – Do Frogs Croak More on WARM Nites? Temperature (ºF) Croaks/Hr 88.6 20.0 71.6 16.0 93.3 19.8 84.3 18.4 80.6 17.1 75.2 15.5 69.7 14.7 82.0 17.1 69.4 15.4 83.3 16.2 78.6 15.0 82.6 17.2 80.6 16.0 83.5 17.0 76.3 14.1 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt Normalization Example Normalize • T→Θ • CPH → Ω Now Compare Plots • CPH vs T • Ω vs Θ Tk Tmin k Tmax Tmin CPH k CPH min k CPH max CPH min Engineering/Math/Physics 25: Computational Methods 21 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt Plots Compared Ω-Θ Plot T-CPH Plot Frog Croaking in the Evening - 2045hrs Frog Croaking in the Evening - 2045hrs 1 20 0.9 0.8 Omege (Normalized CPH) Croaks Per Hour (CPH) 19 18 17 16 0.7 0.6 0.5 0.4 0.3 0.2 15 0.1 14 65 70 75 80 Temp (°F) 85 90 95 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Theta (Normalized Temp) • The Θ-Ω Plot Fully Utilizes Both Axes Engineering/Math/Physics 25: Computational Methods 22 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt 0.8 0.9 1 Basic Fitting 1 0.9 0.8 0.7 0.6 Omega Use MATLAB’s AutoMatic Fitting Utility to Find The Best Line for the the Frog Croaking Data Frog Croaking in the Evening - 2045hrs 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 Theta 0.6 0.7 SEE: Demo_Frog_Croak_BasicFit_1110.m Engineering/Math/Physics 25: Computational Methods 23 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt 0.8 0.9 1 All Done for Today Croaking Frog Engineering/Math/Physics 25: Computational Methods 24 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt Engr/Math/Physics 25 Appendix f x 2 x 7 x 9 x 6 3 2 Bruce Mayer, PE Licensed Electrical & Mechanical Engineer BMayer@ChabotCollege.edu Engineering/Math/Physics 25: Computational Methods 25 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt Linear Regression Tutorial Minimize Height Error δy See File ENGR25_Linear_Regressi on_Tutorial_1309.pp tx Engineering/Math/Physics 25: Computational Methods 26 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt Engineering/Math/Physics 25: Computational Methods 27 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt Altitude of Right Triangle The Area of RIGHT Triangle A 1 2 x y The Area of an ARBITRARY Triangle A 1 2 L h L y h By Pythagoras for Rt-Triangle L x y 2 Engineering/Math/Physics 25: Computational Methods 28 2 x Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt Altitude of Right Triangle cont Equating the A=½·Base·Hgt noting that Base1 x & Base 2 L 1 2 x y 1 2 x y 2 Solving for h h x y Engineering/Math/Physics 25: Computational Methods 29 h L x y 2 2 y h 2 x Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt Normalized Plot >> T = [69.4, 69.7, 71.6, 75.2, 76.3, 78.6, 80.6, 80.6, 82, 82.6, 83.3, 83.5, 84.3, 88.6, 93.3]; >> CPH = [15.4, 14.7, 16, 15.5, 14.1, 15, 17.1, 16, 17.1, 17.2, 16.2, 17, 18.4, 20, 19.8]; >> Tmax = max(T); >> Tmin = min(T); >> CPHmax = max(CPH); >> CPHmin = min(CPH); >> Rtemp = Tmax - Tmin; >> Rcroak = CPHmax - CPHmin; >> DelT = T - Tmin; >> DelCPH = CPH - CPHmin; >> Theta = DelT/Rtemp;DelCPH = CPH - CPHmin; >> Omega = DelCPH/Rcroak; >> plot(T, CPH,), grid >> plot(Theta,Omega), grid Engineering/Math/Physics 25: Computational Methods 30 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt Start Basic Fitting Interface 1 FIRST → Plot the DATA Engineering/Math/Physics 25: Computational Methods 31 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt Start Basic Fitting Interface 2 Goodness of Fit; smaller is Better Engineering/Math/Physics 25: Computational Methods 32 Expand Dialog Box Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt Start Basic Fitting Interface 3 Frog Croaking in the Evening - 2045hrs Result 1 0.9 Chk by polyfit y = 0.8737*x + 0.04286 0.8 0.7 Omega 0.6 >> p = polyfit(Theta,Ome ga,1) p = 0.8737 0.0429 0.5 0.4 0.3 0.2 Croak Data linear Fit 0.1 0 0 Engineering/Math/Physics 25: Computational Methods 33 0.1 0.2 0.3 0.4 0.5 Theta 0.6 0.7 0.8 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt 0.9 1 Caveat Engineering/Math/Physics 25: Computational Methods 34 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt Greek Letters in Plots Frog Croaking Frequency 1 0.9 = 0.8737 + 0.04286 0.8 0.7 0.6 0.5 0.4 0.3 0.2 Croak Data Linear Fit 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Engineering/Math/Physics 25: Computational Methods 35 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt 0.8 0.9 1 Plot “Discoverables” % "Discoverable" Functions Displayed % Bruce Mayer, PE • ENGR25 • 15Jul09 % x = linspace(-5, 5); ye = exp(x); ypp = x.^2; ypm = x.^(-2); % plot all 3 on a single graphe plot(x,ye, x,ypp, x,ypm),grid,legend('ye', 'ypp', 'ypm') disp('Showing MultiGraph Plot - Hit ANY KEY to continue') pause % % PLot Side-by-Side subplot(1,3,1) plot(x,ye), grid subplot(1,3,2) plot(x,ypp), grid subplot(1,3,3) plot(x,ypm), grid Engineering/Math/Physics 25: Computational Methods 36 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt % "Discoverable" Functions Displayed % Bruce Mayer, PE • ENGR25 • 15Jul09 % x = linspace(-5, 5); ye = exp(x); ypp = x.^2; ypm = x.^(-2); % plot all 3 on a single graphe plot(x,ye, x,ypp, x,ypm),grid,legend('ye', 'ypp', 'ypm') disp('Showing MultiGraph Plot - Hit ANY KEY to continue') pause % % PLot Side-by-Side subplot(1,3,1) plot(x,ye), grid subplot(1,3,2) plot(x,ypp), grid subplot(1,3,3) plot(x,ypm), grid Engineering/Math/Physics 25: Computational Methods 37 Bruce Mayer, PE BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt