EGR 105 Foundations of Engineering I Excel Part III Curve-Fitting, Regression Section 8 Fall 2013 Excel Part II Topics • • • • • Data Analysis Concepts Regression Methods Example Function Discovery Regression Tools in Excel Homework Assignment Analysis of x-y Data dependent • Independent versus dependent variables y = f(x) y independent x Common Types of Plots Example: Y=3X2 350 300 Cartesian 250 1000 log-log : log y-log x Y 200 150 100 logY 100 50 10 0 0 2 4 6 8 10 X 350 1 Note! 300 1 logX Semi-log : log x 250 10 Y 200 y = 3x2 150 log(y) = log(3) + 2*log(x) 100 Straight Line on log-log Plot! 50 0 1 10 logX What About Other Values? • Often have a limited set of data • What if you want to know… – Prediction of what occurred before data – Prediction of what will occur after data • Many real applications of this… – Discuss this in a little while Finding Other Values • Interpolation – Data between known points – Need assume variation between points – May be easier to do for closer points data points Finding Other Values • Extrapolation (requires assumptions) – Data beyond the measured range – Forecasting (looking ahead) – Hindcasting (looking behind) • Examples (apply equations or models) – Sales – Ocean waves – Stock market – The weather – etc. Stock Market Forecasting – can require complex model(s) Finding Other Values • Regression – curve fitting of data – Simple representation of data – Understand workings of system • Elements of system behavior are important – How do they affect the overall system? – How important is each one? • Can represent these in model(s) – Useful for prediction Excel Part III Topics • • • • • Data Analysis Concepts Regression Methods Example Function Discovery Regression Tools in Excel Homework Assignment Something Must Be In There…Somewhere…. Curve-Fitting - Regression • Useful for noisy or uncertain data – n pairs of data (xi , yi) • Choose a functional form y = f(x) • polynomial • exponential • etc. and evaluate parameters for a “close” fit What Does “Close” Mean? • Want a consistent rule to determine • Common is the least squares fit (SSE): y (x3,y3) e3 (x1,y1) (x2,y2) (x4,y4) ei = yi – f(xi), i =1,2,…,n n SSE = å ei i =1 2 x SSE Quality of the Fit: R = 1 SST 2 n SSE = å ei 2 y i =1 n SST = å ( yi - y ) 2 i =1 Notes: y is the average y value 0 R2 1 -closer to 1 is a “better” fit y=y x Coefficient of Determination SSE R = 1SST 2 • R2 = 1.0 – All of the data can be explained by the fit • R2 = 0.0 – None of the data can be explained by the curve fit (Note: R2 = is sometimes reported as a %) Caution!!! • A good fit statistically may not be the correct fit • Must always consider the physical phenomenon you are attempting to “model” • Does the fit to the data describe reality? Linear Regression • Functional choice y = m x + b slope intercept • Squared errors sum to SSE = å (yi - m xi - b ) 2 i • Set m and b derivatives to zero ¶ SSE =0 ¶m ¶ SSE =0 ¶b Further Regression Possibilities: • Could force intercept: y=mx+c • Other two parameter ( a and b ) fits: – Logarithmic: – Exponential: – Power function: y = a ln x + b y = a e bx y=axb • Other polynomials with more parameters: – Parabola: – Higher order: y = a x2 + bx + c y = a xk + bxk-1 + … Excel Part III Topics • • • • • Data Analysis Concepts Regression Methods Example Function Discovery Regression Tools in Excel Homework Assignment Example Function Discovery (How to find the “best” relationship) • Look for straight lines on log axes: – linear on semilog x y = a ln x – linear on semilog y y = a e bx – linear on log log y = a x b • No rule for 2nd or higher order polynomial fits Excel Part III Topics • • • • • Data Analysis Concepts Regression Methods Example Function Discovery Regression Tools in Excel Homework Assignment Excel’s Regression Tool • Highlight your chart • On chart menu, select “add trendline” • Choose type: – Linear, log, polynomial, exponential, power • Set options: – – – – Forecast = extrapolation Select y intercept (use zero only if it applies) Show R2 value on chart Show equation of fit on chart Linear & Quartic Curve Fit Example 7 y = 0.996x R² = 0.986 6 5 Y 4 3 2 1 X 0 0 1 2 3 4 5 6 7 7 6 y = 0.037x4 - 0.523x3 + 2.518x2 - 3.878x + 3.133 R² = 0.997 5 4 Y Better fit but does it make sense with expected behavior? 3 2 1 0 0 1 2 3 4 5 6 7 X Example Applications • Look at some curve fitting examples – Examine previous EGR 105 projects • Pendulum • Elastic bungee cord Previous EGR 105 Project • Discover how a pendulum’s timing is impacted by the – length of the string? – mass of the bob? 1. Take experimental data • Use string, weights, rulers, and watches 2. Analyze data and “discover” relationships Experimental Setup: Length Mass One Team’s Results length (inches) time (sec) 121.5 114.0 105.0 97.0 85.0 79.0 67.5 58.5 50.0 43.0 13.0 13.73 3.5 3.4 3.3 3.1 2.9 2.8 2.6 2.4 2.3 2.1 1.2 mass (grams) 27.47 41.20 54.94 3.5 3.5 3.5 3.4 3.4 3.4 3.3 3.3 3.3 3.1 3.1 3.1 2.9 2.9 2.9 2.8 2.8 2.8 2.6 2.6 2.6 2.4 2.4 2.4 2.3 2.3 2.3 2.1 2.1 2.1 1.2 1.2 1.2 Mass appears to have no impact, but length does To determine the effect of length, first plot the data 4.0 time (seconds) 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 0.0 20.0 40.0 60.0 80.0 length (inches) 100.0 120.0 140.0 Try a linear fit 4.0 time (seconds) 3.5 3.0 2.5 2.0 1.5 y = 0.02x + 1.1692 2 R = 0.9776 1.0 0.5 0.0 0.0 20.0 40.0 60.0 80.0 length (inches) 100.0 120.0 140.0 Force a zero intercept (why?) time (seconds) 4.5 4.0 3.5 3.0 2.5 2.0 1.5 y = 0.0332x 2 R = 0.4832 1.0 0.5 0.0 0.0 20.0 40.0 60.0 80.0 length (inches) 100.0 120.0 140.0 Try a quadratic polynomial fit 4.0 time (seconds) 3.5 3.0 2.5 2.0 1.5 2 y = -0.0002x + 0.0551x 2 R = 0.9117 1.0 0.5 0.0 0.0 20.0 40.0 60.0 80.0 length (inches) 100.0 120.0 140.0 Try a logarithmic fit 4.0 time (seconds) 3.5 3.0 2.5 2.0 1.5 y = 1.0349Ln(x) - 1.6506 2 R = 0.9609 1.0 0.5 0.0 0.0 20.0 40.0 60.0 80.0 length (inches) 100.0 120.0 140.0 Try a power function fit 4.0 time (seconds) 3.5 3.0 2.5 2.0 1.5 0.4774 y = 0.3504x 2 R = 0.9989 1.0 0.5 0.0 0.0 20.0 40.0 60.0 80.0 length (inches) 100.0 120.0 140.0 On log-log axes, nice straight line time (seconds) 10.0 b 1.0 1.0 10.0 100.0 1000.0 length (inches) Power Law Relation: t = al b Þ log(t ) = log(a) + b log(l ) Question? • Which one was the best fit here? • Explain why One More Example • Another EGR 105 project • Elastic bungee cord models – Stretching of an elastic cord • Here we have two models to consider – Linear elastic (Hooke’s Law) – Non-linear elastic (Cubic model) Elastic Bungee Cord Models Determined by Curve Fitting the Data • Linear Model (Hooke’s Law): F ( s) = ks • Nonlinear Cubic Model: F ( s) = k1s + k2 s 2 + k3 s 3 Force (lb) eaxial = Collected Data Cubic Fit Better and it Makes Sense with the Physics Linear Fit Elongation l - lo = = axial strain Original Length lo Homework Assignment #5 • See Handout (Excel Part 3) – Analysis of stress-strain data – Plotting of data – Determine equation for best fit to data • Regression analysis – Linear elastic model – Cubic polynomial model • Discussion of results Remember to email submit using EGR105_5 in Subject Line!