Chp6 MATLAB Regression Engr/Math/Physics 25 Bruce Mayer, PE

advertisement
Engr/Math/Physics 25
Chp6 MATLAB
Regression
Bruce Mayer, PE
Registered Electrical & Mechanical Engineer
BMayer@ChabotCollege.edu
Engineering/Math/Physics 25: Computational Methods
1
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
Learning Goals cont
 Use Regression Analysis as quantified
by the “Least Squares” Method
• Calculate
– Sum-of-Squared Errors (SSE or J)
 The Squared Errors are Called “Residuals”
– “Best Fit” CoEfficients (𝑚0 and 𝑏0 )
– Sum-of-Squares About the Mean (SSM or S)
– CoEfficient of Determination (r2)
• Scale Data if Needed
– Creates more Meaningful Spacing
Engineering/Math/Physics 25: Computational Methods
2
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
Learning Goals cont
 Build Math Models for Physical Data
using “nth” Degree Polynomials
 Use MATLAB’s “Basic Fitting” Utility to
find Math models for Plotted Data
Engineering/Math/Physics 25: Computational Methods
3
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
Scatter on Plots in XY-Plane
 A scatter plot
usually shows how
an EXPLANATORY,
or independent,
variable affects a
RESPONSE, or
Dependent Variable
 Shown Below is a
Conceptual Scatter
plot that could
Relate the
RESPONSE to
some
EXCITITATION
 Sometimes the
SHAPE of the
scatter reveals a
relationship
Engineering/Math/Physics 25: Computational Methods
4
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
Linear Fit by Guessing
 The previous plot
looks sort of Linear
 We could use a
Ruler to draw a
y = mx+b
line thru the data
 But
• which Line is
BETTER?
• and WHY?
Engineering/Math/Physics 25: Computational Methods
5
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
Least Squares Curve Fitting
 In a Previous
Example
polyfit(x,y,1)
returned the Values
of m & b
 polyfit, as do most
other curve fitters,
uses the “Least
Squares” Criterion
• How does PolyFit
Make these Calcs?
• How Good is the
fitted Line Compared
to the Data?
Engineering/Math/Physics 25: Computational Methods
6
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
Least Squares
Best Guess-y
yL  mxk  b
h
y
 xk , y k 
data
x
yk  b
xL 
Best Guess-x
m
Engineering/Math/Physics 25: Computational Methods
7
 To make a Good Fit,
MINIMIZE the
|GUESS − data|
distance by one of
 yk  b

x  
 xk 
 m

y 
h
2
mxk  b   yk 
2
x  y
x   y 
2
2
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
Least Squares cont
 MATLAB polyfit
Minimizes the
VERTICAL
distances; i.e.:
n
J   yk 
2
k 1
n
 Note that The
Function J contains
two Variables; m & b
 Recall from MTH1
that to MINIMIZE a
Function set the
1st (partial)
Derivative(s) equal
2
to Zero
J   mxk  b  yk 
k 1
Engineering/Math/Physics 25: Computational Methods
8
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
Least Squares cont
 To Minimize J, take
 The Two Partials
yield Two LINEAR
J
 n
2
mxk  b  yk    0


Eqns in m & b

m m  k 1

 The two eqns can
J
 n
2
mxk  b  yk    0



be solved EXACTLY
b b  k 1

for m & b
 Remember, at this
 the Book
x
y
point m & b are
0
2
on pg 271
UNKNOWN
5
6
gives a
10
11
good
example
Engineering/Math/Physics 25: Computational Methods
9
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
x
y
Least Squares cont
0
2
5
6
 In This Case
10
11
J  0m  b  2   5m  b  6 
2
2
 10m  b  11
2
 Taking ∂J/∂m = 0,
and ∂J/∂b = 0 yields
250m  30b  280
30m  6b  38
Engineering/Math/Physics 25: Computational Methods
10
 Solving these
Eqns for m & b
yields
• m = 9/10
• b = 11/6
 This produces the
best fit Line
9
11
y  x
10
6
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
Goodness of Fit
 The Distance from
The Best-Fit Line to
the Actual Data
Point is called the
RESIDUAL
 For the Vertical
Distance the
Residual is just δy
 If the Sum of the
Residuals were
ZERO, then the Line
would Fit Perfectly
 Thus J, after
finding m & b, is an
Indication of the
Goodness
of Fit
n
J   yk 
2
y 
mxk  b   yk 
Engineering/Math/Physics 25: Computational Methods
11
2
k 1
n
2


J   mxk  b  yk
k 1
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
Goodness of Fit cont
 Now J is an
indication of Fit, but
we Might want to
SCALE it relative to
the MAGNITUDE of
the Data
• For example
consider
– DataSet1 with x&y
values in the
MILLIONS
– DataSet2 with x&y
values in the single
digits
Engineering/Math/Physics 25: Computational Methods
12
• In this case we
would expect
J1 >> J2
 To remove the affect
of Absolute
Magnitude, Scale J
against the Data Set
mean; e.g
• mean1 = 730 000
• mean2 = 4.91
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
Goodness of Fit cont
 The Mean-Scaling
Quantity is the
Actual-Data Relative
to the Actual-Mean
n

S   yk  y

2
 Finally the
Scaled Fit-Metric,
“r-squared’
J
r  1
S
2
n
k 1
 As before the
Squaring Ensures
that all Terms in the
sum are POSITIVE
Engineering/Math/Physics 25: Computational Methods
13
r 2  1
 mx
k 1
k
 y
 b  yk 
2
n
k 1
k
y
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt

2
r2 = Coeff of Determination

The r2 Value is Also Called the
COEFFICIENT OF DETERMINATION
• J  Sum of Residual (errors)
r  1 J S •
2
– May be Zero or Positive
S  Data-to-Mean Scaling Factor
– Always Positive if >1 Data-Pt and
data not “perfectly Horizontal”

If J = 0, then there is NO Distance Between
the calculated Line and Data

Thus if J = 0, then r2 = 1; so r2 = 1 (or 100%)
indicates a PERFECT FIT
Engineering/Math/Physics 25: Computational Methods
14
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
Meaning of r2

The COEFFICIENT OF DETERMINATION
n
r2 
 mx
k 1
k
 y
 b  yk 
2
n
k 1
k
y


Has This Meaning
2
The coefficient of determination tells you what
proportion of the variation between the data
points is explained or accounted for by the
best line fitted to the points. It indicates how
close the points are to the line.
Engineering/Math/Physics 25: Computational Methods
15
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
Norm of Residuals
 MATLAB uses the
Norm of Residuals
as a Measure of
Goodness of Fit
 Thus r2 in Terms of
NR: r 2  1  N 2 S
R
 As a Measure of
Goodness of Fit as
 The Norm of
the FIT Approaches
Residuals, NR, is
Perfection J→0 so:
simply the SqRt of J:
NR  J 
NR 
n
2


y

y
 L k
k 1
n
2


mx

b

y
 k
k
k 1
Engineering/Math/Physics 25: Computational Methods
16
r 1
NR  0
2
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
NR vs
r2
→ r  1  N R  S  1  J S
2
2
 Notice that r2 is a RELATIVE Measure
→ it’s NORMALIZED to the WORST
CASE Value of J which is S
• Thus r2 can be expressed at
PERCENTAGE withOUT Units
r %
2
 NR is an ABSOLUTE measure that
Technically Requires that it be stated
with SAME UNITS as the dependent
variable, y N R  Sec, m, F, Teslas, V, etc.
Engineering/Math/Physics 25: Computational Methods
17
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
Data Scaling
 Data-Scaling is a
SubTopic of
DIMENSIONAL
ANALYIS (DA)
• DA is Covered in 3rd
Yr ME/CE Courses
– I Learned it in a Fluid
Mechanics Course
 For Our Purposes
we will cover only
SCALING
Engineering/Math/Physics 25: Computational Methods
18
 Sometimes we
Collect Data with a
SMALL Variation
Relative to the
Magnitude of the
MEAN
• Leads to a
SENSITIVE
Analysis; e.g.
This Data is
Noisy During
Analysis
x
y
8974
7313
8971
7309
8969
7310
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
Data Scaling - Normalization
 The Significance of
ANY Data Set Can
be Improved by
Normalizing
 Normalize  Scale
Data such that the
Values run:
• 0 →1
• 0% → 100%

Steps to
Normalization
1. Find the MAX &
MIN values in the
Data Set; e.g.,
•
2. Calculate the Data
Range, RD
•
19
RD = (zmax – zmin)
3. Calc the Individual
Data Differences
Relative to the MIN
•
Engineering/Math/Physics 25: Computational Methods
zmax & zmin
Δzk = zk - zmin
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
Data Scaling – Normailzation cont
4. Finally, Scale the
Δzk relative to RD
•
Ψk = Δzk /RD
5. Scale the
corresponding “y”
values in the Same
Manner to produce
say, Φk
6. Plot Φk vs Ψk on
x & y scales that
Run from 0→1
Engineering/Math/Physics 25: Computational Methods
20
 Example
– Do
Frogs
Croak
More on
WARM
Nites?
Temperature
(ºF)
Croaks/Hr
88.6
20.0
71.6
16.0
93.3
19.8
84.3
18.4
80.6
17.1
75.2
15.5
69.7
14.7
82.0
17.1
69.4
15.4
83.3
16.2
78.6
15.0
82.6
17.2
80.6
16.0
83.5
17.0
76.3
14.1
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
Normalization Example
 Normalize
• T→Θ
• CPH → Ω
 Now Compare Plots
• CPH vs T
• Ω vs Θ
Tk  Tmin
k 
Tmax  Tmin
CPH k  CPH min
k 
CPH max  CPH min
Engineering/Math/Physics 25: Computational Methods
21
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
Plots Compared
 Ω-Θ Plot
 T-CPH Plot
Frog Croaking in the Evening - 2045hrs
Frog Croaking in the Evening - 2045hrs
1
20
0.9
0.8
Omege (Normalized CPH)
Croaks Per Hour (CPH)
19
18
17
16
0.7
0.6
0.5
0.4
0.3
0.2
15
0.1
14
65
70
75
80
Temp (°F)
85
90
95
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Theta (Normalized Temp)
• The Θ-Ω Plot Fully Utilizes Both Axes
Engineering/Math/Physics 25: Computational Methods
22
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
0.8
0.9
1
Basic Fitting
1
0.9
0.8
0.7
0.6
Omega
 Use
MATLAB’s
AutoMatic
Fitting Utility to
Find The Best
Line for the
the Frog
Croaking Data
Frog Croaking in the Evening - 2045hrs
0.5
0.4
0.3
0.2
0.1
0
0
0.1
0.2
0.3
0.4
0.5
Theta
0.6
0.7
SEE: Demo_Frog_Croak_BasicFit_1110.m
Engineering/Math/Physics 25: Computational Methods
23
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
0.8
0.9
1
All Done for Today
Croaking
Frog
Engineering/Math/Physics 25: Computational Methods
24
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
Engr/Math/Physics 25
Appendix
f x   2 x  7 x  9 x  6
3
2
Bruce Mayer, PE
Licensed Electrical & Mechanical Engineer
BMayer@ChabotCollege.edu
Engineering/Math/Physics 25: Computational Methods
25
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
Linear Regression Tutorial
 Minimize Height
Error δy
 See File ENGR25_Linear_Regressi
on_Tutorial_1309.pp
tx
Engineering/Math/Physics 25: Computational Methods
26
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
Engineering/Math/Physics 25: Computational Methods
27
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
Altitude of Right Triangle
 The Area of RIGHT Triangle
A  1 2 x  y
 The Area of an ARBITRARY
Triangle
A  1 2 L  h
L
y
h
 By Pythagoras for
Rt-Triangle
L
x   y 
2
Engineering/Math/Physics 25: Computational Methods
28
2
x
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
Altitude of Right Triangle cont
 Equating the A=½·Base·Hgt noting that
Base1  x & Base 2  L
1 2 x  y  1 2 x  y 
2
 Solving for h
h
x   y 
Engineering/Math/Physics 25: Computational Methods
29
h
L
x  y
2
2
y
h
2
x
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
Normalized Plot
>> T = [69.4, 69.7, 71.6, 75.2, 76.3, 78.6, 80.6,
80.6, 82, 82.6, 83.3, 83.5, 84.3, 88.6, 93.3];
>> CPH = [15.4, 14.7, 16, 15.5, 14.1, 15, 17.1,
16, 17.1, 17.2, 16.2, 17, 18.4, 20, 19.8];
>> Tmax = max(T);
>> Tmin = min(T);
>> CPHmax = max(CPH);
>> CPHmin = min(CPH);
>> Rtemp = Tmax - Tmin;
>> Rcroak = CPHmax - CPHmin;
>> DelT = T - Tmin;
>> DelCPH = CPH - CPHmin;
>> Theta = DelT/Rtemp;DelCPH = CPH - CPHmin;
>> Omega = DelCPH/Rcroak;
>> plot(T, CPH,), grid
>> plot(Theta,Omega), grid
Engineering/Math/Physics 25: Computational Methods
30
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
Start Basic Fitting Interface 1
 FIRST →
Plot the
DATA
Engineering/Math/Physics 25: Computational Methods
31
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
Start Basic Fitting Interface 2
Goodness of
Fit; smaller is
Better
Engineering/Math/Physics 25: Computational Methods
32
Expand Dialog Box
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
Start Basic Fitting Interface 3
Frog Croaking in the Evening - 2045hrs
 Result
1
0.9
 Chk by
polyfit
y = 0.8737*x + 0.04286
0.8
0.7
Omega
0.6
>> p =
polyfit(Theta,Ome
ga,1)
p =
0.8737
0.0429

0.5
0.4
0.3
0.2
Croak Data
linear Fit
0.1
0
0
Engineering/Math/Physics 25: Computational Methods
33
0.1
0.2
0.3
0.4
0.5
Theta
0.6
0.7
0.8
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
0.9
1
Caveat
Engineering/Math/Physics 25: Computational Methods
34
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
Greek Letters in Plots
Frog Croaking Frequency
1
0.9
 = 0.8737 + 0.04286
0.8
0.7

0.6
0.5
0.4
0.3
0.2
Croak Data
Linear Fit
0.1
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7

Engineering/Math/Physics 25: Computational Methods
35
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
0.8
0.9
1
Plot “Discoverables”
% "Discoverable" Functions Displayed
% Bruce Mayer, PE • ENGR25 • 15Jul09
%
x = linspace(-5, 5);
ye = exp(x);
ypp = x.^2;
ypm = x.^(-2);
% plot all 3 on a single graphe
plot(x,ye, x,ypp, x,ypm),grid,legend('ye', 'ypp', 'ypm')
disp('Showing MultiGraph Plot - Hit ANY KEY to continue')
pause
%
% PLot Side-by-Side
subplot(1,3,1)
plot(x,ye), grid
subplot(1,3,2)
plot(x,ypp), grid
subplot(1,3,3)
plot(x,ypm), grid
Engineering/Math/Physics 25: Computational Methods
36
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
% "Discoverable" Functions Displayed
% Bruce Mayer, PE • ENGR25 • 15Jul09
%
x = linspace(-5, 5);
ye = exp(x);
ypp = x.^2;
ypm = x.^(-2);
% plot all 3 on a single graphe
plot(x,ye, x,ypp, x,ypm),grid,legend('ye', 'ypp', 'ypm')
disp('Showing MultiGraph Plot - Hit ANY KEY to continue')
pause
%
% PLot Side-by-Side
subplot(1,3,1)
plot(x,ye), grid
subplot(1,3,2)
plot(x,ypp), grid
subplot(1,3,3)
plot(x,ypm), grid
Engineering/Math/Physics 25: Computational Methods
37
Bruce Mayer, PE
BMayer@ChabotCollege.edu • ENGR-25_Plot_Model-4.ppt
Download