Document 15070654

advertisement
Matakuliah
Tahun
: S0262-Analisis Numerik
: 2010
Curve Fitting
Pertemuan 10
Material Outline
• Curve Fitting
–
–
–
–
Least square fit
Quantification of error
Coefficient of determination
Coefficient of correlation
CURVE FITTING
•
•
In Curve Fitting, n pairs of numbers are expressed in
((x1,y1), (x2,y2), …(xn, yn)). These pairs are possibly
from observation or field measurements of certain
quantity.
The objective: To find a certain function such that we
can inter-relate the pairs of numbers, f(xj)  yj. In
other word, if the function is plotted, the resulted graph
will best fit the pairs of numbers.
4
CURVE FITTING
• One method that can be used to find the function for
curve fitting of n pairs of observation values is to
minimize the discrepancy between n pairs of
observations with the curve.
• To minimize the discrepancy is known as Least Squared
Regression.
• Least Squared Regression 
• Linear Regression
• Polynomial Regression
5
•
LINEAR REGRESSION
In Linear regression, n pairs observations or field
measurements is fitted to a straight line (linear). Linear
or straight line can be written as:
y= a0 + a1 x + E, in which
a0: intercept,
a1: slope, gradient
E : error (discrepancy) between data points
with the chosen linear line model.
The above equation can be written as : 
E = y - a0 - a1 x
 From this equation, it can be seen that the error E is
the difference between the true value y with the
approximate value a0 + a1 x.
6
LINEAR REGRESSION


E = y - a0 - a1 x
There are several methods to find the Best
Fit as to
1. Minimize the sum of the residual (error),
E
2. Minimize the sum of the absolute of
residual (error), |E|
3. Minimize the sum of the squared of
residual (error), E2
Out of these 3 methods, the best method
is to minimize the sum of the squared of
residual (error), E2 . One of the advantage
of using this method is that the resulted
line is unique for each set of n pairs of
data.  This approach is known as Least
Squares Fit.
7
Least Square Fit
 The coefisients a0 and a1 in the previous
equation will be determined by minimizing the
sum of error (residual) squared as follows :
n
n
i 1
i 1
S r   Ei2   ( yi  a0  a1 xi ) 2

To minimize means (Calculus):
n
S r
 2 ( yi  a0  a1 xi )  0
a0
i 1
n
S r
 2 [( yi  a0  a1 xi ) xi )  0
a1
i 1
8
•

Least Square Fit
From previous equations then a0 and a1 can be
written as:
a1 
n
n
n
i 1
i 1
i 1
n  xi yi   xi  yi


n  x    xi 
i 1
 i 1 
a0  y  a1 x ;
n
n
2
2
i
n
y
y
i 1
n
n
i
; x
x
i 1
i
n
9

QUANTIFICATION OF ERROR OF LINEAR
REGRESSION
Standard Deviation between prediction
model with data distribution can be
quantified using the following formula :
Sy/x 
2
(
y

a

a
x
)
 i 0 1i
n2
10

QUANTIFICATION
REGRESSION
OF
ERROR
OF
LINEAR
In addition to the sum of the squares of residuals (sr),
there is a quantity the sum of the squares around
the mean value st =  (y-yi)2. The difference
between st and sr quantifies the improvement or
error reduction due to linear regression rather
than average value. Two coefficients to quantifies
this improvement is given below: Coefficients of
determination (r2) and Correlation coeff. These 2
Coeff quantify the perfect ness of the fit of the
linear regression
11
 QUANTIFICATION OF ERROR OF LINEAR REGRESSION
Coefficient of Determination
r2 
2
2
(
y

y
)

(
y

a

a
x
)

 i 0 1i
i
2
(
y

y
)

i
Correlation Coeff r; 0 r  1;
r=1  Perfect Fit
r=0  No improvement st=sr
12
•

Least Square Fit
Example:
Find the linear regression line to fit the
following data and estimate the
deviation standard.
i
xi
yi
1
1
0,5
2
2
2,5
3
3
2,0
4
4
4,0
5
5
3,5
6
6
6,0
7
7
5,5
13
• Answer:
i
xi
yi
xi yi
x i2
yi-a0-a1xi
1
1
0,5
…
…
…
2
2
2,5
…
…
…
3
3
2,0
…
…
…
4
4
4,0
…
…
…
5
5
3,5
…
…
…
6
6
6,0
…
…
…
7
7
5,5
…
…
…
= …
= …
= …
= …
= …
14
Answer (cont): after completion of the previous
Table
n7
x y
 xi  28
i i
 119,5
x
2
i
 140
28  4
24  3,4285
y

24
x

y

7
7
i
 a1  
 a0  
15
• POLYNOMIAL REGRESSION



For most cases the linear regression that just
discussed is appropriate to fit data distribution. For
some case, however, it is not. For these cases,
Polynomial functions can be used as an alternative.
Polynomial functions can 2be written as:
m
0
1
2
m
y  a  a x  a x  a x
As before, the sum of the squares of error can be
2
n
written as:
2
m

S r   yi  a0  a1 x  a2 x   am x

i 1
16
•
POLYNOMIAL REGRESSION

In a polynomial function given before, there are
m+1 unknown quantities they are: a0, a1, …, am.
These quantities will be determined by
minimizing the sum of the squares of error Sr
as follows

sr
sr
sr
sr
 0;
0
 0
0
a0
a1
a2
am

From the above m+1 equations, the parameters
a0, a1, …, am can be determined
17
Download