Curve and Surface Fitting Using the Method µ

advertisement
Curve and Surface Fitting Using the Method
of Least Squares – Chapters 4 and 9
Consider a quantitative continuous non-random variable X. Each value of X identifies a
continuous type population. Think of values of X as indexing populations of values y.
Each y population has a mean µ, and the plot of population means µX at the X values gives a
continuous curve f ( X ) . We want to estimate f ( X ) . Call the estimate fˆ ( X ) , so that for any
X0 we can estimate the mean of the population indexed by X0 as fˆ ( X ) .
0
Mathematically we describe the elements of each y population value as the population mean plus
a number δ.
y = µX +δ
= f (X ) + δ .
A random sample of n from the population indexed by Xi , denoted as y i1 , y i 2 , y i 3 , ..., y in , is
described as
y ij = f ( X i ) + ε ij
for each j = 1, 2, ..., n .
Here the ε ij are independent identically distributed random variables having mean zero. We will
assume that all populations have the same variance σ 2 .
To estimate f ( X ) we use random samples from some r number of the y populations. Say these
populations are indexed by X 1 , X 2 , ..., X r and the sample sizes are n1 , n2 , n3 , ..., nr . Describe
the set of sample values as
y ij = f ( X i ) + ε ij ;
i = 1, 2, ..., r ;
j = 1, 2, .., ni
and ε ij ~ iid (0, σ 2 ) .
Given the sample data, how do we use it to obtain fˆ ( X ) ? We will consider the Method of
Least Squares to do this. The method is described as follows. Pick a general form of
approximating function (linear, quadratic, cubic, …, etc.), and determine the specific function
fˆ ( X ) having this form, so that the following sum of squares is minimum.
ni
r
∑
∑
i =1 j =1
S =
(y
ij
− fˆ ( X i )
)
2
(1)
Then fˆ ( X ) , for any X value, is the least squares estimate of the population mean µX for the
population indexed by that X. The differences y − fˆ ( X ) are called residuals. Your textbook
ij
i
uses the notation ŷ i to denote fˆ ( X i ) , and calls ŷ i the predicted or fitted value at Xi.
How does one decide which general form of approximating function to use? One way is to plot
the sample data and see what form is suggested. Let us consider some example situations.
Example 1: Straight line
Suppose the general form selected is a straight line function f ( X ) = β 0 + β1 X . Then we must
pick b = β̂ and b = β̂ to obtain fˆ ( X ) = b + b X . To do this we minimize
0
0
1
1
0
S =
1
ni
(yij − (β 0 + β1 X i ) )2
∑
∑
i =1 j =1
r
with respect to β 0 and β1 by taking the two partial derivatives ∂ S / ∂ β 0 and ∂ S / ∂ β1 , setting
each to zero, and solving the simultaneous system of two (linear) equations.
These equations are called Normal Equations.
The solution gives b1 as
r
b1 =
1 r
1 r
,
n
X
y
=
∑ i i
∑
n i =1
n i =1
least squares fit is
where X =
ni
∑
j =1
ni
∑
∑
i =1 j =1
r
∑
i =1
( yij − y ) ( X i − X )
,
ni ( X i − X )
2
r
y ij , n = ∑ ni . Then b0 is found as b0 = y − b1 X . The
i =1
fˆ ( X ) = b0 + b1 X
and at any X, say X i , the residuals for sample values from that population are
y ij − (b0 + b1 X i )
2
and predicted values are
yˆ i = b0 + b1 X i .
Example 2: Quadratic curve
The general form of function selected for this example is
f ( X ) = β 0 + β1 X + β 2 X 2
and the method of least squares will give the estimating function
fˆ ( X ) = b0 + b1 X + b2 X 2 ,
selected by minimizing
(
S = Σ Σ yij − ( β 0 + β1 X i + β 2 X i2 )
)
2
.
The Normal Equations are three linear equations in three unknowns. The predicted values are
yˆ i = b0 + b1 X i + b2 X i2
and residuals are
(
)
y ij − b0 + b1 X i + b2 X i2 .
Example 3: The general linear regression case
The general form of linear regression function is
f ( X ) = β 0 + β1 g1 ( X ) + β 2 g 2 ( X ) + L + β k g k ( X ) .
The function is linear in the unknown β ’s which insures that the normal equations will be a
simultaneous system of linear equations that has a unique solution. The functions g i ( X ) can be
any functions of X. Examples are:
3
f ( X ) = β 0 + β1 X + β 2 X 2 + β 3 X 3
f ( X ) = β 0 + β1 ln ( X ) + β 2 sin ( X )
f ( X ) = β 0 + β1 X 4 + β 2 X 8 .
Your textbook only considers the most frequently used polynomial form of f ( X ) .
y
fˆ ( X )
X1
X2
X3
X4
X5
X
y
A Residual
X1
X2
fˆ ( X )
y − yˆ
X3
X4
4
X5
X
Download