Fit to a quadratic curve:

advertisement
Appendix V: Quadratic Fit
Mathematica Fit:
QuadFit.1
Appendix V: Quadratic Fit
One could generate an estimate of the uncertainty in the fit parameters by varying one of them until
the fit quality noticeably degrades. Repeat for the other parameters. Note that this is not an approved
method to generate such estimates.
This handout begins with an algorithm for fitting a second order curve to a data set.
After that, the procedure to find the quadratic fit using Excel is presented.
Fit to a quadratic curve:
Least Squares Regression for Quadratic Curve Fitting
Date: 02/27/2008 at 14:56:07
From: Rodo
Subject: Curve fitting
I have the following table of values
x
y
31
0
27 -1
23 -3
19 -5
15 -7
11 -10
7 -15
3 -25
I would like to find a function to interpolate all integer values
between 0 and 31 in x. I drew the above values and I got what looks
QuadFit.2
Appendix V: Quadratic Fit
like a square root curve. So I was thinking that maybe it's OK to
interpolate the other settings from a fitted equation like P =
a*sqrt(b*PA_LEVEL)+c.
However, I was playing with the values and the equation and I don't
find how to calculate a,b,c values. How can I calculate them?
Using 3 values from the table like (x,y)=(0,31),(-5,19),(-25,3) I was
trying to solve a,b,c using the equation P = a*sqrt(b*PA_LEVEL)+c.
--------------------------------------------------------------------------------
Date: 02/27/2008 at 18:18:51
From: Doctor Vogler
Subject: Re: Curve fitting
Hi Rodo,
Thanks for writing to Dr. Math. Actually, you only need one of a and
b, because you can just move the "a" inside the square root, or pull
the "b" out. It is more usual to invert the equation and solve for a,
b, and c in
y = a*x^2 + b*x + c.
The simplest kind of fitting is least-squares regression.
Least-squares linear regression is very common, and least-squares
quadratic regression is not very different. It gives a good
approximation, and it has the very nice property that you can solve
the equations once and then use these formulas for a, b, and c.
See also
Wikipedia: Least Squares
http://en.wikipedia.org/wiki/Least_squares
and
Wikipedia: Linear Least Squares
http://en.wikipedia.org/wiki/Linear_least_squares
(Note that Wikipedia is using "linear" in the sense that a, b, and c
are linear, whereas I was using "linear" and "quadratic" in the sense
QuadFit.3
Appendix V: Quadratic Fit
that x is linear or quadratic. Both of these cases are linear in the
Wikipedia sense. Nonlinear in the Wikipedia sense would be something
like y = a*cos(b*x), because the parameter b is inside the cosine.)
The idea is to choose the parabola that minimizes the sum of the
squares of the vertical distances between your data points and your
parabola. That is, you have n = 8 points (x_i, y_i) for i=1 to i=8,
and you want the sum
n
sum (a*x_i^2 + b*x_i + c - y_i)^2
i=1
to be as small as possible. (We square so that points below don't
cancel points above. We use the square instead of the absolute value
because it gives a nicer solution. This has the effect--which might
be good or bad, depending on your problem--of making it more important
to get faraway points reasonably close than to get nearby points right
on.)
Now we pull out our calculus tricks, and we solve as follows:
(1) first multiply out the square
(2) split the sum into lots of smaller sums
(3) take derivatives with respect to the unknowns a, b, c
(4) set these equal to zero and solve for a, b, and c
The result is a, b, and c given by formulas involving sums of values
from your data points. Importantly, the formulas do not depend on the
data, so you can get the formulas once and then just plug in the
numbers when you have them. So let's solve the general (linear)
least-squares quadratic regression problem.
(1) Multiply out the square
(a*x_i^2 + b*x_i + c - y_i)^2 =
a^2*x_i^4 + b^2*x_i^2 + c^2 + y_i^2 + 2ab*x_i^3 +
2ac*x_i^2 + 2bc*x_i - 2a*x_i^2*y_i - 2b*x_i*y_i - 2c*y_i
(2) Split the sum
n
sum (a*x_i^2 + b*x_i + c - y_i)^2 =
i=1
n
n
QuadFit.4
Appendix V: Quadratic Fit
a^2 sum x_i^4 + (b^2 + 2ac) sum x_i^2 + c^2 * n
i=1
i=1
n
n
n
+ sum y_i^2 + 2ab sum x_i^3 + 2bc sum x_i
i=1
i=1
i=1
n
n
n
- 2a sum x_i^2*y_i - 2b sum x_i*y_i - 2c sum y_i
i=1
i=1
i=1
So now we'll use the notation Sjk to mean the sum of x_i^j*y_i^k.
(Note that S00 = n, the number of data points you have.) Therefore,
we can write the sum as
a^2*S40 + (b^2 + 2ac)*S20 + c^2*S00 + S02 + 2ab*S30
+ 2bc*S10 - 2a*S21 - 2b*S11 - 2c*S01.
(3) Take derivatives
The local minimum for this function is going to be where the
derivatives with respect to a, b, and c (treating the data points and
therefore the sums Sjk as constants) are all zero. The derivatives are:
(with respect to a)
2a*S40 + 2c*S20 + 2b*S30 - 2*S21
(with respect to b)
2b*S20 + 2a*S30 + 2c*S10 - 2*S11
(with respect to c)
2a*S20 + 2c*S00 + 2b*S10 - 2*S01
(4) Solve
Now we solve the system of simultaneous equations
2a*S40 + 2c*S20 + 2b*S30 - 2*S21 = 0
2b*S20 + 2a*S30 + 2c*S10 - 2*S11 = 0
2a*S20 + 2c*S00 + 2b*S10 - 2*S01 = 0
which we can also write in matrix notation (after dividing by 2 for
simplification) as
QuadFit.5
Appendix V: Quadratic Fit
[ S40 S30 S20 ] [ a ] [ S21 ]
[ S30 S20 S10 ] [ b ] = [ S11 ]
[ S20 S10 S00 ] [ c ] [ S01 ]
Now we can use Cramer's Rule to give a, b, and c as formulas in these
Sjk values. They all have the same denominator:
a = (S01*S10*S30 - S11*S00*S30 - S01*S20^2
+ S11*S10*S20 + S21*S00*S20 - S21*S10^2)
/(S00*S20*S40 - S10^2*S40 - S00*S30^2 + 2*S10*S20*S30 - S20^3)
b = (S11*S00*S40 - S01*S10*S40 + S01*S20*S30
- S21*S00*S30 - S11*S20^2 + S21*S10*S20)
/(S00*S20*S40 - S10^2*S40 - S00*S30^2 + 2*S10*S20*S30 - S20^3)
c = (S01*S20*S40 - S11*S10*S40 - S01*S30^2
+ S11*S20*S30 + S21*S10*S30 - S21*S20^2)
/(S00*S20*S40 - S10^2*S40 - S00*S30^2 + 2*S10*S20*S30 - S20^3)
Now all you have to do is take your data (your eight points) and
evaluate the various sums
n
Sj0 = sum x_i^j
i=1
for j = 0 through 4, and
n
Sj1 = sum x_i^j*y_i
i=1
for j = 0 through 2. Then you substitute into the formulas for a, b,
and c, and you are done!
If you have any questions about this or need more help, please write
back and show me what you have been able to do, and I will try to
offer further suggestions.
- Doctor Vogler, The Math Forum
http://mathforum.org/dr.math/
Using Excel to Solve Simultaneous Linear
QuadFit.6
Appendix V: Quadratic Fit
Equations
In this article we present two methods of solving simultaneous linear
equations using Excel. The first method uses Excel Solver (which is an
add-in optimizer). The second method makes use of Excel's built in
matrix functions.
We demonstrate these two methods by solving the following equations
for u, v, w, x and y:
u + v + w + x + y = 5.5
u + 2v + w - 0.5x + 2y = 22.5
2v + 2w - x - y = 30
2u - w + 0.75x + 0.5y = -11
u + 0.25v + w - x = 17.5
Using Excel Solver to Solve Simultaneous Linear
Equations
To use Excel Solver, you'll need to set up a worksheet like the one
below:
Email this article to
a friend
Write to the editor
UltraSleuth
Compare, check,
analyze and
document your Excel
projects with
UltraSleuth.
Gain confidence in your
spreadsheet results
Speed up your
spreadsheet
development
Understand third party
spreadsheets
Create an audit trail
Learn More...
Fig. 1 - Specimen Worksheet Before Running Excel Solver
Note that each of the five equations is entered as a formula in
separate cells that are directly below each other. This arrangement
makes it easy to set up Excel Solver because we can select all
formulas using a single range. We do the same with the unknowns (u,
v, w, x and y) and the constants on the right hand side of each
equation. That is, we've arranged them in a vertical block of cells.
Other Tutorials
How do I find the last used
cell in a column? (Part 1)
How do I find the last used
cell in a column? (Part 2)
How to Display Formulas in
an Excel Worksheet
How to use Conditional
Formatting
How to use Names
How to Manually Install an
Excel Add-in
How to use Goal Seek
Useful Worksheet Functions
You might want to enter an initial guess for u, v, w, x and y. We
haven't done so here (the cells are blank) and often you won't need to.
If Excel Solver can't find a solution, it might be necessary to enter a
good guess for the unknowns.
Now start Solver (by clicking on Tools then Solver...). Clear the "Set
Target Cell" edit box and in the "By Changing Cells" edit box, enter a
range for the solution. The Solver Parameters dialog box should now
QuadFit.7
Appendix V: Quadratic Fit
look something like this:
Fig. 2 - Solver Parameters Dialog Box
Next set the constraints by clicking on the Add button. When the Add
Constraints dialog box opens, fill it out as shown below (adjusted if
necessary for the way you've set up your spreadsheet):
Fig. 3 - Add Constraint Dialog Box
This ensures that Excel Solver is constrained to find a solution that
matches the constants on the right hand side of the simultaneous
linear equations. Now click on OK to return to the Solver Parameters
dialog box:
QuadFit.8
Appendix V: Quadratic Fit
Fig. 4 - Completed Solver Parameters Dialog Box
Lastly click on the Solve button. You should see the following:
Fig. 5 - Solver Results Dialog Box
Click on the OK button to keep the solution. The spreadsheet should
now look like this:
Fig. 6 - Specimen Worksheet After Running Excel Solver
As you can see, Excel Solver has updated the spreadsheet and
replaced the contents of F3:F7 with values of u, v, w, x and y that
solve the simultaneous linear equations.
Using Matrix Functions to Solve Simultaneous Linear
Equations
QuadFit.9
Appendix V: Quadratic Fit
As an alternative to using Excel Solver, you can use the matrix
functions mmult and minverse.
When using mmult and minverse we need only use the coefficients of
each equation. (Coefficients are the numbers on the left hand side of
each equation. As an example, for the second equation, the
coefficients are 1, 2, 1, -0.5 and 2.) With this in mind, set up a
spreadsheet where the coefficients for each equation are entered into
consecutive rows. Also, enter the constants from the right hand side of
each equation into a vertical block of cells. You'll end up with a
spreadsheet like this:
Fig. 7 - Specimen Worksheet Before Using Matrix Functions
Now select an empty area on the worksheet. Make sure the area is
exactly five rows high and one column wide. Click on the formula bar
and enter "=mmult(minverse(A3:E7),G3:G7)" as shown below. (You
may need to change the ranges to be consistent with the layout of
your spreadsheet.)
Fig. 8 - Excel Formula Bar
When you've finished typing in the formula, don't press Enter. Press
Ctrl + Shift + Enter instead. (That is, press the Ctrl, Shift and Enter
keys together.) This enters the formula as an array which is the same
size as the solution of the simultaneous linear equations. You should
see the solution of the simultaneous linear equations in the cells you
selected:
QuadFit.10
Appendix V: Quadratic Fit
Fig. 9 - Specimen Worksheet After Using Matrix Functions
Matrix functions are slightly more flexible than Excel Solver. If you
change a coefficient or constant, the solution is updated immediately.
If you use Excel Solver and make a change, you'll have to re-run
Solver to calculate a new solution. A disadvantage of using matrix
functions is that they restrict you to solving linear equations. This is
where using Excel Solver can be an advantage; you may be able to
solve non linear equations as well as linear ones.
Download the spreadsheet for this tutorial. (Note: you'll need Excel 95
or later to use this spreadsheet.)
[Back to top]
QuadFit.11
Download