MatLab 2 Edition Lecture 6:

advertisement
Environmental Data Analysis with MatLab
2nd Edition
Lecture 6:
The Principle of Least Squares
SYLLABUS
Lecture 01
Lecture 02
Lecture 03
Lecture 04
Lecture 05
Lecture 06
Lecture 07
Lecture 08
Lecture 09
Lecture 10
Lecture 11
Lecture 12
Lecture 13
Lecture 14
Lecture 15
Lecture 16
Lecture 17
Lecture 18
Lecture 19
Lecture 20
Lecture 21
Lecture 22
Lecture 23
Lecture 24
Lecture 25
Lecture 26
Using MatLab
Looking At Data
Probability and Measurement Error
Multivariate Distributions
Linear Models
The Principle of Least Squares
Prior Information
Solving Generalized Least Squares Problems
Fourier Series
Complex Fourier Series
Lessons Learned from the Fourier Transform
Power Spectra
Filter Theory
Applications of Filters
Factor Analysis
Orthogonal functions
Covariance and Autocorrelation
Cross-correlation
Smoothing, Correlation and Spectra
Coherence; Tapering and Spectral Analysis
Interpolation
Linear Approximations and Non Linear Least Squares
Adaptable Approximations with Neural Networks
Hypothesis testing
Hypothesis Testing continued; F-Tests
Confidence Limits of Spectra, Bootstraps
Goals of the lecture
estimate model parameters using the
principle of least-squares
part 1
the least squares estimation of model parameters
and their covariance
the prediction error
motivates us to define an error vector, e
prediction error in straight line case
plot of linedata01.txt
15
10
dipre
d
data,
d
5
ei
diobs
0
-5
-10
-15
-6
-4
-2
0
x
auxiliary variable,
2
x
4
6
total error
single number summarizing the error
sum of squares of individual errors
principle of least-squares
that minimizes
least-squares and probability
suppose that each observation has a
Normal p.d.f.
2
for uncorrelated data
the joint p.d.f. is just the product of
the individual p.d.f.’s
least-squares
formula for E
suggests a link
between probability
and least-squares
now assume that Gm predicts the
mean of d
Gm substituted
for d
minimizing E(m) is equivalent to
maximizing p(d)
the principle of least-squares
determines the m
that makes the observations
“most probable”
in the sense of maximizing
obs
p(d )
the principle of least-squares
determines the model parameters
that makes the observations
“most probable”
(provided that the data are Normal)
this is
the principle of maximum likelihood
a formula for mest
at the point of minimum error, E
∂E / ∂mi = 0
so solve this equation for mest
Result
where the result comes from
E=
so
use the chain rule
unity when k=j
zero when k≠j
since m’s are
independent
so just delete
sum over j and
replace j with k
which gives
covariance of mest
mest is a linear function of d of the form mest = M d
so Cm = M Cd MT, with M=[GTG]-1GT
assume Cd uncorrelated with uniform variance, σd2
then
two methods of estimating the variance
of the data
prior estimate: use knowledge of
measurement technique
the ruler has 1mm tic marks, so σd≈½mm
posterior estimate: use prediction error
posterior estimates are overestimates
when the model is poor
reduce N by M since an Mparameter model can exactly
fit N data
confidence intervals for the estimated
model parameters
(assuming uncorrelated data of equal variance)
so
σmi = √[Cm]ii
and
m=mest±2σmi (95% confidence)
MatLab script for least squares solution
mest = (G’*G)\(G’*d);
Cm = sd2 * inv(G’*G);
sm = sqrt(diag(Cm));
part 2
exemplary least squares problems
Example 1: the mean of data
the
constant
will turn
out to be
the mean
usual
formula for
the mean
variance
decreases with
number of data
formula for mean
formula for covariance
combining the two into confidence limits
m1est = d =
±
2σd
√N
(95% confidence)
Example 2: fitting a straight line
intercept
slope
[GTG]-1=
(uses the rule)
intercept and slope
are uncorrelated
when the mean of x
is zero
keep in mind that none of this algrbraic
manipulation is needed if we just compute
using MatLab
Generic MatLab script
for least-squares problems
mest = (G’*G)\(G’*dobs);
dpre = G*mest;
e = dobs-dpre;
E = e’*e;
sigmad2 = E / (N-M);
covm = sigmad2 * inv(G’*G);
sigmam = sqrt(diag(covm));
mlow95 = mest – 2*sigmam;
mhigh95 = mest + 2*sigmam;
Example 3:
pre temp, C
d(t)pre
error, C
d(t)obs
obs temp, C
modeling long-term trend and annual cycle in
Black Rock Forest temperature data
error, e(t)
40
20
0
-20
-40
40
20
0
-20
-40
40
20
0
-20
-40
0
500
1000
1500
2000
2500
3000
timetime,
t, daysdays
3500
4000
4500
5000
0
500
1000
1500
2000
2500
3000
time,
days
time
t, days
3500
4000
4500
5000
0
500
1000
1500
2000
2500
3000
time,
days
time t, days
3500
4000
4500
5000
the model:
long-term trend
annual cycle
MatLab script to create the data kernel
Ty=365.25;
G=zeros(N,4);
G(:,1)=1;
G(:,2)=t;
G(:,3)=cos(2*pi*t/Ty);
G(:,4)=sin(2*pi*t/Ty);
prior variance of data
based on accuracy of thermometer
σd = 0.01 deg C
posterior variance of data
based on error of fit
σd = 5.60 deg C
huge difference, since the model does not include diurnal cycle of
weather patterns
long-term slope
95% confidence limits based on prior variance
m2 = -0.03 ± 0.00002 deg C / yr
95% confidence limits based on posterior variance
m2 = -0.03 ± 0.00460 deg C / yr
in both cases, the cooling trend is significant, in the
sense that the confidence intervals do not include
zero or positive slopes.
However
The fit to the data is poor, so the results should be
used with caution. More effort needs to be put
into developing a better model.
part 3
covariance and the shape of the error surface
solutions within the region of low error are
almost as good as mest
m2est
0
0
mest
m1est
4
m2
large
range
of m1
E(m)
4
m1
small
range
of m2
miest
mi
near the minimum the error is shaped like a
parabola. The curvature of the parabola
controls the with of the region of low error
near the minimum, the Taylor
series for the error is:
curvature of
the error
surface
starting with the formula for error
we compute its
nd
2
derivative
but
so
covariance of
the model
parameters
curvature of
the error
surface
the covariance of the least squares solution
is expressed
in the shape of the error surface
large
variance
small
variance
E(m)
E(m)
miest
mi
miest
mi
Download