22S6 - Numerical and data analysis techniques Mike Peardon Hilary Term 2012

advertisement
22S6 - Numerical and data analysis
techniques
Mike Peardon
School of Mathematics
Trinity College Dublin
Hilary Term 2012
Mike Peardon (TCD)
22S6 - Data analysis
Hilary Term 2012
1 / 12
Modelling statistical (Monte Carlo) data
Often, we carry out experiments to test a hypothesis.
Since the result is a stochastic variable, the hypothesis
can never be proved or disproved.
Need a way to assign a probability that the hypothesis is
false. One place to begin: the χ2 statistic.
Suppose we have n measurements, Ȳi , i = 1..n each with
standard deviation σi . Also, we have a model which
predicts each measurement, giving yi .
The χ2 statistic
χ2 =
n (Ȳ − y )2
X
i
i
i=1
Mike Peardon (TCD)
σi2
22S6 - Data analysis
Hilary Term 2012
2 / 12
Goodness of fit
χ2 ≥ 0 and χ2 = 0 implies Ȳi = yi for all i = 1..n (ie the
model and the data agree perfectly).
Bigger values of χ2 imply the model is less likely to be
true.
Note χ2 is itself a stochastic variable
Rule-of-thumb
χ2 ≈ n for a good model
Mike Peardon (TCD)
22S6 - Data analysis
Hilary Term 2012
3 / 12
Models with unknown parameters - fitting
The model may depend on parameters αp , p = 1 . . . m
Now, χ2 is a function of these parameters; χ2 (α).
If the parameters are not know a priori, the “best fit”
model is described by the set of parameters, α ∗ that
minimise χ2 (α), so
∂χ2 (α) =0
∂αp ∗
α
Pm
p
For linear models; yi = p=1 αp qi , finding α ∗ is equivalent
to solving a linear system.
For more general models, finding minima of χ2 can be a
challenge. . .
Mike Peardon (TCD)
22S6 - Data analysis
Hilary Term 2012
4 / 12
Example - one parameter fit
Fit a straight line through the origin
Consider the following measured data Yi ± σi , i = 1..5 for
inputs xi
i
1
2
3
4
5
xi
0.1
0.5
0.7
0.9
1.0
Yi
0.25
0.90
1.20
1.70
2.20
σi
0.05
0.10
0.05
0.10
0.20
Fit this to a straight line through the origin, so our model is
y(x) = αx
with α an unknown parameter we want to determine
Result: α = 1.8097 and χ2 = 8.0.
Mike Peardon (TCD)
22S6 - Data analysis
Hilary Term 2012
5 / 12
Example - one parameter fit (2)
3
2.5
y
2
1.5
1
0.5
0
0
Mike Peardon (TCD)
0.2
0.4
x
0.6
22S6 - Data analysis
0.8
1
Hilary Term 2012
6 / 12
Models with unknown parameters - fitting (2)
Example: fitting data to a straight line
Suppose for a set of inputs, xi , i = 1..n we measure output
Ȳi ± σi .
If Y is modelled by a simple straight-line function;
yi = α1 + α2 xi , what values of {α1 , α2 } minimise χ2 ?
χ2 (α1 , α2 ) is given by
χ2 (α1 , α2 ) =
n (Ȳ − α − α x )2
X
i
1
2 i
i=1
The minimum is at
α1∗ =
α2∗ =
Mike Peardon (TCD)
σi2
A22 b1 − A12 b2
A11 A22 − A212
A11 b2 − A12 b1
A11 A22 − A212
22S6 - Data analysis
Hilary Term 2012
7 / 12
Models with unknown parameters - fitting (3)
Example: fitting data to a straight line
A11 =
n 1
X
i=1
σi2
A12 =
n x
X
i
i=1
A22 =
σi2
n x2
X
i
i=1
σi2
b1 =
n Ȳ
X
i
i=1
b2 =
σi2
n x Ȳ
X
i i
i=1
σi2
∗ are themselves stochastic
The best-fit parameters, α1,2
variables, and so have a probabilistic distribution
A range of likely values must be given; the width is
approximated by
s
s
A
A11
22
α
σ1α =
,
σ
=
A11 A22 − A212 2
A11 A22 − A212
Mike Peardon (TCD)
22S6 - Data analysis
Hilary Term 2012
8 / 12
Example - two parameter fit (2)
3
2.5
y
2
1.5
1
0.5
0
0
0.2
0.4
x
0.6
0.8
1
Now χ2 goes down from 8.0 → 7.1.
Mike Peardon (TCD)
22S6 - Data analysis
Hilary Term 2012
9 / 12
Example - try both fits again ...
3.5
3
2.5
y
2
1.5
1
0.5
0
0
0.2
0.4
x
0.6
0.8
1
Now χ2 is 357 for the y = αx model but still 7.1 for the
y = α1 + α2 x model. The first model should be ruled out.
Mike Peardon (TCD)
22S6 - Data analysis
Hilary Term 2012
10 / 12
Uncertainty propagates
The best fit parameter(s) α ∗ have been determined from
statistical data - so we must quote an uncertainty. How
precisely have they been determined?
α ∗ is a function of the statistical data, Ȳ. A statistical
fluctuation
in Ȳ of dȲ would result in a fluctuation in α ∗ of
dα ∗
dȲ.
dȲ
All the measured Y values fluctuate but if they are
independent, the fluctuations only add in quadrature so:
Error in the best fit parameters:
σα2 ∗
=
m
X
i=1
Mike Peardon (TCD)
‚
dα ∗
Œ2
dYi
22S6 - Data analysis
σi2
Hilary Term 2012
11 / 12
Uncertainty propagates (2)
Back to our example:
One-parameter fit
We found α ∗ = b/ A with
A=
n x2
X
i
i=1
So
dα ∗
dyi
=
1 db
A dyi
σi2
and b =
n xy
X
i i
i=1
σi2
since A is fixed. We get
σα2 ∗
=
n
1 X
A2
i=1
‚
xi
Œ2
σi2
σi2 =
1
A
Back to our first example: We quote α ∗ = 1.81 ± 0.05.
Mike Peardon (TCD)
22S6 - Data analysis
Hilary Term 2012
12 / 12
Download