22S6 - Numerical and data analysis techniques Mike Peardon School of Mathematics Trinity College Dublin Hilary Term 2012 Mike Peardon (TCD) 22S6 - Data analysis Hilary Term 2012 1 / 12 Modelling statistical (Monte Carlo) data Often, we carry out experiments to test a hypothesis. Since the result is a stochastic variable, the hypothesis can never be proved or disproved. Need a way to assign a probability that the hypothesis is false. One place to begin: the χ2 statistic. Suppose we have n measurements, Ȳi , i = 1..n each with standard deviation σi . Also, we have a model which predicts each measurement, giving yi . The χ2 statistic χ2 = n (Ȳ − y )2 X i i i=1 Mike Peardon (TCD) σi2 22S6 - Data analysis Hilary Term 2012 2 / 12 Goodness of fit χ2 ≥ 0 and χ2 = 0 implies Ȳi = yi for all i = 1..n (ie the model and the data agree perfectly). Bigger values of χ2 imply the model is less likely to be true. Note χ2 is itself a stochastic variable Rule-of-thumb χ2 ≈ n for a good model Mike Peardon (TCD) 22S6 - Data analysis Hilary Term 2012 3 / 12 Models with unknown parameters - fitting The model may depend on parameters αp , p = 1 . . . m Now, χ2 is a function of these parameters; χ2 (α). If the parameters are not know a priori, the “best fit” model is described by the set of parameters, α ∗ that minimise χ2 (α), so ∂χ2 (α) =0 ∂αp ∗ α Pm p For linear models; yi = p=1 αp qi , finding α ∗ is equivalent to solving a linear system. For more general models, finding minima of χ2 can be a challenge. . . Mike Peardon (TCD) 22S6 - Data analysis Hilary Term 2012 4 / 12 Example - one parameter fit Fit a straight line through the origin Consider the following measured data Yi ± σi , i = 1..5 for inputs xi i 1 2 3 4 5 xi 0.1 0.5 0.7 0.9 1.0 Yi 0.25 0.90 1.20 1.70 2.20 σi 0.05 0.10 0.05 0.10 0.20 Fit this to a straight line through the origin, so our model is y(x) = αx with α an unknown parameter we want to determine Result: α = 1.8097 and χ2 = 8.0. Mike Peardon (TCD) 22S6 - Data analysis Hilary Term 2012 5 / 12 Example - one parameter fit (2) 3 2.5 y 2 1.5 1 0.5 0 0 Mike Peardon (TCD) 0.2 0.4 x 0.6 22S6 - Data analysis 0.8 1 Hilary Term 2012 6 / 12 Models with unknown parameters - fitting (2) Example: fitting data to a straight line Suppose for a set of inputs, xi , i = 1..n we measure output Ȳi ± σi . If Y is modelled by a simple straight-line function; yi = α1 + α2 xi , what values of {α1 , α2 } minimise χ2 ? χ2 (α1 , α2 ) is given by χ2 (α1 , α2 ) = n (Ȳ − α − α x )2 X i 1 2 i i=1 The minimum is at α1∗ = α2∗ = Mike Peardon (TCD) σi2 A22 b1 − A12 b2 A11 A22 − A212 A11 b2 − A12 b1 A11 A22 − A212 22S6 - Data analysis Hilary Term 2012 7 / 12 Models with unknown parameters - fitting (3) Example: fitting data to a straight line A11 = n 1 X i=1 σi2 A12 = n x X i i=1 A22 = σi2 n x2 X i i=1 σi2 b1 = n Ȳ X i i=1 b2 = σi2 n x Ȳ X i i i=1 σi2 ∗ are themselves stochastic The best-fit parameters, α1,2 variables, and so have a probabilistic distribution A range of likely values must be given; the width is approximated by s s A A11 22 α σ1α = , σ = A11 A22 − A212 2 A11 A22 − A212 Mike Peardon (TCD) 22S6 - Data analysis Hilary Term 2012 8 / 12 Example - two parameter fit (2) 3 2.5 y 2 1.5 1 0.5 0 0 0.2 0.4 x 0.6 0.8 1 Now χ2 goes down from 8.0 → 7.1. Mike Peardon (TCD) 22S6 - Data analysis Hilary Term 2012 9 / 12 Example - try both fits again ... 3.5 3 2.5 y 2 1.5 1 0.5 0 0 0.2 0.4 x 0.6 0.8 1 Now χ2 is 357 for the y = αx model but still 7.1 for the y = α1 + α2 x model. The first model should be ruled out. Mike Peardon (TCD) 22S6 - Data analysis Hilary Term 2012 10 / 12 Uncertainty propagates The best fit parameter(s) α ∗ have been determined from statistical data - so we must quote an uncertainty. How precisely have they been determined? α ∗ is a function of the statistical data, Ȳ. A statistical fluctuation in Ȳ of dȲ would result in a fluctuation in α ∗ of dα ∗ dȲ. dȲ All the measured Y values fluctuate but if they are independent, the fluctuations only add in quadrature so: Error in the best fit parameters: σα2 ∗ = m X i=1 Mike Peardon (TCD) dα ∗ 2 dYi 22S6 - Data analysis σi2 Hilary Term 2012 11 / 12 Uncertainty propagates (2) Back to our example: One-parameter fit We found α ∗ = b/ A with A= n x2 X i i=1 So dα ∗ dyi = 1 db A dyi σi2 and b = n xy X i i i=1 σi2 since A is fixed. We get σα2 ∗ = n 1 X A2 i=1 xi 2 σi2 σi2 = 1 A Back to our first example: We quote α ∗ = 1.81 ± 0.05. Mike Peardon (TCD) 22S6 - Data analysis Hilary Term 2012 12 / 12