Document 14539471

advertisement
Physics 2660: Fundamentals of Scientific Computing Lecture 12
Notes
•  Only 3 weeks left in the semester, 3 lectures left including this one:
–  19 April
–  26 April
–  3 May
•  Labs: two labs left
–  21 April
–  28 April
•  Upcoming homeworks: – 
– 
– 
– 
HW11 due Monday 25 April at midnight
HW12 due Saturday 30 April at 6pm
HW13 due Wednesday 4 May at midnight
HW14 due Wednesday 4 May at midnight
•  Solutions to all labs and hw’s are in the process of being posted …
sorry for recent delays
2
Notes
•  Final exam is coming:
–  Take-­‐‑home projects
•  3 or 4 problems
–  Like a more involved, longer multipart homework assignment
–  Assigned last week of semester on Tuesday 3 May
–  Due Thursday May 12:
•  electronic copies by 9:00am •  hard-­‐‑copies must be submiTed Thursday 12 May between 08:00-­‐‑10:00 in room 022-­‐‑C, our computer lab
3
Notes
•  Office hours reminder:
–  My office hours are in Room 022-­‐‑C (our computer lab) from 3:30-­‐‑5pm on Tuesdays or by appointment
•  Today they will start a liTle late!
•  3:45 or so
–  TA office hours, also in Room 022-­‐‑C
•  Mondays 5-­‐‑8pm
•  Tuesdays 5-­‐‑8pm
4
Review and Today’s Outline
•  Last time:
–  Three probability distributions and the Gaussian Limit
–  Experimental Uncertainties
–  Comparing two models
•  Today: some powerful ideas!
– 
– 
– 
– 
– 
Monte Carlo methods
Comparing two models
Tuning a model/theory to best match the data
Searching
Sorting
5
Comparing Data to a Prediction
6
Comparing Data to Some Prediction
•  This is science at its best!
0. Prediction
1.  Observation
2.  Comparison 3.  Conclusion
4.  Refine Prediction
5.  Repeat as necessary
• 
The comparison step is a crucial step in how we arrive at a refined picture of how the world works.
– 
• 
That’s our mission as scientists, no?
Great news: there are numerical methods one can use to do this quantitatively – perfect for executing in computer programs!
7
How good is this theory?
8
How good is this theory?
9
How good is this theory?
Suggestions for a simple model?
10
How good is this theory?
Question:
How well does this model fit the data?
11
How good is this theory?
Question:
How well does this model fit the data?
12
Which theory is beMer?
How to arbitrate between these two?
13
Which theory is beMer?
14
Calculation of Chi2
15
Comparison of Models: Chi2 Values
16
Chi2 Distribution
17
Chi2 Distribution
…and still be right
…and still be right
18
Chi2 Distribution
•  So the probability of having a measurement with χ2 > N can be determined from this χ2 distribution
•  This distribution is same as the one for the integral of the Gaussian dist
one data point = one degree of freedom
•  Why use this other thing?
–  We can calculate the χ2 for more than one data point and easily combine into a single figure of merit
19
More Data Points: More Degrees of Freedom
20
Degrees of Freedom
•  When comparing a theory to some data, as we are doing here, each compared prediction from the model is called a degree of freedom of the comparison
–  comparing 1 data point = 1 degree of freedom
–  comparing 5 data points = 5 degrees of freedom
–  comparing N data points = N degrees of freedom
•  The χ2 distribution changes as one considers a comparison with more degrees of freedom
21
Many Degrees of Freedom
For large num of degrees of freedom k, the most probable value of χ2 is equal to k.
22
2 Reduced χ
23
2 Probabilities for Reduced χ
As a rough rule of thumb, a reduced chi2 of ~1.0 indicates good agreement between samples, given their uncertainties
24
2 Probabilities for Reduced χ
25
Probability of being consistent?
26
If this theory were an accurate representation of our data…
2 Probabilities for Reduced χ
Notes: Too LARGE reduced chi2 implies poor agreement btwn theory and data.
Too SMALL reduced chi2 implies one could be OVERFITTING the data – the agreement should still be impacted by the uncertainty on each point. reduced chi2 ~= 1.0 indicates theory and data are in accord within uncertainties, ie, measurement collection sometimes high (50%) sometimes low (50%). 27
2 Usefulness ofχ
28
Summary so far…
•  Compare some data to a model, account for uncertainties
•  Calculate reduced χ2
•  If good agreement
–  should see different points sometimes high/low
–  if k large, reduced χ2 ~ 1.0
29
Tuning a Model to Best Match Some Data
30
Tuning a Model
31
Tuning a Model
Can we figure out which model – which values of a and b – the data most favors?
32
Probability of Some Observation
33
Probability of Multiple Observations
The probability of the collection of data – 3 observations – is just the product of the three individual probabilities
34
Probability of Multiple Observations
The probability of the collection of data – k observations – is just the product of the k individual probabilities
35
P: The χ2 Likelihood Function
36
Minimizing the χ2
37
Minimizing the χ2
38
Minimizing the χ2
39
Minimizing the χ2
40
More Powerful Application: An Arbitrary Theory
41
FiMing with Gnuplot
P(x;a, b, c) =
1
2π c
2
e
−
(ax−b)2
2c 2
42
Assessing the Quality of a Fit
43
Assessing the Quality of a Fit
Trivial case
Consequence: If the number of fit parameters is greater than or equal to the number of data points the χ2 is undefined.
44
Assessing the Quality of a Fit
P(x;a, b, c) =
1
2π c
2
e
−
(ax−b)2
2c 2
45
Assessing the Quality of a Fit
So there is a 90% probability that, if the data were consistent with the model (here a Gaussian-­‐‑like thing with 3 params), the data would have a higher chi2 value. Too good to be true? Why are the points so close to the model? Did the fit procedure cheat in some way? Are the uncertainties over-­‐‑estimated?
46
FiMing is Done EVERYWHERE
47
Curve FiMing
48
Deviations from the Model
49
The Pull Distribution
50
Bias – Is the Prediction In Accord with the Data?
51
Clusters of Data Above/Below
52
Clusters of Data Above/Below
53
More Testing of Compatibility
54
Cumulative Distribution Function
55
Cumulative Distribution Function
56
Example: PDF and CDF
57
Empirical Distribution Function
58
Empirical Distribution Function
•  The ECDF is made from “unbinned” data
–  not from a binned histogram
–  use raw measured values
•  Do this by:
1.  say you have N values, xi
2.  sort the N values in order of increasing value
3.  plot each of the N values with xi on the x-­‐‑axis and i/
N on the y-­‐‑axis
• 
Now, compare model’s CDF and the data’s ECDF…
59
Testing Compatibility
60
Testing Compatibility
61
Download