Statistical Problems in Particle Physics Louis Lyons Oxford IPAM, November 2004 1 HOW WE MAKE PROGRESS Read Statistics books Kendal + Stuart Papers, internal notes Feldman-Cousins, Orear,…….. Experiment Statistics Committees BaBar, CDF Books by Particle Physicists Eadie, Brandt, Frodeson, Lyons, Barlow, Cowan, Roe,… PHYSTAT series of Conferences 2 PHYSTAT • History of Conferences • Overview of PHYSTAT 2003 • Specific Items •Bayes and Frequentism •Goodness of Fit •Systematics •Signal Significance •At the pit-face • Where are we now ? 3 HISTORY Where CERN Fermilab Durham SLAC When Jan 2000 March 2000 March 2002 Sept 2003 Issues Limits Limits Wider range of topics Wider range of topics Physicists Particles Particles Particles +3 astrophysicists +3 astrophysicists Particles + Astro + Cosmo Statisticians 3 3 2 Many 4 Future PHYSTAT05 Oxford, Sept 12th – 15th 2005 Information from l.lyons@physics.ox.ac.uk Limited to 120 participants Committee: Peter Clifford, David Cox, Jerry Friedman Eric Feigelson, Pedro Ferriera, Tom Loredo, Jeff Scargle, Joe Silk 5 Issues •Bayes versus Frequentism •Limits, Significance, Future Experiments •Blind Analyses •Likelihood and Goodness of Fit •Multivariate Analysis •Unfolding •At the pit-face •Systematics and Frequentism 6 Talks at PHYSTAT 2003 2 Introductory Talks 8 Invited talks by Statisticians 8 Invited talks by Physicists 47 Contributed talks Panel Discussion Underlying much of the discussion: Bayes and Frequentism 7 Invited Talks by Statisticians Brad Efron Bayesian, Frequentists & Physicists Persi Diaconis Bayes Jerry Friedman Machine Learning Chris Genovese Multiple Tests Nancy Reid Likelihood and Nuisance Parameters Philip Stark Inference with physical constraints David VanDyk Markov chain Monte Carlo John Rice Conference Summary 8 Invited Talks by Physicists Eric Feigelson Statistical issues for Astroparticles Roger Barlow Statistical issues in Particle Physics Frank Porter BaBar Seth Digel GLAST Ben Wandelt WMAP Bob Nichol Data mining Fred James Teaching Frequentism and Bayes Pekka Sinervo Systematic Errors Harrison Prosper Multivariate Analysis Daniel Stump Partons 9 Bayes versus Frequentism Old controversy Bayes 1763 Frequentism 1937 Both analyse data (x) statement about parameters ( ) e.g. Prob ( l u ) = 90% but very different interpretation Both use Prob (x; ) 10 Bayesian P( B; A) x P( A) P( A; B) P( B) Bayes Theorem P( param; data ) P(data; param) x P(param) posterior likelihood prior Problems: P(param) True or False “Degree of belief” Prior What functional form? Flat? Which variable? Unimportant when “data overshadows prior” 11 Important for limits P (Data;Theory) P (Theory;Data) HIGGS SEARCH at CERN Is data consistent with Standard Model? or with Standard Model + Higgs? End of Sept 2000 Data not very consistent with S.M. Prob (Data ; S.M.) < 1% valid frequentist statement Turned by the press into: Prob (S.M. ; Data) < 1% and therefore Prob (Higgs ; Data) > 99% i.e. “It is almost certain that the Higgs has been seen” 12 P (Data;Theory) P (Theory;Data) Theory = male or female Data = pregnant or not pregnant P (pregnant ; female) ~ 3% but P (female ; pregnant) >>>3% 13 l Frequentist u l at 90% confidence and u known, but random unknown, but fixed Probability statement about Bayesian and l u l and u known, and fixed unknown, and random Probability/credible statement about 14 Bayesian versus Frequentism Bayesian Basis of method Bayes Theorem --> Posterior probability distribution Frequentist Uses pdf for data, for fixed parameters Meaning of Degree of belief probability Prob of Yes parameters? Frequentist defintion Needs prior? Yes No Choice of interval? Data considered Likelihood principle? Yes Yes (except F+C) Only data you have ….+ more extreme Yes No Anathema 15 Bayesian versus Frequentism Bayesian Ensemble of experiment No Final statement Posterior probability distribution Unphysical/ empty ranges Excluded by prior Systematics Integrate over prior Coverage Decision making Unimportant Yes (uses cost function) Frequentist Yes (but often not explicit) Parameter values Data is likely Can occur Extend dimensionality of frequentist construction Built-in Not useful 16 Bayesianism versus Frequentism “Bayesians address the question everyone is interested in, by using assumptions no-one believes” “Frequentists use impeccable logic to deal with an issue of no interest to anyone” 17 Goodness of Fit Basic problem: 2 very general applicability, but • Requires binning, with ni> 5…..20 events per bin. Prohibitive with sparse data in several dimensions. • Not sensitive to signs of deviations K-S and related tests overcome these, but work in 1-D So, need something else. 18 Goodness of Fit Talks Zech Energy test Heinrich Yabsley & Kinoshita Lmax ? Raja Narsky What do we really know? Pia Software Toolkit for Data Analysis Ribon Blobel ……………….. Comments on 2 minimisation 19 Goodness of Fit Gunter Zech “Multivariate 2-sample test based on logarithmic distance function” See also: Aslan & Zech, Durham Conf., “Comparison of different goodness of fit tests” R.B. D’Agostino & M.A. Stephens, “Goodness of fit techniques”, Dekker (1986) 20 Likelihood & Goodness of Fit Joel Heinrich CDF note #5639 Faulty Logic: Parameters determined by maximising L So larger So larger Lmax Lmax is better implies better fit of data to hypothesis Monte Carlo dist of Lmax for ensemble of expts 21 Lmax not very useful e.g. Lifetime dist Fit for i.e. function only of t Therefore any data with the same t same so Lmax Lmax not useful for testing distribution (Distribution of Lmax due simply to different t in samples) 22 SYSTEMATICS For example Nevents LA b we need to know these, Observed Physics parameter probably from other measurements (and/or theory) N N for statistical errors Shift Central Value Bayesian Uncertainties error in Some are arguably statistical errors LA LA 0 LA b b0 b Frequentist Mixed 23 Shift Nuisance Parameters Nevents LA b Simplest Method Evaluate 0 using LA 0 and b0 Move nuisance parameters (one at a time) by their errors LA & b If nuisance parameters are uncorrelated, combine these contributions in quadrature total systematic 24 Bayesian p ; N p N; Without systematics prior With systematics p , LA, b; N p N ; , LA, b , LA, b ~ 1 2 LA 3 b Then integrate over LA and b p ; N p , LA, b; N dLA db 25 p ; N p , LA, b; N dLA db If 1 = constant and 2 LA = truncated Gaussian TROUBLE! Upper limit on from p ; N d Significance from likelihood ratio for 0 and max 26 Frequentist Full Method Imagine just 2 parameters and LA and 2 measurements N and M Physics Nuisance Do Neyman construction in 4-D Use observed N and M, to give Confidence Region for LA and LA 68% 27 Then project onto axis This results in OVERCOVERAGE Aim to get better shaped region, by suitable choice of ordering rule Example: Profile likelihood ordering LN0 , M 0 ; , LAbest LN0 , M 0 ; best , LAbest 28 Full frequentist method hard to apply in several dimensions Used in 3 parameters For example: Neutrino oscillations (CHOOZ) sin 2 , m 2 2 Normalisation of data Use approximate frequentist methods that reduce dimensions to just physics parameters e.g. Profile pdf i.e. pdf profile N ; pdf N , M 0 ; , LAbest Contrast Bayes marginalisation Distinguish “profile ordering” 29 Properties being studied by Giovanni Punzi Talks at FNAL CONFIDENCE LIMITS WORKSHOP (March 2000) by: Gary Feldman Wolfgang Rolk p-ph/0005187 version 2 Acceptance uncertainty worse than Background uncertainty Limit of C.L. as 0 C.L. for 0 Need to check Coverage 30 Method: Mixed Frequentist - Bayesian Bayesian for nuisance parameters and Frequentist to extract range Philosophical/aesthetic problems? Highland and Cousins NIM A320 (1992) 331 (Motivation was paradoxical behaviour of Poisson limit when LA not known exactly) 31 Systematics & Nuisance Parameters Sinervo Invited Talk (cf Barlow at Durham) Barlow Asymmetric Errors Dubois-Felsmann Theoretical errors, for BaBar CKM Cranmer Nuisance Param in Hypothesis Testing Higgs search at LHC with uncertain bgd Rolke Profile method see also: talk at FNAL Workshop and Feldman at FNAL (N.B. Acceptance uncertainty worse than bgd uncertainty) Demortier Berger and Boos method 32 Systematics: Tests Do test (e.g. does result depend on day of week?) Barlow: Are you (a) estimating effect, or (b) just checking? • If (a), correct and add error • If (b), ignore if OK, worry if not OK BUT: 1) Quantify OK 2) What if still not OK after worrying? My solution: 2 2 Contribution to systematics’ variance is a b even if negative! 33 Barlow: Asymmetric Errors e.g. 5 4 2 Either statistical or systematic How to combine errors ( Combine upper errors in quadrature is clearly wrong) How to calculate 2 How to combine results 34 Significance Significance = S / B ? Potential Problems: •Uncertainty in B •Non-Gaussian behavior of Poisson •Number of bins in histogram, no. of other histograms [FDR] •Choice of cuts (Blind analyses •Choice of bins Roodman and Knuteson) For future experiments • Optimising S / B could give S =0.1, B = 10-6 35 Talks on Significance Genovese Multiple Tests Linnemann Comparing Measures of Significance Rolke How to claim a discovery Shawhan Detecting a weak signal Terranova Scan statistics Quayle Higgs at LHC Punzi Sensitivity of future searches Bityukov Future exclusion/discovery limits 36 Multivariate Analysis Friedman Machine learning Prosper Experimental review Cranmer A statistical view Loudin Comparing multi-dimensional distributions Roe Reducing the number of variables (Cf. Towers at Durham) Hill Optimising limits via Bayes posterior ratio Etc. 37 From the Pit-face Roger Barlow William Quayle etc. From Durham: Chris Parkes Bruce Yabsley Asymmetric errors Higgs search at LHC Combining W masses and TGCs Belle measurements 38 Blind Analyses Potential problem: Experimenters’ bias Original suggestion? Luis Alvarez concerning Fairbank’s ‘discovery’ of quarks Aaron Roodman’s talk Methods of blinding: • Keep signal region box closed • Add random numbers to data • Keep Monte Carlo parameters blind • Use part of data to define procedure Don’t modify result after unblinding, unless………. Select between different analyses in pre-defined way See also Bruce Knuteson: QUAERO, SLEUTH, Optimal binning 39 Where are we? • Things that we learn from ourselves – Having to present our statistical analyses • Learn from each other – Likelihood not pdf for parameter • Don’t integrate L – – – – – – Conf int not Prob(true value in interval; data) Bayes’ theorem needs prior 2 Flat prior in m or in m are different Max prob density is metric dependent Prob (Data;Theory) not same as Prob(Theory;Data) Difference of Frequentist and Bayes (and other) intervals wrt Coverage – Max Like not usually suitable for Goodness of Fit 40 Where are we? •Learn from Statisticians –Update of Current Statistical Techniques –Bayes: Sensitivity to prior –Multivariate analysis –Neural nets –Kernel methods –Support vector machines –Boosting decision trees –Hypothesis Testing : False discovery rate –Goodness of Fit : Friedman at Panel Discussion –Nuisance Parameters : Several suggestions 41 Conclusions Very useful physicists/statisticians interaction e.g. Upper Limit on Poisson parameter when: observe n events background, acceptance have some uncertainty For programs, transparencies, papers, etc. see: http://www-conf.slac.stanford.edu/phystat2003 Workshops: Software, Goodness of Fit, Multivariate methods,… Mini-Workshop: Variety of local issues Future: PHYSTAT05 in Oxford, Sept 12th – 15th, 2005 Suggestions to: l.lyons@physics.ox.ac.uk 42