The Seven Pillars of Statistical Wisdom Stephen M. Stigler London School of Hygiene and Tropical Medicine, January 29, 2015 The Seven Pillars of Statistical Wisdom A reference to T.E. Lawrence? [of Arabia, whose 1927 memoir was titled “Seven Pillars of Wisdom”] No, rather to Lawrence’s own source: Proverbs IX:1 “Wisdom hath built her house, she hath hewn out her seven pillars.” Wisdom’s house was built to welcome those seeking understanding. Jorge Luis Borges’s “Chinese Encyclopedia” (1942) A list divides all animals into one of 14 categories: • Those that belong to the emperor • Embalmed ones • Those that are trained • Suckling pigs • Mermaids • Fabulous ones • Stray dogs • Those that are included in this classification • Those that tremble as if they were mad • Innumerable ones • Those drawn with a very fine camel hair brush • Et cetera • Those that have just broken the flower vase • Those that, at a distance, resemble flies The Seven Pillars #1 Aggregation (Sums, Means) The Combination of Observations: A truly radical Idea! Information is gained by discarding information (discarding individual identification) Ex. The Mean #1 Aggregation (Sums, Means) The Arithmetic Mean, Henry Gellibrand 1635 #1 Aggregation (Sums, Means) Sumerian 2x3 table, c. 3000 BC, with marginal totals #1 Aggregation (Sums, Means) 13 9 22 12 19 16 15 31 31 37 9" 12" 16" 37" 13" 19" 15" 47" 22" 31" 31" 84" " 47 84 Sumerian 2x3 table, c. 3000 BC, with marginal totals #1 Aggregation (Sums, Means) Determining the Lawful Measuring Rod, Kobel 1522 #1 Aggregation (Sums, Means) More Examples: Least Squares (Legendre 1805, Gauss 1809) Index Numbers (Jevons 1863) Kernel Estimate of a Density (Peirce, 1873) “Nonparametric” smoothers #1 Aggregation (Sums, Means) 12 Mathematicians #1 Aggregation (Sums, Means) #2 Information (Root n Rule) The Ancient Greek Paradox “Sorites” (Eubulides of Miletus, circa 350 BC) The Problem of the Heap: How do you define a Heap of Sand? One grain is not a heap. If you add a grain to a pile of sand that is not a heap, how then can one grain make it a heap? #1 Aggregation (Sums, Means) #2 Information (Root n Rule) The Greek Physician Galen (circa 200 AD): One medical case does not prove a treatment works. If you add a single case to a unconvincing record of cases, how then can that one case convince? How can we measure the accumulation of Information? #1 Aggregation (Sums, Means) #2 Information (Root n Rule) Abraham De Moivre 1730 observed that the inflection points of the symmetric Binomial “curve” for n trials are at a distance ± 1 n from the center: 2 #1 Aggregation (Sums, Means) #2 Information (Root n Rule) Pierre Simon Laplace 1810 stated a Central Limit Theorem that generalized this (but with a typo! The square root sign was missing!) #1 Aggregation (Sums, Means) #2 Information (Root n Rule) G. B. Airy (1861) and Charles S. Peirce (1879) Extended the idea to more complex settings (variance component models). #1 Aggregation (Sums, Means) #2 Information (Root n Rule) #3 Likelihood (MLEs,Testing) The calibration of evidence on a probability scale. P- values. Likelihood based inference, MLEs #3 Likelihood (MLEs,Testing) John Arbuthnot, 1711. 82 years with more Male births than Female. #3 Likelihood (MLEs,Testing) The recently discovered title page of Bayes’s 1763 Essay: A Method of Calculating The Exact Probability of All Conclusions founded on Induction. #1 Aggregation (Sums, Means) #2 Information (Root n Rule) #3 Likelihood (MLEs,Testing) #4 Intercomparison (Percentiles, t-Tests) Comparisons based upon differences among the data themselves. Simplest examples: 1) Percentiles (differences compared to IQR). 2) Student’s t-test (distance between means compared to estimated SD of difference). #4 Intercomparison (Percentiles, t-Tests) More complicated examples: 1) ANOVA with structured data (blocks, split plots, additive models, etc). 2) The Bootstrap #1 Aggregation (Sums, Means) #2 Information (Root n Rule) #3 Likelihood(MLEs,Testing) #4 Intercomparison (Percentiles, t-Tests) #5 Regression (Multivariate Inference and Bayes Theorem) Tall parents on average produce somewhat shorter children than themselves; tall children on average have somewhat shorter parents than themselves. #5 Regression (Multivariate Inference and Bayes Theorem) Two marginal distributions, Two conditional expectations; Bayes Theorem! #5 Regression (Multivariate Inference and Bayes Theorem) B level A A level B B A Galton, 1885: Two regression lines (two lines of conditional expectation). #1 Aggregation (Sums, Means) #2 Information (Root n Rule) #3 Likelihood(MLEs,Testing) #4 Intercomparison (Percentiles, t-Tests) #5 Regression (Multivariate Inference and Bayes Theorem) #6 Design (ANOVA, Design-Based Inference) Examples: Book of Daniel (Old Testament) Avicenna (or Ibn Sina, circa 1000 AD) Sample surveys #6 Design (ANOVA, Design-Based Inference) Random assignment in pairs (Peirce 1880s) “Induction: Reasoning from a sample taken at random to the whole lot sampled” #6 Design (ANOVA, Design-Based Inference) W. S. Jevons 1873: “One of the most requisite precautions in experimentation is to vary only one circumstance at a time, and to maintain all other circumstances rigidly unchanged.” R. A. Fisher 1926: “No aphorism is more frequently repeated in connection with field trials, than that we must ask Nature few questions, or, ideally, one question, at a time. … this view is wholly mistaken.” #6 Design (ANOVA, Design-Based Inference) Random Latin Square designs (Fisher 1930s) Spherical symmetry approximated by random design, validating tests! #1 Aggregation (Sums, Means) #2 Information (Root n Rule) #3 Likelihood (MLEs,Testing) #4 Intercomparison (Percentiles, t-Tests) #5 Regression (Multivariate Inference and Bayes Theorem) #6 Design (ANOVA, Design-Based Inference) #7 Residuals (Models and Model Comparison) #7 Residuals (Models and Model Comparison) John Herschel, 1831: “Complicated phenomena…may be simplified by subducting the effect of known causes, … leaving … a residual phenomenon to be explained.” “It is by this process…that science … is chiefly promoted.” #7 Residuals (Models and Model Comparison) John Stuart Mill, 1843, in A System of Logic, called this “The Method of Residues”: “Of all the methods of investigating laws of nature, this is the most fertile in unexpected results.” In statistics: Residual plots as diagnostics Comparison of nested models Testing significance of regression coefficients Parametric modelling Partial Likelihood (Cox models) #1 Aggregation (Sums, Means) #2 Information (Root n Rule) #3 Likelihood (MLEs,Testing) #4 Intercomparison (Percentiles, t-Tests) #5 Regression (Multivariate Inference and Bayes Theorem) #6 Design (ANOVA, Design-Based Inference) #7 Residuals (Models and Model Comparison) #1 Targeted reduction/compression of data #2 Diminishing value of more data #3 Putting a probability measure to inferences #4 Doing this based upon internal data variation #5 Different perspectives give different answers #6 The Essential role of planning #7 How to explore in nested families of models #1 Aggregation (Sums, Means) #2 Information (Root n Rule) #3 Likelihood (MLEs,Testing) #4 Intercomparison (Percentiles, t-Tests) #5 Regression (Multivariate Inference and Bayes Theorem) #6 Design (ANOVA, Design-Based Inference) #7 Residuals (Models and Model Comparison) Revolutionary ideas: #1 Gain information by suppressing information W. S. Jevons replying to critics: “Were a complete explanation of each fluctuation thus necessary, not only would all inquiry into this subject be hopeless, but the whole of the statistical and social sciences, so far as they depend upon numerical facts, would have to be abandoned.” #1 Aggregation (Sums, Means) #2 Information (Root n Rule) #3 Likelihood (MLEs,Testing) #4 Intercomparison (Percentiles, t-Tests) #5 Regression (Multivariate Inference and Bayes Theorem) #6 Design (ANOVA, Design-Based Inference) #7 Residuals (Models and Model Comparison) More Revolutionary Ideas: #2 All data equally good but added data less valuable #3 Probability as a measure for inference #4 Measure accuracy with no exterior standard #5 Euclid proportionality wrong in all interesting science #6, #7 Major change to scientific experimentation #1 Aggregation (Sums, Means) #2 Information (Root n Rule) #3 Likelihood (MLEs,Testing) #4 Intercomparison (Percentiles, t-Tests) #5 Regression (Multivariate Inference and Bayes Theorem) #6 Design (ANOVA, Design-Based Inference) #7 Residuals (Models and Model Comparison) Features or bugs? #1 Treat people as mere statistics #2 Is “Big Data” good based on size alone? #3#4 Significance tests: Problem or solution? #5#6#7 Criticism of regression models, fitting, selection E. B. Wilson 1927: “It is largely because of lack of knowledge of what statistics is that the person untrained in it trusts himself with a tool quite as dangerous as any he may pick out from the whole armamentarium of scientific methodology.” #1 Aggregation (Sums, Means) #2 Information (Root n Rule) #3 Likelihood (MLEs,Testing) #4 Intercomparison (Percentiles, t-Tests) #5 Regression (Multivariate Inference and Bayes Theorem) #6 Design (ANOVA, Design-Based Inference) #7 Residuals (Models and Model Comparison) These Seven Pillars are not mathematics and are not Computer Science. They do centrally constitute the important core ideas underlying the Science of Statistics. The Seven Pillars of Statistical Wisdom Proverbs IX:1 “Wisdom has built her house, she has hewn her seven pillars.” “Wisdom has built her house, the seven have set its foundations.” Refers to the Seven Sages of the seven ancient (before the great flood) cities of Mesopotamia Three Local Pillars of Wisdom Major Greenwood, Austin Bradford Hill, Peter Armitage Seven (+) Sages…. This content downloaded from 76.192.186.43 on Fri, 29 Nov 2013 12:53:18 PM All use subject to JSTOR Terms and Conditions