StatisticalInferenceBiblio.pdf © 2013, Timothy G. Gregoire, Yale University http://environment.yale.edu/profile/gregoire/bibliographies Last revised: May 2013 Statistical Inference Bibliography 1920-Present . 1. Misc. 2 On Bias and Randomness. 2. Misc. 3 Lopsided reasoning. 3. Pearson, K. (1920) “The Fundamental Problem in Practical Statistics.” Biometrika, 13(1): 116. 4. Edgeworth, F.Y. (1921) “Molecular Statistics.” Journal of the Royal Statistical Society, 84(1): 71-89. 5. Fisher, R. A. (1922) “On the Mathematical Foundations of Theoretical Statistics.” Philosophical Transactions of the Royal Society of London, Series A, Containing Papers of a Mathematical or Physical Character, 222: 309-268. 6. Neyman, J. and E. S. Pearson. (1928) “On the Use and Interpretation of Certain Test Criteria for Purposes of Statistical Inference: Part I.” Biometrika, 20A(1/2): 175-240. 7. Fisher, R. A. (1933) “The Concepts of Inverse Probability and Fiducial Probability Referring to Unknown Parameters.” Proceedings of the Royal Society of London, Series A, Containing Papers of Mathematical and Physical Character, 139(838): 343-348. 8. Buchanan-Wollaston, H. J. (1935) “Statistical Tets”, Nature v136: 182-183. 9. Fisher, R. A. (1935) “The Logic of Inductive Inference.” Journal of the Royal Statistical Society, 98(1): 39-82. 10. Fisher, R. A. (1936) “Uncertain inference.” Proceedings of the American Academy of Arts and Sciences, 71: 245-258. 11. Neyman, J. (1937). Outline of a theory of statistical estimation based on the classical theory of probability. Philosophical Transactions of the Royal Society of London, Series A. 236: 333-380. 12. Berkson, J. (1942) “Tests of Significance Considered as Evidence.” Journal of the American Statistical Association, 37(219): 325-335. 13. Berkson, J. (1942) “Tests of Significance Considered as Evidence.” Reprinted in International Journal of Epidemiology (from 1942 JASA article) 32:687-691. StatisticalInferenceBiblio.pdf © 2013 Timothy G. Gregoire, Yale University 14. Barnard, G. A. (1949) “Statistical Inference.” Journal of the Royal Statistical Society, Series B (Methodological), 11(2): 115-149. 15. Fisher, R. (1955) “Statistical Methods and Scientific Induction.” Journal of the Royal Statistical Society, Series B (Methodological), 17(1): 69-78. 16. Pearson, E. S. (1955) “Statistical Concepts in their Relation to Reality.” Journal of the Royal Statistical Society, Series B (Methodological),17(2): 204-207. 17. Yates, F. (1955) “Discussion on the Paper by Dr. Box and Dr. Anderson.” Statistical Inference, Robustness, and Modeling Strategy, JRSS-B, 17(1): 31. 18. Barlett, M.S. (1956) Comment on Sir Ronald Fisher’s Paper: “On a Test of significance in Pearson’s Biometrika Tables (No. 11)”. Journal of the Royal Statistical Society Series B 18(2): 295 – 296. 19. Fisher, R. (1956) On a Test of significance in Pearson’s Biometrika Tables (No. 11). Journal of the Royal Statistical Society Series B 18(1): 56 – 60. 20. Neyman, J. (1956) Note on an Article by Sir Ronald Fisher. Journal of the Royal Statistical Society Series B 18(2): 288 – 294. 21. Welch, B.L. (1956) Note on some criticisms made by Sir Ronald Fisher. Journal of the Royal Statistical Society Series B 18(2): 297 – 302. 22. Lindley, D. V. (1957). A statistical paradox. Biometrika 44(1/2) 187-192. 23. Cox, D. R. (1958) “Some Problems Connected with Statistical Inference.” Annals of Mathematical Statistics, 29(2): 357-372. 24. Good, I. J. (1958) “Significance Tests in Parallel and In Series.” Journal of the American Statistical Association, 53: 799-813. 25. Eysenck, H. J. (1960) “The Concept of Statistical Significance and the Controversy about One-tailed Tests”, Psychological Review 67(4) 269-271. 26. Natrella, M. G. (1960) “The Relation Between Confidence Intervals and Tests of Significance.” The American Statistician, 14: 20-22 & back cover. 27. Rozeboom, W. W. (1960) “The Fallacy of the Null-Hypothesis Significance Test.” Psychological Bulletin, 57(5): 416-428. 28. Neyman, J. (1961) “Silver Jubilee of My Dispute with Fisher.” Journal of the Operations Research Society of Japan, 3(4): 145-154. 2 StatisticalInferenceBiblio.pdf © 2013 Timothy G. Gregoire, Yale University 29. Pratt, J. W. (1961) “Testing Statistical Hypotheses.” Journal of the American Statistical Association, 56(293): 163-167. 30. Barnard, G. A., G. M. Jenkins, & C. B. Winsten. (1962) “Likelihood Inference and Time Series.” Journal of the Royal Statistical Society, Series A (General), 125(3): 321-372. 31. Birnbaum, A. (1962) “On the Foundations of Statistical Inference.” Journal of the American Statistical Association, 57(298): 269-306. 32. Pearson, E. S. (1962) “Some Thoughts on Statistical Inference.” Annals of Mathematical Statistics, 33(2): 394-403. 33. Fraser, D. A. S. (1963) “On the Sufficiency and Likelihood Principles.” Journal of the American Statistical Association, 58(303): 641-647. 34. Kendall, M. G. (1963) “Ronald Aylmer Fisher, 1890-1962.” Biometrika, 50(1/2):1-15. 35. Platt, J. R. (1964) “Strong Inference.” Science, 146(3642): 347-353. 36. Dempster, A. P. and M. Schatzoff. (1965) “Expected Significance Level as a Sensitivity Index for Test Statistics.” Journal of the American Statistical Association, 60(310): 420-436. 37. Pratt, J. W. (1965) “Bayesian interpretation of standard inference statements.” Journal of the Royal Statistical Society 27(2) 169-203 38. Cornfield, J. (1966) “Sequential Trials, Sequential Analysis and the Likelihood Principle.” The American Statistician, 20: 18-23. 39. Cutler, S. J., et al. (1966) “The Role of Hypothesis Testing in Clinical Trials.” Journal of Chronic Disease, 19: 857-882. 40. Selvin, H. C. and Stuart, A. (1966) “Data-dredging Procedures in Survey Analysis.” The American Statistician, 20:20-23. 41. Royall, R. (1968). “An old approach to finite population sampling theory.” Journal of the American Statistical Association 63: 1269-1279. 42. Seeger, P. (1968) “A Note on a Method for the Analysis of Significances en masse” Technometrics 10(3): 586-593. 43. Edwards, A. W. F. (1969) “Statistical Methods in Scientific Inference.” Nature, 222(June): 1233-1237. 44. Tukey, J. W. (1969) Analyzing Data: Sanctification or Detective Work? American Psychologist 83-91. 3 StatisticalInferenceBiblio.pdf © 2013 Timothy G. Gregoire, Yale University 45. Edwards, A. W. F. (1970) “Likelihood.” Nature, 227(July): 92. 46. Durbin, J. (1970) “On Birnbaum’s Theorem on the Relation Between Sufficiency, Conditionality and Likelihood.” Journal of the American Statistical Association, 65(329): 395-398. 47. Leamer, E. E. (1974) “False Models and Post-Data Model Construction.” Journal of the American Statistical Association, 69(345): 122-131. 48. Spielman, S. 1974. Philosophy of Science: The Logic of Tests of Significance. 49. Kempthorne, O. (1975) “Inference from Experiments and Randomization.” In A Survey of Statistical Design and Linear Models, J. N. Srivastava, ed., North-Holland Publishing Company. Pages 303-331. 50. Robinson, G. K. (1975). Some counterexamples to the theory of confidence intervals. Biometrika 62(1) 155-161. 51. Joshi, V. M. (1976). A note on Birmbaum’s theory of the likelihood principle. Journal of the American Statistical Association. 71: 345-346. 52. Cox, D. R. (1977) “The Role of Significance Tests.” Scandinavian Journal of Statistics, 4: 49-70. 53. Guttman, L. (1977) “What is Not What in Statistics.” The Statistician, 26(2): 81-107. 54. Robinson, G. K. (1977). Conservative statistical inference. Journal of the Royal Statistical Society, Series B. 39: 381-386. 55. Carver, R. P. (1978) “The Case Against Statistical Significance Testing.” Harvard Educational Review, 48(3): 378-398. 56. Eberhardt L.L. (1978) “Appraising Variability in Population Studies”. Journal of Wild Life Management, 42(2): 207-238. 57. Good, I. J. (1980) “The diminishing significance of a p-value as the sample size decreases.” Journal of Statistical Computation & Simulation, 11: 307-313. 58. Dolby, G. R. (1982) “The Role of Statistics in the Methodology of the Life Sciences.” Biometrics, 38: 1069-1083. 59. Good, I. J. (1982) “Standardized tail-area probabilities.” Journal of Statistical Computation and Simulation, 16: 65-75. 60. Schweder, T. and E. Spjøtvoll. (1982). “Plots of P-values to Evaluate Many Tests Simultaneously.” Biometrics, 69(3): 493-502. 4 StatisticalInferenceBiblio.pdf © 2013 Timothy G. Gregoire, Yale University 61. Leamer, E. E. (1983) “Let’s Take the Con out of Econometrics.” The American Economic Review, 73(1): 31-43. 62. Leamer, E. and H. Leonard. (1983) “Reporting the Fragility of Regression Estimates.” The Review of Economics and Statistics, 65(2): 306-317. 63. Good, I. J. (1984) “How Should Tail-Area Probabilities be Standardized for Sample Size in Unpaired Comparisons?” C191 in Journal of Statistical Computation and Simulation, 19: 174. 64. Thompson, W. A. Jr. (1985). Optimal significance procedures for simple hypotheses. Biometrika 72(1) 230-232. 65. Berger, J. O. (1986) “Are P-Values Reasonable Measures of Accuracy?” In Pacific Statistical Congress, I. S. Francis et al., eds., Elsevier Science Publishers, the Netherlands. Pages 21-27. 66. Cox, D. R. (1986) “Some General Aspects of the Theory of Statistics.” International Statistical Review, 54(2): 117-126. 67. Fleiss, J. L. (1986) “Significance Tests Have a Role in Epidemiologic Research: Reactions to A. M. Walker.” American Journal of Public Health, 76(5): 559-560. 68. Fleiss, J. L. (1986) “Letters to the Editor: Confidence Intervals vs Significance Tests: Quantitative Interpretation.” American Journal of Public Health, 76(5): 587. 69. Hall, P. and B. Selinger. (1986) “Statistical Significance: Balancing Evidence Against Doubt.” Australian Journal of Statistics, 28(3): 354-370. 70. Johnstone, D. J. (1986). Tests of significance in theory and practice. The Statistician 35, 491504. 71. Royall, R. M. (1986) “The Effect of Sample Size on the Meaning of Significance Tests.” The American Statistician, 40(4): 313-315. Also: Bailey, K. R. (1987) “Comment on Royall (1986).” The American Statistician, 41(3): 245-246. 72. Walker, A. M. (1986) “Reporting the Results of Epidemiologic Studies.” American Journal of Public Health, 76(5): 556-558. 73. Warren, W.G. (1986) “On the presentation of statistical analysis: reason or ritual”. Canadian Journal of Research 16: 1185 – 1191. 74. Berger, J. O. and M. Delampady. (1987) “Testing Precise Hypotheses.” Statistical Science, 2(3): 317-352. 5 StatisticalInferenceBiblio.pdf © 2013 Timothy G. Gregoire, Yale University 75. Berger, J. O. and T. Sellke. (1987) “Testing a Point Null Hypothesis: The Irreconcilability of P Values and Evidence.” Journal of the American Statistical Association, 82(397): 112139. 76. Casella, G. and R. L. Berger. (1987) “Reconciling Bayesian and Frequentist Evidence in the One-Sided Testing Problem.” Journal of the American Statistical Association, 82(397): 106135. 77. Hill, B. M. (1987) “The validity of the likelihood principle.” The American Statistician 41(2) 95-100. 78. Poole, C. (1987) “Beyond the Confidence Interval.” American Journal of Public Health, 77(2): 195-199. 79. Thompson, W. D. (1987) “Statistical Criteria in the Interpretation of Epidemiologic Data.” American Journal of Public Health, 77(2): 191-194. 80. Berger, J. O. and D. A. Berry. (1988) “Statistical Analysis and the Illusion of Objectivity.” American Scientist, 76(2): 159-165. 81. Goodman, S. N. and R. Royall. (1988) “Evidence and Scientific Research.” American Journal of Public Health, 78(12): 1568-1574. 82. Schweder, T. (1988) “A Significance Version of the Basic Neyman-Pearson Theory for Scientific Hypothesis Testing.” Scandinavian Journal of Statistics, 15: 225-242. 83. Sorić, B. (1989) “Statistical “Discoveries” and Effect-Size Estimation.” Journal of the American Statistical Association, 84(406): 608-610. 84. Vuong, Q. H. (1989) “Likelihood Ratio Tests for Model Selection and Non-Nested Hypotheses.” Econometrica, 57(2): 307-333. 85. Anscombe, F. J. (1990) “The Summarizing of Clinical Experiments by Significance Levels.” Statistics in Medicine, 9: 703-708. 86. Barnard, G. A. (1990) “Must Clinical Trials Be Large? The Interpretation of P-Values and the Combination of Test Results.” Statistics in Medicine, 9: 601-614. 87. Begg, C. B. (1990) “On Inferences from Wei’s Biased Coin Design for Clinical Trials.” Biometrika, 77(3): 467-84. 88. Cohen, J. (1990) “Things I Have Learned (So Far).” American Psychologist, 45(12): 13041312. 89. Peterman, R. M. (1990) “The Importance of Reporting Statistical Power: The Forest Decline and Acidic Deposition Example.” Ecology, 71(5): 2024-2027. 6 StatisticalInferenceBiblio.pdf © 2013 Timothy G. Gregoire, Yale University 90. Rice, W. R. (1990) “A Consensus Combined P-Value Test and the Family-Wide Significance of Component Tests.” Biometrics, 46: 303-308. 91. Salsburg, D. (1990) “Hypothesis Versus Significance Testing for Controlled Clinical Trials: A Dialogue.” Statistics in Medicine, 9: 201-211. 92. Besag, J. and P. Clifford. (1991) “Sequential Monte Carlo p-values.” Biometrika, 78(2): 301-304. 93. Yoccoz, N. G. (1991) “Commentary: Use, Overuse, and Misuse of Significance Tests in Evolutionary Biology and Ecology.” Bulletin of the Ecological Society of America, 72(2): 106-111. 94. Faraway, J. J. (1992) “On the Cost of Data Analysis” ???? 1(3) 213-229. 95. Goodman, S. N. (1992) “A Comment on Replication, P-Values and Evidence.” Statistics in Medicine, 11: 875-879. 96. Wright, S. P. (1992) “Adjusted P-Values for Simultaneous Inference.” Biometrics, 48: 1005-1013. 97. Freeman, P. R. (1993) “The Role of P-Values in Analysing Trial Results.” Statistics in Medicine, 12: 1443-1452. 98. Hurlbert, H. and White, M. D. (1993) “Experiments with freshwater invertebrate zooplanktivores: Quality of statistical analysis” Bulletin of Marine Science, 53(1) 128-153. 99. Lee, Y. J. and H. Quan. (1993) “P-Values After Repeated Significance Testing: A Simple Approximation Method.” Statistics in Medicine, 12: 675-684. 100. Lehmann, E. L. (1993) “The Fisher, Neyman-Pearson Theories of Testing Hypotheses: One Theory or Two?” Journal of the American Statistical Association, 88(424): 1242-1249. 101. McBride, G., J. C. Loftis, and N. C. Adkins. (1993) “What Do Significance Tests Really Tell Us About the Environment?” Environmental Management, 17(4): 423-432. 102. Wang, C. (1993) Sense and Nonsense of Statistical Inference: Controversy, Misuse, and Subtelty. Marcel Dekker, New York. 103. Goodman, S. N. and J. A. Berlin. (1994) “The Use of Predicted Confidence Intervals When Planning Experiments and the Misuse of Power When Interpreting Results.” Annals of Internal Medicine, 121(3): 200-206. 104. Inman, H. F. (1994). Karl Pearson and R. A. Fisher on Statistical Tests: A 1935 Exchange from Nature. The American Statistician 48(1) 2-11. 7 StatisticalInferenceBiblio.pdf © 2013 Timothy G. Gregoire, Yale University 105. Chatfield, C. (1995) Model Uncertainty, Data Mining, and Statistical Inference. Journal of the Royal Statistical Society, Series A. 158: 419-466. 106. Edwards, A. W. F. (1995) “XVIIIth Fisher Memorial Lecture Delivered at the Natural History Museum, London, on Thursday, 20th October, 1994, Fiducial Inference and the Fundamental Theorem of Natural Selection.” Biometrics, 51(3): 799-809. 107. Goutis, C. and G. Casella. (1995) “Frequentist Post-Data Inference.” International Statistical Review, 63(3): 325-344. 108. Keuzenkamp, H. A. and J. R. Magnus. (1995) “On Tests and Significance in Econometrics.” Journal of Econometrics, 67: 5-24. 109. Keuzenkamp, H. A. and M. McAleer. (1995) “Simplicity, Scientific Inference, and Econometric Modelling.” The Economic Journal, 105: 1-21. 110. Sagan, C (1995) The Demon-Haunted World: Science as a Candle in the Dark. (see page 113 for the quote “absence of evidence is not evidence of absence.” 111. Tsou, T. and R. M. Royall. (1995) “Robust Likelihoods.” Journal of the American Statistical Association, 90(429): 316-320. 112. Dollinger, M., E. Kulinskaya & R.G. Staudte et al. (1996) “When is a p-Value a Good Measure of Evidence?” In Robust Statistics, Data Analysis, and Computer Intensive Methods, (H. Rieder, ed.), No. 109 in Lecture Notes in Statistics pp. 119-134. 113. Mislevy, R. J. (1996) “Evidence and Inference in Educational Assessment.” CSE Technical Report 414, National Center for Research on Evaluation, Standards, and Student Testing (CRESST), Graduate School of Education and Information Studies, The Regents of the University of California, Los Angeles. 114. Nester, M. R. (1996) “An Applied Statistician’s Creed.” Applied Statistics, 45(4): 401410. 115. Schervish, M. J. (1996) “P Values: What They Are and What They Are Not.” The American Statistician, 50(3): 203-206. 116. Bower, B. (1997) “Psychology’s Statistical Status Quo Draws Fire.” Science News, 151: 356-357. 117. Hayes, J. P. and R. J. Steidl. (1997) “Statistical Power Analysis and Amphibian Population Trends.” Conservation Biology, 11(1): 273-275. 118. Hung, H. M. J., et al. (1997) “The Behavior of the P-Value When the Alternative Hypothesis is True.” Biometrics, 53: 11-22. 8 StatisticalInferenceBiblio.pdf © 2013 Timothy G. Gregoire, Yale University 119. Pruzek, R. M. (1997) “An Introduction to Bayesian Inference and its Applications.” In What if There Were No Significance Tests?, L. L. Harlow, et al., eds. Lawrence Erlbaum Associates, Publishers: Mahwah, New Jersey, & London, pages 287-318. 120. Rindskopf, D. M. (1997) “Testing “Small,” Not Null, Hypotheses: Classical and Bayesian Approaches.” In What if There Were No Significance Tests?, L. L. Harlow, et al., eds. Lawrence Erlbaum Associates, Publishers: Mahwah, New Jersey, & London, pages 319-332. 121. Royall, R. (1997) Statistical Evidence: A Likelihood Paradigm. Chapman&Hall/CRC. 122. Thomas, L. (1997) “Retrospective Power Analysis.” Conservation Biology, 11(1): 276280. 123. Tukey, J. (1997) More Honest Foundations for Data Analysis. Journal of Statistical Planning and Inference 57: 21-29. 124. Cherry, S. (1998) “Statistical Tests in Publications of The Wildlife Society.” Wildlife Society Bulletin, 26(4): 947-953. 125. Efron, B. (1998) “R. A. Fisher in the 21st Century: Invited Paper Presented at the 1996 R. A. Fisher Lecture.” Statistical Science, 13(2): 95-122. 126. Gerard, P. D., et al. (1998) “Limits of Retrospective Power Analysis.” Journal of Wildlife Management, 62(2): 801-807. 127. Shen, W. and T.A. Louis. (1998) “Triple-goal estimates in two-stage hierarchical models.” Jorunal of the Royal Statistical Society B, 60(2): 455-471. 128. Thompson, J. R. (1998) “Invited Commentary: Re: “Multiple Comparisons and Related Issues in the Interpretation of Epidemiologic Data.” American Journal of Epidemiology, 147(9): 801-806. 129. Vieland, V. J. and S. E. Hodge. (1998) “Book Reviews: Statistical Evidence: A Likelihood Paradigm, by Richard Royall.” American Journal of Human Genetics, 63: 283289. 130. Zumbo, B. D. and A. M. Hubley. (1998) “A note on misconceptions concerning prospective and retrospective power.” The Statistician, 47(2): 385-388. 131. Donahue, R. M. J. (1999) “A Note on Information Seldom Reported Via the P Value.” The American Statistician, 53(4): 303-306. 132. Goodman, S. N. (1999) “Toward Evidence-Based Medical Statistics. 1: The P Value Fallacy.” Annals of Internal Medicine, 130(12): 995-1004. 9 StatisticalInferenceBiblio.pdf © 2013 Timothy G. Gregoire, Yale University 133. Goodman, S. N. (1999) “Toward Evidence-Based Medical Statistics. 1: The Bayes Factor.” Annals of Internal Medicine, 130(12): 1005-1013. 134. Johnson, D. H. (1999) “The Insignificance of Statistical Significance Testing.” Journal of Wildlife Mangement, 63(3): 763-772. 135. Lindsey, J. K. (1999) “Some Statistical Heresies.” The Statistician, 48(1): 1-40. 136. Perlman, M. D. and L. Wu. (1999) “The Emperor’s New Tests.” Statistical Science, 14(4): 355-381. 137. Sackrowitz, H. and E. Samuel-Cahn. (1999) “P Values as Random Variables—Expected P Values.” The American Statistician, 53(4): 326-331. 138. Stockmarr, A. (1999) “Likelihood Ratios for Evaluating DNA Evidence When the Suspect is Found Through a Database Search.” Biometrics, 55: 671-677. 139. Bayarri, M. J. and J. O. Berger. (2000) “P Values for Composite Null Models.” Journal of the American Statistical Association, 95(452): 1127-1172. 140. Robinson, A. (2000) “Slides from A. Robinson’s talk on A Jaundiced View of Hypothesis and Significance Testing.” University of Idaho. 141. Royall, R. (2000) “On the Probability of Observing Misleading Statistical Evidence.” Journal of the American Statistical Association, 95(451): 760-773. 142. Barker, L., H. Rolka, D. Rolka, C. Brown. (2001) “Equivalence Testing for Binomial Random Variables: Which Test to Use?” The American Statistician, 55(4): 279. 143. Dennis, B. (2001) “Statistics and the Scientific Method in Ecology.” Draft for The Nature of Scientific Evidence, M. L. Taper and S. R. Lele, eds., The University of Chicago Press. 33 pages. 144. Gregoire, T. G. (2001) “Biometry in the 21st Century: Whither Statistical Inference?” Keynote address presented at The Conference on Forest Biometry and Information Science (http://cms1.gre.ac.uk/conferences/iufro/proceedings/), 25-29 June 2001, The University of Greenwich, London, U.K. 145. Hoenig, J. M. and D. M. Heisey. (2001) “The Abuse of Power: The Pervasive Fallacy of Power Calculations for Data Analysis.” The American Statistician, 55(1): 1-6. 146. Lenth, R. V. (2001) “Some Practical Guidelines for Effective Sample Size Determination.” The American Statistician, 55(3): 187-193. 10 StatisticalInferenceBiblio.pdf © 2013 Timothy G. Gregoire, Yale University 147. Pace, M. L. (2001) “Prediction and the Aquatic Sciences.” Canadian Journal of Fisheries and Aquatic Sciences, 58: 63-72. 148. Salsburg, D. (2001) The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century. A. W. H. Freeman: New York. 149. Schenker, N. and J. F. Gentleman. (2001) “On Judging the Significance of Differences by Examining the Overlap Between Confidence Intervals.” The American Statistician, 55(3): 182-186. 150. Schnute, J. T. and L. J. Richards. (2001) “Use and Abuse of Fishery Models.” Canadian Journal of Fisheries and Aquatic Science, 58: 10-17. 151. Schweder, T. and N. L. Hjort. (2001) “Confidence and Likelihood.” Statistical Research Report, Department of Mathematics, University of Oslo. [ISBN: 82-553-1278-1] 152. Sellke, T., M. J. Bayarri, and J. O. Berger. (2001) “Calibration of p values for testing precise null hypotheses.” The American Statistician 55(1) 62-71. 153. Senn, S. (2001) “Statistical Issues in Bioequivalance.” Statistics in Medicine, 20: 27852799. 154. Stern, J. A. C. and G. D. Smith. (2001) “Sifting the Evidence—What’s Wrong With Significance Tests?” British Medical Journal, 322: 226-231. 155. Berger, J. O. (2002) “Could Fisher, Jeffreys, and Neyman Have Agreed on Testing?” Paper based on the Fisher lecture, given at the 2001 Joint Statistical Meetings by the author, Duke University. 156. Bird, C.D. (2002) “Confidence intervals for effect sizes in analysis of variance”. Educational and Psychological measurement 62: 197- 226. 157. Blume, J. D. (2002) “Likelihood methods for measuring statistical evidence.” Statistics in Medicine 21: 2563-2599. 158. Farrant, T. (2002) “To p or not to p.” Royal Statistical Society News, 29(10): 21. 159. Goodman, S. N. (2002) “Author’s Reply.” Statistics in Medicine, 21: 2445-2447. 160. Johnson, D. H. (2002) The Role of Hypothesis Testing in Wildlife Science. Journal of Wildlife Management 66(2) 272-276. 161. Knapp, T. R. (2002) “Some reflections on significance testing” Journal of Modern Applied Statistical Methods 1(2) 240-242. 11 StatisticalInferenceBiblio.pdf © 2013 Timothy G. Gregoire, Yale University 162. Robinson, D. H. and Wainer, H. (2002) On the Past and Future of Null Hypothesis Significance Testing. Journal of Wildlife Management 66(2) 263-271. 163. Senn, S. (2002) “Letter to the Editor: A comment on replication, p-values and evidence.” Statistics in Medicine, 21: 2437-2444. 164. Sterne, J. A. (2002) Teaching hypothesis tests – time for significant change? Statistics in Medicine 21:985-994. 165. Berger, J. O. (2003) Could Fisher, Jeffreys, and Neyman have agreed on testing? Statistical Science 18(1)1-12. 166. Browner, W. S. (2003) “The reliability of P values”. Science, 301, 167-168. 167. Dass, S. C. and J. O. Berger. (2003) “Unified Conditional Frequentist and Bayesian Testing of Composite Hypotheses.” Scandinavian Journal of Statistics, 30: 193-210. 168. Edwards, A.W.F. (2003) “R.A. Fisher—twice Professor of Genetics: London and Cambridge, or ‘A fairly well-known geneticists’.” The Statistician, 52(3): 311-318. 169. Fisher, R. A. (2003) Note on Dr. Berkson’s criticism of tests of significance. International Journal of Epidemiology 32:692. 170. Goodman, S. (2003) “Commentary: the p-value, devalued” International Journal of Epidemiology 32:699-702. 171. Green, P. J. (2003) “Notes on the life and work of R.A. Fisher.” The Statistician, 52(3): 299-301. 172. Healy, M. J. R. (2003) “R. A. Fisher the statistician.” The Statistician, 52(3): 303-310. 173. Hubbard, R. and M. J. Bayarri. (2003) “Confusion Over Measure of Evidence (p’s) Versus Errors (’s) in Classical Statistical Testing.” The American Statistician, 57(3): 171178. 174. Lele, S. (2003). Various work and correspondence. Includes: Subhash, L. and A. Das. (2003) “Elicited data and incorporation of expert opinion for statistical inference in spatial studies.” Mathematical Geology, 32(4): 465-467. 175. Onwuegbuzie, A.J. and J.R. Levin. (2003) “Without supporting statistical evidence, where would reported measures of substantive importance lead? To no good effect.” Journal of Modern Applied Statistical Methods, 2(1): 133-151. 176. Royall, R. & T-S Tsou. (2003) Interpreting statistical evidence by using imperfect models: robust adjusted likelihood functions. Journal of the Royal Statistical Society B, 65(2) 391404. 12 StatisticalInferenceBiblio.pdf © 2013 Timothy G. Gregoire, Yale University 177. Senn, S. (2003) “Foreword: A blue plaque for Fisher.” The Statistician, 52(3) 297-298. 178. Smith, G. D. (2003) “Uncertainty and significance” International Journal of Epidemiology, 32:683. 179. Stone, M. (2003) “Commentary: Worthwhile Polemic or Transatlantic storm-in-a-teacup?” International Journal of Epidemiology, 32:693-694. 180. Bickel, D.R. 2004. Degree of differential gene expression: detecting biologically significant expression differences and estimating their magnitudes. Bioinformatics 20(5): 683 – 688 181. Garcia V.L. (2004) Escaping the Bonferroni iron claw in ecological studies. Journal of Ecology 657-663. 182. Stefano, J.D. (2004) “A confidence interval approach to data analysis”. Forest Ecology and Management 187: 173 – 183. 183. Taper, M. L. & Lele, S. R. (2004) The Nature of Scientific Evidence: Statistical, Philosophical, and Empirical Considerations. University of Chicago Press. 184. Christensen, R. (2005) Testing Fisher, Neyman, Pearson, and Bayes. The American Statistician, 59(2)121-126. 185. Stephens, P.A. and Buskirk, S.W. (2005) Information theory and hypothesis testing: a call for pluralism. Journal of Applied Ecology 42: 4-12. 186. Korn, E. L. & B. Freidlin. (2006) The likelihood as statistical evidence in multiple comparisons in clinical trials: no free lunch. Biometrical Journal 3:346-355. 187. Lenhard, J. (2006) Models and Statistical Inference: The Controversy between Fisher and Neyman-Pierson. British Journal Philosophy of Science 57:69-91. 188. Moerkerke, B., Goetghebeur, E., De Riek, J., and Roldan-Ruiz, I. 2006. Significance and impotence: towards a balanced view of the null and the alternative hypotheses in marker selection for plant breeding. J .R. Statist. Soc. 169: 61 - 79 189. Thompson, B. (2006). Critique of p-values. International Statistical Review. 74(1)1-14. 190. Yuan, Y. (2007). Bayesian meta-analysis of highly-cited controlled clinical trials based on test statistics. (IBC meeting in Montreal, 2006??). 191. Boyles, R. A. (2008). The role of likelihood in interval estimation. The American Statistician. 62(10) 22-26. 13 StatisticalInferenceBiblio.pdf © 2013 Timothy G. Gregoire, Yale University 192. Cox, D. R. (2009) Randomization in the Design of Experiments. International Statistical Review 77(3) 415-429. 193. Hurlbert, S. H. (2009) The Ancient Black Art and Trandisciplinary Extent of Pseudoreplication. Journal of Comparative Psychology, 123(4) 434-443. 194. Hurlbert, S.H. and Lombard, C.M. (2009) Final collapse of the Neyman-Pearson decision theoretic framework and rise of the neoFisherian. Annales of Zoologici Fennici 46: 311 – 349. . 195. Browne, R. H. (2010) The t-test p value and its relationship to the effect size and P(X>Y). The American Statistician 64(1)30-33 (Correction, TAS, 64(2) 195) 196. Micceri, T. (2010) The Unicorn, the normal curve, and other improbably creatures. Psychological Bulletin 105(1) 156-166. 197. Zuo, Y. (2010) “Is the t confidence interval ± t∞ (n - 1)s/√n optimal?”. The American Statistician 64(2): 170 – 173. 198. Boos, D.D. and Stefanski L.A. (2011) p-Value Precision and Reproducibility. The American Statistician 65 (4): 213-221. 199. Hubbard, R. (2011) The widespread misinterpretation of p-values as error probabilities. Journal of Applied Statistics 38 (11): 2617-2626. 200. Kass, R.E. (2011) Statistical Inference: The Big Picture. Institute of Mathematical Statistics 26(1): 1-9. 201. Lewis, F., Butler, A. and Gilbert, L. (2011) A unified approach to model selection using the likelihood ratio test. Methods in Ecology and Evolution 2: 155 – 162. 202. Picquelle, S.J. and Mier, K.L. (2011) A Practice guide to statistical methods for comparing means from two-stage sampling. Fisheries Research 107: 1-13. 203. Wild, C.J., Pfannkuch, M. and Regan, M. (2011) Towards more accessible conceptions of statistical inference. Journal of Royal Statistical Society 174(2): 247 – 295. 204. Hurlbert, S.H. (2012) Pseudofactorialism, response structures and collective responsibility. Austral Ecology: 1-18. 205. Hurlbert, S.H. and Lombard, C.M. (2012) Lopsided Reasoning on Lopsided Tests and Multiple Comparisons. Australian & New Zealand Journal of Statistics 54(1): 23 – 42. 206. Hurlbert, S.H. (2013) Affirmation of the classical terminology for experimental design via a critique of Casella’s Statistical design. Agronomy Journal 105(2): 412 – 418. 14