. 1. Misc. 2 On Bias and Randomness.

advertisement
StatisticalInferenceBiblio.pdf
© 2013, Timothy G. Gregoire, Yale University
http://environment.yale.edu/profile/gregoire/bibliographies
Last revised: December 2014
Statistical Inference Bibliography
1920-Present
.
1. Misc. 2 On Bias and Randomness.
2. Misc. 3 Lopsided reasoning.
3. Pearson, K. (1920) “The Fundamental Problem in Practical Statistics.” Biometrika, 13(1): 116.
4. Edgeworth, F.Y. (1921) “Molecular Statistics.” Journal of the Royal Statistical Society,
84(1): 71-89.
5. Fisher, R. A. (1922) “On the Mathematical Foundations of Theoretical Statistics.”
Philosophical Transactions of the Royal Society of London, Series A, Containing Papers of a
Mathematical or Physical Character, 222: 309-268.
6. Neyman, J. and E. S. Pearson. (1928) “On the Use and Interpretation of Certain Test
Criteria for Purposes of Statistical Inference: Part I.” Biometrika, 20A(1/2): 175-240.
7. Fisher, R. A. (1933) “The Concepts of Inverse Probability and Fiducial Probability
Referring to Unknown Parameters.” Proceedings of the Royal Society of London, Series A,
Containing Papers of Mathematical and Physical Character, 139(838): 343-348.
8. Buchanan-Wollaston, H. J. (1935) “Statistical Tests”, Nature v136: 182-183.
9. Fisher, R. A. (1935) “The Logic of Inductive Inference.” Journal of the Royal Statistical
Society, 98(1): 39-82.
10. Fisher, R. A. (1936) “Uncertain inference.” Proceedings of the American Academy of Arts
and Sciences, 71: 245-258.
11. Neyman, J. (1937). Outline of a theory of statistical estimation based on the classical theory
of probability. Philosophical Transactions of the Royal Society of London, Series A. 236:
333-380.
12. Berkson, J. (1942) “Tests of Significance Considered as Evidence.” Journal of the American Statistical Association, 37(219): 325-335.
13. Berkson, J. (1942) “Tests of Significance Considered as Evidence.” Reprinted in International Journal of Epidemiology (from 1942 JASA article) 32:687-691.
StatisticalInferenceBiblio.pdf
© 2013 Timothy G. Gregoire, Yale University
14. Barnard, G. A. (1949) “Statistical Inference.” Journal of the Royal Statistical Society,
Series B (Methodological), 11(2): 115-149.
15. Fisher, R. (1955) “Statistical Methods and Scientific Induction.” Journal of the Royal
Statistical Society, Series B (Methodological), 17(1): 69-78.
16. Pearson, E. S. (1955) “Statistical Concepts in their Relation to Reality.” Journal of the
Royal Statistical Society, Series B (Methodological),17(2): 204-207.
17. Yates, F. (1955) “Discussion on the Paper by Dr. Box and Dr. Anderson.” Statistical
Inference, Robustness, and Modeling Strategy, JRSS-B, 17(1): 31.
18. Barlett, M.S. (1956) Comment on Sir Ronald Fisher’s Paper: “On a Test of significance in
Pearson’s Biometrika Tables (No. 11)”. Journal of the Royal Statistical Society Series B
18(2): 295 – 296.
19. Fisher, R. (1956) On a Test of significance in Pearson’s Biometrika Tables (No. 11). Journal
of the Royal Statistical Society Series B 18(1): 56 – 60.
20. Neyman, J. (1956) Note on an Article by Sir Ronald Fisher. Journal of the Royal Statistical
Society Series B 18(2): 288 – 294.
21. Welch, B.L. (1956) Note on some criticisms made by Sir Ronald Fisher. Journal of the Royal
Statistical Society Series B 18(2): 297 – 302.
22. Lindley, D. V. (1957). A statistical paradox. Biometrika 44(1/2) 187-192.
23. Cox, D. R. (1958) “Some Problems Connected with Statistical Inference.” Annals of
Mathematical Statistics, 29(2): 357-372.
24. Good, I. J. (1958) “Significance Tests in Parallel and In Series.” Journal of the American
Statistical Association, 53: 799-813.
25. Eysenck, H. J. (1960) “The Concept of Statistical Significance and the Controversy about
One-tailed Tests”, Psychological Review 67(4) 269-271.
26. Natrella, M. G. (1960) “The Relation Between Confidence Intervals and Tests of Significance.” The American Statistician, 14: 20-22 & back cover.
27. Rozeboom, W. W. (1960) “The Fallacy of the Null-Hypothesis Significance Test.” Psychological Bulletin, 57(5): 416-428.
28. Neyman, J. (1961) “Silver Jubilee of My Dispute with Fisher.” Journal of the Operations
Research Society of Japan, 3(4): 145-154.
2
StatisticalInferenceBiblio.pdf
© 2013 Timothy G. Gregoire, Yale University
29. Pratt, J. W. (1961) “Testing Statistical Hypotheses.” Journal of the American Statistical
Association, 56(293): 163-167.
30. Barnard, G. A., G. M. Jenkins, & C. B. Winsten. (1962) “Likelihood Inference and Time
Series.” Journal of the Royal Statistical Society, Series A (General), 125(3): 321-372.
31. Birnbaum, A. (1962) “On the Foundations of Statistical Inference.” Journal of the American Statistical Association, 57(298): 269-306.
32. Pearson, E. S. (1962) “Some Thoughts on Statistical Inference.” Annals of Mathematical
Statistics, 33(2): 394-403.
33. Fraser, D. A. S. (1963) “On the Sufficiency and Likelihood Principles.” Journal of the
American Statistical Association, 58(303): 641-647.
34. Kendall, M. G. (1963) “Ronald Aylmer Fisher, 1890-1962.” Biometrika, 50(1/2):1-15.
35. Platt, J. R. (1964) “Strong Inference.” Science, 146(3642): 347-353.
36. Dempster, A. P. and M. Schatzoff. (1965) “Expected Significance Level as a Sensitivity
Index for Test Statistics.” Journal of the American Statistical Association, 60(310): 420-436.
37. Pratt, J. W. (1965) “Bayesian interpretation of standard inference statements.” Journal of the
Royal Statistical Society 27(2) 169-203
38. Cornfield, J. (1966) “Sequential Trials, Sequential Analysis and the Likelihood Principle.”
The American Statistician, 20: 18-23.
39. Cutler, S. J., et al. (1966) “The Role of Hypothesis Testing in Clinical Trials.” Journal of
Chronic Disease, 19: 857-882.
40. Selvin, H. C. and Stuart, A. (1966) “Data-dredging Procedures in Survey Analysis.” The
American Statistician, 20:20-23.
41. Royall, R. (1968). “An old approach to finite population sampling theory.” Journal of the
American Statistical Association 63: 1269-1279.
42. Seeger, P. (1968) “A Note on a Method for the Analysis of Significances en masse”
Technometrics 10(3): 586-593.
43. Edwards, A. W. F. (1969) “Statistical Methods in Scientific Inference.” Nature, 222(June):
1233-1237.
44. Tukey, J. W. (1969) Analyzing Data: Sanctification or Detective Work? American Psychologist 83-91.
3
StatisticalInferenceBiblio.pdf
© 2013 Timothy G. Gregoire, Yale University
45. Edwards, A. W. F. (1970) “Likelihood.” Nature, 227(July): 92.
46. Durbin, J. (1970) “On Birnbaum’s Theorem on the Relation Between Sufficiency,
Conditionality and Likelihood.” Journal of the American Statistical Association, 65(329):
395-398.
47. Kyburg, Jr. H.E. (1971) “Probability and informative inference”. Proceedings of the
symposium on the Foundations of Statistical Inference prepared under the auspices of the
Rene Descartes Foundation and held at the Department of Statistics, University of Waterloo,
Ontario, Canada, from March 31 to April 9, 1970: 82 – 107.
48. Neyman, J. (1971) “Foundations of Behavioristic statistics”. Proceedings of the symposium
on the Foundations of Statistical Inference prepared under the auspices of the Rene Descartes
Foundation and held at the Department of Statistics, University of Waterloo, Ontario,
Canada, from March 31 to April 9, 1970:1-19.
49. Rao, C.R. (1971) “Some aspects of statistical inference in problems of sampling from finite
populations”. Proceedings of the symposium on the Foundations of Statistical Inference
prepared under the auspices of the Rene Descartes Foundation and held at the Department of
Statistics, University of Waterloo, Ontario, Canada, from March 31 to April 9, 1970: 117 –
202.
50. Leamer, E. E. (1974) “False Models and Post-Data Model Construction.” Journal of the
American Statistical Association, 69(345): 122-131.
51. Spielman, S. 1974. Philosophy of Science: The Logic of Tests of Significance.
52. Kempthorne, O. (1975) “Inference from Experiments and Randomization.” In A Survey of
Statistical Design and Linear Models, J. N. Srivastava, ed., North-Holland Publishing
Company. Pages 303-331.
53. Robinson, G. K. (1975). Some counterexamples to the theory of confidence intervals.
Biometrika 62(1) 155-161.
54. Joshi, V. M. (1976). A note on Birmbaum’s theory of the likelihood principle. Journal of the
American Statistical Association. 71: 345-346.
55. Cox, D. R. (1977) “The Role of Significance Tests.” Scandinavian Journal of Statistics, 4:
49-70.
56. Guttman, L. (1977) “What is Not What in Statistics.” The Statistician, 26(2): 81-107.
57. Kiefer, J. (1977) “The Foundations of Statistics – Are there any?” Synthese 36: 161 – 176.
58. Giere, R.N. (1977) “Allan Birnbaum’s conception of statistical evidence”. Synthese 36: 5-13.
4
StatisticalInferenceBiblio.pdf
© 2013 Timothy G. Gregoire, Yale University
59. Neyman, J. (1977) “Frequentist probability and frequentist statistics”. Synthese 36: 97 – 131
60. Robinson, G. K. (1977). Conservative statistical inference. Journal of the Royal Statistical
Society, Series B. 39: 381-386.
61. Smith, C. A. B. (1977) “The analogy between decision and inference”. Synthese 36: 71 – 85.
62. Carver, R. P. (1978) “The Case Against Statistical Significance Testing.” Harvard
Educational Review, 48(3): 378-398.
63. Eberhardt L.L. (1978) “Appraising Variability in Population Studies”. Journal of Wild Life
Management, 42(2): 207-238.
64. Good, I. J. (1980) “The diminishing significance of a p-value as the sample size decreases.”
Journal of Statistical Computation & Simulation, 11: 307-313.
65. Dolby, G. R. (1982) “The Role of Statistics in the Methodology of the Life Sciences.”
Biometrics, 38: 1069-1083.
66. Good, I. J. (1982) “Standardized tail-area probabilities.” Journal of Statistical Computation
and Simulation, 16: 65-75.
67. Schweder, T. and E. Spjøtvoll. (1982). “Plots of P-values to Evaluate Many Tests
Simultaneously.” Biometrics, 69(3): 493-502.
68. Leamer, E. E. (1983) “Let’s Take the Con out of Econometrics.” The American Economic
Review, 73(1): 31-43.
69. Leamer, E. and H. Leonard. (1983) “Reporting the Fragility of Regression Estimates.” The
Review of Economics and Statistics, 65(2): 306-317.
70. Good, I. J. (1984) “How Should Tail-Area Probabilities be Standardized for Sample Size in
Unpaired Comparisons?” C191 in Journal of Statistical Computation and Simulation, 19:
174.
71. Mayo, D.G. (1985) “Behavioristic, Evidentialist and learning models of statistical testing”.
Philosophy of Science 52: 493 – 516.
72. Salburg, D.S. (1985) “The religion of statistics as practiced in Medical journals”. The
American Statistician 39(3): 220 – 223.
73. Thompson, W. A. Jr. (1985). Optimal significance procedures for simple hypotheses.
Biometrika 72(1) 230-232.
5
StatisticalInferenceBiblio.pdf
© 2013 Timothy G. Gregoire, Yale University
74. Berger, J. O. (1986) “Are P-Values Reasonable Measures of Accuracy?” In Pacific
Statistical Congress, I. S. Francis et al., eds., Elsevier Science Publishers, the Netherlands.
Pages 21-27.
75. Cox, D. R. (1986) “Some General Aspects of the Theory of Statistics.” International
Statistical Review, 54(2): 117-126.
76. Fleiss, J. L. (1986) “Significance Tests Have a Role in Epidemiologic Research: Reactions
to A. M. Walker.” American Journal of Public Health, 76(5): 559-560.
77. Fleiss, J. L. (1986) “Letters to the Editor: Confidence Intervals vs Significance Tests:
Quantitative Interpretation.” American Journal of Public Health, 76(5): 587.
78. Hall, P. and B. Selinger. (1986) “Statistical Significance: Balancing Evidence Against
Doubt.” Australian Journal of Statistics, 28(3): 354-370.
79. Johnstone, D. J. (1986). Tests of significance in theory and practice. The Statistician 35, 491504.
80. Royall, R. M. (1986) “The Effect of Sample Size on the Meaning of Significance Tests.”
The American Statistician, 40(4): 313-315. Also: Bailey, K. R. (1987) “Comment on
Royall (1986).” The American Statistician, 41(3): 245-246.
81. Walker, A. M. (1986) “Reporting the Results of Epidemiologic Studies.” American Journal
of Public Health, 76(5): 556-558.
82. Warren, W.G. (1986) “On the presentation of statistical analysis: reason or ritual”. Canadian
Journal of Research 16: 1185 – 1191.
83. Berger, J. O. and M. Delampady. (1987) “Testing Precise Hypotheses.” Statistical Science,
2(3): 317-352.
84. Berger, J. O. and T. Sellke. (1987) “Testing a Point Null Hypothesis: The Irreconcilability
of P Values and Evidence.” Journal of the American Statistical Association, 82(397): 112139.
85. Casella, G. and R. L. Berger. (1987) “Reconciling Bayesian and Frequentist Evidence in the
One-Sided Testing Problem.” Journal of the American Statistical Association, 82(397): 106135.
86. Chow, S.L. (1987) “Science, Ecological validity and experimentation”. Journal for the
Theory of Social Behaviour 17: 181 – 194.
87. Hill, B. M. (1987) “The validity of the likelihood principle.” The American Statistician 41(2)
95-100.
6
StatisticalInferenceBiblio.pdf
© 2013 Timothy G. Gregoire, Yale University
88. Poole, C. (1987) “Beyond the Confidence Interval.” American Journal of Public Health,
77(2): 195-199.
89. Thompson, W. D. (1987) “Statistical Criteria in the Interpretation of Epidemiologic Data.”
American Journal of Public Health, 77(2): 191-194.
90. Berger, J. O. and D. A. Berry. (1988) “Statistical Analysis and the Illusion of Objectivity.”
American Scientist, 76(2): 159-165.
91. Goodman, S. N. and R. Royall. (1988) “Evidence and Scientific Research.” American
Journal of Public Health, 78(12): 1568-1574.
92. Schweder, T. (1988) “A Significance Version of the Basic Neyman-Pearson Theory for
Scientific Hypothesis Testing.” Scandinavian Journal of Statistics, 15: 225-242.
93. Sorić, B. (1989) “Statistical “Discoveries” and Effect-Size Estimation.” Journal of the
American Statistical Association, 84(406): 608-610.
94. Vuong, Q. H. (1989) “Likelihood Ratio Tests for Model Selection and Non-Nested
Hypotheses.” Econometrica, 57(2): 307-333.
95. Anscombe, F. J. (1990) “The Summarizing of Clinical Experiments by Significance
Levels.” Statistics in Medicine, 9: 703-708.
96. Barnard, G. A. (1990) “Must Clinical Trials Be Large? The Interpretation of P-Values and
the Combination of Test Results.” Statistics in Medicine, 9: 601-614.
97. Begg, C. B. (1990) “On Inferences from Wei’s Biased Coin Design for Clinical Trials.”
Biometrika, 77(3): 467-84.
98. Cohen, J. (1990) “Things I Have Learned (So Far).” American Psychologist, 45(12): 13041312.
99. Peterman, R. M. (1990) “The Importance of Reporting Statistical Power: The Forest
Decline and Acidic Deposition Example.” Ecology, 71(5): 2024-2027.
100. Rice, W. R. (1990) “A Consensus Combined P-Value Test and the Family-Wide
Significance of Component Tests.” Biometrics, 46: 303-308.
101. Salsburg, D. (1990) “Hypothesis Versus Significance Testing for Controlled Clinical
Trials: A Dialogue.” Statistics in Medicine, 9: 201-211.
102. Besag, J. and P. Clifford. (1991) “Sequential Monte Carlo p-values.” Biometrika, 78(2):
301-304.
7
StatisticalInferenceBiblio.pdf
© 2013 Timothy G. Gregoire, Yale University
103. Yoccoz, N. G. (1991) “Commentary: Use, Overuse, and Misuse of Significance Tests in
Evolutionary Biology and Ecology.” Bulletin of the Ecological Society of America, 72(2):
106-111.
104.
Faraway, J. J. (1992) “On the Cost of Data Analysis” ???? 1(3) 213-229.
105. Goodman, S. N. (1992) “A Comment on Replication, P-Values and Evidence.”
Statistics in Medicine, 11: 875-879.
106. Wright, S. P. (1992) “Adjusted P-Values for Simultaneous Inference.” Biometrics, 48:
1005-1013.
107. Freeman, P. R. (1993) “The Role of P-Values in Analysing Trial Results.” Statistics in
Medicine, 12: 1443-1452.
108. Hurlbert, H. and White, M. D. (1993) “Experiments with freshwater invertebrate
zooplanktivores: Quality of statistical analysis” Bulletin of Marine Science, 53(1) 128-153.
109. Lee, Y. J. and H. Quan. (1993) “P-Values After Repeated Significance Testing: A
Simple Approximation Method.” Statistics in Medicine, 12: 675-684.
110. Lehmann, E. L. (1993) “The Fisher, Neyman-Pearson Theories of Testing Hypotheses:
One Theory or Two?” Journal of the American Statistical Association, 88(424): 1242-1249.
111. McBride, G., J. C. Loftis, and N. C. Adkins. (1993) “What Do Significance Tests Really
Tell Us About the Environment?” Environmental Management, 17(4): 423-432.
112. Wang, C. (1993) Sense and Nonsense of Statistical Inference: Controversy, Misuse, and
Subtlety. Marcel Dekker, New York.
113.
Cohen, J. (1994) “The Earth is round (p < .05)”. American Psychologist 49: 997 – 1003.
114. Goodman, S. N. and J. A. Berlin. (1994) “The Use of Predicted Confidence Intervals
When Planning Experiments and the Misuse of Power When Interpreting Results.” Annals of
Internal Medicine, 121(3): 200-206.
115. Inman, H. F. (1994). Karl Pearson and R. A. Fisher on Statistical Tests: A 1935
Exchange from Nature. The American Statistician 48(1) 2-11.
116. Chatfield, C. (1995) Model Uncertainty, Data Mining, and Statistical Inference. Journal
of the Royal Statistical Society, Series A. 158: 419-466.
117. Edwards, A. W. F. (1995) “XVIIIth Fisher Memorial Lecture Delivered at the Natural
History Museum, London, on Thursday, 20th October, 1994, Fiducial Inference and the
Fundamental Theorem of Natural Selection.” Biometrics, 51(3): 799-809.
8
StatisticalInferenceBiblio.pdf
© 2013 Timothy G. Gregoire, Yale University
118. Goutis, C. and G. Casella. (1995) “Frequentist Post-Data Inference.” International
Statistical Review, 63(3): 325-344.
119. Johnson, D. H. (1995) “Statistical sirens: the allure of nonparametrics”. Ecology 76(6)
1998 – 2000.
120. Keuzenkamp, H. A. and J. R. Magnus. (1995) “On Tests and Significance in
Econometrics.” Journal of Econometrics, 67: 5-24.
121. Keuzenkamp, H. A. and M. McAleer. (1995) “Simplicity, Scientific Inference, and
Econometric Modelling.” The Economic Journal, 105: 1-21.
122. Sagan, C (1995) The Demon-Haunted World: Science as a Candle in the Dark. (see page
113 for the quote “absence of evidence is not evidence of absence.”
123. Smith, S. M. (1995) “Distribution-free and robust statistical methods: viable alternatives
to parametric statistics”. Ecology 76(6) 1997 – 1998.
124. Tsou, T. and R. M. Royall. (1995) “Robust Likelihoods.” Journal of the American
Statistical Association, 90(429): 316-320.
125. Dollinger, M., E. Kulinskaya & R.G. Staudte et al. (1996) “When is a p-Value a Good
Measure of Evidence?” In Robust Statistics, Data Analysis, and Computer Intensive
Methods, (H. Rieder, ed.), No. 109 in Lecture Notes in Statistics pp. 119-134.
126. Mislevy, R. J. (1996) “Evidence and Inference in Educational Assessment.” CSE
Technical Report 414, National Center for Research on Evaluation, Standards, and Student
Testing (CRESST), Graduate School of Education and Information Studies, The Regents of
the University of California, Los Angeles.
127. Nester, M. R. (1996) “An Applied Statistician’s Creed.” Applied Statistics, 45(4): 401410.
128. Schervish, M. J. (1996) “P Values: What They Are and What They Are Not.” The
American Statistician, 50(3): 203-206.
129. Bower, B. (1997) “Psychology’s Statistical Status Quo Draws Fire.” Science News,
151: 356-357.
130. Hayes, J. P. and R. J. Steidl. (1997) “Statistical Power Analysis and Amphibian
Population Trends.” Conservation Biology, 11(1): 273-275.
131. Hung, H. M. J., et al. (1997) “The Behavior of the P-Value When the Alternative
Hypothesis is True.” Biometrics, 53: 11-22.
9
StatisticalInferenceBiblio.pdf
© 2013 Timothy G. Gregoire, Yale University
132. Pruzek, R. M. (1997) “An Introduction to Bayesian Inference and its Applications.” In
What if There Were No Significance Tests?, L. L. Harlow, et al., eds. Lawrence Erlbaum
Associates, Publishers: Mahwah, New Jersey, & London, pages 287-318.
133. Rindskopf, D. M. (1997) “Testing “Small,” Not Null, Hypotheses: Classical and
Bayesian Approaches.” In What if There Were No Significance Tests?, L. L. Harlow, et al.,
eds. Lawrence Erlbaum Associates, Publishers: Mahwah, New Jersey, & London, pages
319-332.
134.
Royall, R. (1997) Statistical Evidence: A Likelihood Paradigm. Chapman&Hall/CRC.
135. Thomas, L. (1997) “Retrospective Power Analysis.” Conservation Biology, 11(1): 276280.
136. Tukey, J. (1997) More Honest Foundations for Data Analysis. Journal of Statistical
Planning and Inference 57: 21-29.
137. Cherry, S. (1998) “Statistical Tests in Publications of The Wildlife Society.” Wildlife
Society Bulletin, 26(4): 947-953.
138. Efron, B. (1998) “R. A. Fisher in the 21st Century: Invited Paper Presented at the 1996
R. A. Fisher Lecture.” Statistical Science, 13(2): 95-122.
139. Gerard, P. D., et al. (1998) “Limits of Retrospective Power Analysis.” Journal of
Wildlife Management, 62(2): 801-807.
140. Shen, W. and T.A. Louis. (1998) “Triple-goal estimates in two-stage hierarchical
models.” Jorunal of the Royal Statistical Society B, 60(2): 455-471.
141. Thompson, J. R. (1998) “Invited Commentary: Re: “Multiple Comparisons and Related
Issues in the Interpretation of Epidemiologic Data.” American Journal of Epidemiology,
147(9): 801-806.
142. Vieland, V. J. and S. E. Hodge. (1998) “Book Reviews: Statistical Evidence: A
Likelihood Paradigm, by Richard Royall.” American Journal of Human Genetics, 63: 283289.
143. Zumbo, B. D. and A. M. Hubley. (1998) “A note on misconceptions concerning
prospective and retrospective power.” The Statistician, 47(2): 385-388.
144. Donahue, R. M. J. (1999) “A Note on Information Seldom Reported Via the P Value.”
The American Statistician, 53(4): 303-306.
145. Goodman, S. N. (1999) “Toward Evidence-Based Medical Statistics. 1: The P Value
Fallacy.” Annals of Internal Medicine, 130(12): 995-1004.
10
StatisticalInferenceBiblio.pdf
© 2013 Timothy G. Gregoire, Yale University
146. Goodman, S. N. (1999) “Toward Evidence-Based Medical Statistics. 1: The Bayes
Factor.” Annals of Internal Medicine, 130(12): 1005-1013.
147. Johnson, D. H. (1999) “The Insignificance of Statistical Significance Testing.” Journal of
Wildlife Management, 63(3): 763-772.
148. Lindsey, J. K. (1999) “Some Statistical Heresies.” The Statistician, 48(1): 1-40.
149. Perlman, M. D. and L. Wu. (1999) “The Emperor’s New Tests.” Statistical Science,
14(4): 355-381.
150. Sackrowitz, H. and E. Samuel-Cahn. (1999) “P Values as Random Variables—Expected
P Values.” The American Statistician, 53(4): 326-331.
151. Stockmarr, A. (1999) “Likelihood Ratios for Evaluating DNA Evidence When the
Suspect is Found Through a Database Search.” Biometrics, 55: 671-677.
152. Bayarri, M. J. and J. O. Berger. (2000) “P Values for Composite Null Models.” Journal
of the American Statistical Association, 95(452): 1127-1172.
153. Robinson, A. (2000) “Slides from A. Robinson’s talk on A Jaundiced View of Hypothesis
and Significance Testing.” University of Idaho.
154. Royall, R. (2000) “On the Probability of Observing Misleading Statistical Evidence.”
Journal of the American Statistical Association, 95(451): 760-773.
155. Barker, L., H. Rolka, D. Rolka, C. Brown. (2001) “Equivalence Testing for Binomial
Random Variables: Which Test to Use?” The American Statistician, 55(4): 279.
156. Dennis, B. (2001) “Statistics and the Scientific Method in Ecology.” Draft for The Nature
of Scientific Evidence, M. L. Taper and S. R. Lele, eds., The University of Chicago Press. 33
pages.
157. Gregoire, T. G. (2001) “Biometry in the 21st Century: Whither Statistical Inference?”
Keynote address presented at The Conference on Forest Biometry and Information Science
(http://cms1.gre.ac.uk/conferences/iufro/proceedings/), 25-29 June 2001, The University of
Greenwich, London, U.K.
158. Hoenig, J. M. and D. M. Heisey. (2001) “The Abuse of Power: The Pervasive Fallacy of
Power Calculations for Data Analysis.” The American Statistician, 55(1): 1-6.
159. Lenth, R. V. (2001) “Some Practical Guidelines for Effective Sample Size
Determination.” The American Statistician, 55(3): 187-193.
160. Pace, M. L. (2001) “Prediction and the Aquatic Sciences.” Canadian Journal of Fisheries
and Aquatic Sciences, 58: 63-72.
11
StatisticalInferenceBiblio.pdf
© 2013 Timothy G. Gregoire, Yale University
161. Salsburg, D. (2001) The Lady Tasting Tea: How Statistics Revolutionized Science in the
Twentieth Century. A. W. H. Freeman: New York.
162. Schenker, N. and J. F. Gentleman. (2001) “On Judging the Significance of Differences by
Examining the Overlap Between Confidence Intervals.” The American Statistician, 55(3):
182-186.
163. Schnute, J. T. and L. J. Richards. (2001) “Use and Abuse of Fishery Models.” Canadian
Journal of Fisheries and Aquatic Science, 58: 10-17.
164. Schweder, T. and N. L. Hjort. (2001) “Confidence and Likelihood.” Statistical Research
Report, Department of Mathematics, University of Oslo. [ISBN: 82-553-1278-1]
165. Sellke, T., M. J. Bayarri, and J. O. Berger. (2001) “Calibration of p values for testing
precise null hypotheses.” The American Statistician 55(1) 62-71.
166. Senn, S. (2001) “Statistical Issues in Bioequivalance.” Statistics in Medicine, 20: 27852799.
167. Sterne, J. A. C. and G. D. Smith. (2001) “Sifting the Evidence—What’s Wrong With
Significance Tests?” British Medical Journal, 322: 226-231.
168. Tryon, W.W. (2001) “Evaluating statistical difference, equivalence and indeterminacy
using inferential confidence intervals: An integrated alternative method of conducting Null
Hypothesis statistical tests”. Psychological Methods 6(4): 371 – 386.
169. Berger, J. O. (2002) “Could Fisher, Jeffreys, and Neyman Have Agreed on Testing?”
Paper based on the Fisher lecture, given at the 2001 Joint Statistical Meetings by the author,
Duke University.
170. Bird, C.D. (2002) “Confidence intervals for effect sizes in analysis of variance”.
Educational and Psychological measurement 62: 197- 226.
171. Blume, J. D. (2002) “Likelihood methods for measuring statistical evidence.” Statistics in
Medicine 21: 2563-2599.
172. Farrant, T. (2002) “To p or not to p.” Royal Statistical Society News, 29(10): 21.
173. Goodman, S. N. (2002) “Author’s Reply.” Statistics in Medicine, 21: 2445-2447.
174. Johnson, D. H. (2002) The Role of Hypothesis Testing in Wildlife Science. Journal of
Wildlife Management 66(2) 272-276.
175. Knapp, T. R. (2002) “Some reflections on significance testing” Journal of Modern Applied
Statistical Methods 1(2) 240-242.
12
StatisticalInferenceBiblio.pdf
© 2013 Timothy G. Gregoire, Yale University
176. Robinson, D. H. and Wainer, H. (2002) On the Past and Future of Null Hypothesis
Significance Testing. Journal of Wildlife Management 66(2) 263-271.
177. Senn, S. (2002) “Letter to the Editor: A comment on replication, p-values and evidence.”
Statistics in Medicine, 21: 2437-2444.
178. Sterne, J. A. (2002) Teaching hypothesis tests – time for significant change? Statistics in
Medicine 21:985-994.
179. Berger, J. O. (2003) Could Fisher, Jeffreys, and Neyman have agreed on testing? Statistical
Science 18(1)1-12.
180. Browner, W. S. (2003) “The reliability of P values”. Science, 301, 167-168.
181. Dass, S. C. and J. O. Berger. (2003) “Unified Conditional Frequentist and Bayesian
Testing of Composite Hypotheses.” Scandinavian Journal of Statistics, 30: 193-210.
182. Edwards, A.W.F. (2003) “R.A. Fisher—twice Professor of Genetics: London and
Cambridge, or ‘A fairly well-known geneticists’.” The Statistician, 52(3): 311-318.
183. Fisher, R. A. (2003) Note on Dr. Berkson’s criticism of tests of significance. International
Journal of Epidemiology 32:692.
184. Goodman, S. (2003) “Commentary: the p-value, devalued” International Journal of
Epidemiology 32:699-702.
185. Green, P. J. (2003) “Notes on the life and work of R.A. Fisher.” The Statistician, 52(3):
299-301.
186. Gregoire, T. G. (2003) “Why was Fisher so mad with Neyman?” Presentation at Western
Mensurationist Meeting, 14 July 2003, Victoria, BC, Canada.
187. Healy, M. J. R. (2003) “R. A. Fisher the statistician.” The Statistician, 52(3): 303-310.
188. Hubbard, R. and M. J. Bayarri. (2003) “Confusion Over Measure of Evidence (p’s)
Versus Errors (’s) in Classical Statistical Testing.” The American Statistician, 57(3): 171178.
189. Lele, S. (2003). Various work and correspondence. Includes: Subhash, L. and A. Das.
(2003) “Elicited data and incorporation of expert opinion for statistical inference in spatial
studies.” Mathematical Geology, 32(4): 465-467.
190. Onwuegbuzie, A.J. and J.R. Levin. (2003) “Without supporting statistical evidence, where
would reported measures of substantive importance lead? To no good effect.” Journal of
Modern Applied Statistical Methods, 2(1): 133-151.
13
StatisticalInferenceBiblio.pdf
© 2013 Timothy G. Gregoire, Yale University
191. Royall, R. & T-S Tsou. (2003) Interpreting statistical evidence by using imperfect models:
robust adjusted likelihood functions. Journal of the Royal Statistical Society B, 65(2) 391404.
192. Senn, S. (2003) “Foreword: A blue plaque for Fisher.” The Statistician, 52(3) 297-298.
193. Smith, G. D. (2003) “Uncertainty and significance” International Journal of Epidemiology,
32:683.
194. Stone, M. (2003) “Commentary: Worthwhile Polemic or Transatlantic storm-in-a-teacup?”
International Journal of Epidemiology, 32:693-694.
195. Bickel, D.R. 2004. Degree of differential gene expression: detecting biologically
significant expression differences and estimating their magnitudes. Bioinformatics 20(5): 683
– 688
196. Garcia V.L. (2004) Escaping the Bonferroni iron claw in ecological studies. Journal of
Ecology 657-663.
197. Stefano, J.D. (2004) “A confidence interval approach to data analysis”. Forest Ecology and
Management 187: 173 – 183.
198. Taper, M. L. & Lele, S. R. (2004) The Nature of Scientific Evidence: Statistical,
Philosophical, and Empirical Considerations. University of Chicago Press.
199. Christensen, R. (2005) Testing Fisher, Neyman, Pearson, and Bayes. The American
Statistician, 59(2)121-126.
200. Stephens, P.A. and Buskirk, S.W. (2005) Information theory and hypothesis testing: a call
for pluralism. Journal of Applied Ecology 42: 4-12.
201. Korn, E. L. & B. Freidlin. (2006) The likelihood as statistical evidence in multiple
comparisons in clinical trials: no free lunch. Biometrical Journal 3:346-355.
202. Lenhard, J. (2006) Models and Statistical Inference: The Controversy between Fisher and
Neyman-Pierson. British Journal Philosophy of Science 57:69-91.
203. Moerkerke, B., Goetghebeur, E., De Riek, J., and Roldan-Ruiz, I. 2006. Significance and
impotence: towards a balanced view of the null and the alternative hypotheses in marker
selection for plant breeding. J .R. Statist. Soc. 169: 61 - 79
204. Thompson, B. (2006). Critique of p-values. International Statistical Review. 74(1)1-14.
205. Yuan, Y. (2007). Bayesian meta-analysis of highly-cited controlled clinical trials based on
test statistics. (IBC meeting in Montreal, 2006??).
14
StatisticalInferenceBiblio.pdf
© 2013 Timothy G. Gregoire, Yale University
206. Boyles, R. A. (2008). The role of likelihood in interval estimation. The American
Statistician. 62(10) 22-26.
207. Spanos, A. (2008) “Review of Stephen T. Ziliak and Deirdre N. McCloskey;s The cult of
statistical significance: how the standard error costs us jobs, justice, and lives. Amm Arbor
(MI): The University of Michigan Press, 2008, XXII+ 322 pp”. Earsmus Journal for
Philosophy and Economics 1(1): 154 – 164.
208. Cox, D. R. (2009) Randomization in the Design of Experiments. International Statistical
Review 77(3) 415-429.
209. Hurlbert, S. H. (2009) The Ancient Black Art and Trandisciplinary Extent of
Pseudoreplication. Journal of Comparative Psychology, 123(4) 434-443.
210. Hurlbert, S.H. and Lombard, C.M. (2009) Final collapse of the Neyman-Pearson decision
theoretic framework and rise of the neoFisherian. Annales of Zoologici Fennici 46: 311 –
349.
211. Reichardt, C.S. and Gollob, H.F. (2009) “Justifying the use and increasing the power of a t
Test for a randomized experiment with a convenience sample”. Psychological Methods 4:
117 – 128.
212. Breadford, M., Stricland, M. & Keiser, A. (2010) “Statistics and soils” Course notes.
213. Browne, R. H. (2010) The t-test p value and its relationship to the effect size and P(X>Y).
The American Statistician 64(1)30-33 (Correction, TAS, 64(2) 195)
214. Micceri, T. (2010) The Unicorn, the normal curve, and other improbably creatures.
Psychological Bulletin 105(1) 156-166.
215. Zuo, Y. (2010) “Is the t confidence interval ± t∞ (n - 1)s/√n optimal?”. The American
Statistician 64(2): 170 – 173.
216. Boos, D.D. and Stefanski L.A. (2011) p-Value Precision and Reproducibility. The
American Statistician 65 (4): 213-221.
217. Hubbard, R. (2011) The widespread misinterpretation of p-values as error probabilities.
Journal of Applied Statistics 38 (11): 2617-2626.
218. Kass, R.E. (2011) Statistical Inference: The Big Picture. Institute of Mathematical Statistics
26(1): 1-9.
219. Lewis, F., Butler, A. and Gilbert, L. (2011) A unified approach to model selection using the
likelihood ratio test. Methods in Ecology and Evolution 2: 155 – 162.
15
StatisticalInferenceBiblio.pdf
© 2013 Timothy G. Gregoire, Yale University
220. Picquelle, S.J. and Mier, K.L. (2011) A Practice guide to statistical methods for comparing
means from two-stage sampling. Fisheries Research 107: 1-13.
221. Wild, C.J., Pfannkuch, M. and Regan, M. (2011) Towards more accessible conceptions of
statistical inference. Journal of Royal Statistical Society 174(2): 247 – 295.
222. Friston, K. (2012) “Ten ironic rules for non-statistical reviewers”. NeuroImage 61: 1300 –
1310.
223. Hurlbert, S.H. (2012) Pseudofactorialism, response structures and collective responsibility.
Austral Ecology: 1-18.
224. Hurlbert, S.H. and Lombard, C.M. (2012) Lopsided Reasoning on Lopsided Tests and
Multiple Comparisons. Australian & New Zealand Journal of Statistics 54(1): 23 – 42.
225. Bacchetti, P. (2013) “Small sample size is not the real problem”. Nature Reviews:
Neuroscience 14:585 doi: 10.1038/nrn3475-c3.
226. Button, K. S., Ioannidis, J. P. A., Mokrysz, C., Nosek, B. A. Flint, J., Robinson, E. S. J. &
Munafo. (2013) “Power failure: why small sample size undermines the reliability of
neuroscience”. Nature Review Neuroscience doi 10:1038/nrm3475.
227. Cox, D. R. (2013) “A return to an old paper: ‘Tests of separate families of hypotheses’”
Journal of the Royal Statistical Society, Series B. 75(2) 207 – 215.
228. Hurlbert, S.H. (2013) Affirmation of the classical terminology for experimental design via
a critique of Casella’s Statistical design. Agronomy Journal 105(2): 412 – 418.
229. Johnson, V. E. (2013) “Revised standards for statistical evidence”. Proceedings of the
National Academies of Science 110(48) 19313 – 19317. doi 10.1073/pnas.1313476110.
230. Lin, W. (2013) “Agnostic notes on regression adjustments to experimental data:
Reexamining Freedman’s critique”. The Annals of Applied Statistics 7(1): 295 – 318.
231. Colquhoun, D. (2014) “An investigation of the false discovery rate and the
misinterpretation of p-values”. Royal Society Open Science doi. 10.1098/rsos.140216.
232. Foster, C. (2014) “Confidence trick: the interpretation of confidence intervals”. Canadian
Journal of Science, Mathematics, and Technology Education 14(10 23 – 34. doi
10.1080/14926156.2014.874615
233. Low-Décarie, E., Chivers, C. & Granados, M. (2014) “Rising complexity and falling
explanatory power in ecology”. Frontiers in Ecology and Environment 12(7) 412 – 418.
16
Download