The Gold Standard Debate Cornell Evaluation Network September 10, 2007 William Trochim What is the “gold standard”? A brief history of the Randomized Experiment o R.A. Fisher (Fisher, 1925) The Rothamsted Experimental Station The Lady Tasting Tea (Salsburg, 2001) – page 47-48 The branch to education, psychology and evaluation o McCall’s (1923), How to Experiment in Education actually predates Fisher by two years (and is in the tradition of Thorndike) although he did not include random assignment (he emphasized Latin Squares designs and comparison groups) – his work opened the classic Campbell and Stanley monograph (Campbell & Stanley, 1963). o Campbell & Stanley describe an era of disillusionment with experimentation in education (p. 2-3). o Campbell & Stanley and Cook & Campbell (Cook & Campbell, 1979) became the new experimental orthodoxy The branch to clinical medicine o Austin Bradford Hill and the “first” clinical trial on Streptomycin (Stevenson, 1998) for pulmonary tuberculosis in the 1940s (Hart, 1999) Hill was clearly influenced by Fisher (Armitage, 2003) (see http://ije.oxfordjournals.org/cgi/content/full/32/6/925). o The Salk Vaccine Trials The supremacy of the RCT was established by the trial of the Salk polio vaccine, which involved an elaborate double-blind test (ie neither investigators nor patients knew who was actually receiving the vaccine) on nearly two million US children. The report published in 1955 unequivocally affirmed that the vaccine was safe and effective. These successes, and the failures of the test procedures used for thalidomide, led to the 1962 amendments to the US Food and Drug Act, institutionalising the view that 'randomised, placebo-controlled, double-blind trials are the appropriate means, indeed almost the only scientific means, to establish the efficacy of a treatment' (David Healy, The antidepressant era, Cambridge, US: Harvard University Press 1997). (Stevenson, 1998) The Kefauver-Harris Drug Amendments 1962 o For a good history of clinical trials in medicine, see (Stevenson, 1998) (http://www.chemsoc.org/chembytes/ezine/1998/stevenson.htm) and (T. Chen, 2003) o The phased clinical trial process (this graph shows the four-phase clinical trial process within the broader basic and pre-clinical trial model) (from Stevenson: http://www.chemsoc.org/chembytes/ezine/images/1998/stevenson_fig1_lge.doc) Clinical Trials, including a description of the phases o Wikipedia: Clinical Trials o Wikipedia: Randomized Controlled Trials What percent of drugs pass each phase? o http://www.centerwatch.com/patient/backgrnd.html Challenges to the Experimental Orthodoxy in the 1980s and 1990s o Cronbach (Cronbach, 1982) and the argument for generalizability (external validity) o Theory-Driven Evaluation (H. Chen & Rossi, 1983, 1990) o The qualitative-quantitative debate (Guba & Lincoln, 1989; Patton, 1980) The rise of meta-analysis from 1977 - 1990 o In education and psychology, Smith and Glass (Smith & Glass, 1977) o In clinical medicine (Dickersin, Scherer, & Lefebvre, 1994; Hasselblad, 1998) Evidence-Based Medicine and Evidence-Based Practice o The Cochrane Collaboration (http://www.cochrane.org/index.htm) (Chalmers & Haynes, 1994) o The Community Guide (http://www.thecommunityguide.org/) o The Campbell Collaboration (http://www.campbellcollaboration.org/) (Davies & Boruch, 2001) http://www.bmj.com/cgi/content/full/323/7308/294 The current controversy in evaluation o The Department of Education regulations Original posting, Nov 4, 2003 (U.S. Department of Education, 2003); http://frwebgate.access.gpo.gov/cgibin/getpage.cgi?position=all&page=62445&dbname=2003_register Revised posting, Jan 25, 2005 (U.S. Department of Education, 2005); http://frwebgate.access.gpo.gov/cgibin/getpage.cgi?dbname=2005_register&position=all&page=3585 o The reaction in AEA (American Evaluation Association, 2003); http://www.eval.org/doestatement.htm o Current developments Spreading the “gold standard” The recent ND volume (Julnes & Rog, 2007) The emerging new balance References American Evaluation Association. (2003). Scientifically Based Evaluation Methods [Electronic Version], from http://www.eval.org/doestatement.htm Armitage, P. (2003). Fisher, Bradford Hill, and randomization International Journal of Epidemiology 32, 925-928. Campbell, D. T., & Stanley, J. C. (1963). Experimental and Quasi-Experimental Designs for Research. Chicago: Rand McNally. Chalmers, I., & Haynes, B. (1994). Systematic Reviews - Reporting, Updating, and Correcting Systematic Reviews of the Effects of Health-Care. British Medical Journal, 309(6958), 862-865. Chen, H., & Rossi, P. (1983). Evaluating with sense: The theory-driven approach. Evaluation Review, 7, 283-302. Chen, H., & Rossi, P. (1990). Theory-Driven Evaluations. Thousand Oaks, CA: Sage. Chen, T. (2003). History of Statistical Thinking in Medicine. In Y. Lu & J. Fang (Eds.), Advanced Medical Statistics (pp. 3-19). Singapore: World Scientific Publishing. Cook, T. D., & Campbell, D. T. (1979). Quasi-Experimentation: Design and Analysis for Field Settings. Boston: Houghton Mifflin Company. Cronbach, L. J. (1982). Designing Evaluations of Educational and Social Programs. San Francisco: Jossey-Bass. Davies, P., & Boruch, R. (2001). The Campbell Collaboration BMJ, 323, 294-295. Dickersin, K., Scherer, R., & Lefebvre, C. (1994). Systematic Reviews - Identifying Relevant Studies for Systematic Reviews. British Medical Journal, 309(6964), 1286-1291. Fisher, R. A. (1925). Statistical Methods for Research Workers. Edinburgh: Oliver and Boyd. Guba, E. G., & Lincoln, Y. S. (1989). Fourth Generation Evaluation. Newbury Park, CA: Sage. Hart, P. (1999). A change in scientific approach: from alternation to randomised allocation in clinical trials in the 1940s. BMJ, 319(7209), 572-573. Hasselblad, V. (1998). Meta-analysis of multitreatment studies. Medical Decision Making, 18(1), 37-43. Julnes, G., & Rog, D. (2007). Informing Federal Policies on Evaluation Methodology: Building the Evidence Base for Method Choice in Government Sponsored Evaluation. New Directions in Evaluation, 113. Patton, M. Q. (1980). Qualitative Evaluation and Research Methods. Newbury Park, CA: Sage. Salsburg, D. (2001). The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century. New York: W.H. Freeman. Smith, M. L., & Glass, G. V. (1977). Meta-Analysis of Psychotherapy Outcome Studies. American Psychologist, 32(9), 752-760. Stevenson, R. (1998). Gold standard for drugs. Chembytes e-zine, 9(September). U.S. Department of Education. (2003). Scientifically Based Evaluation Methods. Federal Register(Nov. 4, 2003), 62445-62447. U.S. Department of Education. (2005). Scientifically Based Evaluation Methods. Federal Register(Jan. 25, 2005), 3585-3589.