Research Integrity, The Importance of Data Acquisition and Management Ralph H. Hruban, M.D. Monday, February 13, 2011 Conflict of Interest • I receive royalty payments from Myriad Genetics for the PalB2 invention I think what happened is that you are betting on football, and what’s after football is basketball, and then the NCAA tournament. The next thing that follows is betting on baseball… I wish I could take it all back. Pete Rose Aristotle “We become just by performing just actions, temperate by performing temperate actions, brave by performing brave actions” Nicomachean Ethics http://www.gap-system.org/~history/PictDisplay/Aristotle.htm Henry L. Mencken “Science, at bottom, is really anti-intellectual. It always distrusts pure reason, and demands the production of objective fact.” http://www.toptenz.net “The [pirate] code is more what you’d call ‘guidelines’ than actual rules” Barbossa, Pirates of the Caribbean Screenrant.com PHS 42 C.F.R. 93 The PHS regulation (42 C.F.R. 93) defines research misconduct as fabrication, falsification, or plagiarism in proposing, performing, or reviewing research, or in reporting research results. (a) Fabrication is making up data or results and recording or reporting them. (b) Falsification is manipulating research materials, equipment, or processes, or changing or omitting data or results such that the research is not accurately represented in the research record. (c) Plagiarism is the appropriation of another person's ideas, processes, results, or words without giving appropriate credit. (d) Research misconduct does not include honest error or differences of opinion. Data Integrity 1. How common is data fraud? 2. Fraud harms patients, the institution and the investigator 3. Examples of inappropriately published data 4. How can you prevent fraud in your own lab? Fraud is More Common Than You Think Fraud in High School ORI is pleased to have high school students, Michael Moorin and Tyler Smith, present at the Quest for Research Excellence 2011 Conference. Moorin and Smith made headlines in the news media, such as The Washington Post, when they found over 60% of high school students reported that they had falsified or fabricated the data in their science fair projects http://ori.hhs.gov/blog/category/researchmisconduct/ Web Sites http://retractionwatch.wordpress.com/ and http://abnormalscienceblog.wordpress.com/ Yet another young scientist starting postgrad Desires their CV to be better, a tad. Such a wonderful gel! I must publish in Cell! The controls I can fix on my iPad. Retractionwatch.wordpress.com/ How Common is Misconduct? • Meta-analysis of surveys of scientific misconduct • 2% of scientists admitted to have fabricated, falsified or modified data or results at least once Fanelli PLoS One 2009; 4:e5738 Fanelli PLoS One 2009; 4:e5738 Many More Were Aware of Misconduct by Others! Fanelli PLoS One 2009; 4:e5738 Nature 478, 26-28 (2011) Nature 478, 26-28 (2011) Data Integrity 1. How common is data fraud? 2. Fraud harms patients, the institution and the investigator 3. Examples of inappropriately published data 4. How can you prevent fraud in your own lab? Fraud Harms Patients • Analyzed 180 retracted articles that involved human subjects or “freshly derived human material,” along with 851 published studies citing that research • The retracted papers were cited over 5,000 times • According to Steen, 6,573 patients received treatment in studies eventually retracted because of fraud. One study alone, published in 2001, included 2,161 women being treated for postpartum bleeding • The downstream studies included more than 400,000 subjects, with 70,501 receiving treatment R. Grant Steen, Journal of Medical Ethics, 2011 Deception at Duke Scott Pelley reports on a Duke University oncologist whose supervisor says he manipulated the data in his study of a breakthrough cancer therapy http://www.cbsnews.com/8301-18560_162-57376073/deception-at-duke Fraud Harms the Institution and the Investigator “But the research at Duke turned out to be wrong. Its gene-based tests proved worthless, and the research behind them was discredited. Ms. Jacobs died a few months after treatment, and her husband and other patients’ relatives have retained lawyers.” Anil Potti, MD Gina Kolata, on Anil Potti, New York times, July 7, 2011 Potti Scandal The defendants named in the suits are: • Duke University • Duke University Health System, Inc. • Private Diagnostics Clinic PLLC • Joseph Nevins, PhD • Anil Potti, MD • Michael Cuff, MD • Sally Kornbluth, MD • John M. Harrelson, MD • Cancer Diagnostics, Inc. Research Misconduct Harms Patients, the Investigator and the Institution Data Integrity 1. How common is data fraud? 2. Fraud harms patients, the institution and the investigator 3. Examples of inappropriately published data 4. How can you prevent fraud in your own lab? http://www.pioneerinstitute.org Examples • • • • Fabricated data Falsified data Selective reporting of data Image manipulation Example 1: Trial of 3 Drugs- Actual Results 1.2 1 0.8 0.6 0.4 0.2 0 Drug 1 Drug 2 Drug 3 Trial of 3 Drugs- Results Reported 6 5 4 Drug 1 Drug 2 Drug 3 3 2 1 0 Time 1 Time 2 Time 3 Time 4 Data Fabrication Fabrication is making up data or results and recording or reporting them Jon Sudbø- Fabrication • Medical researcher at the Radium Hospital, Oslo, Norway • 2005 article in the Lancet suggested that Ibuprofen reduces oral cancer in smokers The Lancet, 366 (9494): 1359–1366; http://www.vg.no/nyheter/innenriks/artikkel.php Jon Sudbø- Fabrication • Suspicion aroused because the data were supposedly from a cancer patient database which had not yet opened • Of the 908 subjects in the Lancet study 250 had the same date of birth • Sudbø later acknowledged that he used fictional data in at least two more papers, published in the New England Journal of Medicine and Journal of Clinical Oncology http://en.m.wikipedia.org/wiki/Jon_Sudb%C3%B8#cite_note-2 Jon Sudbø- Fabrication • Independent commission investigated and also criticized the co-authors of Sudbø's papers • Dr. Atle Klovning, a leading European authority, said that Sudbø's co-authors had probably not lived up to their responsibilities according to the rules of authorship • You think they would have noticed the database wasn’t open yet! http://en.m.wikipedia.org/wiki/Jon_Sudb%C3%B8#cite_note-2 International Committee of Medical Journal Editors • An “author” is generally considered to be someone who has made substantive intellectual contributions to a published study…. An author must take responsibility for at least one component of the work, should be able to identify who is responsible for each other component, and should ideally be confident in their co-authors’ ability and integrity. • When a large, multicenter group has conducted the work, the group should identify the individuals who accept direct responsibility for the manuscript http://www.icmje.org/ethical_1author.html Example 2: Trial of 3 Drugs- Actual Results 3.5 3 2.5 2 Drug 1 Drug 2 Drug 3 1.5 1 0.5 0 Time 1 Time 2 Time 3 Time 4 Trial of 3 Drugs- Results Reported 6 5 4 Drug 1 Drug 2 Drug 3 3 2 1 0 Time 1 Time 2 Time 3 Time 4 Data Falsification Falsification is manipulating research materials, equipment, or processes, or changing or omitting data or results such that the research is not accurately represented in the research record Suggested that microarray data from cell lines (NCI 60) could be used to define drug response signatures, and these signatures could in turn be used to guide therapy The results of this study had therapeutic implications Nature Medicine, 2006 Woodward and Bernstein of Bioinformatics Keith Baggerly Time.com Kevin Coombes Forensic bioinformatics http://videolectures.net/cancerbioinfor matics2010_baggerly_irrh/ http://www.cbsnews.com/8301-18560_162-57376073/deception-at-duke/ Baggerly and Coombes Investigate Potti Baggerly Reported Genes http://www.jhsph.edu/cct/videos/ Potti’s paper suffers from a “frameshift mutation” (Off by one error for all of the genes caused by an extra column) Potti Baggerly Potti et al submit erratum with updated gene lists NOT the end of the saga……. http://www.jhsph.edu/cct/videos Sensitive and Resistant Switched for some Drugs http://www.jhsph.edu/cct/videos/ Keith Baggerly and Coombes Letter to the Editor (Nature Medicine) – one page letter, 149 pages of Supplementary Data November 2007 http://www.jhsph.edu/cct/videos For cisplatin, U133A arrays were used for training. ERCC1, ERCC4 and DNA repair genes are identified as “important” Journal of Clinical Oncology, 2007 http://www.jhsph.edu/cct/videos Four Genes Didn’t Match The four that couldn’t be matched were the genes that were touted to be functionally important http://www.jhsph.edu/cct/videos Based directly on the Potti and Nevins publications, despite concerns raised by Baggerly and Combes, Duke Initiates three Clinical trials in 2007 Adjuvant Cisplatin With Either Genomic-Guided Vinorelbine or Pemetrexed for Early Stage Non-Small-Cell Lung Cancer (TOP0703) Study Using a Genomic Predictor of Platinum Resistance to Guide Therapy in Stage IIIB/IV Non-Small Cell Lung Cancer (TOP0602) Phase II Study Evaluating The Safety And Response To Neoadjuvant Dasatinib In Early Stage Non-Small Cell Lung Cancer (TOP0706) O, what a tangled web we weave; When first we practice to deceive! Sir Walter Scott July 16, 2010 Retraction watch November 2010 Improving Validation Practices in “Omics” Research • Routine replication, public data and protocol availability, funding incentives, reproducibility rewards or penalties, and targeted repeatability checks Ioannidis, et al., Science December 2 2011: Vol. 334: 1230-1232 Example 3: Trial of 3 Drugs-Actual Results Results Statistically Significant 6 5 4 Drug 1 Drug 2 Drug 3 3 2 1 0 Time 1 Time 2 Time 3 Time 4 Trial of 3 Drugs-Reported Results (Results still Significant) 7 6 5 4 Drug 1 Drug 2 Drug 3 3 2 1 0 Time 1 Time 2 Time 3 Time 4 It is Still Data Falsification Falsification is manipulating research materials, equipment, or processes, or changing or omitting data or results such that the research is not accurately represented in the research record (No qualifier here that falsification is ok so long as the results were originally statistically significant) Example 4: Trial of 3 Drugs-Actual Results: Results Statistically Significant 6 5 4 Drug 1 Drug 2 Drug 3 3 2 1 0 Time 1 Time 2 Time 3 Time 4 The PI Tells the Post-Doc “These two data points seems off. I would expect there to be a greater difference” Trial of 3 Drugs-Reported Results (Results still Significant) 7 6 5 4 Drug 1 Drug 2 Drug 3 3 2 1 0 Time 1 Time 2 Time 3 Time 4 It is found out later that the Post-Doc Changed the data in question Does the P.I. have any responsibility for what happened? Dipak K. Das • The University of Connecticut report alleges Dr. Das “defunded” the work of a student in his lab because she did not produce results that he wanted • The investigation of Dr. Das’s work began in January 2009, two weeks after the university received an anonymous allegation about research irregularities in his laboratory NY Times, January 11, 2012 and retractionwatch.wordpress.com Allegations of misconduct often come from a whistle blower inside the group, such as a postdoc or graduate student who does not agree with the PI's tendencies of glossing over data or blatant misconduct We All Have a Responsibility to Maintain Integrity What should we do when we “suspect” another PI is falsifying data? What if the other PI is a competitor? Example 5: 5-Month Trial of 3 Drugs-Actual Results of a 5-Month Design 6 5 4 Drug 1 Drug 2 Drug 3 3 2 1 0 Time 1 Time 2 Time 3 Time 4 Time 5 Trial of 3 Drugs-Results Reported 6 5 4 Drug 1 Drug 2 Drug 3 3 2 1 0 Time 1 Time 2 Time 3 Time 4 Could be Falsification Falsification is manipulating research materials, equipment, or processes, or changing or omitting data or results such that the research is not accurately represented in the research record OK Only if Clearly Documented in the Paper “…not accurately represented in the research record” Example 6: Results (4 Day Expt., but Technician ran the Experiment too long) 6 5 4 Drug 1 Drug 2 Drug 3 3 2 1 0 Time 1 Time 2 Time 3 Time 4 Time 5 Trial of 3 Drugs-Results Reported 6 5 4 Drug 1 Drug 2 Drug 3 3 2 1 0 Time 1 Time 2 Time 3 Time 4 Probably OK if the study design was shorter, but it ought to get you thinking! The most exciting phrase to hear in science, the one that heralds new discoveries, is not ‘Eureka,’ but ‘That’s funny…’ Isaac Asimov Example 7: Trial of 3 Drugs- Results First Run 3.5 3 2.5 2 Drug 1 Drug 2 Drug 3 1.5 1 0.5 0 Time 1 Time 2 Time 3 Time 4 Results Second Run 6 5 4 Drug 1 Drug 2 Drug 3 3 2 1 0 Time 1 Time 2 Time 3 Time 4 Time 5 Results Third Run 6 5 4 Drug 1 Drug 2 Drug 3 3 2 1 0 Time 1 Time 2 Time 3 Time 4 Reported (The Third Run) 6 5 4 Drug 1 Drug 2 Drug 3 3 2 1 0 Time 1 Time 2 Time 3 Time 4 Likely Falsification Falsification is manipulating research materials, equipment, or processes, or changing or omitting data or results such that the research is not accurately represented in the research record “…not accurately represented in the research record” If there were genuine reasons the first two runs didn’t work you ought to document why, fix them th and repeat the study a 4 time! Example 8: Mouse Model • Genetically engineered mouse model suggests that protein X promotes metastases • Scientist shares an antibody to human protein X with the collaborating pathologist studying human disease • The antibody doesn’t label any human metastases, in fact, it is only expressed in nonmetastatic lesions http://www.buzzfeed.com Mouse Model • Manuscript published reads “Protein X Promotes Metastases” • Is this selective reporting of data? Mouse Model • Manuscript published reads “Protein X Promotes Metastases in a Mouse Model” • Is this selective reporting of data? Mouse Model • What was suggested in the paper? • Was the discussion always focused on mouse models or did it stray into suggesting that protein X is important in humans? Example 9: A New Drug to Cure Depression • The P.I. develops a new drug to treat depression • It works on 100 of 103 patients • The investigators go back and review the charts on the three patients on whom the drug didn’t work and on re-review it is clear that the 3 patients have manic depressive illness • The drug is reported to be effective in 100% of patients with depression Dangers of Re-Review of Selected Data …such that the research is not accurately represented in the research record Example 10: PowerPoint Presentation Within Hopkins Falsified data are presented at a meeting within the Hopkins community. The data are not published. Is this research misconduct? Yes, it is research misconduct even if the data are not published Image Manipulation Falsification is manipulating research materials, equipment, or processes, or changing or omitting data or results such that the research is not accurately represented in the research record It‘s so easy to add and subtract with Photoshop M. Rossner and K. Yamada, JCB, 2004 Rubber Stamp to “Clean” the Background M. Rossner and K. Yamada, JCB, 2004 “Your X-ray showed a broken rib, but we fixed it in Photoshop” http://www.glasbergen.com Image Manipulations M. Rossner and K. Yamada, JCB, 2004 Woo-Suk Hwang Science 2005 http://news.naver.com/main/read.nhn Woo-Suk Hwang Time, Dec. 15, 2005 Reusing Images- Potti Augustine et al., 2009, Clin Can Res, 15:502-10, Fig 4A. Temozolomide, NCI-60. Hsu et al., 2007, J Clin Oncol, 25:4350-7, Fig 1A. Cisplatin, Gyorffy cell lines. http://videolectures.net/keith_baggerly/ Altering Images If you misrepresent your data, you are deceiving your colleagues, who expect and assume basic scientific honesty— that is, that each image you present is an accurate representation of what you actually observed. In addition, an image usually carries information beyond the specific point being made. M. Rossner and K. Yamada, JCB, 2004 Altering Images Data must be reported directly, not through a filter based on what you think they “should” illustrate to your audience. For every adjustment that you make to a digital image, it is important to ask yourself, “Is the image that results from this adjustment still an accurate representation of the original data?” If the answer to this question is “no,” your actions may be construed as misconduct. M. Rossner and K. Yamada, JCB, 2004 Image Manipulation Falsification is manipulating research materials, equipment, or processes, or changing or omitting data or results such that the research is not accurately represented in the research record Data Integrity 1. Fraud in the history of science 2. How common is data fraud? 3. Fraud harms patients, the institution and the investigator 4. Examples of inappropriately published data 5. How can you prevent fraud in your own lab? http://randysright.files.wordpress.com Prevention!! • Establish a culture of honesty above all in your lab • Inform and educate • Screen- Periodically ask to see lab books • Detect problems by working closely with primary data Establish a Culture of Honesty • “We need this difference to be significant or I won’t get my grant” • “These data points don’t fit the results I expected” These small things can add up and can quickly become the norm Establish a Culture of Honesty vs. From day one; “All that matters to me is that the results you present are 100% honest” Inform and Educate • Dedicate some journal clubs or lab group meetings to educating those under you on the importance of academic integrity • Encourage members of your lab to attend lectures such as this one! Even so we have to screen for problems! We are not going to detect fraud if we only look at PowerPoint presentations of finished results Picture of a PowerPoint presentation We need to carefully review and question primary data Henry L. Mencken “Conscience is the inner voice that warns us somebody may be looking” If it is Too Good to be True • Blind the samples and ask the person to rerun the experiment • Have someone else in the lab rerun the experiment http://econsultancy.com/ Tools for detecting misconduct • Anti-plagiarism software (eTBLAST, CrossCheck, Turnitin) • Screening images (PhotoShop)Pioneered by J Cell Biology. See M. Rossner and K. Yamada, JCB 2004; 166:11-15- found 1% unacceptable manipulation • Data Review (digit preference) Liz Wager, Council of Scientific Editors Conclusions Preventing damage would save careers from ruin Everyone has a responsibility to promote a culture in which research misconduct does not happen Harold C. Sox, Annals of Internal Medicine If you are the first or last author on a paper You are responsible: 1. For making sure all of the other authors have read and approved the manuscript 2. For everything in the manuscript- make sure the images included are correct, that the text isn’t copied from somewhere else, that the data weren’t manipulated, that you have appropriate IRB protocols, and that the protocols were followed Take Home Message #1 We need to be aware that at a place like Johns Hopkins people may feel enormous pressures Take Home Message #2 Science should be our “touchstone” The currency of science is the peerreviewed and peer-accepted manuscript that is backed by a gold standard of scientific integrity and scrupulous honesty. Anything that tarnishes this gold standard threatens to devalue the worth of scientific currency. Ultimately, society itself suffers because scientific advancement prepares the way for social progress Curt Civin Editor-in-Chief, Stem Cells “Cloned Photomicrographs, not cloned cells” Panel Discussion • • • • • Ralph Hruban, M.D. Bob Bollinger, M.D., M.P.H. Curt Civin, M.D. Anirban Maitra, M.B.B.S. Sheila Garrity, J.D., M.P.H., M.B.A. – Moderator