MUSC Postdoctoral Retreat on the Responsible Conduct of Research “Data Management, Data Selection, and Reporting Research” Ed Krug BE101 876-2404 krugel@musc.edu 12/11/09 What does “Data Management” involve? 1) Physical record keeping • lab notebook • instrument printouts • images • computer analysis • e-mail • lab meeting discussion • probes and cells 2) Analysis of primary data • data selection • statistical analysis • calculations • normalization 3) Presentation of results • graphs • tables • pictures • Figures 4) Publication • posters • seminars • manuscripts • funding requests Why is data management important? • • • • • • • Mandated by funding agency Maintain focus on experimental objective Document observations Troubleshooting Organization of thoughts May analyze data differently in the future Communication with mentor NIH Data Sharing Policy “The NIH expects and supports the timely release and sharing of final research data from NIH-supported studies for use by other researchers.” NOT-OD-03-032 “Definition of Research Data: Recorded factual material commonly accepted in the scientific community as necessary to validate research findings. It does not include preliminary analyses; drafts of scientific papers; plans for future research; peer reviews; communications with colleagues; physical objects (e.g., laboratory samples, audio or video tapes); trade secrets; commercial information; materials necessary to be held confidential by a researcher until publication in a peer-reviewed journal; information that is protected under the law (e.g., intellectual property); personnel and medical files and similar files, the disclosure of which would constitute an unwarranted invasion of personal privacy; or information that could be used to identify a particular person in a research study.” http://grants1.nih.gov/grants/policy/data_sharing/index.htm Essential Characteristics of a Research Notebook • • • • • • • • • Bound pages Number every page - index in front Table of Contents Abbreviations list Relevant sections - title, purpose, materials & methods, results, conclusions, notes Purpose and M&M written before starting the experiment Includes reagent details, e.g. mfr, lot number, location, etc. Cross-referenced to computer files, images, etc. Cross-referenced with collaborator’s notebook Notebooks are university property! stays with PI – 3 years past acceptance of final financial statement (6 years from end of grant to be safe) Example Data Record • • • • • • • • • • Date and Title Purpose Rationale Materials and Methods Data Calculations Interpretation of the Data Conclusions Future experiments Notes (follow up discussion with colleagues or mentor – date separately) I strongly suggest a listing of all notebook and page numbers corresponding to any data used for a poster, manuscript or funding application! Give a copy of this inventory to your mentor and any collaborators. Studies show a very low reproducibility for articles published in scientific journals • The biotech company Amgen had a team of about 100 scientists trying to reproduce the findings of 53 “landmark” articles in cancer research published by reputable labs in top journals. Only 6 of the 53 studies were reproduced. [Nature 483:531-33 (2012)] • Scientists at the pharmaceutical company, Bayer, examined 67 target-validation projects in oncology, women’s health, and cardiovascular medicine. Published results were reproduced in only 14 out of 67 projects.[Nature Reviiews Drug Discovery 10:712 (2011)] However, lack of reproducibility does not necessarily mean results are false … details are critical! http://blog.jove.com/2012/05/03/studies-show-only-10-of-published-science-articles-arereproducible-what-is-happening According to the NIH … (RFA-GM-15-006) The lack of reproducibility is not due primarily to intentional fabrication or falsification of data. Rather, in many cases there is a lack awareness or adherence to sufficiently high standards in the planning and execution of scientific experiments, and in transparency in the reporting of science. Examples include inherently weak experimental designs and over-interpretation of statistically marginal differences and variability of materials, differing brands or even lots of reagents, differing or drifting strains of organisms and cells in culture, or other variables that have not been adequately controlled. A study of the published literature has found large numbers of reports where basic information such as the numbers of animals, definition of control groups, randomization, blinding, and so forth cannot be determined and/or assumed to be adequate. Graduate students were often significantly dependent on the mentor or the mentor's lab for the training received, and postdoctoral fellows were primarily dependent on the mentor or mentor's lab at all institutions. Rather than being learned in prescribed curricula, training in good laboratory practices that influence data reproducibility appears to be largely passed down from generation to generation of working scientists, with substantial variation from laboratory to laboratory. … and researchers agree! Neuron 84: 572-582 (2014) Suggested Best Practices to Enhance Rigor and Reproducibility Neuron 84: 572-582 (2014) Suggested Best Practices to Enhance Rigor and Reproducibility Neuron 84: 572-582 (2014) Organize your computer drive using informative grouping VS. Use a consistent file naming strategy: • Electronic data and samples <notebook ID>-<page #>-<sample ID>-<analysis parameters> (e.g. EK01-105-12a-40X-areaC) • Files <your last name>-<keyword> (e.g. Krug R01 renewal draft 092412) – append your initials to revisions (e.g. Krug R01 renewal draft 092412-jd) Describe naming strategy in front page of each notebook – keep an electronic version on your desktop Image Manipulation Issues “Data may be excluded from the experimental results only if you have a sound reason to do so!” Mother Nature indicate justification in your notebook for any observations excluded for an experiment Examples of misleading presentation of data • Origin of graph not zero • Number of animals vs. number of determinations • Non-standard normalization method • “typical results” • Not showing the entire gel • “cleaning up” the data The Journal of Cell Biology: • 25% of accepted papers have at least 1 figure with undocumented manipulation • revokes the acceptance of about 1% of its papers due to inappropriate image manipulation If you misrepresent your data you are deceiving people – as well as precluding alternative interpretation of the data in the future. Rossner, M. (2006). The Scientist 20:24-25. As photographed Brightness Contrast Adjustment As photographed Erasing Brightness and contrast adjustments must be applied equally across all images being compared Rossner, M. (2006). The Scientist 20:24-25. How much can you adjust brightness & contrast? Rossner, M. and Yamada, K. (2004). J. Cell Biol. 166:11-15. Data that looks too good to be true … Rossner, M. and Yamada, K. (2004). J. Cell Biol. 166:11-15. All data have signatures – even gels “Innovative approaches” to data presentation are not allowed! What CAN I do? • List all image acquisition and processing tools and software • Document key image-gathering setting and manipulations in the Supplemental materials • Clearly demarcate borders between images collected at different times • Avoid use of touch up tools or deliberately obscuring parts of an image • Processing is acceptable only if applied across the entire image • Be prepared to deliver the original, unprocessed images to the editor Nature, instructions to authors (2006) http://research.unl.edu/researchresponsibility/re sponsible-conduct-of-research/