Data Management & Record Keeping

advertisement
MUSC Postdoctoral Retreat on the
Responsible Conduct of Research
“Data Management, Data Selection,
and Reporting Research”
Ed Krug
BE101
876-2404
krugel@musc.edu
12/11/09
What does “Data Management” involve?
1) Physical record keeping
• lab notebook
• instrument printouts
• images
• computer analysis
• e-mail
• lab meeting discussion
• probes and cells
2) Analysis of primary data
• data selection
• statistical analysis
• calculations
• normalization
3) Presentation of results
• graphs
• tables
• pictures
• Figures
4) Publication
• posters
• seminars
• manuscripts
• funding requests
Why is data management important?
•
•
•
•
•
•
•
Mandated by funding agency
Maintain focus on experimental objective
Document observations
Troubleshooting
Organization of thoughts
May analyze data differently in the future
Communication with mentor
NIH Data Sharing Policy
“The NIH expects and supports the timely release and
sharing of final research data from NIH-supported studies
for use by other researchers.” NOT-OD-03-032
“Definition of Research Data: Recorded factual material commonly
accepted in the scientific community as necessary to validate research
findings. It does not include preliminary analyses; drafts of scientific
papers; plans for future research; peer reviews; communications with
colleagues; physical objects (e.g., laboratory samples, audio or video
tapes); trade secrets; commercial information; materials necessary to
be held confidential by a researcher until publication in a peer-reviewed
journal; information that is protected under the law (e.g., intellectual
property); personnel and medical files and similar files, the disclosure of
which would constitute an unwarranted invasion of personal privacy; or
information that could be used to identify a particular person in a
research study.”
http://grants1.nih.gov/grants/policy/data_sharing/index.htm
Essential Characteristics of a Research Notebook
•
•
•
•
•
•
•
•
•
Bound pages
Number every page - index in front
Table of Contents
Abbreviations list
Relevant sections - title, purpose, materials & methods,
results, conclusions, notes
Purpose and M&M written before starting the experiment
Includes reagent details, e.g. mfr, lot number, location, etc.
Cross-referenced to computer files, images, etc.
Cross-referenced with collaborator’s notebook
Notebooks are university property!
stays with PI – 3 years past acceptance of final financial
statement (6 years from end of grant to be safe)
Example Data Record
•
•
•
•
•
•
•
•
•
•
Date and Title
Purpose
Rationale
Materials and Methods
Data
Calculations
Interpretation of the Data
Conclusions
Future experiments
Notes (follow up discussion with
colleagues or mentor – date separately)
I strongly suggest a listing of all notebook and
page numbers corresponding to any data used
for a poster, manuscript or funding application!
Give a copy of this inventory to your mentor and
any collaborators.
Studies show a very low reproducibility for
articles published in scientific journals
• The biotech company Amgen had a team of about 100
scientists trying to reproduce the findings of 53 “landmark”
articles in cancer research published by reputable labs in
top journals. Only 6 of the 53 studies were reproduced.
[Nature 483:531-33 (2012)]
• Scientists at the pharmaceutical company, Bayer, examined
67 target-validation projects in oncology, women’s health,
and cardiovascular medicine. Published results were
reproduced in only 14 out of 67 projects.[Nature Reviiews
Drug Discovery 10:712 (2011)]
However, lack of reproducibility does not necessarily
mean results are false … details are critical!
http://blog.jove.com/2012/05/03/studies-show-only-10-of-published-science-articles-arereproducible-what-is-happening
According to the NIH … (RFA-GM-15-006)
The lack of reproducibility is not due primarily to intentional fabrication or falsification
of data. Rather, in many cases there is a lack awareness or adherence to sufficiently
high standards in the planning and execution of scientific experiments, and in
transparency in the reporting of science. Examples include inherently weak
experimental designs and over-interpretation of statistically marginal differences and
variability of materials, differing brands or even lots of reagents, differing or drifting
strains of organisms and cells in culture, or other variables that have not been
adequately controlled.
A study of the published literature has found large numbers of reports where basic
information such as the numbers of animals, definition of control groups,
randomization, blinding, and so forth cannot be determined and/or assumed to be
adequate.
Graduate students were often significantly dependent on the mentor or the mentor's
lab for the training received, and postdoctoral fellows were primarily dependent on
the mentor or mentor's lab at all institutions. Rather than being learned in prescribed
curricula, training in good laboratory practices that influence data reproducibility
appears to be largely passed down from generation to generation of working
scientists, with substantial variation from laboratory to laboratory.
… and researchers agree!
Neuron 84: 572-582 (2014)
Suggested Best Practices to Enhance Rigor and Reproducibility
Neuron 84: 572-582 (2014)
Suggested Best Practices to Enhance Rigor and Reproducibility
Neuron 84: 572-582 (2014)
Organize your computer drive using informative grouping
VS.
Use a consistent file naming strategy:
• Electronic data and samples
<notebook ID>-<page #>-<sample ID>-<analysis
parameters>
(e.g. EK01-105-12a-40X-areaC)
• Files
<your last name>-<keyword>
(e.g. Krug R01 renewal draft 092412)
– append your initials to revisions
(e.g. Krug R01 renewal draft 092412-jd)
Describe naming strategy in front page of each notebook
– keep an electronic version on your desktop
Image Manipulation Issues
“Data may be excluded from the
experimental results only if you have
a sound reason to do so!”
Mother Nature
indicate justification in your notebook for any
observations excluded for an experiment
Examples of misleading
presentation of data
• Origin of graph not zero
• Number of animals vs. number of
determinations
• Non-standard normalization method
• “typical results”
• Not showing the entire gel
• “cleaning up” the data
The Journal of Cell Biology:
• 25% of accepted papers have at least 1 figure
with undocumented manipulation
• revokes the acceptance of about 1% of its
papers due to inappropriate image manipulation
If you misrepresent your data you are deceiving
people – as well as precluding alternative
interpretation of the data in the future.
Rossner, M. (2006). The Scientist 20:24-25.
As
photographed
Brightness Contrast
Adjustment
As
photographed
Erasing
Brightness and contrast adjustments must be applied
equally across all images being compared
Rossner, M. (2006). The Scientist 20:24-25.
How much can you adjust brightness & contrast?
Rossner, M. and Yamada, K. (2004). J. Cell Biol. 166:11-15.
Data that looks too good to be true …
Rossner, M. and Yamada, K. (2004). J. Cell Biol. 166:11-15.
All data have signatures – even gels
“Innovative approaches” to data presentation
are not allowed!
What CAN I do?
• List all image acquisition and processing tools
and software
• Document key image-gathering setting and
manipulations in the Supplemental materials
• Clearly demarcate borders between images
collected at different times
• Avoid use of touch up tools or deliberately
obscuring parts of an image
• Processing is acceptable only if applied across
the entire image
• Be prepared to deliver the original, unprocessed
images to the editor
Nature, instructions to authors (2006)
http://research.unl.edu/researchresponsibility/re
sponsible-conduct-of-research/
Download