dcip_intro_biocaddie_-_slides

advertisement
The BioCADDIE / FORCE11
Data Citation Pilot
Tim Clark, Ph.D.
Harvard Medical School & Massachusetts General Hospital
Maryann Martone, Ph.D.,
University of California at San Diego
October 13, 2015
© 2015 FORCE11.org
Background
•
BD2K Aim #1: “To facilitate broad use of biomedical digital assets by
making them discoverable, accessible and citable.” (NIH 2015)1
•
Data robustly archived, and directly cited in journal articles can
provide powerful input to BioCADDIE content and operations.
•
Significant work has been done on data citation and will provide a
foundation on which to proceed.
•
Several top-tier publishers are planning to implement this approach
but need assistance.
•
This pilot will organize a coordination activity and provide
communication across participating groups
Background Documents
•
CODATA & National Academies Reports (2012-2013) 2, 3
•
Joint Declaration of Data Citation Principles / JDDCP (2014) 4
•
Collins & Tabak (2014) on NIH reproducibility initiatives 5
•
JDDCP Implementation Guidelines (Starr et al. 2015) 6
•
ELIXIR/BD2K& BioCADDIE/FORCE11 Workshops (Jan 2015)
7
•
BioCADDIE supplement for Data Citation Implementation Pilot,
subcontract to FORCE11 thru UCSD (Oct 2015)
Objectives
•
Provide coordination & guidance for early adopters of data citation:
publishers, repositories and ID / metadata services.
•
Help establish one or more benchmark implementations by
important early adopters across key use cases.
•
Focus on archiving and citing primary research data.
•
Coordinate with CODATA’s international workshops on data
citation, complementary to the focused early adopter pilot.
•
Publish several peer-reviewed articles and a final report.
•
Provide report on lessons learned to the community.
DCIP Executive Committee
•
Tim Clark, Harvard Medical School.
•
Carole Goble, U of Manchester, ELIXIR Deputy Director for the UK.
•
Jeff Grethe, UC San Diego, BioCADDIE Executive Committee.
•
Simon Hodson, Executive Director, CODATA.
•
Maryann Martone, UC San Diego & Hypothes.is.
•
Jo McEntyre, EMBL/EBI, European PubMed Central.
•
Joan Starr, California Digital Library.
Agreed Participants to Date
•
Publishers
•
•
•
Repositories
•
Dryad, Figshare, PDB, European PMC (EMBL/EBI)
•
Columbia University Library, Harvard Dataverse
Metadata & ID
•
•
California Digital Library, DataCite, CrossRef, ORCID
Standards, Academic & Scholarly Organizations
•
•
Elsevier, PLoS, Biomed Central, eLife, F1000, GigaScience
JATS Standing Committee, CODATA, ELIXIR
BioCADDIE
Approach
•
•
Publishers
•
use JATS 1.1d2/3 schema for documents
•
common data citation workflows & core metadata
Repositories
•
•
use JDDCP implementation guidelines
Authors
•
provide authors a common FAQ web page
•
assist publisher operations group in supporting authors
Proposed Deliverables
1. Principles and Entailments for Direct Scientific Data Citation on the Web - peer
reviewed archival version of the JDDCP with background material
2. Five Steps to Citing Research Data - summary JDDCP implementation guidelines
3. Citing Data with the Journal Article Tag Suite - detailed guidance for using the 1d.2
and 1d.3 NISO JATS revisions for data citation in publishing
4. Data Citation FAQ - dynamic feature on the FORCE11 site
5. Data Citation Ask the Experts - social media feature on FORCE11 website
6. Ongoing input to CODATA Global Workshops on Data Citation
7. Ongoing implementation guidance and coordination across Pilot stakeholders
8. Citing Data in Action: Experiences and lessons learned from the BioCADDIE Data
Citation Implementation Pilot - 1 year report on the pilot
BioCADDIE Integration
•
All metadata and identifiers available for harvesting
•
Ensure that archived content is indexable
•
Coordination with BioCADDIE use case & activities
To Sum Up
•
Data Citation Implementation Pilot over 1 year
•
Organize “early adopter” stakeholders
•
Provide expert feedback, authoritative guidance
•
Enable stakeholders to implement successfully
•
Publish guidance and lessons learned
•
Strong BioCADDIE support and integration
References
1. NIH: About BD2K. National Institutes of Health, 2015. Accessed October 12, 2015.
[https://datascience.nih.gov/bd2k/about].
2. Uhlir P: For Attribution - Developing Data Attribution and Citation Practices and
Standards: Summary of an International Workshop (2012) In.: The National Academies
Press; 2012: 220 [http://www.nap.edu/catalog.php?record_id=13564].
3. CODATA/ITSCI Task Force on Data Citation: Out of cite, out of mind: The Current State
of Practice, Policy and Technology for Data Citation. Data Science Journal 2013, 12:1-75
4. Data Citation Synthesis Group: Joint Declaration of Data Citation Principles. Edited by
Martone M. San Diego CA: Future of Research Communication and e-Scholarship
(FORCE11); 2014 [https://www.force11.org/datacitation].
56. Collins FS, Tabak LA: Policy: NIH plans to enhance reproducibility. Nature 2014,
505(7485):612
6. Starr J, Castro E, Crosas M, Dumontier M, Downs RR, Duerr R, Haak LL, Haendel M,
Herman I, Hodson S, ́ HH, Kratz JE, Lin J, Nielsen LH, Nurnberger A, Proell S, Rauber A,
Sacchi S, Smith A, Taylor M, Clark T: Achieving human and machine accessibility of
cited data in scholarly publications. PeerJ 2015, 1: e1.
Download