ANDS Webinar 5 June 2014 Natasha Simons Senior Data Management Specialist Australian National Data Service Located at: Griffith University, Brisbane, Australia http://orcid.org/0000-0003-0635-1998 Tw: @n_simons Established in 1971 and opened in 1975 Now has five South-East Queensland campuses Around 43,000 students and 4,300 staff 26 schools and departments in four academic groups: Arts, Education and Law Business Health Science, Environment, Engineering and Technology Image credit: Danny Munnerley, http://www.flickr.com/photos/munnerley/6381877583/ 32 research centres and institutes Priority areas Image credit: Anne Ruthmann, http://www.flickr.com/photos/annemarlow/8392238157/ Water science Drug discovery and infectious diseases Asian politics, security and development Climate change adaptation Criminology and crime prevention Music, the arts and the Asia Pacific Sustainable tourism Chronic disease prevention Physical sciences Environmental sciences Nursing Education Strong commitment from University leaders to improving data management Staff resources – operational and project related Successful in seeking funding from ANDS and NeCTAR to build national and local infrastructure Strong emphasis on seeking internal funds and working with researchers on grants for funds to develop, enhance and support institutional tools Policy frameworks and service models for data management support under discussion There are really only two things you need before you start on a data citation journey: 1. Some research data collections at your institution that have open, embargoed or mediated access. 1. A publically available metadata record that describes each of these collections and provides access to them. Image credit: http://www.peregrineadventures.com/blog/13/02/2012/great-packing-debate At Griffith, we have: 1. Research Data Repository http://equella.rcs.griffith.edu.au/research/ logon.do 1. Research Hub (metadata store/researcher profile system) - http://researchhub.griffith.edu.au On your journey, you may also need: 1. Management support: Malcolm Wolski Director, eResearch Services & Scholarly Application Development Division of Information Services Griffith University 2. Technical support: Arve Solland Senior Developer, eResearch Services & Scholarly Application Development Division of Information Services Griffith University 2011 August –‘PIDs for data’ options paper, recommended DOIs August – ANDS launched Cite My Data service pilot September to December – signed up; developed m-2-m scripts, minted DOIs 2012 c.May - Put ‘Cite this collection’ feature in Griffith Research Hub October - Commenced data citation project 2013 May - Concluded data citation project September – produced DOI guidelines; developed roadmap Griffith needed a persistent identifier that would: • Fill gaps in persistent identifiers for scholarly works • Replace long and incomprehensible URLs for metadata • Signal long-term management of our research data collections • Contribute to the semantic vision for data in the Research Hub • Later: foundation for data citation. We chose DOIs to meet our needs because they: • Are a global persistent identifier, already used for many scholarly publications • Can be assigned to research data, theses, grey lit and even software code • Improve visibility of, and access to, research data • Gave us responsibility for managing persistent access to our data collections • Won’t break when IR software is re-indexed (as handles sometimes do) Later, because they: • Facilitate data citation • Greatly assist tracking impact of data sets through collection of metrics and altmetrics based on DOI The ANDS Cite My Data service provided: Partnership with international DOI registration agency: DataCite Minting DOIs for metadata records about open, mediated or embargoed research data, theses, grey literature (even software code) Machine-to-machine workflow Easily achieved kernel metadata Trial in safe test environment High level documentation for the M-2-M provided by ANDS High level information on data citation on the ANDS website Free! And so we became the first guinea pigs of the Cite My Data Service…. 1. Sign agreement to use the service 2. ANDS give you an institutional id 3. Prepare your m-2-m script (includes required metadata for each DOI: title, creator, publisher, publication year, identifier) 4. Execute script against Cite My Data service 5. Cite My Data service returns DOIs 6. Store DOIs in own system 7. Create citation element 8. Make citation element avail in RIF-CS feed for ANDS harvester DOI scipts: https://github.com/gu-eresearch/ANDSDOIScripts • What’s the criteria for assigning a DOI to a research data collection? • At what level of granularity should a DOI be applied? • Should the DOI link to the landing page or the actual data? Which landing page? • What if the data is changed e.g. updated? Should a new DOI be issued? • Should researchers be able to mint the DOI or should we mint it for them? • How are DOIs assigned if the research data is the result of a collaboration between various institutions? • What happens to the DOIs we have minted if ANDS closes shop? • Can you cite data without a DOI? Implementing DOIs for Research Data D-Lib article http://dx.doi.org/10.1045/may2012-simons We found answers to our questions and wrote them up in guidelines: Digital Object Identifiers (DOIs): Introduction and Management Guide Available for download from the ANDS website: http://ands.org.au/cite-data/griffith_doi_guidelines-4.pdf ANDS DOI FAQs http://ands.org.au/cite-data/doi_q_and_a.html Documented our experiences in the Gold Standard Project @ Griffith blog: http://ands-gold-griffith.blogspot.com.au/ Established a blog - http://data-citation-griffith.blogspot.com.au/ Spoke with librarians about citation practices in different disciplines Included data citation as part of standard consultations with a group in Health & an individual in environmental economics Notifications workflows Investigated Dryad automated notifications workflows Modified their depositor notification Manually emailed collections owners of new collections Notifications added to technical requirements for data deposit Reviewed existing information and workflows Griffith policies and procedures Academic style guides Training materials and guides Included data citation in new Best practice guidelines for researchers: managing research data and primary materials Disciplines Citation practices Style guides Publishing protocols Target audiences Types of research output Usage of metrics Age and career stage Attitudes to open access Motivations Technical know-how Image credit: Taki Steve, http://www.flickr.com/photos/13519089@N03/1380483002/ Find ‘hooks’ in the researchers’ workflows e.g. point of data deposit e.g. final report on funded research e.g. through data planning Long term goal should be to get in early - improving the training and supporting artefacts (style guides, bibliographic management software) that introduce new students and researchers to the principles of citation Image credit: Todd Lappin, http://www.flickr.com/photos/telstar/433029904/ A depositor shouldn’t have to know what a DOI is or where it comes from, or be asked to make a decision about whether they want one or not Minting DOIs should be done automatically for collections that meet the rules defined by the ‘publisher’ of the deposited data (in this case, Griffith University) and the DOI registration agency Image credit: Taki Steve, http://www.flickr.com/photos/13519089@N03/1380483002/ Be honest about the evidence base – they’re researchers so they will ask! Be honest about the lack of rewards within the current system and have empathy – researchers know better than us what they do and don’t get rewarded for Publisher policies Funder Collective action is needed for change in these areas Style guides mandates Tools e.g. Endnote, Zotero Altmetrics Culture of data citation Bibliometrics Research quality exercises Identifier registration agencies Information and training Data repositories Institutional procedures We’re investigating these now and in the near future We mostly know what we are doing with these D-Lib article: Growing Institutional Support for Data Citation: Results of a Partnership between Griffith University and the Australian National Data Service http://dx.doi.org/10.1045/november2013-simons What Griffith University are doing to establish a culture of data citation: https://www.youtube.com/watch?v=jDsD5cbIeZU But we didn’t conquer the world… On the ‘to do’ list: • Embedding DOIs into automated data collection workflows • Minting DOIs for grey literature: theses, reports, discussion papers etc. • Improving links between research publications and underlying data • Reviewing DOI guidelines, rules and workflows at future points in time • Embedding types of metadata, such as COINS, into the landing pages to assist import into citation tools Easy lessons learnt: • • • • • Do what you can with what you have available Technical minting and maintaining of DOIs is relatively easy Cite My Data service is straight forward Getting citation element is also relatively easy There are a lot of materials available now on DOIs (infrastructure) and on data citation (researchers) so don’t reinvent the wheel • You could decide to set up an administrator interface for minting and maintaining the DOIs (e.g. the way TERN have done this). This would run over the top of the m-2-m scripts. Hard lessons learnt: • Establishing workflows for DOIs and data citation is not easy if you don’t know when researchers are going to publish their data and if data publication is not routine • Data citation is not (yet) common practice but there is a large international community supporting data citation as a principle and to encourage practice • There is a growing body of evidence on a positive link between open data and citation counts