Getting to grips with Research Data Management 19th March 2015 Isabel Chadwick, Research Data Management Librarian rdm-project@open.ac.uk Overview of the workshop • • • • • • What is Research Data Management? Sharing data Working with data Planning for data Useful resources Questions? What is Research Data Management? “Research data management concerns the organisation of data, from its entry to the research cycle through to the dissemination and archiving of valuable results. It aims to ensure reliable verification of results, and permits new and innovative research built on existing information." Digital Curation Centre (2011) Making the Case for Research Data Management http://www.dcc.ac.uk/sites/default/files/documents/publications/Making%20the%20case.pdf What is Research Data Management? Discussion • Describe your research • What type of data will you/do you create/use? What is Research Data Management? UK Data Archive Data Lifecycle model Preserving datato data Giving access Re-using data Analysing data Migrate data to best Distribute data Creating data Data oftendata have a longer • •Processing research • Follow-up Interpret data format Share data Design Enter data, research digitise, than the research • ••lifespan New research Derive data Migrate data to suitable Control access Plan transcribe, data management translate that creates them. • ••project Undertake research Produce research medium Establish copyright • reviews Plan Check, consent validate, for clean outputs Back-up and store data Promote data sharing data may continue to work • ••You Scrutinise findings Author publications Create metadata and Locate Anonymise existing datadata data after funding has • ••onTeach and learn Prepare data for documentation •ceased; Collect Describe data data projects follow-up preservation •may Archive data (experiment, Manage and observe, storetodata analyse or add the measure, simulate) data; data may be re-used •by Capture and create other researchers. metadata http://www.data-archive.ac.uk/create-manage/life-cycle What is Research Data Management? Why spend time and effort on this? • So you can work efficiently and effectively –Save time and reduce frustration –Highlight patterns or connections that might otherwise be missed • Because your data is precious • To enable data re-use and sharing • To meet funders’ and institutional requirements What is Research Data Management? What does the OU expect? “Research data must be managed to the highest standards throughout their life-cycle in order to support excellence in research practice. In keeping with OU principles of open-ness, it is expected that research data will be open and accessible to other researchers, as soon as appropriate and verifiable, subject to the application of appropriate safeguards relating to the sensitivity of the data and legal requirements.” OU Principles of Research Data Management, April 2013 http://intranet.open.ac.uk/research-school/strategy-infogovernance/docs/CoPamendedJuly2013mergedwithappendix-forintranet.pdf What is Research Data Management? What do funders expect? “Publicly funded research data are a public good, produced in the public interest, which should be made openly available with as few restrictions as possible in a timely and responsible manner that does not harm intellectual property.” RCUK Common Principles on Research Data Policy, 2011 http://www.rcuk.ac.uk/research/datapolicy/ Sharing data Benefits of sharing data Sharing data Benefits of sharing data (2) Sharing data Benefits of sharing data (3) Sharing data What do you need to share? • Raw data • Derived data • Data underpinning publications • Code • Methods What are research data in your context? What would others need to understand your research? Sharing data Barriers to sharing data: discussion Discuss barriers to sharing your research data. These could be: • Ethical • Legal • Professional Can these barriers be overcome? Sharing data How can I share my data? OU Data Catalogue in ORO Data access statements Funders’ repository services • UK Data Service ReShare • NERC data centres Online data sharing services • Figshare • Zenodo • CKAN DataHub Directories • re3data • DataBib Working with data “Start as you mean to go on” Data Sharing The end point of all projects should involve making the data publicly available. Many data will be deposited in national archives which have regulations for files and metadata. Thinking about the requirements at the beginning of the project will limit the transformations needed at the end of the project. https://www.youtube.com/watch?v=YQNadL5t8hg Working with data Filing systems Filing is more than saving files, it’s making sure you can find them later in your project • • • • Naming Directory Structure File Types Versioning All these help to keep your data safe and accessible. Working with data Naming conventions Decide on a file naming convention at the start of your project. Useful file names are: • consistent. • meaningful to you and your colleagues. • allow you to find the file easily. Agree on the following elements of a file name: • Vocabulary • Punctuation • Dates (YYYY-MM-DD) • Order • Numbers • Version information Ideally you should be able to tell what’s in a file before opening it. Working with data File formats • • • • Unencrypted Uncompressed Non-proprietary/patent-encumbered Open, documented standard • Standard representation (ASCII, Unicode) Type Recommended Avoid for data sharing Tabular data CSV, TSV, SPSS portable Excel Text Plain text, HTML, RTF PDF/A only if layout matters Word Media Container: MP4, Ogg Codec: Theora, Dirac, FLAC Quicktime H264 Images TIFF, JPEG2000, PNG GIF, JPG Structured data XML, RDF RDBMS Further examples: http://www.data-archive.ac.uk/create-manage/format/formats-table Working with data Metadata • Metadata is additional information that is required to make sense of your files – it’s data about data. • This is not a new idea; consider your music or film collection; • Think: title, authors, release date, producers, directors, etc. • Maybe the artwork, the studio, or format Image by Wilfried Joh: https://www.flickr.com/photos/wilfriedjoh/11494134233 (CC- BY-NC-ND 2.0) Working with data Metadata (2) What contextual details are needed? •Who is in this picture? •When was it taken? •Where are they? •Who took this photo? •How was this picture taken? Working with data Backing up Remember the 3-2-1 rule: 3 copies, 2 formats, 1 off-site Working with data Sensitive data Working with data Sensitive data (2) Managing sensitive data • If possible, collect the necessary data without using personally identifying information • De-identify your data upon collection or as soon as possible thereafter • Avoid transmitting unencrypted personal data electronically • Consider whether you need to keep original collection instruments (recordings, surveys etc.) once they have been transcribed and quality assured Working with data Storage and Security: Discussion • Discuss the data security issues raised by the scenarios. • What practical measures could have been taken to reduce risks to security? Planning for data Data Management Plans are useful whenever you are creating data to: • Make informed decisions to anticipate and avoid problems • Avoid duplication, data loss and security breaches • Develop procedures early on for consistency • Ensure data are accurate, complete, reliable and secure • Save time and effort – make your life easier! Planning for data DMPOnline A web-based tool to help you write DMPs according to different requirements. DCC, funder and OU guidance. https://dmponline.dcc.ac.uk Planning for data Tips • Keep it simple, short and specific • Seek advice - consult and collaborate • Base plans on available skills and support • Make sure implementation is feasible • Justify any resources or restrictions needed Useful links • VRE module: http://www.open.ac.uk/students/research/content/activites/researchdata-management • The OU Research Data Management intranet site: http://intranet6.open.ac.uk/library/main/supporting-ou-research/research-datamanagement • Digital Curation Centre: http://www.dcc.ac.uk/ • DMPOnline: https://dmponline.dcc.ac.uk/ • UK Data Archive: http://www.data-archive.ac.uk/ • MANTRA: http://datalib.edina.ac.uk/mantra/ • The Orb: http://open.ac.uk/blogs/the_orb Questions? Image credits Unless otherwise stated, all images are by Jørgen Stamp at http://www.digitalbevaring.dk