MANAGING & SHARING RESEARCH DATA ……………………………………………………………………………………………………………………………….…………………………….. ……………………………………………………………….…... LOUISE CORTI UK DATA SERVICE UNIVERSITY OF ESSEX ……………………………………………….……………..……. University of Sussex, Why share your research data 14 February 2013 OVERVIEW FOR TODAY ……………………………………………………………………………………………………………………………….…………………………….. • • • • • • Why share data Data management planning Looking after your data Formatting and organising your data Storage, security, transfer and file sharing Copyright of data ………………………………………………………………………………………………………………………………………….……………………..… UK DATA ARCHIVE UK DATA SERVICE ……………………………………………………………………………………………………………………………….…………………………….. • UK Data Archive has over forty years experience in selecting, ingesting, curating and providing access to social science data • huge experience of supporting researchers and data creators of social science data and related disciplines • Been managing data sharing for the ESRC Data Policy since 1995, including the Rural Economy and Land Use Programme (2004-2012) • our best practice approaches to making data shareable are based on: • challenges faced by researchers to share data • handling research data – quantitative and qualitative • highly skilled staff comprising researchers, technical and information specialists www.data-archive.ac.uk/create-manage ………………………………………………………………………………………………………………………………………….……………………..… UK DATA ARCHIVE OUR MANAGING AND SHARING DATA RESOURCES ……………………………………………………………………………………………………………………………….…………………………….. Managing and sharing guidance • sections • references • training programme www.data-archive.ac.uk/create-manage www.data-archive.ac.uk/media/2894/managingsharing.pdf Training resources: • presentations • exercises and discussions / answers www.data-archive.ac.uk/create-manage/training-resources ………………………………………………………………………………………………………………………………………….……………………..… UK DATA ARCHIVE BENEFITS OF MANAGING AND SHARING YOUR DATA ……………………………………………………………………………………………………………………………….…………………………….. • Data created from research are valuable resources that can be used and re-used for future scientific and educational purposes. • Sharing data: • facilitates new scientific inquiry • avoids duplicate data collection • provides rich real-life resources for education and training • credits the data creator with intellectual effort • can help to support evidence of impact • is something to be proud of! ………………………………………………………………………………………………………………………………………….……………………..… UK DATA ARCHIVE DATA LIFECYCLE & DATA MANAGEMENT PLANNING ……………………………………………………………………………………………………………………………….…………………………….. A data management and sharing plan helps researchers consider when research is being designed and planned, how data will be managed during the research process and shared afterwards with the wider research community Areas of coverage: • Data management planning why & how and the research lifecycle • Data management checklist • Roles and responsibilities • Costing data management ………………………………………………………………………………………………………………………………………….……………………..… UK DATA ARCHIVE WHY DATA MANAGEMENT PLANNING? ……………………………………………………………………………………………………………………………….…………………………….. • Research funders require planning for data management and data sharing, e.g. UK Research Councils • • • • • which data how manage how share, preserve, curate rights to access, use,…. roles & responsibilities • Research benefits • • • • • • think what to do with research data, how collect, how look after keep track of research data (e.g. staff leaving) identify support, resources, services needed plan storage, short & long-term plan security, ethical aspects be prepared for data requests (FoI, funder) ………………………………………………………………………………………………………………………………………….……………………..… UK DATA ARCHIVE COMPLETING A DMP ……………………………………………………………………………………………………………………………….…………………………….. • Funder template for DMPs • • • ESRC DMP requirements in data policy and DMP guidance MRC DMP guidance and template AHRC technical plan requirements • DCC’s DMPonline tool • Good to share best practice in your own departments e.g what does an ideal DMP looks like? UK Data Archive data management checklist ………………………………………………………………………………………………………………………………………….……………………..… UK DATA ARCHIVE KEY DATA MANAGEMENT INTERVENTION POINTS ……………………………………………………………………………………………………………………………….…………………………….. Data formats, data migration Sign off consent form Agree data & metadata templates Licensing, terms and conditions for sharing, formal documentation Shared data sharing protocols ………………………………………………………………………………………………………………………………………….……………………..… Green and Gutmann, 2007 UK DATA ARCHIVE ROLES & RESPONSIBILITIES ……………………………………………………………………………………………………………………………….…………………………….. Assign, not presume roles or responsibilities for data management Who? • PI • Research staff / students - collecting, creating, processing, analysing data • External contractors - data collection, collation, processing; e.g. transcribers • Support staff - managing, administering research • Local Information Services - data storage, security, back-up services • External/institutional data centres / archives - data sharing ………………………………………………………………………………………………………………………………………….……………………..… UK DATA ARCHIVE COSTING ……………………………………………………………………………………………………………………………….…………………………….. • Cost data management and sharing into research • Identify resources needed to make research data shareable beyond primary research team - above planned standard research procedures and practices • Resources = people, equipment, infrastructure, tools to manage, document, organise, store and provide access to data • Early planning can reduce costs • See our data management costing tool ………………………………………………………………………………………………………………………………………….……………………..… UK DATA ARCHIVE FORMATTING YOUR DATA ……………………………………………………………………………………………………………………………….…………………………….. • Using standard and interchangeable or open lossless data formats ensures long-term usability of data • High quality data are well organised, structured, named and versioned • Authenticity of master files must be identified • • • • • File formats and conversions Organising files and folders File naming Version control and authenticity Transcription ………………………………………………………………………………………………………………………………………….……………………..… UK DATA ARCHIVE CAN YOU UNDERSTAND/USE THESE DATA? ……………………………………………………………………………………………………………………………….…………………………….. SrvMthdDraft.doc SrvMthdFinal.doc SrvMthdLastOne.doc SrvMthdRealVersion.doc ………………………………………………………………………………………………………………………………………….……………………..… UK DATA ARCHIVE FILE FORMATS ……………………………………………………………………………………………………………………………….…………………………….. • Choice of software format for digital data: • • • • planned data analyses software availability/cost hardware used e.g. audio capture discipline-specific standards and customs • Digital data endangered by obsolescence of software/ hardware • Best formats for long-term preservation - standard formats, interchangeable formats, open formats e.g. tab-delimited, comma-delimited (CSV), ASCII, RTF, PDF/A, OpenDocument format, SPSS portable, XML see UK Data Archive optimal file formats for various data types • Beware of errors or losses of data when converting formats! ………………………………………………………………………………………………………………………………………….……………………..… UK DATA ARCHIVE EXAMPLE: FORMAT CONVERSION ……………………………………………………………………………………………………………………………….…………………………….. MS Excel (.XLSX) format using colour highlighting for annotation Loss of annotation Format change Tab-delimited text format, and loss of colour annotation ………………………………………………………………………………………………………………………………………….……………………..… UK DATA ARCHIVE ORGANISING DATA ……………………………………………………………………………………………………………………………….…………………………….. • Plan in advance how best to organise data • Use a logical structure and ensure collaborators understand Examples • hierarchical structure of files, grouped in folders, e.g. audio, transcripts and annotated transcripts • survey data – spreadsheet, SPSS, relational database • interview transcripts - individual well-named files ………………………………………………………………………………………………………………………………………….……………………..… UK DATA ARCHIVE FILE NAMING ……………………………………………………………………………………………………………………………….…………………………….. • • • • file name - principal identifier of file use logical naming i.e. easy to identify, locate, retrieve, access naming provides organisation, context & consistency name elements: version nr, date, content description, creator name Best practice • • • • • • • name independent of location brief & relevant no special characters, dots or spaces for separation use underscores _ versioning via filename: ordinal and decimal version numbers use names to classify broad types of files avoid very long file names ………………………………………………………………………………………………………………………………………….……………………..… UK DATA ARCHIVE VERSION CONTROL ……………………………………………………………………………………………………………………………….…………………………….. Keep track of different copies or versions of data files Which method: • single site vs. across locations • different versions to be stored vs. files to be synchronised Best practice: • unique identifiers for files (naming convention keeping track) • record file status/versions • record relationships between files e.g. data file and documentation; similar data files • keep track of file locations e.g. laptop vs. PC Various tools available for versioning and syncronising files ………………………………………………………………………………………………………………………………………….……………………..… UK DATA ARCHIVE EXAMPLE : VERSION CONTROL TABLE ……………………………………………………………………………………………………………………………….…………………………….. ………………………………………………………………………………………………………………………………………….……………………..… UK DATA ARCHIVE AUDIO TRANSCRIPTION ……………………………………………………………………………………………………………………………….…………………………….. • adopt a uniform layout throughout the research project • compatibility with import features of Computer Assisted Qualitative Data Analysis Software (CAQDAS) • role of transcription varies by discipline. What to transcribe? • • • verbal and non-verbal? turn-taking? ‘interruptions’ • who does it – researcher, service? Need rules • implications of technologies – video, multiple camera, screen capture, webcams ………………………………………………………………………………………………………………………………………….……………………..… UK DATA ARCHIVE STORING YOUR DATA ……………………………………………………………………………………………………………………………….…………………………….. • Looking after research data for the longer-term and protecting them from unwanted loss requires having good strategies in place for: • • • • backing-up transmission secure storage disposal ………………………………………………………………………………………………………………………………………….……………………..… UK DATA ARCHIVE BACKING-UP DATA ……………………………………………………………………………………………………………………………….…………………………….. • Why do back-ups? Risk of loss and change - would your data survive a disaster? • Protect against: software failure, hardware failure, malicious attack, natural disasters • Back-ups are additional copies that can be used to restore originals • It’s not backed-up unless backed-up with a strategy ………………………………………………………………………………………………………………………………………….……………………..… UK DATA ARCHIVE DIGITAL BACK-UP STRATEGY ……………………………………………………………………………………………………………………………….…………………………….. Consider • what’s backed-up? - all, some, just the bits you change? • where? - original copy, external local and remote copies • what media? - CD, DVD, external hard drive, tape, etc. • how often? – assess frequency and automate the process • for how long is it kept? Data retention policies that might apply? know your personal/institutional back-up strategy • verify and recover - never assume, regularly test a restore Backing-up need not be expensive • 1Tb external drives are around £50, with back-up software Consider non-digital storage too! ………………………………………………………………………………………………………………………………………….……………………..… UK DATA ARCHIVE ENCRYPTION and SECURITY ……………………………………………………………………………………………………………………………….…………………………….. Always encrypt personal or sensitive data • when moving data files e.g. interview transcripts • when or storing files e.g. shared areas, mobile devices Free softwares that are easy to use • encrypt hard drives, partitions, files and folders • encrypt portable storage devices such as USB flash drives • Safehouse, Truecrypt, Axcrypt Protect data from unauthorised access, use, change, disclosure and destruction • control access to all computers devices • control physical access to buildings, rooms, cabinets • restrict access to sensitive materials e.g. consent forms Proper disposal of equipment and media • even reformatting the hard drive is not sufficient ………………………………………………………………………………………………………………………………………….……………………..… 24 UK DATA ARCHIVE FILE SHARING & COLLABORATIVE ENVIRONMENTS ……………………………………………………………………………………………………………………………….…………………………….. Sharing data between researchers and teams • too often email attachments • Yousendit, Dropbox – consider if appropriate as services can be hosted outside the EU (DPA for personal data), e.g. encrypt • Virtual Research Environments • MS SharePoint • Sakai • file transfer protocol (ftp) • physical media • Essex ZendTo ………………………………………………………………………………………………………………………………………….……………………..… 25 UK DATA ARCHIVE OPTIONS FOR SHARING CONFIDENTIAL DATA ……………………………………………………………………………………………………………………………….…………………………….. Researchers to consider • obtaining informed consent, also for data sharing and preservation / curation • protecting identities e.g. anonymisation, not collecting personal data • restricting / regulating access where needed (all or part of data) e.g. by group, use, time period. UK Data Archive has gradation of access controls • securely storing personal or sensitive data Consider jointly and in dialogue with participants Plan early in research ………………………………………………………………………………………………………………………………………….……………………..… UK DATA ARCHIVE COPYRIGHT AND DATA SHARING ……………………………………………………………………………………………………………………………….…………………………….. • Copyright permissions sought and granted prior to data sharing / archiving • Clearing copyright – reach agreement with copyright holder • Repositories only publish data – they hold no copyright • Copyright holders give permission to data archives to preserve data and make them accessible to users • For secondary use, copyright clearance before data can be reproduced • Exception - fair dealing - for non-commercial research, private study, teaching, quotations, criticism or review; then author and source must be cited ………………………………………………………………………………………………………………………………………….……………………..… UK DATA ARCHIVE AN ARCHIVED RESEARCH COLLECTION ……………………………………………………………………………………………………………………………….…………………………….. • Foot and Mouth in Cumbria dataset – qualitative study • http://beta-discovery.dataservice.ac.uk/Home/DataCatalogueTab/5407 • High quality hard to replicate/collect data • Added value from reports, methods and links to publications ………………………………………………………………………………………………………………………………………….……………………..… UK DATA ARCHIVE CONTACT ……………………………………………………………………………………………………………………………….…………………………….. UK DATA ARCHIVE UNIVERSITY OF ESSEX WIVENHOE PARK COLCHESTER ESSEX CO4 3SQ ……………………….…………………….…. T: +44 (0)1206 872001 E: datasharing@data-archive.ac.uk W: www.data-archive.ac.uk ……………………………….……………….. ………………………………………………………………………………………………………………………………………….……………………..… UK DATA ARCHIVE