Support for e-Research: Filling the Library Skills Gap e-Research and the Data Librarian Stuart Macdonald Edinburgh University Data Library / EDINA National Data Centre Luis Martinez London School of Economics Data Library E-Science Institute, University of Edinburgh, June 2007 Support for e-Research: Filling the Library Skills Gap • What are data? • Where do you get it from? • Data support services • Developments in data storage, dissemination and analysis • e-Research definition and examples • DISC-UK DataShare E-Science Institute, University of Edinburgh – June 2007 Support for e-Research: Filling the Library Skills Gap What are Data? Some definitions: a collection of observations or other information related to a particular question, problem, experiment or place information, most commonly in the form of a series of binary digits, stored on a physical storage medium for manipulation by a computer program information in numerical form that can be digitally transmitted or processed a representation of facts, concepts, or instructions in a formalized manner suitable for communication, interpretation, or processing by humans or by automated means E-Science Institute, University of Edinburgh – June 2007 Support for e-Research: Filling the Library Skills Gap Data Types Social Sciences - micro data; aggregated data; geospatial data; financial data; qualitative data; in addition to commercial or private data (bank transactions, Tesco customer purchase records, government administrative records, CCTV footage) ‘Hard Science’ : astronomical and meteorological observations; climate modelling; crystallography; gene sequence data; clinical and epidemiological records; mass spectrometry; satellite or archaeological images and aerial photography; polar orbit tracking data; chemical, structural and mechanical engineering data; remote sensing,……… Associated concerns: • • • • • ethics (confidentiality/disclosure), scale (time/storage), proprietary formats, copyright and legal issues, long-term preservation E-Science Institute, University of Edinburgh – June 2007 Support for e-Research: Filling the Library Skills Gap E-Science Institute, University of Edinburgh – June 2007 Support for e-Research: Filling the Library Skills Gap ‘increase the democratisation of knowledge’ More data will be created in the next five years than has been collected in the whole of human history. Properly managed, this data will form major resource for Australian researchers. *Department of Education, Science and Training (2007) "Backing Australia's Ability - An Ongoing Commitment" – url: http://backingaus.innovation.gov.au/info_booklet/on_commit.htm Researchers, government institutions, non-profit organizations, schools, commercial organizations, and individual citizens all need the widest possible access to data from all sources to explore, experiment, test, create new knowledge and new products, and, ultimately, to increase understanding of our world. *Harlan Onsrud and James Campbell, Department of Spatial Information Science and Engineering, University of Maine [2006] – “Big Opportunities in Access to ‘Small Science’ Data” E-Science Institute, University of Edinburgh – June 2007 Support for e-Research: Filling the Library Skills Gap Research Council-funded Data Centres • EDINA, MIMAS (JISC/ESRC) • UK Data Archive, ESDS (JISC/ESRC) • Arts and Humanities Data Service (AHRC/JISC) • • • • • • NGDC - National Geoscience Data Centre (NERC) BADC - British Atmospheric Data Centre (NERC) AEDC - Antarctic Environmental Data Centre (NERC) NEODC - NERC Earth Observation Data Centre (NERC) BODC - British Oceanographic Data Centre (NERC) NEBC - NERC Environmental Bioinformatics Centre (NERC) • UK Cluster Data Centre (Particle Physics and Astronomy Research Council) • UK Stem Cell Bank (MRC) • UK DNA Banking Network (MRC) • Brain Tissue Bank (MRC) • UKIDC - UK Infrared Space Observatory Data Centre (STFC) • UKSSDC - UK Solar System Data Centre (STFC) • Chemical Database Service (STFC) E-Science Institute, University of Edinburgh – June 2007 Support for e-Research: Filling the Library Skills Gap Other Sources National Statistical Agencies: • • • • • Office of National Statistics (ONS) - http://www.statistics.gov.uk/ General Register Office for Scotland (GROS) - http://www.gro-scotland.gov.uk/ Northern Ireland Statistics and Research Agency (NISRA) - http://www.nisra.gov.uk/ Statistics for Wales - http://new.wales.gov.uk/topics/statistics/ Eurostat - http://epp.eurostat.ec.europa.eu/portal/ Free Resources: • • • • • Non-Governmental Organisations Government websites (national/local) Independent Research Organisations Charitable Organisations Media Organisations Data Discovery Tools: • • Intute: http://www.intute.ac.uk/ Go-Geo! - http://www.gogeo.ac.uk/ E-Science Institute, University of Edinburgh – June 2007 Support for e-Research: Filling the Library Skills Gap Data Support Services Institutions provide support for data services in different ways: • • • • • Data Libraries University Libraries Computing Centres Research Offices Academic Departments Data Libraries go beyond local support of national data centres & statistical agencies: • • • Act as a ‘repository’ of data Reference service Train users to access and handle data resources E-Science Institute, University of Edinburgh – June 2007 Support for e-Research: Filling the Library Skills Gap UK Data Libraries •Edinburgh University Data Library - first such service in the UK, 1983 •Oxford University Data Library –1988 •London School of Economics Data Library –1997 •RLab Data Service –1999, providing support to LSE’s research laboratory Other institutions with ‘Social Statistics’ libraries: •University of Southampton •Strathclyde University DISC-UK (Data Information Specialist Committee – UK) • • • • Foster understanding between data users and providers Raise awareness of the value of data support in Universities Share information and resources among local data support staff URL:http://www.disc-uk.org/ E-Science Institute, University of Edinburgh – June 2007 Support for e-Research: Filling the Library Skills Gap Web 2.0 – lateral thinking in a linear world? Blogs and wikis – Wordpress, blogger Social Bookmarking – del.icio.us Media-sharing services – YouTube, Flickr, Scridb Social networking systems – MySpace, Elgg Collaborative editing tools – Google Docs and Spreadsheets, Gliffy Syndication technologies – RSS Mashups: Numeric Data: Swivel - http://swivel.com/ Many Eyes - http://services.alphaworks.ibm.com/manyeyes/home Data360 - http://www.data360.co.uk/ Spatial Data : BackOfMyHand – http://www.backofmyhand.com Map Builder – http://www.mapbuilder.net, Maptrot – http://www.maptrot.com, Click2Map – http://www.click2map.com, Blockrocker – http://www.blockrocker.com E-Science Institute, University of Edinburgh – June 2007 Support for e-Research: Filling the Library Skills Gap Institutional Repositories UK Repository Projects: • • • • StORe – Source-to-Output Repositories GRADE - Geospatial Repository for Academic Deposit and Extraction R4L – Repository for the Laboratory SPECTRa – Submission, Preserv’n & Exposure of Chemistry Teaching & Research data • CLADDIER – Citation, Location And Deposition in Discipline & Institutional Repositories Issues for further development: • • • • • • Interoperability - Dublin Core, OAI versus domain-specific XML schemas Embedding - repository seen as part of the organisational workflow Redefining repository - as a suite of methodological and technological processes that facilitate the research lifecycle Web 2.0 tools for collaboration - across and within department / institution / discipline Clarity on data citation & persistent identifiers Data rights - open access v restricted access v user-defined access Domain-Specific Repositories: • ArXiv.org – physics, maths, computer science • • Blue Obelisk Data Repository – chemoinformatics PubMedCentral – biomedical and lifesciences E-Science Institute, University of Edinburgh – June 2007 Support for e-Research: Filling the Library Skills Gap • eScience, e-Social Science, e-Research and cyberinfrastructure • “E-Research extends e-Science’s remit to all sciences referring to the use of distributed resources across multiple domains to do science or further research with the following key features: collaborative, multidisciplinary, use of GRID technologies and vast amounts of data” (CURL Workshop, 2005) E-Science Institute, University of Edinburgh – June 2007 Support for e-Research: Filling the Library Skills Gap Examples GRIDPP • Large Hadron Collider • GRID Prototype to analyze data AstroGRID • UK contribution to Virtual Observatory E-Science Institute, University of Edinburgh – June 2007 Support for e-Research: Filling the Library Skills Gap Examples CQeSS • Develop and support quantitative E-Social Science MiMeG • Tools and techniques to analyse audio-visual qualitative data E-Science Institute, University of Edinburgh – June 2007 Support for e-Research: Filling the Library Skills Gap Seamless Access to Multiple Datasets (SAMD) MIMAS as major contributor ESRC and DTI funded Solving a problem of the UK academic Social Science community E-Science Institute, University of Edinburgh – June 2007 Support for e-Research: Filling the Library Skills Gap E-Science Institute, University of Edinburgh – June 2007 Support for e-Research: Filling the Library Skills Gap E-Science Institute, University of Edinburgh – June 2007 Support for e-Research: Filling the Library Skills Gap E-Science Institute, University of Edinburgh – June 2007 Support for e-Research: Filling the Library Skills Gap DISC-UK DATASHARE PROJECT JISC Repository and Preservation Programme March 2007 to March 2009 DISC-UK members • EDINA (lead) • University of Edinburgh • London School of Economics • University of Oxford • University of Southampton Purpose “provide exemplars for a range of approaches and policies in which to embed the deposit and stewardship of datasets in institutional repositories” E-Science Institute, University of Edinburgh – June 2007 Support for e-Research: Filling the Library Skills Gap DATASHARE Motivation Growing presence of IRs SToRe Social Science Report • 70% of survey respondents producing quantitative questionnaire data • Vast majority of researchers not depositing data E-Science Institute, University of Edinburgh – June 2007 Support for e-Research: Filling the Library Skills Gap Deliverables Enhancements to partners’ IRs Exemplars of the process of setting up an institutional data repository service Documentation and open source code for adapting repository software for handling datasets. Technical watch on e-Research, VREs and Web 2.0 developments. Papers, presentations and online dissemination of collected knowledge. E-Science Institute, University of Edinburgh – June 2007 Support for e-Research: Filling the Library Skills Gap Issues Management: storage, curation, policies Legal: access rights, confidentiality and creating public use files Technical: standards to describe, transport and communicate Cultural and political: do people want to share data? Central vs. distributed. Self-archiving vs. assisted deposited E-Science Institute, University of Edinburgh – June 2007 Support for e-Research: Filling the Library Skills Gap Thank you E-Science Institute, University of Edinburgh – June 2007