bwFDM Communities A Research Data Management Initiative in the State of Baden-Wuerttemberg Karlheinz Pappenberger London, 44th LIBER Annual Conference, 25/06/2015 University of Konstanz, Germany Overview 1) The e-science initiative in Baden-Wuerttemberg 2) bwFDM communities: a research data management project 3) bwFDM communities: results 4) Concluding remarks 2 25/06/2015 bwFDM Communities University of Konstanz, Germany 1) The e-science initiative in Baden-Wuerttemberg 3 25/06/2015 bwFDM Communities University of Konstanz, Germany 9 universities in Baden-Wuerttemberg in numbers Universities in Baden-Wuerttemberg Academic Staff (all: 30,300) Students (all: 176,000) Heidelberg* 5,500 31,500 Tübingen* 4,450 27,200 Freiburg 6,800 24,700 Stuttgart 3,350 24,600 Karlsruhe 6,000 24,500 Mannheim 1,000 12,300 Konstanz* 1,250 11,800 950 9,900 1,050 9,500 Hohenheim Ulm * = member of the group of 11 German universities with nationally funded institutional strategies within the German top-level research initiative 4 25/06/2015 bwFDM Communities University of Konstanz, Germany The e-science initiative in Baden-Wuerttemberg (I) The concept paper − E-Science – Science in a New Environment. Further development of the scientific infrastructure in Baden-Wuerttemberg. − − − Published 29th July 2014, 120 pages (only available in German) Edited by the Ministry of Science, Research and the Arts 3.7 million euro for working out action plans 5 defined action areas − − − − − 5 Licensing electronic information media Digitisation Open access Research data management Virtual research environments 25/06/2015 bwFDM Communities University of Konstanz, Germany The e-science initiative in Baden-Wuerttemberg (II) The concept paper – chapter 4: research data management − Written by a working group of 10 members, lead by the ministry and with representatives from − Computing centres − Data centres − Libraries − Researcher communities − Recommendations − Further development of technical “bw“-infrastructure − Establishing research data management within teaching and curricula − Countrywide research data repository − Data life cycle labs − Coordination of policies, legal issues and standardisation questions 6 25/06/2015 bwFDM Communities University of Konstanz, Germany 2) bwFDM communities: a federal research data management project bwFDM = Baden-Wuerttemberg Forschungsdatenmanagement (Baden-Wuerttemberg research data management) as part of several existing and planned IT-driven “bw“-tools: − − − − − bwHPC: high-performance computing bwSync&Share: joint sharing and synchronisation of data bwFileStorage: additional countrywide storage capacity bwIDM: identity management bwMS: licensing of Microsoft software − − − − bwCloud*: federal virtualisation of server and IT-services bwDataArchiv*: long term preservation of data bwDataDiss*: research data of dissertations …. * = project status 7 25/06/2015 bwFDM Communities University of Konstanz, Germany bwFDM communities: the quest for effective research support Background: the data landscape − Data life cycle − Complex and large scale data − Relevant for a growing number of science disciplines − Intelligent data management and analysis creates more high-quality research and publications 8 25/06/2015 bwFDM Communities University of Konstanz, Germany bwFDM communities – the project Funding − Funder: Ministry of Science, Research and the Arts Baden-Wuerttemberg − Funding period: 2014/01 – 2015/06 − Budget: 1 million euro − Staff: 9 full time key accounters (one at each university) 1 project coordinator − Project lead: Steinbuch Computing Centre Karlsruhe Institute of Technology − Main project partners: computing centres at the 9 universities − Associated: libraries 9 25/06/2015 bwFDM Communities University of Konstanz, Germany “What infrastructure and services are needed to make the region a global leading area in research and development?“ Improving scientific research in Baden-Wuerttemberg in context of the e-science strategy Main question & project goal of bwFDM: how to do this? executing a comprehensive survey − − 10 Detailed recommendations for concrete steps Definition of future projects and tasks 25/06/2015 bwFDM Communities University of Konstanz, Germany bwfdm.scc.kit.edu/english 11 25/06/2015 bwFDM Communities University of Konstanz, Germany bwFDM: the research communities and the survey The observed research data landscape in Baden-Wuerttemberg − 3000 different research groups − Applied sciences − Life sciences incl. medicine − Social sciences − Humanities − 700 interviews with researchers (approx. 1h each research group) “How are you working?“ “What are your needs?“ − 2550 user stories extracted from the interviews 12 25/06/2015 bwFDM Communities University of Konstanz, Germany Survey: 30 open & closed questions / example: 13 25/06/2015 bwFDM Communities University of Konstanz, Germany bwFDM: milestones of the survey List of research groups and contact persons (from Feb 2014 onwards) Guideline for interviews Interviewing researchers and transforming the results into a machine readable format (Feb 2014 to Nov 2014) Extracting “user stories“ / statements and collecting them in a database (Nov 2014 to Feb 2015) Compiling thematic areas and building working groups on them (from Feb 2015 onwards) Grouping user stories in story maps (from Feb 2015 onwards) Report / presentation of results and recommendations to the ministry (17th July 2015, summary in English sharing information) 14 25/06/2015 bwFDM Communities University of Konstanz, Germany 3) bwFDM communities: results 15 25/06/2015 bwFDM Communities University of Konstanz, Germany a) General requirements and policy framework (1) Legal Issues: intellectual property rights, copyright and data protection Advice and expertise in IPR and data protection / publishing data − Information centre / contact person Fewer legal restrictions in IPR and data protection − When using data from others Stronger property rights − For own collected and processed data IT infrastructure that fulfills data protection requirements − When sharing data (within a project group) − When archiving data 16 25/06/2015 bwFDM Communities University of Konstanz, Germany a) General requirements and policy framework (2) Information services 56 % of the interviewed groups don´t feel well-informed about RDM Advice and expertise in RDM − Newsletter (40 %) − Information platform (40 %) − Information centre / contact person (35 %) − Training / tutorials (30 %) − RDM guideline / RDM policy (8 %) − RDM teaching courses (3 %) − General information requests in RDM (20 %) 17 25/06/2015 bwFDM Communities University of Konstanz, Germany a) General requirements and policy framework (3) Scientific culture on data 50 % are satisfied with the availability of data, 50 % are not 55 % have data that might be of interest to others but don´t share Incentives Status quo: Limited exchange of data in scientific communities − Time − Personal risk / career To be solved within the scientific community and the funding organizations 18 25/06/2015 bwFDM Communities University of Konstanz, Germany b) Technical framework (1) Standards and formats − Software − Types of files, exchange formats 19 25/06/2015 bwFDM Communities University of Konstanz, Germany c) Data collection and data sharing (1) Access to commercial and governmental data / data of NGOs (2) Digitization − Articulated mainly in humanities − Using digitized material / active digitization of material (3) Scientific cooperation: − Management and exchange of data unsatisfactory at the moment: 50 % email / USB-stick 20 % dropbox 18 % server 12 % other (e.g. bwSync&Share) − 20 Virtual research environments 25/06/2015 bwFDM Communities University of Konstanz, Germany d) IT infrastructure / IT support Storage − Very often named − Efficient access (speed, simplicity) − Archiving facilities: 10 years + Computing power / high performance computing needs Hardware − Special requirements − Easier and more flexible purchasing − Money Software / Software tools − Access to specialist software IT support − Support for using IT infrastructure − data processing support − [more IT staff on the research group level] 21 25/06/2015 bwFDM Communities University of Konstanz, Germany e) Preservation (1) Documentation of projects and data Project documentation / documentation of data − RDM plan and guidelines − RDM information centre − Accompanying consulting by a RDM expert − Support for data curation − Research information system for documentation Metadata − Metadata standards − Professional staff for data enrichment with metadata / automation Stronger property rights − For own collected and processed data IT infrastructure that fulfills data protection requirements − When sharing data (within a project group) − When archiving data 22 25/06/2015 bwFDM Communities University of Konstanz, Germany e) Preservation (2) Data repositories Articulated demand − Central / structured / curated Both disciplinary and interdisciplinary Definable access rules Visualization of data Finding data / using data of others − General search engine for data − Access to high-quality data (3) Archiving 23 25/06/2015 bwFDM Communities University of Konstanz, Germany Results f) – i) f) Licensing (campus) − Software g) Funding / financial issues − More money at the research group level h) Open Science / Open Data / Open Access − Open source software − Data curation − Trust i) Reservations about RDM − Efficiency, special needs, bureaucracy, time pressure 24 25/06/2015 bwFDM Communities University of Konstanz, Germany 4) Concluding remarks Political dimension − e-science-initiative… recognized importance of RDM in Baden-Wuerttemberg; bwFDM survey may form the nucleus for a broad RDM infrastructure deployment in BW − But keep in mind: state BW strategy versus international cross-linked research Researcher dimension − bwFDM bares the existing gap between research and infrastructure − bwFDM offers a huge and unique dataset of various researcher interests − Certainly not only representative of researchers in Baden-Wuerttemberg Infrastructure dimension − IT driven project, but: fraction of answers covers less technical needs than expected and much more heterogeneous than expected − Several players: computing centres, data centres, libraries, … − A lot of distributed expertise in infrastructure − Important: cooperation − Within an institution and also on a state/national/international level − Not all issues can be solved on a local infrastructural level 25 25/06/2015 bwFDM Communities University of Konstanz, Germany Thank you Very much! Karlheinz Pappenberger Subject Librarian for economics and statistics Specialist for research data management University of Konstanz - KIM karlheinz.pappenberger@uni-konstanz.de University of Konstanz, Germany