Institutional Research Data Management: ARL libraries SPEC Survey Results David Fearon Data Management Services Johns Hopkins University Sheridan Libraries Andrew Sallans Center for Open Science Formerly at the University of Virginia Library CNI Fall 2013 Membership Meeting Dec 9, 2013. Washington DC ARL SPEC Survey: Research Data Management Services ARL SPEC Kit 334 (July 2013) Johns Hopkins Sheridan Libraries Data Management Services University of Virginia Library Data Management Consultant Group Available for download at ARL.org Survey origins • Built upon the ARL E-Science Working Group survey: • “E-Science and Data Support Services: A Study of ARL Member Institutions" (Soehner, Steeves, & Ward, 2010) Research Data Management Services: expanding research lifecycle support Locating data sources 4 GIS 8 Statistical software 5 Data visualization 59 50 33 9 12 Data analysis 5 19 Programming 310 Service offered 1–3 yrs Service offered 3+ yrs • Research proposal stage services: • data management plans • Dissemination & preservation stage services: • data repositories and archiving Survey themes & interests • Research data management – JHU: archiving services • Resource requirements for sustaining services – UVA: staffing and training – Technical & administrative needs & challenges Key finding: RDM Service Offering 73 academic libraries responded • (59% of 125 ARL members) 100% Offer research support services (broadly defined) (73) Offer data management services (54) Planning to 23% offer DMS (17) 68% 84% 100% Start of RDM Services 20 18 NSF DMP requirement 16 16 (Jan 2011) Number of Responses (N) 14 12 11 10 8 8 7 6 4 4 2 1 1 1 2006 2007 2008 0 <2006 2009 2010 Year Initiated RDM Services (1996 - 2013) 2011 2012 Key Finding: Motivators Question: What are some key variables in the institutional environment driving these new services? Common reasons: • Responding to grant funder requirements • Library-led initiatives toward supporting research Less common reasons: • Administration/researchers calling for data management support by library • Responding to formal institutional data policies Key finding: RDM Service Offering Online DMP resources 47 DMP consulting 48 DMP training 33 Other Data Mangement training 23 Research metadata support 42 Data citation support 38 Data sharing & access support 22 Data archiving by library 40 0 10 20 30 40 Data management planning Data management support Data sharing & archiving 50 60 Data management planning Online DMP resources DMP Tool 23 12 Links to resources 24 29 Customized guidance 87% N = 47 75% N = 41 Data management planning 60 89% N = 48 50 40 30 61% N = 33 20 10 0 DMP training DMP consulting Libraries tracking DMP support N=25 Key Finding: Modest DMP service demand 10 9 9 8 Total DMP Support Contacts in last 2 years (of 25 libraries tracking their consulting) 7 6 5 5 4 4 3 3 3 2 1 1 0 0-5 6 - 10 11 - 20 21 - 40 41 - 60 Total DMP Sessions (0 - 96) 61-100 Data Archiving Services Funders are promoting data sharing through repositories For libraries, may require more staffing/resources beyond reference services. Archiving: online access to data, facilitated by preservation Data Archiving Services Assistance locating data repositories Direct assistance w/ depositing data Library hosts a research data archive 96% 48% 74% 0 20 40 60 Data Archiving Services Data-specific repository 13% (5) Digital Repositories 13% (5) Institutional Repository (IR) w/datasets 75% (30) Data Archiving Infrastructure Primary platform choice Inst. Repository w/ Data Data-specific Repository (top 5) Dspace Fedora Dataverse Chronopolis BePress Digital Commons HubZero (customized) Hydra Drupal DataConservancy Custom repository Funding Data Archiving Internal budgets Grants 24% 14% Charge researcher 84% Archive Usage No. of Researchers w/ deposits Min Max Median IR’s w/data Data Archives 1 2 400 100 10 11 Total size of archived deposits IR’s w/data Data Archives Min 9 GB 3 GB Max 19 TB 2 TB Median 10.5 GB 516 GB Deposit Sources & Support Sources of deposited data Publications 30 Dissertations/Theses 30 2 Research Projects 29 5 Prior Projects Other 22 5 3 IR'S w/data 5 1 Data Archives Method of depositing data Library deposits for researcher Researchers self-deposit 30 23 5 3 Staffing of RDM Services Organizational models of RDMS Key skills and training for positions Staffing: Organization Structure for RDM Services Other structure 6% Single library department 11% Staff from 2 or more library departments 51% Single library position 15% Staff from library & other units in inst. 17% Number & Type of Positions Number of Institutes Institutes' Number of Positions Providing RDMS 23 9 8 4 1 2 7 2 3 4 5 6 Total Positions within Institute • Most are permanent positions (90%), but RDM roles are less than 50% for the majority of positions. Position's % of Time Spent on RDMS % of Positions • Single positions & groups of 6 are common 61.3 20.8 14.6 3.3 0-25 26-50 51-75 % of Time 76-100 Staffing Roles & Job Titles Data Management, 9 Systems, 9 Repository, 10 Curation, 11 Research Data, 11 GIS or Geospatial, 12 Subject Librarian or Liaison, 50 Digital , 38 Data Services , 13 Metadata, 17 Data Librarian, 18 Frequency of Word/Phrases in Titles (n=231) Key findings: Skills and Training Ranked as Important Skills 1. Subject domain expertise 2. Digital/data curation expertise 3. IT experience 75% 60% 59% Background for current positions (n=228) MLS/ MLIS Data curation emphasis 75% 6% Masters in another domain specialty PhD in another domain specialty 27% 13% Key Finding: Assessing service effectiveness • Most self-assessment of RDM service effectiveness is informal, ad-hoc – Survey inconclusive on which services and models are most effective, top outreach strategies, etc. • Is faculty/researcher demand sustaining these programs once started? (too early to say) • Challenges for implementing and sustaining services Key Finding: Challenges Theme % w/ theme Collaboration campus-wide Funding Faculty Engagement Technology Infrastructure Limited Staffing 18 17 15 13 12 37% 35% 31% 27% 24% Marketing Services Staff Training Scoping services Institutional commitment 12 11 9 7 24% 22% 18% 14% 5 4 3 3 2 10% 8% 6% 6% 4% Faculty education on need Evaluating demand Other Scaling service expansion Funding Agency ambiguity Limitations: Distribution • Distribution through ARL SPEC Kit network may not have reached all data services staff • Distribution method may have missed representation of non-library services Limitations: Estimations • Poor estimation of actual time invested in RDM services • Poor estimation of actual volume of data being archived or planned Limitations: Terminology • Some terms do not yet seem to have precise common meaning • Variation in interpretation may mean some of the data needs further exploration Limitations: Broader Analysis • Much data, little time • We especially hoped to merge our data with other available organizational data for broader comparison *** Future research project opportunity!*** Lesson 1: Collaboration Seems Key • Libraries need to collaborate across the institution to support RDM • Developing these collaborations is seen as one of the biggest challenges Lesson 2: Real Costs Exist • Necessary skills may requiring hiring new staff with different skills or retraining • New skills may cost more • Archiving infrastructure, storage, and curation will incur real cost Lesson 3: Build More Engagement • Poor engagement may lead to a lack of awareness, low perceived value, and resistance to sharing • Trickle down effect from empty mandates --ie. DMP requirements that aren’t reviewed seriously Lesson 4: Grow Services • Despite the challenges, many respondents see RDM services as an appropriate service for libraries • What comes will involve a balance of institutional and funder policy, technical skills of staff, and financial capabilities Lesson 4: Grow Services • Planned services w/in 2yrs: • Plans for staffing: Online DMP resources 63% Adding 1 or more positions 44% Research data archiving 54% Adding RDM role to existing staff 44% RDM topic training 46% No staff changes planned 34% • Plans for RDM funding: Expecting a funding increase 66% Decrease 2% Staying the same 33% Source: Not yet determined 52% Regular library budget 36% External grant funding 26% Special project budget 16% Lesson 5: There Is No Single Path • We interpret the data to suggest merit in many models in different settings • Cross institutional collaboration and offering of services seems to be one of the viable models Credits Our full team: • David Fearon, Johns Hopkins University • Betsy Gunia, Johns Hopkins University • Sherry Lake, University of Virginia • Barbara Pralle, Johns Hopkins University • Andrew Sallans, Center for Open Science With thanks to Lee Ann George, ARL’s SPEC Kit editor And ARL’s E-Science Working Group