Faculty/AP Un-retreat – January 7, 2014
Data Stewardship and Research Data Services in the Library
Summary from Discussion Groups 1 & 2:
Challenges, Goals, and Recommended Strategies and Actions to Take
Submitted by Tim Cole based on his notes augmented with additional notes provided by Jennifer Hain Teper and Carissa Phillips
The key challenges in this domain as identified by the discussion groups are:
1.
To raise level of data and data curation literacy within the Library and with Campus faculty, staff & students who generate data and/or rely on the Library for curation of research resources.
While the Library is adept at curating research outcomes published in traditional forms, we collectively are not yet adept at curating and providing access to research data and outcomes in other forms. We will need to become adept at this, and we can anticipate that we will need to do so across the entire
Library, hence the need raise level of data curation literacy across the entire Library. This challenge encompasses:
• Raising awareness of ways that copyright & intellectual property rights associated with data bear on data access, conditions of use and long-term retention and curation;
• Finding effective & efficient ways to market the Library's data curation services and reach out to researchers to explain what we do and why it may be important to them.
In articulating this challenge it was observed that the scope of the new Research Data Services initiative needs to be better defined. There was a consensus in the first discussion especially that the function of RDS and what constitutes data curation literacy remains a bit elusive. Hard to raise awareness about an amorphous need or market incompletely described services. That is part of what makes this a challenge.
2.
To Identify & prioritize near-term & long-term Research Data Services and to clearly define the
Library's role & scope in regard to providing these services as part of our day-to-day operations and through our collaboration in the RDS initiative.
In particular we are challenged to identify groups & institutional partners, especially those beyond the Campus, with whom we will collaborate in providing services.
3.
To make data more readily discoverable, accessible & useful both in its native domain (e.g., making civil engineering data discoverable to civil engineers) and in other domains (e.g., making civil engineering data discoverable and useful to environmental scientists). While libraries can build on long traditions of making bibliographic resources discoverable, we have much less knowledge about or experience with making collections of data discoverable. This challenge encompasses:
• Defining and developing new approaches to and expertise with metadata for data, e.g., figuring out how best to describe data as different from how we describe books;
• Sustaining the data we curate in useful and manageable formats.
4.
To learn how to comply with federal agency data curation mandates & requirements levied by others involved scholarly communication -- e.g., Campus, publishers, etc. (It was noted that libraries need a voice in the definition of these mandates & requirements.)
For each Challenge, ensuing discussion generated a goal (a shared goal for Challenge 3 and 4), strategies and a few specific recommendations for near-term or long-term actions that the Library should take.
Resource requirements suggested to help meet each challenge are extremely preliminary, likely incomplete and assume allocations already planned for RDS.
Goal 1 (addressing Challenge #1) -- Make subject specialists and bibliographers in the library aware of
Research Data Services as these come online & equip subject specialists and bibliographers to use these services and to help their users use these services.
In discussing this goal there was not consensus as to appropriate granularity. Should we expect that every subject specialist will become familiar with and adept at using a full range of data services? Or would it be better to achieve penetration at the level of broad disciplines or domains with select subject specialists in these domains taking the lead on LibGuides, DMP tool development, fluency with other centrally supported services, etc. Or perhaps a mixed approach, with all librarians coming to understand what an ORCID is, but only select librarians being expert in what a DMP is and how to help researchers customize it for a domain? Or perhaps domain data service expertise is best achieved in collaboration with Colleges -- e.g., with the College hiring the specialist who then works in the appropriate departmental library.
The second discussion session also noted the importance of collaboration among subject specialists and between subject specialists and central resources, e.g., the RDS Office. We need to be clear as to what the expectations for subject specialists are versus what will be taken on by RDS. Who finds and helps when depositing data in subject-specific repositories? How do RDS and subject specialists share the job of customizing DMPs? How do RDS and subject specialists collaborate on data service LibGuides?
Some suggested that the goal should be that subject specialists have a baseline comfort with explaining basic data literacy to the faculty/users. We have a very mixed level of comfort right now. Need more than just understanding the services; need a basic understanding the role of research data in THEIR field.
But again there was not consensus as to whether the bulk of engagement (and so comfort level) should be at the level of each subject specialist or at a higher, e.g., divisional or equivalent level, or at a central level, e.g., RDS Office.
In terms of metrics it was noted that we have a baseline survey already that gives a sense of current level of competencies with and knowledge about data services.
• We need to update this data at regular intervals to measure progress.
• We also need to develop a list of expectations / core competencies, probably tailored by domain/discipline, & possibly assuming multiple levels of expected competency.
Strategies to achieve Goal 1:
• Create & disseminate brochures, flyers etc. (some already from Scholarly Commons);
• Offer in person & Webinar training; leverage outside the Library Webinars, etc.;
• Look for opportunities to hire subject specialists who have/can add data service skills;
• Develop ways to measure librarian engagement with research data generators & users.
In discussing these strategies we did not reach consensus of who in the Library is best positioned to implement these strategies or even exactly what needs to be measured. Some suggested that we think of RDS Office as the center of a loose federation. Others noted that some colleges have expressed interest in hiring data service specialists, potentially homed in the appropriate departmental library; this would suggest a collaborative, partially distributed model of RDS specialists -- i.e., both centrally and embedded in select libraries.
Ultimately there was consensus that we need a model of collaboration between at least some subject / domain specialists and the new RDS Office that can improve the knowledge of at least some subject specialists in regard to data services. As with any transition, in part this will be achieved by building in expectations for data services skills as we hire. Exactly what level of expectation for subject specialists with regard to RDS remains to be determined.
The tension in the discussions seemed to center around question of whether RDS is so different from other information services / curation services we currently provide as to preclude having the required skill set widely dispersed in the library (à la collection development). We must recognize the unrealistic expectations of holistic librarianship on the one hand; on the other hand, we want to avoid too much of a library-within-a-library mentality -- i.e., a data library separate and distinct from our library of books, serials and special collections. Undoubtedly an approach somewhere in the middle is what's needed.
Resources for Goal 1:
• Already allocated -- Director of Research Data Services & the 2 AP
• Another faculty or AP to ramp up data literacy instruction program (this person will integrate efforts others into the program, but we need a full-time dedicated person) -- this might be a current faculty or AP reassigned to this task for a while (ala NSM).
• Instructional space (with computers) for data literacy -- such space exists in the Library but is heavily obligated.
• Over time we can anticipate that a percentage of all subject specialists and most other faculty / APs in the Library will be spent on data services our related ancillary tasks. This necessarily means we do less of something else. This trade-off should reflect intentionality not happenstance.
Goal 2 (addressing Challenge #2): Generate a scope document (building on eResearch Task Force report, more oriented for general audience and more concrete). This document should say what we do, who we do it for (near and long term), and what we don't do and should be treated as living document (i.e., subject to ongoing revision and improvement).
Strategies to achieve Goal 2:
• Assign task of creating this scoping document to the Library eResearch Implementation
Committee & the Director of RDS in some combination and/or define an additional entity to be involved in generating this document.
• Continue to systematically review RDS activity at other institutions;
• Engage Library faculty and AP as widely as possible to help learn what's going on elsewhere, providing a means by which they can report back from meetings, etc.;
• Define roles for the existing Library players in this & decide if structure needs amendment to make more efficient, complete, compatible with new RDS Office;
• Identify a long-term faculty advisory group (if not eResearch Implementation, may coincide with the Scholarly Commons advisory group being contemplated);
• Define balance between what we can do / need to do virtually vs. physical;
• Define how best to synchronize between RDS & Library IT and facilities infrastructure.
In the early discussion session, Goal 2 was seen as needed in order that Library faculty and staff could better understand the nature of RDS and the Library's role in RDS. It was also seen as a pre-requisite to being able to understand and implement collaborations, both internal to the Campus and eventually externally.
The later discussion noted that RDS is still nascent here -- suggesting some risk of being too specific (too concrete) too soon in discussing this goal and strategies to achieve it. Right now RDS is benefiting from entrepreneurial efforts. And while the Library's eResearch Implementation Committee has been needed and helpful, it's not clear if it should continue once the Director of RDS is in place. The Director of RDS should be advised by more than just Library faculty, and so if an advisory body is needed, better it be a broader advisory body that includes faculty from around Campus, not just internal to the Library or even just the OVCR's Office. It may also be difficult to fully scope the role of RDS until more of the staff (i.e., beyond the Director) in place.
But eventually, if we want to facilitate the participation of and contributions from Library faculty and staff, there needs to be a clearer sense of what RDS Office does / will do, and how the Library (and individual librarians) contributes to the goals of RDS Campus-wide.
Resources for Goal 2:
• We do not see this Goal as needing special additional resources over what is already in place and/or allocated. In fact achieving this goal should save resources (since everyone will have a better understanding of work needed and so will be more efficient).
Goal 3: (Challenges #3 and #4) -- Characterize & report on use of our data repository over time & assess how well we are meeting goals / Demonstrate value of RDS.
There was a consensus that Library faculty will need to know how well we are meeting our goals in regard to the provision of research data services. We also need to know and be able to describe the value of research data services as implemented. These metrics will also be of interest to faculty and researchers outside the Library. This feedback is a first step in understanding how well we are supporting discoverability of research data resources and how well we are doing in meeting funder requirements to support data retention and reuse.
Strategies to achieve Goal 3:
• Measuring transaction logs, downloads, citations in twitter and more traditional forms of research publication.
• Iteratively define based on transaction log data test use cases and best practices for discoverabilty and to facilitate re-use (e.g., best practices readme documents, policy for formats supported, etc.)
• Define a formal framework for assessing how well we are helping researchers comply with requirements. This should include a way to capture feedback from faculty submitting grants regarding reviewer comments on DMP, etc.
• Compare how well we are helping faculty navigate our own RDS services versus their use of off campus services and repositories.
• Can the library help the OVCR measure compliance – they’re responsible, but not sure what they’re doing right now. How is the campus enforcing compliance and measuring it? (Need to be clear between what is/should be done at a Campus level versus what the Library is doing.)
The later discussion also asked the question, 'What will collaboration between Office of the
CIO/NCSA/Library (and somewhat OVCR) look like – what would the partnership look like?'
Resources for Goal 3:
• A visiting (2-year) librarian to develop and implement metrics;
• Modest ongoing support from the Assessment Coordinator;
• Over time we can anticipate that a percentage of all subject specialists and most other faculty / APs in the Library will be spent on data services our related ancillary tasks. This necessarily means we do less of something else. This trade-off should reflect intentionality not happenstance.
_________________________________________________________________________________
15 March 2014
Disclaimer: While every effort was made to reflect the points and advice expressed by participants in the unretreat discussion groups and captured by the 3 note-takers, the distillation presented here is necessarily the responsibility of the author. Any misrepresentation of points raised or failure to capture ideas expressed are his fault alone.