Supporting the local research data environment via cross

advertisement
Supporting the local research data
environment via cross-campus
collaboration and leveraging of
national expertise
Hannah F. Norton, Rolando Garcia Milian,
Michele R. Tennant*, Cecilia Botero
Health Science Center Libraries
* and UF Genetics Institute
University of Florida
Image credit: Modified from Eric Fischer, http://www.flickr.com/photos/walkingsf/5266043943/
Background
• Growing interest in research data and how to
provide support
– Partnership with Research Computing/ High
Performance Computing Center (HPCC)
– Involvement in ARL E-Science Institute
• Pilot project asked about Clinical and
Translation Science Institute (CTSI)
researchers’ information needs, including
those related to data and e-science
Pilot Project Results
What resources outside of your department do you need to best manage
and analyze your data?
Training on data management
44.4%
Storage capacity
53.3%
Data/digital management system
for organizing data
51.1%
Computing capacity for analysis
40.0%
Computing expertise or software
62.2%
Data management service to
outsource some of the work to
31.1%
Other external expertise
(e.g. statistician, informatician)
37.8%
Other
15.6%
0%
n=45
10%
20%
30%
40%
Percentage of Respondents
50%
60%
70%
Interview theme:
Learning data management
• Larger labs with many graduate students have
trouble with consistency of data organization and
documentation – best practices training would
help.
• Data management is either learned from PIs or
individually.
• In today’s rapidly-changing fields, it is important
to be able to learn new methods for data
management and analysis as quickly as possible.
Interview theme:
Storage and long-term preservation
• Most participants use college- or department-level network
servers for storage. This is convenient, but can be difficult
to access from off-campus.
• Print lab notebooks are still used in many disciplines.
Despite interest in migrating to electronic documentation,
print lab notebooks were cited as
– the gold standard for documenting ethical conduct of research,
– easier to use when doing gloved research,
– and less expensive than electronic options.
• Retention of data is difficult when it must be constantly
migrated to new systems.
• Those working with biological samples indicated that their
retention is more important than electronic data (samples
cannot be exactly duplicated if lost).
Interview theme:
Data sharing and collaboration
• Most participants are not sharing data, other than with
their immediate collaborators. Exceptions are those
who deposit genetic data into national databases.
• Large collaborative projects can have trouble with reintegrating data from multiple investigators on multiple
side-projects.
• It can be difficult to find and use existing data that
should be compared or related to the researcher’s
data.
• Research is increasingly collaborative, and it is
important for researchers to learn about resources and
potential collaborators across the institution.
Interview theme:
Overall concerns and observations
• For those working with particularly sensitive data (e.g.
from high containment labs or the Veterans Affairs
hospital), it is important to balance necessary security
measures with processes that enable researchers to
actually work with the data.
• Resource-rich labs with dedicated data people have
fewer problems.
• Institution-level policies or guidelines on data
management would be helpful.
• Even those who have few data management challenges
now are planning to work on more complex research in
the future with bigger, more varied data sets.
Learning from national experts
• Faculty Enhancement Opportunity (minisabbatical) allowed library director and two
other librarians to visit three top tier health
science libraries to observe strategies,
programs, and services that could be applied
at UF.
• Also provided funding to bring experts in
topics of interest to UF to provide training and
lead strategic brainstorming sessions.
Learning from national experts
Each visit focused on a particular strength of that
library, but other areas were addressed.
Library space and
renovation
E-science and data
curation support
CTSA and
bioinformatics support
Learning from national experts
• 3-day visit from:
– Joan Starr, EZID Service Manager, California Digital
Library
– Carly Strasser, Data Curation Specialist, California
Digital Library
– Sherry Lake, Data Specialist, University of Virginia
Libraries
• Presentations on data trends, open data, data
tools, and next steps for librarians
Outcome:
Presentations across campus
• Research Computing Day
• Open Access Week
Outcome:
Best Practices in Data Management
Outcome: Data Management/Curation
Task Force
Membership:
• 2 health science
librarians
• 2 science librarians
• GIS librarian
• 3 humanities/social
science librarians
• Director of Research
Computing
• Representative of Office
of Research
Projects:
• Dataverse
• Focus groups
• DMP Tool
• Survey
• Workshops
Conclusion
• Research data management is an area ripe for
library involvement and leadership.
• At our institution, local training on the best
practices in data management and
development of a library-wide task force on
data management have proven fruitful first
steps in providing concrete guidance to our
users in this area.
Acknowledgements
Thank you to collaborators at UF, including faculty
and staff from:
• Clinical Translational Science Institute
• High Performance Computing Center
• Digital Library Center
This project has been funded in part with federal
funds from the National Library of Medicine,
National Institutes of Health, under Contract # HHSN-276-2011-00004-C.
This presentation is available for re-use under a creative commons attribution license.
Download