Using assessment of NSF data management plans to enable evidence-based evolution of

advertisement
Using assessment of NSF data
management plans to enable
evidence-based evolution of
research data services
Amanda Whitmire, Jake Carlson, Patricia Hswe,
Susan Wells Parham, Lizzy Rolando & Brian Westra
@DMPResearch
Acknowledgements
Jake Carlson ─ University of Michigan Library
Patricia M. Hswe ─ Pennsylvania State University Libraries
Susan Wells Parham ─ Georgia Institute of Technology Library
Lizzy Rolando ─ Georgia Institute of Technology Library
Brian Westra ─ University of Oregon Libraries
This project was made possible in part by the
Institute of Museum and Library Services
grant number LG-07-13-0328.
23 April 2015
2
23 April 2015
3
23 April 2015
4
23 April 2015
5
23 April 2015
6
23 April 2015
7
levels of data services
high level
mid-level
the basics
23 April 2015
infrastructure
metadata
support
DMP review
data curation
facilitate
deposit in
DRs
consults
website
dedicated
“research
services”
workshops
From: Reznik-Zellen, Rebecca C.; Adamick, Jessica; and McGinty, Stephen. (2012). "Tiers of Research
Data Support Services." Journal of eScience Librarianship 1(1): Article 5.
http://dx.doi.org/10.7191/jeslib.2012.1002
8
Informed data services development
surveys
23 April 2015
9
Informed data services development
data curation
profiles
23 April 2015
10
Informed data services development
DMP
data mgmt.
plans
23 April 2015
11
DART Premise
Research Data
Management
DMP
knowledge
capabilities
practices
needs
researcher
23 April 2015
12
DART Premise
Research Data
Management
knowledge
capabilities
practices
needs
23 April 2015
Research Data
Services
13
“Of the 181 NSF DMPs that were analyzed, 39 (22%) identified Georgia Tech’s
institutional repository, SMARTech.”
“We have a clear road ahead of us: we will target specific schools for
outreach; develop consistent language about repository services for research
data; and focus on the widespread dissemination of information about our new
digital preservation strategy.”
23 April 2015
14
We need a tool
23 April 2015
15
Solution: An analytic rubric
Performance
Criteria
Performance Levels
23 April 2015
High
Medium
Low
Thing 1
Thing 2
Thing 3
16
Literature review on
creating & using
analytic rubrics
23 April 2015
17
NSF-tangent & 3rd-party
DMP guidance
23 April 2015
18
NSF DMP guidance
23 April 2015
19
*
*
*
*
NSF Directorate or Division
BIO
DBI
DEB
EF
IOS
MCB
CISE
ACI
CCF
CNS
IIS
EHR
DGE
DRL
Information & Intelligent Systems
Education & Human Resources
Division of Graduate Education
Research on Learning in Formal & Informal Settings
Undergraduate Education
HRD
Human Resources Development
ENG
CBET
CMMI
ECCS
EEC
EFRI
Engineering
Chemical, Bioengineering, Environmental, & Transport Systems
Civil, Mechanical & Manufacturing Innovation
Electrical, Communications & Cyber Systems
Engineering Education & Centers
Emerging Frontiers in Research & Innovation
IIP
Industrial Innovation & Partnerships
GEO
AGS
EAR
OCE
Geosciences
Atmospheric & Geospace Sciences
Earth Sciences
Ocean Sciences
MPS
AST
CHE
DMR
23 April 2015
Molecular & Cellular Biosciences
Computer & Information Science & Engineering
Advanced Cyberinfrastructure
Computing & Communication Foundations
Computer & Network Systems
DUE
PLR
*
Biological Sciences
Biological Infrastructure
Environmental Biology
Emerging Frontiers Office
Integrative Organismal Systems
Polar Programs
Mathematical & Physical Sciences
Astronomical Sciences
Chemistry
Materials Research
DMS
Mathematical Sciences
PHY
Physics
SBE
********
division-specific
guidance
Social, Behavioral & Economic Sciences
BCS
Behavioral & Cognitive Sciences
SES
Social & Economic Sciences
20
Consolidated guidance
Source
Guidance text
NSF guidelines
The standards to be used for data and metadata format and content (where
existing standards are absent or deemed inadequate, this should be
documented along with any proposed solutions or remedies)
BIO
Describe the data that will be collected, and the data and metadata formats and
standards used.
CSE
The DMP should cover the following, as appropriate for the project: ...other
types of information that would be maintained and shared regarding data, e.g.
the means by which it was generated, detailed analytical and procedural
information required to reproduce experimental results, and other metadata
ENG
Data formats and dissemination. The DMP should describe the specific data
formats, media, and dissemination approaches that will be used to make data
available to others, including any metadata
GEO AGS
23 April 2015
Data Format: Describe the format in which the data or products are stored (e.g.
hardcopy logs and/or instrument outputs, ASCII, XML files, HDF5, CDF, etc).
21
Project team
testing &
revisions
Feedback &
iteration
Rubric
Advisory
Board
23 April 2015
22
Directorate- or divisionspecific assessment criteria
General Assessment
Criteria
Performance Level
Addressed issue, but
incomplete
Did not address
issue
Performance Criteria
Complete / detailed
Describes what types
of data will be
captured, created or
collected
Clearly defines data type(s).
E.g. text, spreadsheets, images, 3D
models, software, audio files, video
files, reports, surveys, patient
records, samples, final or
intermediate numerical results from
theoretical calculations, etc. Also
defines data as: observational,
experimental, simulation, model
output or assimilation
Some details about data
types are included, but
DMP is missing details or
wouldn’t be well
understood by someone
outside of the project
No details
included, fails to
adequately
describe data
types.
All
Describes how data
will be collected,
captured, or created
(whether new
observations, results
from models, reuse
of other data, etc.)
Clearly defines how data will be
captured or created, including
methods, instruments, software, or
infrastructure where relevant.
Missing some details
regarding how some of
the data will be
produced, makes
assumptions about
reviewer knowledge of
methods or practices.
Does not clearly
address how
data will be
captured or
created.
GEO AGS,
GEO EAR SGP,
MPS AST
Identifies how much
data (volume) will be
produced
Amount of expected data (MB, GB,
TB, etc.) is clearly specified.
Amount of expected
data (GB, TB, etc.) is
vaguely specified.
Amount of
expected data
(GB, TB, etc.) is
NOT specified.
GEO EAR SGP,
GEO AGS
23 April 2015
Directorates
23
Directorate- or divisionspecific assessment criteria
General Assessment
Criteria
Performance Level
Addressed issue, but
incomplete
Did not address
issue
Performance Criteria
Complete / detailed
Describes what types
of data will be
captured, created or
collected
Clearly defines data type(s).
E.g. text, spreadsheets, images, 3D
models, software, audio files, video
files, reports, surveys, patient
records, samples, final or
intermediate numerical results from
theoretical calculations, etc. Also
defines data as: observational,
experimental, simulation, model
output or assimilation
Some details about data
types are included, but
DMP is missing details or
wouldn’t be well
understood by someone
outside of the project
No details
included, fails to
adequately
describe data
types.
All
Describes how data
will be collected,
captured, or created
(whether new
observations, results
from models, reuse
of other data, etc.)
Clearly defines how data will be
captured or created, including
methods, instruments, software, or
infrastructure where relevant.
Missing some details
regarding how some of
the data will be
produced, makes
assumptions about
reviewer knowledge of
methods or practices.
Does not clearly
address how
data will be
captured or
created.
GEO AGS,
GEO EAR SGP,
MPS AST
Identifies how much
data (volume) will be
produced
Amount of expected data (MB, GB,
TB, etc.) is clearly specified.
Amount of expected
data (GB, TB, etc.) is
vaguely specified.
Amount of
expected data
(GB, TB, etc.) is
NOT specified.
GEO EAR SGP,
GEO AGS
23 April 2015
Directorates
24
Directorate- or divisionspecific assessment criteria
General Assessment
Criteria
Performance Level
Addressed issue, but
incomplete
Did not address
issue
Performance Criteria
Complete / detailed
Describes what types
of data will be
captured, created or
collected
Clearly defines data type(s).
E.g. text, spreadsheets, images, 3D
models, software, audio files, video
files, reports, surveys, patient
records, samples, final or
intermediate numerical results from
theoretical calculations, etc. Also
defines data as: observational,
experimental, simulation, model
output or assimilation
Some details about data
types are included, but
DMP is missing details or
wouldn’t be well
understood by someone
outside of the project
No details
included, fails to
adequately
describe data
types.
All
Describes how data
will be collected,
captured, or created
(whether new
observations, results
from models, reuse
of other data, etc.)
Clearly defines how data will be
captured or created, including
methods, instruments, software, or
infrastructure where relevant.
Missing some details
regarding how some of
the data will be
produced, makes
assumptions about
reviewer knowledge of
methods or practices.
Does not clearly
address how
data will be
captured or
created.
GEO AGS,
GEO EAR SGP,
MPS AST
Identifies how much
data (volume) will be
produced
Amount of expected data (MB, GB,
TB, etc.) is clearly specified.
Amount of expected
data (GB, TB, etc.) is
vaguely specified.
Amount of
expected data
(GB, TB, etc.) is
NOT specified.
GEO EAR SGP,
GEO AGS
23 April 2015
Directorates
25
Directorate- or divisionspecific assessment criteria
General Assessment
Criteria
Performance Level
Addressed issue, but
incomplete
Did not address
issue
Performance Criteria
Complete / detailed
Describes what types
of data will be
captured, created or
collected
Clearly defines data type(s).
E.g. text, spreadsheets, images, 3D
models, software, audio files, video
files, reports, surveys, patient
records, samples, final or
intermediate numerical results from
theoretical calculations, etc. Also
defines data as: observational,
experimental, simulation, model
output or assimilation
Some details about data
types are included, but
DMP is missing details or
wouldn’t be well
understood by someone
outside of the project
No details
included, fails to
adequately
describe data
types.
All
Describes how data
will be collected,
captured, or created
(whether new
observations, results
from models, reuse
of other data, etc.)
Clearly defines how data will be
captured or created, including
methods, instruments, software, or
infrastructure where relevant.
Missing some details
regarding how some of
the data will be
produced, makes
assumptions about
reviewer knowledge of
methods or practices.
Does not clearly
address how
data will be
captured or
created.
GEO AGS,
GEO EAR SGP,
MPS AST
Identifies how much
data (volume) will be
produced
Amount of expected data (MB, GB,
TB, etc.) is clearly specified.
Amount of expected
data (GB, TB, etc.) is
vaguely specified.
Amount of
expected data
(GB, TB, etc.) is
NOT specified.
GEO EAR SGP,
GEO AGS
23 April 2015
Directorates
26
Directorate- or divisionspecific assessment criteria
General Assessment
Criteria
Performance Level
Addressed issue, but
incomplete
Did not address
issue
Performance Criteria
Complete / detailed
Describes what types
of data will be
captured, created or
collected
Clearly defines data type(s).
E.g. text, spreadsheets, images, 3D
models, software, audio files, video
files, reports, surveys, patient
records, samples, final or
intermediate numerical results from
theoretical calculations, etc. Also
defines data as: observational,
experimental, simulation, model
output or assimilation
Some details about data
types are included, but
DMP is missing details or
wouldn’t be well
understood by someone
outside of the project
No details
included, fails to
adequately
describe data
types.
All
Describes how data
will be collected,
captured, or created
(whether new
observations, results
from models, reuse
of other data, etc.)
Clearly defines how data will be
captured or created, including
methods, instruments, software, or
infrastructure where relevant.
Missing some details
regarding how some of
the data will be
produced, makes
assumptions about
reviewer knowledge of
methods or practices.
Does not clearly
address how
data will be
captured or
created.
GEO AGS,
GEO EAR SGP,
MPS AST
Identifies how much
data (volume) will be
produced
Amount of expected data (MB, GB,
TB, etc.) is clearly specified.
Amount of expected
data (GB, TB, etc.) is
vaguely specified.
Amount of
expected data
(GB, TB, etc.) is
NOT specified.
GEO EAR SGP,
GEO AGS
23 April 2015
Directorates
27
Directorate- or divisionspecific assessment criteria
General Assessment
Criteria
Performance Level
Addressed issue, but
incomplete
Did not address
issue
Performance Criteria
Complete / detailed
Describes what types
of data will be
captured, created or
collected
Clearly defines data type(s).
E.g. text, spreadsheets, images, 3D
models, software, audio files, video
files, reports, surveys, patient
records, samples, final or
intermediate numerical results from
theoretical calculations, etc. Also
defines data as: observational,
experimental, simulation, model
output or assimilation
Some details about data
types are included, but
DMP is missing details or
wouldn’t be well
understood by someone
outside of the project
No details
included, fails to
adequately
describe data
types.
All
Describes how data
will be collected,
captured, or created
(whether new
observations, results
from models, reuse
of other data, etc.)
Clearly defines how data will be
captured or created, including
methods, instruments, software, or
infrastructure where relevant.
Missing some details
regarding how some of
the data will be
produced, makes
assumptions about
reviewer knowledge of
methods or practices.
Does not clearly
address how
data will be
captured or
created.
GEO AGS,
GEO EAR SGP,
MPS AST
Identifies how much
data (volume) will be
produced
Amount of expected data (MB, GB,
TB, etc.) is clearly specified.
Amount of expected
data (GB, TB, etc.) is
vaguely specified.
Amount of
expected data
(GB, TB, etc.) is
NOT specified.
GEO EAR SGP,
GEO AGS
23 April 2015
Directorates
28
“Mini-reviews 1 & 2”
29
23 April 2015
30
Complete / detailed
Addressed issue, but incomplete
Describes what types of data will be captured, created or
collected
18
Identifies metadata standards or formats that will used for
the proposed project
4
Describes data formats created or used during project
14
Provides details on when the data will be made publicly
available
8
Describes how the data will be made publicly available
22
Describes security measures that will be in place to protect
the data from unauthorized access
8
Describes the policies or provisions in place governing the
use and reuse of the data
5
Describes the policies or provisions for redistribution of the
data
4
Describes policies or provisions for building off of the data,
such as through the creation of derivatives
3
Indicates whether or not the data will be archived
17
Describes plans for archiving and preserving digital data*
12
Plan discusses the types or formats of data the investigator
expects to retain in their possession*
23 April 2015
1
Did not address the issue
3
4
4
17
4
7
6
11
2 1
1
16
10
10
7
14
7
15
5
6
2
3
4
7
31
data sharing methods
Not planning to share data
0
Conference / proceedings
ETD
3
1
On request
9
Personal website
8
Book
Other data repository or method
1
7
National data center
3
Journal / supplement
Institutional repository
10
4
Did not specify
0
0
23 April 2015
2
4
6
8
10
12
32
To sum up…
Developing a rubric to empower academic
librarians in providing research data support
http://bit.ly/dmpresearch
@DMPResearch
33
34
Download