presentation

advertisement
IASSIST: Wednesday, May 27, 2009
3:45pm - 5:15pm
Welcome to C4: Data Sharing
Across the Disciplines
Terrence Bennett, The College of New Jersey
Joel Herndon, Duke University
Shawn Nicholson, Michigan State University
Robert O’Reilly, Emory University
Scholarly Primitive
Clickstream
Acuity
Decisiveness
Seizes
Scavenger
Opportunistic; seldom attacking
IASSIST: May 27, 2009
Terrence Bennett, The College of New Jersey
Data Sharing Across the
DisciplineData sharing behavior
An empirical study
Data sharing behavior

Why do researchers share?
 Advance
scholarship and inquiry
 Comply with ethical imperatives
 Support open access

Why might researchers be reluctant to share?
 Need
for confidentiality
 Competitive advantage of secrecy
 Lack of infrastructure that supports sharing
 Too much trouble
IASSIST: May 27, 2009
Study: Data sharing in life sciences*


Surveyed trainees in life sciences (and compared
with computer science and chemical engineering)
Results were disturbing
 23%
were denied access to published data;
 21% were denied access to unpublished data
 8% had denied requests from others for access to data
 51% reported that withholding of data had a negative
effect on research progress
*Vogeli, C. et al. (2006). Data withholding and the next generation of scientists: Results of a national survey.
Academic Medicine 81(2), p. 128-136.
IASSIST: May 27, 2009
These results raise new questions


Are dissertators sharing?
Do dissertators in the life sciences share
better than their counterparts in the social
sciences?
IASSIST: May 27, 2009
Methodology




Searched PQDT database
Restricted to PhD dissertations
Limited to most recent five years
Used PQDT controlled subject index (5 disciplines):





Political Science
Cell Biology
Psychology
Biochemistry
Genetics
IASSIST: May 27, 2009
Methodology (continued)




Random sort of results from each discipline
Selected 12 from each discipline
N = 60 (not a multinational sample)
Coded for 9 variables related to presence of data
and availability of data for sharing
IASSIST: May 27, 2009
Research questions


Do abstracts and tables of contents accurately
indicate the presence of data?
What is the nature of the data collected?
 Origin
 Functional
category
Is data scarce? Valuable?
 Is data automated?
 Are there disciplinary differences regarding
dataset use, reuse, and availability?

IASSIST: May 27, 2009
Findings: abstracts and TOCs

Great variation in
the percentage of
author-supplied
abstracts that
indicate the use or
availability of
data collections
Poli Sci
25%
Data cited
Data not
mentioned
75%
Cell Bio
25%
75%
Data cited
Data not
mentioned
For detailed findings, be sure to visit us during the poster session!
IASSIST: May 27, 2009
Findings: data category*

Datasets are predominantly dissertation-specific
12
10
Research
8
Resource
6
Reference
4
2
Undefined / Other
0
Genetics
Cell Bio
Psych
Biochem
Poli Sci
*National Science Foundation (2005), The elements of the digital data collections universe. Ch. 2
(p. 17-23) in Long-lived digital data collections enabling research and education in the 21 st Century).
IASSIST: May 27, 2009
Findings: data automation
12
Multiple types of data; not
automated
10
Multiple types of data; highly
automated
8
Quantitative data, highly
automated
6
Text or other qualitative data,
highly automated
4
2
Quantitative data, not
automated
0
Text or other qualitative data,
not automated
Genetics Cell Bio
Psych
Unknown/undefined
Biochem
Poli Sci
IASSIST: May 27, 2009
Findings: data availability
Data destroyed
12
10
Not in document; web
search unsuccessful
8
6
Citation in document may
point to data
4
Citation to data in
document
2
Data in document
0
Genetics Cell Bio
Not available
Psych Biochem
Poli Sci
IASSIST: May 27, 2009
Conclusions

Dissertation datasets
tend to be configured
to serve only the
immediate need of
the dissertation; this
leads to interesting
questions for
archiving and
preservation.

Dissertators in the life
sciences may be slightly
better than their social
sciences counterparts in
depositing data in
repositories.
IASSIST: May 27, 2009
Conclusions

Highly automated
data collecting does
not lead to increased
data sharing, despite
strong theoretical
support for this result.

Very few dissertators
are embracing the
open data movement.
IASSIST: May 27, 2009
Further questions / next steps




Need stronger empirical data – larger sample;
more disciplines; not limited to dissertations
Implications for saving/preserving/disseminating
research data
Are disciplinary differences in data sharing
behavior inevitable?
What is the role of librarians in promoting data
sharing across the disciplines?
IASSIST: May 27, 2009
Download