Data Reuse Practices among Zoologists

advertisement
Data Reuse Practices among Zoologists: When Materiality Matters
Elizabeth Yakel, Ph.D.
Zoologists regularly reuse digital data about specimens as well as the specimens themselves.
However, we have no understanding of the circumstances surrounding why zoologists select
digital data versus viewing the actual specimen. For example, is the digital data missing or
unclear, is the data needed not contained in the standard metadata required, or does the research
require reexamination of the actual specimen? These questions are important for the
development of zoological databases, metadata standards, as well as for museums when deciding
what data to share with national and international repositories, such as HerpNet, Genbank,
FishNet, or the Global Biodiversity Information Facility (GBIF).
Ilerbaig (2010) argues that specimens are records, but so are the databased records representing
specimens held in natural history databases. These representations differ widely from
abbreviated descriptions to those with rich metadata sometimes paired with images or longer
DNA sequencing information. Similarly, although we have some evidence that social
expectations surrounding these databases differ (Bowker 2005; Bourne 2005; Costello et al.
2013), we know relatively little about the dynamics of data reuse (McLaughlin et al. 2001;
Pereira 2013; Stoltzfus et al. 2012; Wickett et al. 2012), particularly how and when zoological
researchers move between databases and actual specimens when reusing data.
This research project will enhance our knowledge of natural history collections in museums and
the strengths of the data practices of repositories holding information about the specimens in
these collections and advance understanding in multiple domains. Findings will contribute to the
social study of science, focusing on why data reuse practices differ among zoologists. They also
will enrich knowledge about how well and when surrogates can stand in for the actual biological
specimens. Finally, the findings will contribute to our knowledge of cyberinfrastructure in
zoology by informing the design of tools and services to better support research and data
curation practices.
Student Role:
The student will serve as a research assistant for the project and be involved in research meetings
with my other students working on other projects. In this way s/he will peripherally participate
in a research team and be exposed to a variety of projects. For the “When Materiality Matters”
project, the student will reanalyze 33 interviews with and observations of zoologists originally
collected by the Dissemination Information Packages for Information Reuse (DIPIR) project.
Using NVivo, a qualitative data analysis application, the student will be guided by the mentor in
developing a coding scheme, coding and developing interrater reliability, and analysis of the
results to better understand how the decisions about the information selection, particularly
databased information versus actual specimens, are made. Even though the student will largely
be analyzing existing data, I will provide the opportunity for the student to collect some
additional data on this project (interviews and observations) in order for the student to develop
data collection (particularly interview) skills.
Contribution to Student Academic and Professional Development. The student will gain
experience with collecting, analyzing, and presenting qualitative data and what is required to
develop a publishable scholarly paper.
Mentoring Plan:
I will meet with the student several times on a weekly basis throughout the project. Additionally,
I will monitor the student closely during several key phases of the project, including coding,
interviewing, and observing data (specimen) reuse in museums. I will give the student
constructive and iterative feedback on all aspects of the research. Finally, I will involve the
student in the development of the final article on this topic.
References:
Bourne, P. E. (2005). Will a Biological Database be Different from a Biological Journal. PLoS
Computational Biology, 1(3), 179–181. doi:10.1371/journal.pcbi.0010034
Bowker, G. C. (2005). Databasing the World: Biodiversity and the 2000s. In Memory Practices
in the Sciences (pp. 107–136). Cambridge, MA: MIT Press.
Costello, M. J., Michener, W. K., Gahegan, M., Zhang, Z.-Q., & Bourne, P. E. (2013).
Biodiversity Data Should Be Published, Cited, and Peer Reviewed. Trends in Ecology &
Evolution. doi:10.1016/j.tree.2013.05.002
Ilerbaig, J. (2010). Specimens as Records: Scientific Practice and Recordkeeping in Natural
History Research. American Archivist, 73(2), 463–482.
McLaughlin, R. L., Carl, L. M., Middel, T., Ross, M., Noakes, D. L. G., Hayes, D. B., & Baylis,
J. R. (2001). Potentials and Pitfalls of Integrating Data From Diverse Sources: Lessons
from a Historical Database for Great Lakes Stream Fishes. Fisheries, 26(7), 14–23.
Pereira, S. (2013). Motivations and Barriers to Sharing Biological Samples: A Case Study.
Journal of Personalized Medicine, 3(2), 102–110. doi:10.3390/jpm3020102
Stoltzfus, A., O’Meara, B., Whitacre, J., Mounce, R., Gillespie, E. L., Kumar, S., and Vos, R. A.
(2012). Sharing and re-use of phylogenetic trees (and associated data) to facilitate
synthesis. BMC Research Notes, 5(1), 574. doi:10.1186/1756-0500-5-574
Wickett, K. M., Sacchi, S., Dubin, D., & Renear, A. H. (2012). Identifying content and levels of
representation in scientific data. Proceedings of the American Society for Information
Science and Technology, 49(1), 1–10. doi:10.1002/meet.14504901199
Download