Heidorn

advertisement
Advisory Teleconference for NSF #0534567
Biological Information Specialists
October 18, 2007
Advisory Board:
Thomas Garnett, Smithsonian Institution Libraries
John Kress, Smithsonian Institute
Maryann Martone, University of California at San Diego
Chuck Miller, Missouri Botanical Garden
Neil Smalheiser, University of Illinois at Chicago
GSLIS:
Carole Palmer
Bryan Heidorn
John Macmullen
Jennifer Hill
Marc Snir
Allen Renear
Linda Smith
Staffing / affiliations update
●
Dan Wright took new position with Institute for
Genomic Biology for dissertation purposes
●
Jennifer Hill joined project as RA in late August
●
John MacMullen joined GSLIS faculty
●
Sayeed Choudhury GSLIS new research fellow
for 2007-2009
●
Associate Director for Library Digital Programs and Hodson Director
of the Digital Knowledge Center at the Sheridan Libraries of Johns
Hopkins University
Official approval of degree in mid-August!
Dissemination - publications
●
●
●
Palmer, Carole L., Heidorn, P. Bryan, Wright, Dan and Cragin, Melissa (2006).
Graduate Curriculum for Biological Information Specialists: A Key to Integration of
Scale in Biology. 2nd International Digital Curation Conference, Glasgow, Scotland,
November 21-22, 2006.
Heidorn, P. Bryan, Palmer, Carole L., and Wright, Dan (2007). Biological Information
Specialists for Biological Informatics. Journal of Biomedical Discovery and
Collaboration, 2:1. http://www.j-biomed-discovery.com/content/2/1/1
Carole L. Palmer, Melissa H. Cragin, P. Bryan Heidorn, Linda C. Smith (under
review). Data Curation for the Long Tail of Science: The Case of Environmental
Sciences. 3rd International Digital Curation Conference, Washington, USA,
December 11-13, 2007.
Dissemination – panels and posters
●
●
●
●
●
●
●
●
Heidorn, P. Bryan, Palmer, Carole L., and Wright, Dan. (2006). Building Biodiversity Information
Education: Next Generation Bioinformaticians. Proceedings of Taxonomic Databases Working
Group. P. 16. Abstract and presentation, St. Louis, Missouri, October, 2006.
Palmer, C. L. (2006). Making an impact on science: Moving from research to education. Digital
Archives for Science and Engineering Resources (DASER) Summit III and Tri Society
Symposium, November 3, 2006, Austin, Texas.
Heidorn, P.B., Palmer, C.L., Cragin, M.H., Smith, L.C. (2007). Data Curation Education and
Biological Information Specialists. DigCCurr2007: An international symposium on Digital
Curation, April 18-20, 2007, Chapel Hill, NC.
Heidorn, P.B., Palmer, C.L., Wright, D., Cragin, M.H. (2007) Information Specialist’s Training in
Biology. Botany & Plant Biology 2007, July 7-11, 2007, Chicago, IL.
Heidorn, P. Bryan, Wei, Qin (2007). Using New Technologies for Education and
Communication. Proceedings of Taxonomic Databases Working Group. Abstract and
presentation, Bratislava, Slovakia, September 2007.
Cragin, M. H., D’Avolio, L.; MacMullen, W. J.; Smith, C. A. (2007). The Effects of Context on
Data Quality in Biomedical Data Reuse. Panel for 2007 Annual Meeting of the American Society
for Information Science & Technology (ASIS&T), October 19-25, 2007.
Heidorn, P.B., Palmer, C.L., Cragin, M.H., Smith, L.C., Wright, D. (forthcoming). Biological
Information Specialist’s Training. Poster at Biocurator 2007, San Jose, CA, Oct 25-28, 2007.
MacMullen, W. John. (forthcoming). Measuring variation in curators' GO annotations through a
controlled multi-MOD study. Poster for Biocurator 2007, San Jose CA, October 25-28 2007.
Related Activities
●
Scientific Communications Initiative (SCI) core area in
Center for Informatics Research in Science and Scholarship
(CIRSS)
[not yet official website http://cirss.lis.uiuc.edu/]
●
●
–
Data Curation Education Program (DCEP) IMLS/RE-05-05-0036
–
Investigating Data Curation Profiles Across Research Domains
(partnering with Purdue University) IMLS/LG06-07-0032
Landinformatics conferences at UIUC- Sept 2007
Survey developed for UIUC Environmental Council on data
curation and management
–
●
To be distributed to ~330 natural and social scientists
Hosting 2010 6th annual International Digital Curation
Conference at UIUC
Potential curricular areas from last meeting
●
●
Promoting scientist awareness of data curation issues
Information access to nontraditional materials
●
Instrumentation data management
●
Familiarity with metadata standards
●
Data standards: scientific data formats, lower-level standards like UNICODE
●
Ontologies for bioinformatics
●
General workflow capture practices, lab and field notebooks in context of workflow
●
Literature-based discovery (LBD)
●
Synthesis of data from diverse sources (i.e., Genbank + PubMed)
●
Use of domain databases, i.e. Genbank, PDB
●
Digitization best practices
●
Information management (systems analysis, information consulting), including management of born-digital
data
●
Searching/reference skills
●
Familiarity with copyright and other intellectual property issues
●
Archiving, preservation, and data curation
Courses offered beginning fall 2006
Biodiversity Informatics (Heidorn): Examines the history and current state of biodiversity informatics with the
objective of understanding the impact of this information on national and global policy. The taxonomic and
functional diversity of organisms is an essential element of biodiversity that has been represented in resources
ranging from 14th century herbals, to the Global Biodiversity Information Facility. Biodiversity informatics is the
organization and study of information about biodiversity. In this course, we will examine how different
constituencies gather and present this information to meet their own, sometimes conflicting objectives. The
creation and dissemination of Biodiversity Information will be compared with information practices in other
fields such as genomics.
Ontologies in Natural Science (Renear): Explores the application of formal ontology and related information
modeling techniques in the natural sciences. In the current iteration (2006) we will focus particularly on the
biological and medical sciences. There are no specific prerequisites and the necessary background will be
presented as part of the course, but students should be prepared to make routine use of symbolic languages,
and, more generally, should be comfortable making their own way through difficult material that assumes
background knowledge or skills they may not have. Objective: A general acquaintance with the several
important applications of formal ontology and some more in-depth analysis of important cross-cutting issues.
Specific interests of the participants will be accommodated as far as possible.
Information Transfer and Collaboration in Science (Palmer): Examines the role of information in the production
of scientific knowledge. Building on a foundation of classic readings in scientific communication and
documentation, the course covers a range of contemporary research on scientific information, collaboration,
research practice, and informatics. The focus is on formal and informal information transfer and
communication as a social phenomenon and implications for collaborative science and e-science. The course
has been developed as part of the master's degree in bioinformatics and is also suited for doctoral students
and advanced master's students interested in professional development as science and medical information
specialists. The class will be conducted as a seminar centered on discussion of readings and student
interests. Students will lead class discussions, select and present overviews of relevant research studies, and
give updates on their own research projects. Assignments and projects will allow students to focus on an
established research interest or to explore new areas.
Data curation courses began fall 2007
Foundations of Data Curation (Cragin & MacMullen): Data curation is the
active and on-going management of data through its lifecycle of interest and
usefulness to scholarship, science, and education; curation activities and
policies enable data discovery and retrieval, maintain data quality and add
value, and provide for re-use over time. This course provides an overview of
a broad range of theoretical and practical problems in this emerging field.
Examines issues related to appraisal and selection, long-lived data
collections, research lifecycles, workflows, metadata, legal and intellectual
property issues.
Digital Preservation (McDonough)
Other BIS beginning spring 2008
●
●
Introduction to Biological Informatics Problems and Resources. Explores the current
landscape of biological informatics from the LIS perspective, including: types of problems
studied by biological scientists, methods and instruments used, and which problems have
informatics components; the range of data that exist; the uses of metadata, ontologies, and
controlled vocabularies; data manipulation tools; application software; specific tasks and
workflows; and data-driven science. Lecture, discussion, and hands-on components. 4 credit
hours.
Questions remain about how to best cover
–
Scientific workflow
–
Database design for scientific data curation
Summer 2007 data collection from scientists
9 interviews conducted for preliminary analysis:

Taxonomy, molecular simulation, ecoinformatics, animal
science, genomics, proteomics, agricultural science

All transcribed

Coding and analysis in progress

Participant profiles

Preliminary themes

Others fields that should be included?
Survey development for curriculum
●
●
Development of survey instrument for larger, diverse sample

Most important areas from what covered in the interviews?

What additional areas need to be covered
Next slides

Method for further developing response options? – i.e., tools

How define population and construct sample?
Tools identified
●
Analysis, Visualization
●
Build own
–
Statistical (SAS, R, etc.)
–
Excel
–
GIS
–
other
Lit searching
●
Google, Google Scholar
–
Web of Science, PubMed
–
Journal web sites
Collaboration tools
–
–
–
–
–
Wikis
Videoconferencing/Screensharing
email
Resources identified
●
●
●
●
GenBank
Taxonomic databases (tropicos, herpnet,
fishbase, etc.)
Protein databases (RESID, EBI, etc.)
Governmental (USGS, National Climate Data
Center)
●
NCBI
●
Many others
Population and sampling for representativeness
●
●
Defining population parameters
–
Fields
–
Scientific roles – faculty, research scientists, etc.
Sources for sample
–
Listservs
–
Professional associations
–
Other
Download