Advisory Teleconference for NSF #0534567 Biological Information Specialists October 18, 2007 Advisory Board: Thomas Garnett, Smithsonian Institution Libraries John Kress, Smithsonian Institute Maryann Martone, University of California at San Diego Chuck Miller, Missouri Botanical Garden Neil Smalheiser, University of Illinois at Chicago GSLIS: Carole Palmer Bryan Heidorn John Macmullen Jennifer Hill Marc Snir Allen Renear Linda Smith Staffing / affiliations update ● Dan Wright took new position with Institute for Genomic Biology for dissertation purposes ● Jennifer Hill joined project as RA in late August ● John MacMullen joined GSLIS faculty ● Sayeed Choudhury GSLIS new research fellow for 2007-2009 ● Associate Director for Library Digital Programs and Hodson Director of the Digital Knowledge Center at the Sheridan Libraries of Johns Hopkins University Official approval of degree in mid-August! Dissemination - publications ● ● ● Palmer, Carole L., Heidorn, P. Bryan, Wright, Dan and Cragin, Melissa (2006). Graduate Curriculum for Biological Information Specialists: A Key to Integration of Scale in Biology. 2nd International Digital Curation Conference, Glasgow, Scotland, November 21-22, 2006. Heidorn, P. Bryan, Palmer, Carole L., and Wright, Dan (2007). Biological Information Specialists for Biological Informatics. Journal of Biomedical Discovery and Collaboration, 2:1. http://www.j-biomed-discovery.com/content/2/1/1 Carole L. Palmer, Melissa H. Cragin, P. Bryan Heidorn, Linda C. Smith (under review). Data Curation for the Long Tail of Science: The Case of Environmental Sciences. 3rd International Digital Curation Conference, Washington, USA, December 11-13, 2007. Dissemination – panels and posters ● ● ● ● ● ● ● ● Heidorn, P. Bryan, Palmer, Carole L., and Wright, Dan. (2006). Building Biodiversity Information Education: Next Generation Bioinformaticians. Proceedings of Taxonomic Databases Working Group. P. 16. Abstract and presentation, St. Louis, Missouri, October, 2006. Palmer, C. L. (2006). Making an impact on science: Moving from research to education. Digital Archives for Science and Engineering Resources (DASER) Summit III and Tri Society Symposium, November 3, 2006, Austin, Texas. Heidorn, P.B., Palmer, C.L., Cragin, M.H., Smith, L.C. (2007). Data Curation Education and Biological Information Specialists. DigCCurr2007: An international symposium on Digital Curation, April 18-20, 2007, Chapel Hill, NC. Heidorn, P.B., Palmer, C.L., Wright, D., Cragin, M.H. (2007) Information Specialist’s Training in Biology. Botany & Plant Biology 2007, July 7-11, 2007, Chicago, IL. Heidorn, P. Bryan, Wei, Qin (2007). Using New Technologies for Education and Communication. Proceedings of Taxonomic Databases Working Group. Abstract and presentation, Bratislava, Slovakia, September 2007. Cragin, M. H., D’Avolio, L.; MacMullen, W. J.; Smith, C. A. (2007). The Effects of Context on Data Quality in Biomedical Data Reuse. Panel for 2007 Annual Meeting of the American Society for Information Science & Technology (ASIS&T), October 19-25, 2007. Heidorn, P.B., Palmer, C.L., Cragin, M.H., Smith, L.C., Wright, D. (forthcoming). Biological Information Specialist’s Training. Poster at Biocurator 2007, San Jose, CA, Oct 25-28, 2007. MacMullen, W. John. (forthcoming). Measuring variation in curators' GO annotations through a controlled multi-MOD study. Poster for Biocurator 2007, San Jose CA, October 25-28 2007. Related Activities ● Scientific Communications Initiative (SCI) core area in Center for Informatics Research in Science and Scholarship (CIRSS) [not yet official website http://cirss.lis.uiuc.edu/] ● ● – Data Curation Education Program (DCEP) IMLS/RE-05-05-0036 – Investigating Data Curation Profiles Across Research Domains (partnering with Purdue University) IMLS/LG06-07-0032 Landinformatics conferences at UIUC- Sept 2007 Survey developed for UIUC Environmental Council on data curation and management – ● To be distributed to ~330 natural and social scientists Hosting 2010 6th annual International Digital Curation Conference at UIUC Potential curricular areas from last meeting ● ● Promoting scientist awareness of data curation issues Information access to nontraditional materials ● Instrumentation data management ● Familiarity with metadata standards ● Data standards: scientific data formats, lower-level standards like UNICODE ● Ontologies for bioinformatics ● General workflow capture practices, lab and field notebooks in context of workflow ● Literature-based discovery (LBD) ● Synthesis of data from diverse sources (i.e., Genbank + PubMed) ● Use of domain databases, i.e. Genbank, PDB ● Digitization best practices ● Information management (systems analysis, information consulting), including management of born-digital data ● Searching/reference skills ● Familiarity with copyright and other intellectual property issues ● Archiving, preservation, and data curation Courses offered beginning fall 2006 Biodiversity Informatics (Heidorn): Examines the history and current state of biodiversity informatics with the objective of understanding the impact of this information on national and global policy. The taxonomic and functional diversity of organisms is an essential element of biodiversity that has been represented in resources ranging from 14th century herbals, to the Global Biodiversity Information Facility. Biodiversity informatics is the organization and study of information about biodiversity. In this course, we will examine how different constituencies gather and present this information to meet their own, sometimes conflicting objectives. The creation and dissemination of Biodiversity Information will be compared with information practices in other fields such as genomics. Ontologies in Natural Science (Renear): Explores the application of formal ontology and related information modeling techniques in the natural sciences. In the current iteration (2006) we will focus particularly on the biological and medical sciences. There are no specific prerequisites and the necessary background will be presented as part of the course, but students should be prepared to make routine use of symbolic languages, and, more generally, should be comfortable making their own way through difficult material that assumes background knowledge or skills they may not have. Objective: A general acquaintance with the several important applications of formal ontology and some more in-depth analysis of important cross-cutting issues. Specific interests of the participants will be accommodated as far as possible. Information Transfer and Collaboration in Science (Palmer): Examines the role of information in the production of scientific knowledge. Building on a foundation of classic readings in scientific communication and documentation, the course covers a range of contemporary research on scientific information, collaboration, research practice, and informatics. The focus is on formal and informal information transfer and communication as a social phenomenon and implications for collaborative science and e-science. The course has been developed as part of the master's degree in bioinformatics and is also suited for doctoral students and advanced master's students interested in professional development as science and medical information specialists. The class will be conducted as a seminar centered on discussion of readings and student interests. Students will lead class discussions, select and present overviews of relevant research studies, and give updates on their own research projects. Assignments and projects will allow students to focus on an established research interest or to explore new areas. Data curation courses began fall 2007 Foundations of Data Curation (Cragin & MacMullen): Data curation is the active and on-going management of data through its lifecycle of interest and usefulness to scholarship, science, and education; curation activities and policies enable data discovery and retrieval, maintain data quality and add value, and provide for re-use over time. This course provides an overview of a broad range of theoretical and practical problems in this emerging field. Examines issues related to appraisal and selection, long-lived data collections, research lifecycles, workflows, metadata, legal and intellectual property issues. Digital Preservation (McDonough) Other BIS beginning spring 2008 ● ● Introduction to Biological Informatics Problems and Resources. Explores the current landscape of biological informatics from the LIS perspective, including: types of problems studied by biological scientists, methods and instruments used, and which problems have informatics components; the range of data that exist; the uses of metadata, ontologies, and controlled vocabularies; data manipulation tools; application software; specific tasks and workflows; and data-driven science. Lecture, discussion, and hands-on components. 4 credit hours. Questions remain about how to best cover – Scientific workflow – Database design for scientific data curation Summer 2007 data collection from scientists 9 interviews conducted for preliminary analysis: Taxonomy, molecular simulation, ecoinformatics, animal science, genomics, proteomics, agricultural science All transcribed Coding and analysis in progress Participant profiles Preliminary themes Others fields that should be included? Survey development for curriculum ● ● Development of survey instrument for larger, diverse sample Most important areas from what covered in the interviews? What additional areas need to be covered Next slides Method for further developing response options? – i.e., tools How define population and construct sample? Tools identified ● Analysis, Visualization ● Build own – Statistical (SAS, R, etc.) – Excel – GIS – other Lit searching ● Google, Google Scholar – Web of Science, PubMed – Journal web sites Collaboration tools – – – – – Wikis Videoconferencing/Screensharing email Resources identified ● ● ● ● GenBank Taxonomic databases (tropicos, herpnet, fishbase, etc.) Protein databases (RESID, EBI, etc.) Governmental (USGS, National Climate Data Center) ● NCBI ● Many others Population and sampling for representativeness ● ● Defining population parameters – Fields – Scientific roles – faculty, research scientists, etc. Sources for sample – Listservs – Professional associations – Other