Linking biodiversity data with the Biological Collections Ontology Ramona Walls (iPlant Collaborative, University of Arizona) John Deck (University of California at Berkeley) Robert Guralnick (University of Colorado at Boulder) John Wieczorek (University of California at Berkeley) http://code.google.com/p/bco/ What it means to be an OBO Foundry Ontology • Shared commitment to creating a suite of interoperable ontologies that span the biological and biomedical domains – non-redundancy – re-use of existing terms • Adherence to OBO Foundry principles, including: – open access, willingness to collaborate – shared formats, relations, URIs, naming conventions – good documentation, single locus of authority • Access to OBO Foundry community resources – tools – expertise Scope of the BCO: Collections of organisms and their parts (museum or voucher specimens): Environmental samples: transect * * * depth Surveys, ecological observations: transect (within plot) sub-plot aliquot individual (within sub-plot) * * *sample collection point water sample at depth X plot individual (within plot) * metagenome Initial focus of BCO: tracking materials and data through sampling chains Museum specimens Digital image stored on Morphbank identification Genbank sequence Tissue sample at Smithsonian Institution Moorea Biocode bioinventory event Gut sample Metagenomic sequences at CAMERA portal BCO:material sampling process BCO:material sample Biocode Sampling Tissue sampling Insect specimen Tissue sample BCO:identifica tion process BCO:taxonomic name Identificatio n using key TaxonID A Identification using BLAST TaxonID B Sequencing Genbank sequence B DNA extraction DNA molecules KEY: rdfs:Class subclass of has specified output has specified input instance of derives from OBI:sequencin g assay OBI:sequenc e data Example data: processes Investigation Moorea Biocode Project Moorea Biocode Project Moorea Biocode Project Moorea Biocode Project Moorea Biocode Project Moorea Biocode Project Moorea Biocode Project Moorea Biocode Project Moorea Biocode Project Moorea Biocode Project Moorea Biocode Project Moorea Biocode Project Moorea Biocode Project Moorea Biocode Project Study Moorea insect inventory Moorea insect inventory Moorea insect inventory Moorea insect inventory Moorea insect inventory Moorea insect inventory Moorea insect inventory Moorea insect inventory Moorea insect inventory Moorea insect inventory Moorea insect inventory Moorea insect inventory Moorea insect inventory process ID Moorea Biocode project Moorea insect inventory insect collection 01 tissue sampling 01 insect gut sampling 04 insect gut sampling 05 dna isolation 01 dna isolation 04 insect observation 06 identification 01.1 identification 01.2 identification 04.1 identification 06.1 identification 07.1 process type has input has ouput date insect 01 insect 01 2010 insect 01 tissue sample 01 2010 insect 01 insect gut sample 04 2010 insect 02 insect gut sample 05 2010 planned process planned process material sampling process material sampling process material sampling process material sampling process material sampling process material sampling process tissue sample 01 DNA sample 01 insect gut sample 04 DNA sample 04 2010 insect in situ 06 image 01 2010 tax. iden. by morph. key insect 01 tax. iden. using dna barcode dna isolation 01 insect taxon 01 2010 insect taxon 01 2011 tax. iden. using BLAST morph. tax. identification morph. tax. identification dna isolation 04 microbial taxon 01 2010 image 01 insect taxon 01 2010 image 02 insect taxon 02 2011 observing process 2010 Example data: material entities and information artifacts Individual Type Inferred type insect 01 organism or virus or viroid material sample insect 02 organism or virus or viroid material sample tissue sample 01 organism part material sample tissue sample 02 organism part material sample insect gut sample 04 material entity material sample DNA sample 01 DNA material sample DNA sample 02 DNA material sample insect in situ 06 organism or virus or viroid material target of observation image 01 photographic image information artifact insect taxon 01 taxonomic name information artifact insect taxon 02 taxonomic name information artifact microbial taxon 01 taxonomic name information artifact microbial taxon 02 taxonomic name information artifact List all processes that took place in 2010 as part of the Moorea insect inventory BFO: process and BFO:part of occurent BCO_example:Moorea insect inventory and date=2010 Study Moorea insect inventory Moorea insect inventory Moorea insect inventory Moorea insect inventory Moorea insect inventory Moorea insect inventory Moorea insect inventory Moorea insect inventory Moorea insect inventory Moorea insect inventory Moorea insect inventory Moorea insect inventory Moorea insect inventory Moorea insect inventory Moorea insect inventory Moorea insect inventory Moorea insect inventory Moorea insect inventory Moorea insect inventory process ID insect collection 01 insect collection 02 tissue sampling 01 tissue sampling 02 insect gut sampling 04 insect gut sampling 05 dna isolation 01 dna isolation 02 dna isolation 04 dna isolation 05 insect observation 06 identification 01.1 identification 02.1 identification 04.1 identification 04.2 identification 04.3 identification 04.4 identification 04.5 identification 06.1 process type material sampling process material sampling process material sampling process material sampling process material sampling process material sampling process material sampling process material sampling process material sampling process material sampling process observing process tax. iden. by morph. key tax. iden. by morph. key tax. iden. using BLAST tax. iden. using BLAST tax. iden. using BLAST tax. iden. using BLAST tax. iden. using BLAST morph. tax. identification date 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 List the output (“has specified output”) of every “taxonomic identification process” that has as input (“has specified input”) the "insect 03". Study process ID process type Moorea insect inventory identification 03.1 tax. iden. by morph. key Study Moorea insect inventory Moorea insect inventory Moorea insect inventory process ID tissue sampling 03 process type material sampling process material sampling process tax. iden. using dna barcode dna isolation 03 identification 03.2 has specified input insect 03 has specified output insect taxon 01 has input insect 03 has ouput tissue sample 03 tissue sample 03 DNA sample 03 DNA sample 03 insect taxon 03 Future directions - technical • SPARQL endpoint with example queries – Check the BCO wiki (http://code.google.com/p/bco/) • Implement community curation tools such as Quick Term Templates or BioPortal – Requests can go to the Issue tracker now: http://code.google.com/p/bco/issues/list Future directions - ontological • Better integration with OBI and other ontologies • More sophisticated treatment of naming/taxonomy/identification • Ontological modeling of surveys/inventories • Mappings to DwC, MIxS, other vocabularies • Testing with real data sets Contributors: Steve Baskauf, Vijay Barve, Jim Beach, Reed Beaman, Matthiew Bietz, Stan Blum, Shawn Bowers, Pier Luigi Buttigieg, Neil Davies, Gabi Droege, Dag Endresen, Maria Alejandra Gandolfo, Robert Hanner, Alyssa Janning, Michelle Koo, Kris Krishtalka, John Kunze, Andréa Matsunaga, Peter Midford, Chuck Miller, Norman Morrison, Gil Nelson, OBI Developers, Éamonn O’Tuama, Cynthia Parr, Sujeevan Ratnasingham, Jai Rideout, Robert Robbins, Phillipe Rocca-Serra, Joel Sachs, Inigo San Gil, Herbert Schentz, Mark Schildhauer, Barry Smith, Peter Sterk, Steve Stones-Havas, Brian Stucky, Andrea Thomer, Mellisa Tulig, Dave Vieglais, Brian Wee, Trish Whetzel, Jamie Whitacre, Greg Whitbread, John Wooley Funding RCN4GSC: Research Coordination Network for Genomic Standards Consortium (DBI-0840989) IB3 EAGER: An Interoperable Information Infrastructure for Biodiversity Research (IIS-1255035) Questions? https://groups.google.com/forum/?fromgroups #!forum/bco-discuss