DNA Barcoding and the Consortium for the Barcode of Life David E. Schindel, Executive Secretary National Museum of Natural History Smithsonian Institution SchindelD@si.edu; http://www.barcoding.si.edu 202/633-0812; fax 202/633-2938 Data Analysis Working Group, DIMACS, 26 Sept 2005 Species Identification Matters • • • • • • • • Endangered/protected species Agricultural pests Invasive species Disease vectors/pathogens Hazards (e.g., bird strikes on airplanes) Environmental quality indicators Unsustainable harvesting Fidelity of cell lines/culture collections Data Analysis Working Group, DIMACS, 26 Sept 2005 The Practice of Taxonomy Distributions of Character Variation Characters Taxonomists Taxonomic DecisionMaking The Uses of Taxonomy Socioeconomic Decisions Concerns/ Regulations Specimens Specimens Data Analysis Working Group, DIMACS, 26 Sept 2005 The Problem… • Taxonomists are a limited resource • Taxonomic infrastructure is not widely available • Taxonomic decisions are difficult for nonspecialists • Therefore, the practice of taxonomy does not scale up to meet the needs of society (or ecology, ecosystem studies, etc.) Data Analysis Working Group, DIMACS, 26 Sept 2005 A DNA barcode is a short gene sequence taken from standardized portions of the genome, used to identify species Data Analysis Working Group, DIMACS, 26 Sept 2005 Uses of DNA Barcodes “Triage” tool for flagging potential new species: • Undescribed and cryptic species Research tool for assigning specimens to known species, including: • Life history stages, damaged specimens, gut contents, droppings Applied tool for identifying regulated species: • Disease vectors, agricultural pests, invasives • Protected species, CITES listed, trade-sensitive Data Analysis Working Group, DIMACS, 26 Sept 2005 The Mitochondrial Genome D-Loop Small ribosomal RNA Large ribosomal RNA Cyt b ND1 ND6 ND5 L-strand COI COI ND2 H-strand ND4 COI ND4L ND3 COIII COII ATPase subunit 8 ATPase subunit 6 Data Analysis Working Group, DIMACS, 26 Sept 2005 How much information is there in a DNA Barcode? • Human genome: – Contains 3 billion base-pairs – Identified by 648 bp COI barcode sequence – Content-to-label ratio: 5 X 106 • Oxford English Dictionary, 2nd Ed.,1989: – 20 volumes, 21,730 pages, 500,000 entries, 59 million words, 350 million print characters – Identified by 10-character ISBN – Content-to-label ratio: 4 X 107 Data Analysis Working Group, DIMACS, 26 Sept 2005 Current Norm: High throughput Large capacity PCR and sequencing reactions ABI 3100 capillary automated sequencer Data Analysis Working Group, DIMACS, 26 Sept 2005 Future Norm? • A taxonomic GPS • Link to reference database • Usable by nonspecialists. Data Analysis Working Group, DIMACS, 26 Sept 2005 Data Analysis Working Group, DIMACS, 26 Sept 2005 Consortium for the Barcode of Life (CBOL) • An international affiliation of: – – – – • • • • • 80+ Members Org’s, 35+ countries, 6 continents Natural history museums, biodiversity organizations Users: e.g., government agencies Private sector biotech companies, database providers First barcoding publications in 2002 Cold Spring Harbor planning workshops in 2003 Sloan Foundation grant, launch in May 2004 Secretariat opens at Smithsonian, September 2004 First international conference February 2005 Data Analysis Working Group, DIMACS, 26 Sept 2005 CBOL Member Organizations (as of May 2005) Data Analysis Working Group, DIMACS, 26 Sept 2005 CBOL’s Working Groups • Database: Designing/constructing the Barcode Section of GenBank • DNA: Protocols for formalin-fixed and old museum specimens; Producing LIMS for dissemination • Data Analysis: Beyond phenetic methods; population genetics perspective • Plants: Identify gene region(s) for barcoding Data Analysis Working Group, DIMACS, 26 Sept 2005 CBOL’s Goals • Create a reference barcode database • Identify high-priority taxa and societal needs • Promote/facilitate barcoding projects and ‘CBOL campaigns’ • Improve methods, address shared obstacles through WGs • Populate database from collections • More portability, less time/expense • Improve taxonomic research environment Data Analysis Working Group, DIMACS, 26 Sept 2005 Recent and Planned Activities • Data standards, Barcode records in GenBank • Launch of FishBOL, All Birds Initiatives • International Network for Barcoding Invasive and Pest Species (INBIPS) • APEC Workshop on Invasives, Beijing • Mosquitoes and disease vectors • Plans for CITES species, endangered Vertebrates, Bushmeat Data Analysis Working Group, DIMACS, 26 Sept 2005 Barcode Section of GenBank Specimen Metadata Georeference Habitat Character sets Images Behavior Other genes Other Databases Phylogenetic Pop’n Genetics Ecological Voucher Specimen Barcode Sequence Trace files Species Name Indices - Catalog of Life - GBIF/ECAT Nomenclators - Zoo Record - IPNI NameBank Literature Publication links - New species (link to content or citation) Data Analysis Working Group, DIMACS, 26 Sept 2005