Living Large: Elucidation of the Frankia EAN1pec Genome Sequence Shows Gene Expansion and Metabolic Versatility Louis S Tisa1, David R Benson2, Gary B. Smejkal4, Pascal Lapierre2, J. Peter Gogarten2, Philippe Normand5, M. Pilar Francino3, and Paul Richardson3 1Dept. Microbiology , U New Hampshire, Durham, NH, USA; 2Dept. Mol. Cell Biol., U Connecticut , Storrs, CT, USA; 3 JGI, Walnut Creek, CA, USA, 4 Pressure Biosciences, Inc, Bridgewater, MA, USA, 5 Ecologie Microbienne UMR CNRS 5557, Université Lyon, Villeurbanne, France Actinorhizal Symbiosis • Symbiotic association between Frankia and woody dicotyledonous plants – results in formation of root nodules • over 250 species of Actinorhizal plants Frankia • Member of the Actinomycetales • Hyphal bacteria – 67-72% GC – generation time 24-48 h • Structures – vesicles – spores in planta vesicle clusters Bar = 10 µm Three Frankia genotypes Betulaceae Myricaceae Casuarinacee Elaeagnuaceae Rhamnaceae Myricaceae Gymnostoma Coriaiaceae Datiscaceae Rosaceae Ceanothus Why Sequence CcI3 and EAN1pec? CcI3 • • • • Metabolism Member of Group I Narrow Host range Markers: KanR,GenR,KasR,NalR AsO43- EAN1pec • Diverse metabolism • Member of Group III (globally distributed) • Broader Host range Markers: NovR, LinR,KasR,NalR, AsO43-,Pb2+ and CrO42+ • Limited genetics Surprise One: Three different genome sizes ACN 6783 CDS 2 rRNA 72.8% GC CcI3 4515 CDS 2 rRNA 70.1% GC EAN 7492 CDS 3 rRNA 71.0% GC Circular Topology Comparison of the CDS Frankia ACN14a reciprocal blast search with a cutoff of 10-4. 2730 1190 630 2291 Frankia CcI3 1333 587 3725 Frankia Ean1pec Comparative distribution of ORF function Frankia sp. CcI3 1400 COG Functional Groups 1200 Frankia sp. EAN1pec Acidothermus cellulolyticus 11B Arthrobacter sp. FB24 1000 Kineococcus radiotolerans SRS30216 800 Streptomyces coelicolor A3(2) 600 400 200 AA M et ab Ca rb M et ab Ce ll C yc le Ce ll M ot Ce ll W Ch al ro l m a St ru Co c en z Tr an s De fe ns e En er gy Fu nc Un k G en Fu nc Io n Tr an In tra s ce ll T ra ff Li pid M et ab Nu c St ru c Nu c M Po et st ab Tr an sM od RN A Pr Re oc pl ic Re Se pa co ir nd M et ab Si gn al Tr an s Tr an sc r Tr an sl 0 The EAN Genome is Expanding 1500 CcI3 ACN EaN 1355 Gene Duplication Level is higher in EAN (18.5% of the ORFs) than ACN (7.5 %) or CcI3 (9.8 %) 1000 500 0 T+I Duplicates ORFans -500 CcI3 has an accelerated rate of gene loss compared to EAN and ACN -1000 -1500 1054 What are the Major families of duplicated genes in these Frankia strains? BlastClust (NCBI) analysis 25% identity over at least 40% of the length (30% identity /52% length same result) An analysis of the Top 20 duplicated gene families showed major differences in functional groups CcI3 (165/444) • 116 out 165 (70%) duplicated genes belonged to several classes of transposases and genes associated with prophage and plasmids EAN (406/1355) • Transport proteins, Dioxygenases, Short chain dehydrogenases/reductases (SDR), Regulatory proteins, cytochrome P450, monooxygenases • also like CcI3 132 out of 406 (32.5%) genes associated with integrases, transposases ACN (151/512) • Transport proteins, SDR,serine-threonine protein kinases, methyltransferases, endonucleases, & a variety of dehydrogenases • no transposases in 151 genes of the top 20 families CcI3 •loss of genes associated with transport and metabolism Present day native distribution of actinorhizal plant hosts. a, ACN: Betulaceae (orange) Myricaceae (green) and their overlap (khaki). c, CcI3: Casuarina and Allocasuarina of the Casuarinaceae (light blue). b, EAN :Elaeagnaceae (pink), Myricaceae (green) Rhamnaceae (blue, Tribe Colletieae in South America, Australia and New Zealand). Areas of overlap are brown and dark blue). What about genes identified as potentially involved in Symbiosis? • • • • • Nitrogenase components Hopanoid biosynthesis Uptake Hydrogenase biosynthesis Hemoglobin Nodulation Nitrogenase Cluster for EAN1pec Synteny hypothetical proteins 3 Fd genes NifS NifB NifZ NifW NifX, NifN, NifE NifV (homocitrate synthase) is located in another region of the chromosome NifK, NifD, NifH NifV MaGe site Frankia Vesicles • Laminated hopanoid lipids • Restrict oxygen diffusion • N2 fixation can occur “freeliving” Parsons et al 1987 Berry et al. PNAS 1993 Cluster I: shcI TetR putative phytoene DH Polyprenyl synthetase Squalene/phytoene synthase shcI EAN & ACN extra shcI gene amine oxidase http://img.jgi.doe.gov/ cgi-bin/pub/main.cgi Surprise Two: Potential symbiosis genes are not clustered katA • • katG • • • HbO shc2 sodF nifV HbN • • • • • • • cluster I: shc1 FRAEA6946-6954 cluster II: hup2* FRAEA4081-4086 cluster III: hup1 FRAEA2955-2965 cluster IV: nif, FRAEA8447-8463 HbO, FRAEA6420 HbN, FRAEA4419 shc2, FRAEA5736 katA,FRAEA8358 sodF,FRAEA4204 nodB-like FRAEA6279, NifV, FRAEA4890 Transcription Analysis of Two Frankia hemoglobins HboO expression is up-regulated under hypoxic conditions HboN expression is up-regulated by NO release Nitrogen status did not significantly affect expression Why the large genome (9.1 Mb) for Frankia EAN1pec? • many soil dwellers have large genomes (Streptomyces, Bradyrhizobium, Burkholderia, etc. • these “boy-scouts” are always prepared for changing conditions of the soil environment – wide array of substrates (uptake systems) – need for tight regulation Why the large genome (9.1 Mb) for Frankia EAN1pec? • many soil dwellers have large genomes (Streptomyces, Bradyrhizobium, Burkholderia, etc. • these “boy-scouts” are always prepared for changing conditions of the soil environment – wide array of substrates (uptake systems) – need for tight regulation Metabolism • Complete Embden-Meyerhof, TCA and Pentose Phosphate pathways • wide arsenal of transport genes • large numbers of genes for short chain dehydrogenase/reductase, dioxygenase, etc. Regulatory mechanisms • • • • • Large number of DNA binding proteins Two-component systems Sigma Factors Anti-sigma Factors Anti-sigma Factor Antagonists Is Frankia EAN1pec versatile? Quercetin Catechol DNA Regulatory Proteins 160 Frankia sp. EAN1pec Frankia sp. CcI3 Acidothermus cellulolyticus 11B Arthrobacter sp. FB24 Kineococcus radiotolerans SRS30216 Streptomyces coelicolor A3(2) 140 120 100 80 60 40 20 0 ArsR DeoR AraC AsnC GntR IclR LacI LuxR LysR Lrp MerR TetR Vesicle development is influenced by: a. N status b. Oxygen c. Mo & Fe d. Calcium e. Temperature f. host plant Proteome profiles of Frankia CcI3 grown under N2 or NH4Cl conditions. Arrow point out N2-grown specific proteins. Search for VesicleSpecific Proteins Two-dimensional gel electrophoresis of vesicle proteins isolated by Pressure Cycling Technology Purified Vesicles Perspectives • Frankia genome expansion and contraction reflects biogeographic history of symbioses • No “symbiosis islands” • The time is right for functional genetics – Proteomic Profiles – Transcriptome Profiles (DNA arrays) – Genetics Acknowledgements This work was supported by: USDA Hatch grant 486; USDA 2003-01127; NSF EF-0333173; DOE Microbial Genome Program TISA LAB: Tania Rawnsley, James Niemann, Teal Furnholm, Nick Beauchemin, Joanne Coulburn, Anna Myers Arnab Sen (U. North Bengal) UConn David Benson Peter Gogarten UMaine John Tjepkema ULyon Philippe Normand JGI Pilar Francino, Alla Lapidus Paul Richardson, Chris Detter, UNH CSB Vern Rienhold PCT Gary Smejkal All of the Frankia community The PULSE Tube used in Pressure Cycling Technology facilitates high efficiency lysis of cells and subcellular components Isolation of proteins from Frankia mycelium and vesicles by PCT