Phylogenetic distribution of potential cellulases in bacteria R. Berlemont & A. Martiny The cellulose degradation ? ? ? ? ? ? ? Project Goals 4 main questions : 1) How do microbial taxa respond to environmental changes? 2) How are extracellular enzyme genes distributed among microbial taxa? 3) Can we predict enzyme function and litter decomposition rates by combining enzyme gene distributions with microbial taxa responses to environmental change? 4) Are microbial communities and their functions resilient to environmental change? Analyze the Gene Content in Available sequenced bacterial genomes CAZy classification • • • • • Endo/Exo-cellulases, -glucosidases Glycoside Hydrolases, GH >100 GH families (14 folds) >100 activities multifunctional enzymes contain catalytic domains that belong to different GH families Mining the CAZy Atypical / barrel I 231 GH8 (/)6 I GH9 (/)6 β-jelly roll (/)6 (/)6 (/)6 GH12 GH44 GH45 GH48 2 6 (63) 16 13 4 538 55 26 I 468 75 66 R 144 22 21 1 (1) I I I 55 13 128 10 13 13 8 3 4 1 (4) 4 (33) 6 11 1 1 (1) (Berlemont et al, 2009) (de Jong et al, 2009) (Romling, 2002) (SEED) GH6 152 orthologs 1952 5 (43) 8 (50) Identified R 48 55 synthesis (-)8 1 5 Cellulose GH5 87 91 206 Other characterized activity (hits) 2890 3212 -glucosidase (3.2.1.21) R R Exocellulase (3.2.1.74/91/1 76) Hits (-)8 (-)8 Endocellulase (3.2.1.4) Mechanism GH1 GH3 Enzymes characterized * Structure Endo/exo - - GH family CAZy 9071 7105 1727 699 702 GH8 & 380 BcsZ 160 2095 1 580 3 Accessing the genomes • SEED environment • PATRIC database • GH : 1, 3, 5, 6, 8, 9, 12, 44, 45, 48 CAZy Vs. us Atypical / barrel I 231 GH8 (/)6 I GH9 (/)6 β-jelly roll (/)6 (/)6 (/)6 GH12 GH44 GH45 GH48 2 6 (63) 16 13 4 538 55 26 I 468 75 66 R 144 22 21 1 (1) I I I 55 13 128 10 13 13 8 3 4 1 (4) 4 (33) 6 1 1 (1) 9071 7105 1727 (Berlemont et al, 2009) (de Jong et al, 2009) (Romling, 2002) (SEED) GH6 152 orthologs 1952 5 (43) 8 (50) Identified R 48 55 synthesis (-)8 1 5 Cellulose GH5 87 91 206 Other characterized activity (hits) 2890 3212 -glucosidase (3.2.1.21) R R Exocellulase (3.2.1.74/91/1 76) Hits (-)8 (-)8 Endocellulase (3.2.1.4) Mechanism GH1 GH3 Enzymes characterized * Structure Endo/exo - - GH family CAZy 699 702 GH8 & 380 BcsZ 160 2095 11 CAZY : all the NCBI genomes (including genome fragments) – 9,631 genes SEED : Only the sequenced genomes (3711) are considered - 22,523 genes 1 580 3 Variations across phyla GH rich: Acidobacteria, Bacteroidetes, Digtyoglomi, Fibrobacteres, Thermotogae, Verrucomicrobia Lentisphaerae (Actinobacteria, Chloroflexi, Firmicutes) Variations across subphyla • • • • Actinobacteria 402 genomes Niches (soil : Streptomycetales) ‘GH-rich’ Strain specific GH distribution 16S rRNA phylogeny 0 GH 1 GH >1 GH GH associations Redundancy Complementarities Endocellulases Spearman Correlation Glycoside Hydrolase abundance Endo/exo - Synergistic model Group IIIa Group IIIb Exocellulases -glucosidases Group I BcsZ Group II GH associations • Nothing ! (can’t degrade cellulose) (21%) • No Group II – III : opportunists (Opp.) (44%) • 100% Gr.I and 0% Gr.II:Gr.III • putative Cellulose Degraders (pCD) (35%) • 100% GII/III and GI (94%) •pCD vs. Opp. ? GH associations Cellulose degraders and opportunists • Actinobacteria + Bacteroidetes + Proteobacteria + Firmicutes = 86 % Sequenced strains • putative Cell. Degr., Opportunists, in all phyla • 35 % pCD [16-56%] • G1pCD = (1.7-4.4) G1Opp. Opportunists put. cell. degr. Poor strains … and their life styles “Poor phyla, subphyla and strains” ? e.g. : Cyanobacteria RuBisCO Life-style vs. GH content Autotrophs Intracellular Actinobacteria Phylogeny vs. GH content Our first “delivrable” • 16s rRNA Phylogeny (Neighbor Joining) • Glycoside hydrolases based-clustering (Bray Curtis) • distance between the matrices, CADM (Mantel) Phylogeny vs. GH content ? GH – based clustering 16s rRNA phylogeny 402 Actinobacteria Conclusions • No information concerning the Activity of these enzymes • 3711 sequenced genomes ≠ natural population (pathogens) • Genomic perspectives – New “pipeline” for genes distribution/association – Tell me what is your GH content, I ‘ll tell you who you are! • Ecological perspectives – Functional redundancy in sequenced genomes (substrate, pH, …), – 44% opportunists, 35% putative cellulose degraders • Implications for ecosystems processes – -glucosidases reveal nothing about the cellulose degradation ! – More than 35% of the strains have endo/exo-cellulases ! – Cellulose degraders have multiple copies of these genes !