BerlemontDOEmeeting1

advertisement
About the cellulases
distribution
Renaud Berlemont, UCI
Adam Martiny Lab.
About the GHx classification
•
•
•
CAZYdb  Glycoside Hydrolases, …
Structure – Sequences Alignements : Families (>100) / Clans (14)
« Convergence – Divergence »
Some statements
• Biochemically confirmed « cellulases » =
CMCases
Some statements
• Biochemically confirmed « cellulases » =
CMCases
• Many cellulases are active on other substrates
(e.g. xylan)
• Many « cellulases » are non-cellulolytic !?
• CMCases ≠ Cellulases
• Cellulose production :
–
–
–
–
–
GH8 (Romling, 2002) – Biofilm / Interaction (w. plant)
GH5 (Berlemont, 2009) - Biofilm
GH6 (Delbrassine, in prep) – Cell differenciation
GH6 (Tunicate, animal)
GH9 (KORrigan, plant)
Some statements
• Biochemically confirmed « cellulases » =
CMCases
• Many cellulases are active on other substrates
(e.g. xylan)
• Many « cellulases » are non-cellulolytic
• CMCases ≠ Cellulases
• Best studied cellulose degraders all belong to
the Firmicutes group (e.g. Clostridium)
• ~20 genomes of cellulose degraders have been
completely sequenced
Question 2
How are extracellular
enzyme genes distributed among
microbial taxa ?
Hypothesis 2a
Some extracellular enzymes are broadly distributed across
taxa while others are constrained to a small number of taxa.
Hypothesis 2b
The occurrence of different extracellular enzyme genes
among taxa will be correlated. Some genes will show
patterns of over-dispersion while others will show cooccurrence.
pSEED - FigFams
•
Sequenced genomes (patricbrc db - 4089)
In order to analyze as much as possible sequenced genomes
pSEED - FigFams
« FIGfams are sets of protein sequences that are similar along the full
length of the proteins. Proteins are thought of as implementing one or
more abstract functional roles, and all of the members of a single FIGfam
are believed to implement precisely the same set of functional roles ».
« Unambiguous coherent annotation system » …
3.2.1.4 : 1,4-beta-D-endoglucanase, 1,4-beta-D-glucan-4-glucanohydrolase,
beta-1,4-endoglucan hydrolase, beta-1,4-endoglucanase, endoglucanase,
Methodology
CAZYdb
GH families
E.C. 3.2.1.4
GHx
Pfam (pro. + euk.)
InterPRo (pro.)
PfGHx.FASTA
IprGHx.FASTA
Home-made Script :
SEQ  PEG ID
Figfam IDs
pSEED
PEG IDs
FigFam IDs
Several Figfam IDs correspond
To one GHx families because Signal
Peptides and accessory domains
Are not conserved …
Methodology
GHx
pSEED
FigFam IDs
Genomes Annotations
Figfam IDs
GHx Occurrence
In
Sequenced genomes
Bacterial
CBM2
groups
Bacterial Occurrence /
groups
List
Bacterial Occurrence /
groups
List
…
…
Statistic
Genomes annotations (pSEED)
Alignment
GHx distribution
A huge data-set
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
Huge bias : A + C + M + R = 88% of the sequenced genomes…
Actinobacteria
Aequfacie
Bactero./Chlorobi
Chlam./ Verruco.
Chloroflexi
Chrysiogenetes
Cyanobacteria
Deferibacter
Deinoco./Thermus
Dictyoglomi
Elusomicrobia
Fibrob./ Acidobact.
Firmicutes
Fusobacteria
Nitrospirae
Gemmatimonadetes
Planctomyces
Proteobacteria
Spirochaetes
Synergistetes
Tenericutes
Thermodesulfobact.
Thermotogae
Average Gene Content (AGC)
Life style (Auto Vs. Hetero)
Host association
…
“HKG”
Multi-function
…
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
Actinobacteria
Aequfacie
Bactero./Chlorobi
Chlam./ Verruco.
Chloroflexi
Chrysiogenetes
Cyanobacteria
Deferibacter
Deinoco./Thermus
Dictyoglomi
Elusomicrobia
Fibrob./ Acidobact.
Firmicutes
Fusobacteria
Nitrospirae
Gemmatimonadetes
Planctomyces
Proteobacteria
Spirochaetes
Synergistetes
Tenericutes
Thermodesulfobact.
Thermotogae
GHx distribution in Genomes
Life Style
Autotrophic :
Aequifacie
Cyanobacteria
Chrysiogenetes
Nitrospirae
Host associated:
Chlam./ Verruco.
Elusomicrobia
Fibrob./ Acidobact.*
Fusobacteria
Spirochaetes
Tenericutes
GHx distribution in Genomes
GHx functions
« house keeping »
GH6
endoglucanase ; cellobiohydrolase
GH18
… endo-β-N-acetylglucosaminidase …
Q: Planctomycetes
U: Tenericutes - Mycoplasma
GHx distribution in Genomes
GHx functions
GHx families « specialization »
GH6
endoglucanase ; cellobiohydrolase
GH5
chitosanase ; β-mannosidase ; cellulase
; glucan β-1,3-glucosidase ; licheninase
; glucan endo-1,6-β-glucosidase
mannan endo-β-1,4-mannosidase ; endo-β-1,4-xylanase
; cellulose β-1,4-cellobiosidase ;
β-1,3-mannanase ; xyloglucan-specific endo-β-1,4-glucanase
; mannan transglycosylase ;
endo-β-1,6-galactanase ; endoglycoceramidase
How is it possible to know if an
Enzyme from the GH5
is a cellulase?
Complex architectures
GH5
chitosanase (EC 3.2.1.132); β-mannosidase (EC 3.2.1.25); cellulase
(EC 3.2.1.4); glucan β-1,3-glucosidase (EC 3.2.1.58); licheninase
(EC 3.2.1.73); glucan endo-1,6-β-glucosidase (EC 3.2.1.75)
mannan endo-β-1,4-mannosidase (EC 3.2.1.78); endo-β-1,4-xylanase
(EC 3.2.1.8); cellulose β-1,4-cellobiosidase (EC 3.2.1.91);
β-1,3-mannanase (EC 3.2.1.-); xyloglucan-specific endo-β-1,4-glucanase
(EC 3.2.1.151); mannan transglycosylase (EC 2.4.1.-);
endo-β-1,6-galactanase (EC 3.2.1.164);
endoglycoceramidase (EC 3.2.1.123)
GH6
endoglucanase (EC 3.2.1.4); cellobiohydrolase (EC 3.2.1.91)
5500
5000
4500
4000
3500
3000
2500
2000
1500
1000
GH6*, GH6-CBM2 distribution in sequenced genomes
Associated to the cellulose production
In actynomycetes !
GH5*
800
GH5-CBM2
GH6*
GH6-CBM2
600
400
150
?
100
PEGs
PEGs
GH5*,
GH5-CBM2
distribution
sequenced
genomes
Free
cellulases
from inthe
GH6 are
125
100
75
50
50
25
0
0
A B C D E F G H
I
J K L M N O P Q R S T U V W
Bacterial Groups
A B C D E F G H
I
J K L M N O P Q R S T U V W
Bacterial Groups
Is there an efficient combination of
enzymes ?
Is there an efficient combination of
enzymes ?
Some genes
are abundant
(GH5, 10, 16, 18, 19)
Are these genes
really involved in
PCW breakdown ?
Multi-domain
Why are Fibrobacteria so
Efficient ?
Is there an efficient combination of
enzymes ?
The keys of the succes in Fibrobacteria
Things to remember…
• Huge dataset
• Distribution of GHx amongst taxa
• Not all the GHx are equivalent
– Multifunction, house keeping and specialized
GHx families
• Not all the taxa are equivelent
– Life style, metabolism
• Future : « Multi-domain »
What’s next
Looking at the GHx-distribution in subgroups
(e.g Proteobacteria, Firmicutes, …)
 Detailed table of the GHx distribution
amongst (sub)-taxa
Potential publication ?
• What is the phylogenetic distribution of GHx’s
and CBM-GHx’s
• Catabolism regulation analysis in Actynobacteria
CebR (GHx vs CBM-GHx) :
– Presence/absence of regulating sequences upstream
the GHx-coding sequences
• Environmental factors : “life style”, “metabolism”,
…
• Gene Gain/loss : 16S rRNA Vs.
presence/absence of GHx’s
Do the cellulose degradation
potential vary in environment ?
Some cases studies …
GHx distribution in metagenomes
% of
CBM linked
GHx
Warnecke 2007
Spirochaetes, Fibrobacter,
Bacteroidetes, …
Hess 2011
Bacteroidetes, Fibrobacteria,
Clostridia, …
…Vs. Our study
Percent of hits to bacterial SSU-rRNAsequences
Using the SSU…
120
Fibrobacter/Acidobacter
Bacteroidetes
Cyanobacteria
Firmicutes
GammaProteobacteria
BetaProteobacteria
AlphaProteobacteria
ActinoBacteria
Others
100
80
60
40
20
0
L1
L2
L3
L4
L5
L6
PL
…Vs. Our study
Reno 2012 (probably)
Actinobacteria, Alphaproteobacteria,
Bacteroidetes, …
Warnecke 2007
Spirochaetes, Fibrobacter,
Bacteroidetes, …
Hess 2011
Bacteroidetes, Fibrobacteria,
Clostridia, …
Metagenomes Clustrering
16S rRNA
GHx
GOS
GOS
Leaf Litter
Leaf Litter
?
Leaf Litter (tr. 1)
Leaf Litter (tr. 1)
Leaf Litter (tr. 2)
Leaf Litter (tr. 2)
Cow Rmuen
Cow Rmuen
Termites
Termites
Wood feeding insects
Wood feeding insects
Human metagenome
Human metagenome
Environment selects for different populations (with different GHx)
Things to remember…
• Different recipes for efficient PCW breakdown
• Depending on the ecosystem
• Leaf litter ≠ Cow Rumen
– Bacterial content
– GH content
• Regarding the ecosystems, bacteria display
different strategies to access plant polymers
– [GH6, GH8, GH9]LL > [GH6, GH8, GH9]CR
– [CMB-GHx]LL > [CBM-GHx]CR
What’s next
• Leaf Litter Metagenome
– 22 samples ~ready to be sequenced
(TruSeq TM DNA -Illumina) (first year)
– samples to be prepared (second year)
– Compare :
[GHx/16s rRNA in sequenced genomes]
vs.
[GHx/16s rRNA in Leaf Litter]
– Compare different treatments, metagenomes
Nitrogen fertilization
Nemergut, 2008, The effects of chronic nitrogen fertilization on alpine
tundra soil microbial communities: implications for
carbon and nitrogen cycling.
control
GHz
16S rRNA
control
GHz
GHy
GHx
16S rRNA
24 samples
• TruSeq TM DNA (Illumina)
• 24 samples
• 22 samples ready to be sequenced
Complex architectures
CBM2
Cel5
Cel5
CBM2
Xyl8
Cel5
Amount of FigFam IDs
corresponding to a 2-domain protein
Amount of FigFam IDs ≠ Amount of
genes
Metagenomes Clustrering
16S rRNA
GOS
GHx
Leaf Litter
Leaf Litter
Leaf Litter (tr. 1)
Leaf Litter (tr. 1)
Leaf Litter (tr. 2)
GOS
Leaf Litter (tr. 2)
Cow Rmuen
Cow Rmuen
Termites
Termites
Wood feeding insects
Wood feeding insects
Human metagenome
Human metagenome
Environment selects for different GHx potential
Download