CAZy: Integrating Information on Carbohydrate-Active

advertisement
Carbohydrate-Active
Enzymes in
Melampsora laricispopulina
Brandi Cantarel, Bernard
Henrissat, Pedro M. Coutinho
Architecture et Fonction des Macromolécules Biologiques
(UMR 6098)
CNRS / Aix-Marseille Université, France
1st Melampsora Genome Consortium Workshop,
Nancy (Aug/08)
Outline
CAZY Database and Website
Genome Annotation and Comparative
Genomics
Annotation Highlights from Melampsora
laricis-populina
Interpretation and Speculation
Carbohydrate Active enZymes
(CAZymes)
• Adhesion
1991 Glycoside Hydrolases (112)
• Recognition
• Glycosidases
cleave
• Transglycosidases
form
1997 Glycosyltransferases (91)
(NDP-, NMP-, lipid-phosphorylases)
• Selectivity
form
1998 Polysaccharide Lyases (19)
cleave
1999 Carbohydrate Esterases (15)
modify
2000 Carbohydrate-Binding Modules (52)
© Coutinho & Henrissat, 2008
PDB
accessions
Subfamily
Name of protein
EC number
CAZy
Organism
@ AFMB since September 1998
GenBank
accessions
www.cazy.org
UniProt
accessions
CAZY Database and Website
Genome Annotation and Comparative
Genomics
Annotation Highlights from Melampsora
laricis-populina
Interpretation and Speculation
CAZy: Carbohydrate-Active
EnZymes Database
www.cazy.org
Sequences/Structures:
GenBank; UniProt; PDB
Genome Sequence
BLAST
HMMER
Specialized
Library of
Modules
CAZy
Sequences
Modular
Annotation
Family Annotation
Mechanism;
Structure;
Function
Biochemical Data:
Literature; PubMed;
EMP; PMD; Other
Individual
CAZyme
Annotation
Annotating CAZymes
Function Prediction is a major bottleneck
• Common Genome Annotation Practices
•
Sequence Similarity ~ Specific Functional Prediction (≠)
•
Erroneous annotation are propagated
•
Original error(s) difficult to track
• Conservative Practices
•
Sequence Similarity = Family inclusion
•
Catalytic machinery checked for borderline cases
•
Functional assignment based on literature
•
Prediction based on subfamily analysis
Annotation and Comparisons
 CAZy - Biochemical Bioinformatics:

Correlation of data w/ biochemical databases

Manual Literature Curation

Text correlation / mining
 CAZy – Phylo -Genetics / -Genomics:

Identify Orthologs and Paralogs

Identify Analogs -- Convergent Evolution

Distinguish close / remote relationships
 Enzyme discovery in a Single Genome
 Search and list all the CAZymes
 Infer Properties (Mechanism / Fold) from Families
 Infer Function from SubFamilies and Known
Biochemically Characterized Cases
 Compare CAZyme content of Multiple Genomes
 Correlate CAZyme content with Lifestyle
 Discover singularities in Genomes
 Understand Genome Evolution
© Coutinho, Danchin & Henrissat, 2007
CAZy: On the Genomic Scale
Annotations of CAZymes in
Genomes
 Modular Annotation
 Identify modules
 Identify gene models with major problems
 (large truncations, insertions, frameshifts, etc)
 Identify Signal peptides, Linkers, GPI-anchors, TMs

Functional Annotation


Sequence similarity to characterized enzymes
Make use of Subfamilies with characterized
enzymes for reliable annotation




Characterized in the literature
Provide annotations that will “age well”
Several Levels / Categories:

Know Cases (++) :EC activity assignment

High Similarity (+) : “candidate” activity

Medium Similarity (-) : “related to”

Low Similarity (--) : “distantly related to” (taxon) activity
Interpretation



Analogies with better characterized genomes
Singularities in enzyme distribution
Interaction with Consortia Biologists
CAZY Database and Website
Genome Annotation and Comparative
Genomics
Annotation Highlights from Melampsora
laricis-populina
Interpretation and Speculation
Sequence Similarity based Modular
Analysis of CAZymes
Genome
Sequences
Filter against CAZY
Sequences using BLASTP
CAZymes
Identify Modular Structure using
HMMs of Modular Families
Modular
Annotation
CAZyModO : Genomic entry
(1.ModO; 2.Function)
Modularity in a Genome:
Melampsora laricis-populina
SS-based
Functional
Analysis
of
CAZymes
© Coutinho & Henrissat, 2007
Activities in a Genome:
Melampsora laricis-populina
Fungal CAZymes : M_lari vs Global Trends
GH
GT
PL
CBM
LifeStyle
S_cere
A_nige
A_oryz
45
67
0
12
Saprophite
239
109
8
40
Saprophite
283
114
21
33
Saprophite
B_fuck
223
92
9
64
PhytoPath.
T_mela
91
96
3
25
M_gris
Symbiont
PhytoPath.
231
92
4
63
H_jeco
192
93
3
41
G_zeae
242
102
20
62
Saprophite
PhytoPath.
Saprophite
N_cras
171
74
3
41
P_anse
229
88
7
75
Saprophite
Saprophite
S_pomb
C_neof
46
81
61
66
3
8
10
P_chry
L_bico
179
66
4
47
162
88
7
C_cine
M_lari
P_gram
U_mayd
26
90
Saprophite
210
72
13
176
93
6
10
PhytoPath.
157
97
88
64
4
1
11
9
PhytoPath.
0
Pathogen
Saprophite
Symbiont
PhytoPath.
Normal GT set
Medium GH
Low PL / CBM set
CAZyme Family & Functional
Annotation
Objectives
Attribution of
CAZymes to
Families
Annotation based
on Biochemically
Characterized
cases
Understand
Evolution
A.nidulans
A.fumigatus
A.niger
Eurotiomycetes
A.oryzae
S. sclerotiorum
M.grisea
P.anserina
N.crassa
Sordariomycetes
H.jecorina
G.zeae
C.albicans
S.cerevisiae Saccharomycotina
C.glabrata
Ascomycota
S.pombe
Archaeascomycetes
C.neoformans
Hyménomycetes
L.bicolor
P.chrysosporium
Basidiomycota
U.maydis
R. oryzae
Zygomycota
© Coutinho, Danchin & Henrissat, 2006
Fungal Genomes:



































Kluyveromyces lactis NRRL Y-1140
Pichia stipitis CBS 6054
Saccharomyces cerevisiae S288C
Debaryomyces hansenii CBS767
Eremothecium gossypii ATCC 10895
Yarrowia lipolytica CLIB99
Candida albicans - Private
Candida glabrata CBS138
Phaeosphaeria nodorum SN15 - Private
Aspergillus nidulans FGSC A4 v.2
Aspergillus nidulans FGSC A4 v.3 - Private
Aspergillus clavatus NRRL 1 [- Private
Aspergillus flavus NRRL3357 - Private
Aspergillus niger CBS 513.88 – (2007)
Aspergillus niger ATCC 1015 - Private
Aspergillus niger CBS 513.88 - Private
Aspergillus oryzae RIB 40
Aspergillus fumigatus Af293 - Private
Aspergillus terreus NIH2624 - Private
Coccidioides immitis RS - Private
Sclerotinia sclerotiorum 1980 - Private
Botryotinia fuckeliana T4 - Private
Tuber melanosporum - Private
Magnaporthe grisea 70-15
Hypocrea jecorina – Private (2008)
Gibberella zeae - Private
Fusarium verticillioides 7600 - Private
Nectria haematococca mpVI - Private
Fusarium oxysporum lycopersici - Private
Cryphonectria parasitica EP155 v1 - Private
Neurospora crassa OR74A
Chaetomium globosum CBS 148.51 - Private
Podospora anserina – Private (2008)
Schizosaccharomyces pombe 972hSchizosaccharomyces japonicus yFS275 - Private
Fungal
Genome
Crunching
>35 Private (Consortia + Extra)
and/or
15 Public @ www.cazy.org

Cryptococcus neoformans H99 - Private
Cryptococcus neoformans var. neoformans JEC21
Postia placenta Mad-698-R - Private
Phanerochaete chrysosporium – Private (2004)
Laccaria bicolor – Private (2008)
Coprinopsis cinerea- Private
Melampsora laricis-populina - Private
Puccinia graminis f. tritici - Private
Ustilago maydis - Private
Malassezia globosa CBS 7966 – Private

Rhizopus oryzae RA 99-880 – Private

Batrachochytrium dendrobatidis JAM81 – Private

Encephalitozoon cuniculi GB-M1










Kluyveromyces lactis NRRL Y-1140
Pichia stipitis CBS 6054
Saccharomyces cerevisiae S288C
Debaryomyces hansenii CBS767
Eremothecium gossypii ATCC 10895
Yarrowia lipolytica CLIB99
Candida albicans - Private
Candida glabrata CBS138
Phaeosphaeria nodorum SN15 - Private
Aspergillus nidulans FGSC A4 v.2/v.3 - Private
Aspergillus clavatus NRRL 1 [- Private
Aspergillus flavus NRRL3357 - Private
Aspergillus niger CBS 513.88 Private – (2007)
Aspergillus niger ATCC 1015 - Private
Aspergillus oryzae RIB 40
Aspergillus fumigatus Af293 - Private
Aspergillus terreus NIH2624 - Private
Coccidioides immitis RS - Private
Sclerotinia sclerotiorum 1980 - Private
Botryotinia fuckeliana T4 - Private
Tuber melanosporum - Private
Magnaporthe grisea 70-15
Hypocrea jecorina – Private (2008)
Gibberella zeae - Private
Fusarium verticillioides 7600 - Private
Nectria haematococca mpVI - Private
Fusarium oxysporum lycopersici - Private
Cryphonectria parasitica EP155 v1 - Private
Neurospora crassa OR74A
Chaetomium globosum CBS 148.51 - Private
Podospora anserina – Private (2008)
Schizosaccharomyces pombe 972hSchizosaccharomyces japonicus yFS275 – Private
Cryptococcus neoformans H99 - Private
Cryptococcus neoformans var. neoformans JEC21
Postia placenta Mad-698-R - Private
Phanerochaete chrysosporium – Private (2004)
Laccaria bicolor – Private (2008)
Coprinopsis cinerea- Private

Melampsora laricis-populina - Private












































Puccinia graminis f. tritici - Private
Ustilago maydis - Private
Malassezia globosa CBS 7966 – Private
Rhizopus oryzae RA 99-880 – Private
Batrachochytrium dendrobatidis JAM81 – Private
Encephalitozoon cuniculi GB-M1
Orthologous Distance Fungal CAZymes
(Preliminary Results)
« Rusts »
CAZY Database and Website
Genome Annotation and Comparative
Genomics
Annotation Highlights from Melampsora
laricis-populina
Interpretation and Speculation
Host–Rust Parasite Interaction
Interaction between rust and host is initiated on
external surface.
The haustorial mother cell produces a narrow
peg that penetrates the host cell wall.
Pathogen-secreted molecules inside
the host cell suppress host defence
and enhance susceptibility
Maheshwari R. The scourge of mankind: From ancient time into the genomic era. Current Science. 2007 (9) 1249-1256.
Infection


Upon penetration of the plant cell
wall by enzymatic dissolution, an
haustorium is formed in the
periplasmic space of the host cell.
The interface between the plant
and fungal cytoplasm consists of
 A gel like layer consisting of
carbohydrates (extrahaustorial
matrix)
 Extrahaustorial membrane -derived from the plant cell wall.

The haustorium is directly
connected to the mother cell so
that nutrients can be transported
from the plant cell to the
developing fungal hyphae.
Leonard KL and Szabo LJ. Molecular Plant Pathology (2005). 6 (2), 99-111
M_lari vs Fungal GHs : Highlights
GH
S_cere
A_nige
A_oryz
B_fuck
T_mela
M_gris
H_jeco
G_zeae
P_anse
S_pom
C_neof
P_chry
L_bico
C_cine
M_lari
P_gram
U_may
M_glob
1
0
3
3
3
2
2
2
3
1
0
0
2
0
2
0
0
0
0
2
0
6
7
2
2
6
7
10
7
0
0
2
2
2
4
10
1
0
3
0
17
23
16
6
19
13
22
11
1
7
11
2
7
3
2
3
1
5
5
10
13
15
6
13
11
15
15
3
10
20
22
27
30
27
12
6
7
0
2
3
2
0
6
2
2
6
0
0
9
0
7
8
8
0
0
10
0
1
4
2
1
5
1
5
8
0
0
6
0
5
6
5
2
0
11
0
4
4
3
0
5
4
3
6
0
0
1
0
6
0
0
1
0
12
0
4
4
4
1
3
2
4
2
0
0
2
3
1
10
3
0
0
13
8
18
17
10
8
10
5
8
9
12
10
9
8
9
8
5
6
0
15
1
2
3
4
1
2
2
3
3
2
2
2
2
4
4
3
1
0
16
5
13
13
21
7
16
16
21
12
3
12
23
31
32
11
9
21
7
17
4
5
5
6
4
7
4
6
4
1
1
1
3
3
1
1
2
0
18
2
14
18
10
5
14
20
19
20
1
4
11
10
9
15
17
3
1
20
0
3
3
1
2
2
3
2
1
0
1
3
2
2
3
2
2
0
PC
W PCW PCW CW PCW PCW PCW PCW Gly Gly FCW FCW FCW FCW
S
S
S
26
0
1
1
2
0
0
0
0
1
0
0
0
0
0
5
5
0
0
27
0
4
3
4
0
4
8
2
2
1
0
3
1
0
7
12
1
0
?
?
28
1
21
20
18
2
3
4
6
0
0
1
4
6
3
3
1
1
0
32
1
6
4
1
1
5
0
5
0
2
1
0
0
0
2
2
2
0
43
0
10
20
4
1
19
2
17
10
0
0
4
0
4
8
2
4
1
47
3
5
5
8
5
6
8
10
9
2
3
6
9
8
14
14
3
2
51
0
4
3
3
0
3
0
2
1
0
1
2
0
1
3
0
2
0
61
0
7
8
9
4
17
3
15
33
0
1
15
8
33
2
3
0
0
78
0
8
8
8
2
1
1
7
1
0
3
1
0
0
0
0
0
0
88 105
0
0
1
2
3
2
1
1
0
0
1
3
0
1
1
3
0
0
0
0
2
1
1
0
2
0
1
1
0
1
0
0
0
1
0
0
PCW Suc PCW FCW PCW CW PCW PCW PCW
S
S
S
S
S

Low Plant Cell-Wall (PCW) saccharification (S) capacity (GH1, 3, 43, 78…)

Original combination of high GH7,10,12 but absent GH11

Large number of GH26,27 but unknown specificity (extrahaustorial matrix?)

Capacity to saccharify sucrose (GH32) that is absent from PCW-saccharifying fungi

Normal FCW-aiming enzymes but probably large set in CW-targeting family GH5

Differences w/ P_gram may reflect host specificity (Dicot/Monocot?)
M_lari vs Fungal CBMs : Highlights
CBM
S_cere
A_nige
A_oryz
B_fuck
T_mela
M_gris
H_jeco
G_zeae
P_anse
S_pomb
C_neof
P_chry
L_bico
C_cine
M_lari
P_gram
U_mayd
M_glob
1
0
8
3
18
1
22
15
12
30
2
0
31
1
46
0
0
0
0
12
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
0
0
13
0
1
2
1
0
0
3
2
0
0
5
5
10
24
0
0
0
0
18
2
13
5
16
16
33
8
34
30
0
1
1
1
1
0
0
2
0
19
1
0
1
0
1
0
0
0
0
0
0
0
1
2
5
3
0
0
PC
W
FC
W
PC
W
FC
W
FC
W


No CBMs aiming at Plant Cell-Wall (PCW)
Few CBMs aiming at Fungal Cell-Wall (FCW)
M_lari : Main CAZy
Conclusions
• An original distribution of CAZymes mostly shared with
P_gram (where differences may relate w/ host)
• Sufficient degrading GH + PL (not shown) enzymes to
perforate the Plant Cell Wall, and form the Haustorium, but
not for its saccharification
• GH32 invertases present to saccharify Sucrose
(like P_gram and U_mayd)
• Open Question : Are some enzymes present to destroy
oligosaccharide elicitors (resulting from FCM-degradation
by plant enzymes) and diminish plant response?

Bernard Henrissat (DR1)

Pedro Coutinho (PR2)

Brandi Cantarel (Post-Doc)

Corinne Rancurel (IE - Bioinformatics)

Vincent Lombard (IE - DB Expert)

Thomas Bernard (PhD Student) (2008)



Centre National de la Recherche Scientifique
Aix-Marseille Universités
ANR-PNRB: E-TriCel
© Coutinho & Henrissat, 2008
CAZy - Team & Funding
Download