An AFLP-marker based linkage map of Heterobasidion

advertisement
Supporting Information Notes S1–S7
1. Simple sequence repeats and transposable elements
1. Simple sequence repeats identification and characterization
2. Transposable element identification and characterization
2. The mitochondrial genome annotation and analysis
3. Targeted annotation of specific gene families
1.
Cerato-platanin family protein
2.
Oxidative enzymes in the ROS gene network
3.
Lignin peroxidases
4.
Copper Radical Oxidases
5.
Transporters
6.
Peptidases
7.
Signal transduction pathways
8.
Transcription Factors
4. The mating incompatibility locus (MAT)
5. Wood degradation, enzyme content, expression and growth
1. Wood degradation Heterobasidion genome project outline
6. Pathogenicity
1. Natural product genes in the H. irregulare genome
2. Genes differentially regulate in interactions between H. irregulare and Pine
3. Anchoring pathogenicity QTL’s to the genome sequence
7. Trade-off
8. References
1
Notes S1 Simple sequence repeats and transposable elements
1.1. Simple sequence repeats identification and characterisation
The whole genome was searched for SSRs using Sciroko (Kofler et al., 2007) with the
‘perfect MISA-mode’ search set at a minimum number of repeats of 14, 7, 5, 4, 4, 4 for
mono-, di-, tri-, tetra-, penta-, and hexanucleotides, respectively.
The distribution of SSRs was further described by manual annotation of all microsatellites
present in the 10 largest scaffolds, representing 75 % of the genome. SSRs were scored as
present or absent in the following genomic regions within ORFs: exons, introns, 3’UTRs and
5’UTRs. Since gene expression is reported to be controlled by promoters located before ORFs
(Abeel et al., 2008), manually annotated of SSRs present within 50 bp and within 50-500 bp
upstream or downstream of ORFs were performed. Finally, all SSRs located further than 500
bp from ORFs were also annotated. Manual annotation was conducted using all JGI filtered
models. The frequency of SSRs was standardized based on the percentage of each of the
above genomic regions as in the Frozen Catalog 090414.
The total number of perfect SSRs found was 2541, and they comprised about 0.0017% of the
genome. There is approximately one microsatellite per 13 Kb and the total number of SSRs in
intergenic (n=1372) and intragenic regions (n=1169) is similar. Density of SSRs (number/Mb)
in the ten largest scaffolds was highest in 3’UTR, followed by regions located more than 500
bp from ORFs, regions 50 bp upstream ORFs, 5’UTR, 50 bp downstream ORFs, 50-500 bp
downstream ORFs, introns, 50-500 bp upstream ORFs and exons (Fig. S4).
The most frequent SSRs in exons are trinucleotides followed by hexanucleotides, while
tetranucleotides are dominant in introns (Fig. S4). Trinucleotides are also clearly dominant in
5’UTRs, within 50 bp before ORFs and in 50-500 bp upstream ORFs. Conversely,
tetranucleotides are more frequent than other SSRs in the genome fraction located more than
500 bp from ORFs. Densities of tri- and tetranucleotides are similar in 3’UTR and 50 bp
downstream ORFs. Dinucleotides are present at extremely low frequencies both in exons and
within 50 bp before ORF. Total number and density of SSRs is higher in 3’ than in 5’UTRs.
Overall, the highest concentration of trinucleotides is found in 5’UTRs and within 50 bp
before ORFs (Fig. S4).
2
The most frequent perfect fully standardized motifs are ACG, CCG and AGC, both in
intergenic and intragenic regions. Some repeats frequently reported in fungi (i.e., AT, ATC)
(Toth et al., 2000) are rare in the H. irregulare genome (Table S9). On the other hand, H.
irregulare harbour a significant component of repeats absent or rare in fungi (i.e., CG, ACG)
(Table S9).
Although the density of SSRs is comparable between introns and exons, there appears to be a
clear selection in favour of trinucleotides and hexanucleotides in the exonic coding regions
that is absent in introns. Such dominance of triplets over other repeats in coding regions may
be explained by low tolerance for frameshift mutations (Metzgar et al., 2000). In parallel, it
could be argued that dominance of trirepeats and selection against other repeat numbers (in
particular di- and tetrarepeats) in other regions of the genome may be indicative of a delicate
role played by SSRs in gene regulation, for instance by altering the abundance of nuclear
protein binding sites as shown by changes in number of CCG repeats in humans (Richards et
al., 1993, Stallings, 1994). In this light, the overall low number of SSRs, the clear dominance
of trinucleotides in both the 5’UTR and the 50 bp upstream ORFs, and the low representation
of dinucleotides in the 50 bp upstream ORFs in the H. irregulare genome suggests a history
of negative selection towards those SSRs (e.g dinucleotides) that are more likely to disrupt the
functions played by these regions. We identified a dominance of tetranucleotides over
trinucleotides in intronic non-coding regions. This finding is surprising since this dominance
in introns is reported only for vertebrates (Toth et al., 2000, Mun et al., 2006, Lawson &
Zhang, 2006). The parallel finding that tetranucleotides are dominant in regions further than
500 bp from ORFs is suggestive that these regions play a lesser role in the regulation of gene
expression when compared to regions closer to ORFs. The lack of dominance of
trinucleotides was also observed in the 3’UTR (in contrast with the absolute dominance of
these SSRs in 5’UTRs) and downstream of ORFs. Dominance of trirepeats in the 5’UTR has
been generally reported for both animals and plants (Li et al., 2004). However, the H.
irregulare genome is characterized by a higher number and density of SSRs in the 3’UTR, a
trait commonly associated mostly with animals and not plants (Li et al., 2004, Lawson &
Zhang, 2006).
The most abundant triplet in the H. irregular genome was ACG, a motif rarely found in
genomes of other organisms, including fungi (Toth et al., 2000). We also found ACG repeats
in introns, despite the fact they are reported as absolutely absent in fungi and several other
3
groups of organisms (Li et al., 2004). CG, a dinucleotide notoriously underrepresented in
most organisms and not reported for other fungi, was also detected.
In conclusion, this is one of the first reports linking the presence and type of SSRs with the
regulatory function of DNA regions immediately upstream of ORFs. This study also provides
an example from the fungi of a specific selection process almost exclusively in favour of
trinucleotides in the 5’UTR, a well-known mechanism in other groups of organisms.
Conversely, frequency of trinucleotides decreases further away from ORFs, and
tetranucleotides dominate in regions further than 500 bp from ORFs, suggesting a loss of
constraint that may be expected in regions less directly involved in the regulation of gene
expression. A similar lack of constraint is exemplified by the dominance of tetranucleotides
detected in introns of the H. irregulare genome.
1.2. Transposable element identification and characterization
RepeatScout (Price et al., 2005) was used for de novo identification of repetitive DNA in the
H. irregulare genome assembly. The default parameters (with l=15) were used. RepeatScout
generated 1,082 consensus sequences. This library was then filtered as follow: 1) all the
sequences less than 100 bp were eliminated; 2) low-complexity repeats and tandem repeats
were removed as part of the RepeatScout algorithm using Nseg (Wooton & Federhen, 1996)
and TRF (Benson, 1999); 3) repeats having less than 5 copies in the genome were removed
and 4) repeats having significant hits to known proteins in Uniprot (The UniProt Consortium,
2008) except proteins known to belong to TE were removed. The classification of the 272
consensus sequences remaining was conducted using the pipeline REPCLASS (Feschotte et
al., 2009). The elements were annotated manually using the REPCLASS classification and a
tBLASTx search (Altschul et al., 1990) against RepBase. TE´s belonging to ClassI and
ClassII as defined by Wicker and colleagues (2007) were identified (Fig. S1). The gypsy-like
elements were the most frequent TEs corresponding to 9.28 % of the H. irregulare assembly.
The Class II TIR was the second most frequent categorized elements (1.05 %). 3.67 % of the
genome was masked by repeated elements belonging to unknown families (Fig. S1).
To identify full length LTR retrotransposons, a second de novo search was performed with
LTR_STRUC (Mc Carthy & Mc Donald, 2003). The program yielded 116 full-length
candidate LTR retrotransposon sequences, which were checked for their homology using the
BLASTN algorithm (Altschul et al., 1990) against the sequences coming from the RepBase
database. Among the 116 putative full length LTRs, 90 were attributed to Gypsy/Ty3-like
4
elements and 17 to Copia/Ty1-like. Nine other elements did not exhibit a significant
homology with known TE families or have homologies with non LTR retrotransposons,
which sequences have been excluded for further analyses. The insertion age of full length
LTRs was determined from the evolutionary distance between 5’- and 3’-solo LTR derived
from a ClustalW (Thompson et al., 1994) alignment of the two solo LTR sequences using the
Kimura correction. For the conversion of the sequence distance to putative insertion age, a
substitution rate of 1.3 x 10-8 mutations per site per year was used (Ma & Bennetzen, 2004).
H. irregulare underwent a recent activity which peaks at an estimated 0.2 Mya, preceded by a
gradual increase starting 2 Mya. An old activity occurred at 4-8 Mya could also be detected.
The decrease between 12 to 8 Mya probably reflects element deterioration leading to loss of
ability to detect these elements (Fig. S3).
The number of TE occurrences and the percent of genome coverage were identified by
masking the H. irregulare genome assembly using RepeatMasker (Smit et al., 1996;
www.repeatmasker.org). Of the 379 consensus sequences found, 272 came from the
RepeatScout/REPCLASS pipeline and the 90 Gypsy/Ty3-like full length elements and 17 to
Copia/Ty1-like full length elements were identified by LTR_STRUC. RepeatMasker masked
16.21 % of the H. irregulare genome assembly. Identified TEs are not uniformly distributed
across the genome (Chi2 test, p.value <0.05), but are clustered in gene poor regions (Fig. S2).
Notes S2 The mitochondrial genome annotation and analysis
The mitochondrial genome (mt-genome) of H. irregulare, TC32-1, comprises 114193 bp and
has a circular structure and a mean GC-content of 22.8%. Open reading frames (ORFs) longer
than 150 bp were identified using ORFfinder, codon usage table 4
(http://www.ncbi.nlm.nih.gov/projects/gorf/). The ORFs that were 300 bp or longer were used
to search the non-redundant NCBI database using BLAST in order to find conserved genes.
Exon/intron boundaries were located by means of CLUSTALW alignment with homologous
genes from other fungal species. The 300 bp or longer ORFs with no significant hits were
considered as non-conserved putative genes. The small and large ribosomal RNA (rns, rnl
rRNA) genes were located by BLASTing the homologous genes from closely related species
to the mt-genome of H. irregulare. The program tRNAscan-SE was used to identify the tRNA
regions.
5
Of the 15 protein coding genes identified, 14 are involved in energy production: Seven genes
are encoded in the NADH dehydrogenase complex (nad1, nad2, nad3, nad4, nad4L, nad5,
nad6), one gene in the cytochrome bc1 complex (cob), three genes in the cytochrome c
oxidase complex (cox1, cox2, cox3) and three genes in the ATP synthase complex (atp6, atp8,
atp9) (Table S10). In addition, the mt-genome included a ribosomal small subunit protein 3
(rps3) gene and one extra partial nad2 gene. The rns and rls rRNA genes and 25 tRNAs were
identified. Two ORFs found are vaguely similar to each other (E-value 6e-17) and they were
annotated as putative plasmid genes (Ppl1 and Ppl2), since they have low similarity hits with
hypothetical plasmid proteins from P. ostreatus and Moniliophtora perniciosa (E-values 3e-12
and 3e-7). Next to one of the putative plasmid genes are two putative pseudo B-type DNA
polymerase genes (PSdpo1 and PSdpo2). These polymerases are commonly found in
mitochondrial plasmids and sometimes also in mt-genomes. The six non-conserved
hypothetical genes found (NC-ORF1-6) have open reading frames larger than 100 amino
acids and in five of these, InterProScan found transmembrane regions. Four of the nonconserved hypothetical genes are located adjacent to each other and the two others are both
next to one of the two putative plasmid genes. None of the hypothetical genes has any
similarity to each other.
A number of 24 group I introns were identified in the mt genes: Nine in cox1, two in cox2,
two in cox3, seven in cob, two in nad1, one in nad5 and one in rnl. In ten of these introns, 14
intronic genes were found. Out of these 14 intronic genes, 10 were found in the introns of
cox1, one in cox3, two in cob and one in rnl. There are as many as three intronic genes in
intron four of cox1 and also one putative pseudo-intronic gene. These intronic genes are
conserved homing endonuclease genes (HEGs) with two different kinds of motifs: The
LAGLIDADG motif and the GIY-YIG motif. HEGs are known to invade group I introns and
promote mobility of the introns. Some of these HEGs are also maturases that assists in intron
folding and thereby also in intron splicing.
Notes S3 Targeted annotation of specific gene families
3.1 Cerato-platanin family protein
The three members of the H. irregulare CP family were identified by recursive tBLASTn
searches, initially using the sequences of the Ceratocystis platani CP members as queries
(Comparini et al., 2009) (GenBank accession number EF017218.1 and AJ311644), the
sequence of the CP paralog of Ceratocystis fimbriata isolate Cf 4 CF-MANG protein
6
(EF017221) and the cerato-populin gene from Ceratocystis populicola isolate Cf 2
(EF017219) (Comparini et al., 2009). The ORFs identified were designated CP genes and
used as queries for further searches. This process was repeated until no new CP genes were
recovered. The H. irregulare proteins were found in to separate clades when compared to 77
fungal Cp proteins (Fig. S11).
3.2. Oxidative enzymes in the ROS gene network
Handling massive reactive oxygen species (ROS) production is required for pathogenicity in
Magnaporthe oryzae (Egan et al., 2007) or for the mutualistic relationship between Epichloe
festucae and perennial ryegrass (Tanaka et al., 2008). Prevention of ROS toxicity and control
of ROS signalling require a large gene network of at least 150 genes in Arabidopsis, named
the “ROS gene network” (Mittler et al., 2004). Within this network, H. irregulare possesses 5
peroxiredoxins, 3 catalases, 5 haloperoxidases, comparable numbers to those identified in
other fungal genomes (Table S11). Notable is the absence of Alkylhydroperoxidase D-like
and of Glutathione peroxidase, usually detected in fungi.
Class I peroxidases are found in all living organisms and are members of the ROS network.
Cytochrome C peroxidases (CcP) are found in mitochondria: they play a major role in the
control of H 2 O 2 concentrations. One CcP sequence and four hybrid sequences divergent from
B
B
B
B
other fungal sequences were detected in H. irregulare. For the production of ROS, fungi may
also use NADPH oxidase homologues (NOx) and ferric reductase (FRe) (Gessler et al., 2007)
with distinct functions: NOx are necessary for superoxide generation during developmental
processes, whereas FRe are involved in metal reduction required to acquire iron from the
infected host. As expected, NOxA and B were detected in H. irregulare. Surprisingly, seven
FRe encoding sequences, probably resulting from several recent duplications were found. Iron
uptake is required for virulence, resistance to oxidative stress, asexual/sexual development,
and iron storage (Johnson, 2008). The high number of FRe copies could be associated with
the pathogenic capacity of H. irregulare.
3.3. Lignin peroxidases
In the genome of H. irregulare six sequences containing the Manganese Peroxidase (MnP)
characteristic residue (ExxxE and D) were detected and one that show homology to MnP but
lack the specific residues. Other plant pathogens in the Russulales (such as different species
of Amylostereum, Echinodontium and Heterobasidion genus) also contained several MnP,
7
based on cDNA sequencing. The putative MnPs from H. irregulare did not cluster directly
with the major MnP clades composed from Polyporales species (Fig. S10).
3.4. Copper radical oxidases
Glyoxal oxidase is a copper-radical oxidase (CRO), with wide substrate specificity for
oxidizing simple aldehydes, such as glyoxal and methylglyoxal, to the corresponding
carboxylic acids (Whittaker et al., 1996). These substrates are found in ligninolytic cultures,
suggesting a role as physiological substrates for GLX. GLX also has been implicated in the
regulation of peroxidase activity, and is activated in vitro by lignin peroxidase (Kersten, 1990;
Kersten & Kirk, 1987). Based on similarities to the galactose oxidase from Dactylium
dendroides, the active site of GLX has been identified, and includes Tyr377, His378, Tyr135,
Tyr70 and His471 (Ito et al., 1991; Kersten & Cullen, 1993; Whittaker et al., 1996). These
residues are conserved in more recently identified CRO genes, including 6 in P.
chrysosporium (Vanden Wymelenberg et al., 2006) and 3 in Postia placenta (Martinez et al.,
2009). Five putative copper radical oxidases have been identified in the H. irregulare
genome.
BLAST analysis of the H. irregulare genome identified 5 sequences with significant
similarity to P. chrysosporium glx1 or structurally related copper radical oxidases (Martinez
et al., 2004). All sequences feature predicted secretion signals (SignalP v3.0,
www.cbs.dtu.dk/services/SignalP/). Deduced, mature proteins had identities ranging from
29% to 47% compared to the P. chrysosporium glx1. Multiple alignments identified
conserved residues constituting the Cu-coordinating active site of GLX (Tyr135, Tyr377,
His378, His471) (Whittaker et al., 1996; Whittaker et al., 1999). In addition, a cysteine crosslinked with Tyr135 forms the radical redox site, and also is conserved in all six sequences
(Cys70). Thus, based on structure, these proteins are likely copper radical oxidases.
3.5. Transporters
Transporter proteins were classified with the aid of the system for membrane transporter
classification (http://www.tcdb.org) and further refined manually. The H. irregulare genome
contains 499 gene models that which is equivalent with other basidiomycetes fungi (Table
S13). The largest numbers of proteins were found in secondary transporter family with the
individually highest number being the major facilitator superfamily type.
8
3.6. Peptidases
Automatically predicted proteinase genes and functions in the H. irregulare genome were
further refined by manual curation using web-based tools at http://merops.sanger.ac.uk/
(Rawlings et al., 2008). The predicted proteinases were categorized in to different groups
(Table S14).The presens of distingt members were compared with other basidiomycetes
(Table S15).
3.7. Signal transduction pathways
Five signal transduction pathways have been investigated in H. irregulare using the well
characterized S. cerevisiae genes as probe for the bioinformatic analysis. The five pathways
are: Fus3/Kss1 pheromone pathway, Hog1 osmostress pathway, Mpk1 cell integrity pathway,
the calcium/calcineurin signaling pathway and the cAMP pathway (Gustin et al., 1998,
Rispail et al., 2009).
Fus3/Kss1 pheromone pathway
The pheromone pathway in S. cerevisiae is triggered by the binding of pheromone to the
cognate receptors Ste2p and Ste3p. No Ste2 homologue could be found in the H. irregulare
genome. It has been documented that basidiomycetes (i.e. U. maydis and C. neoformans) lack
type  receptors (Rispail et al., 2009). Ste3p from S. cerevisiae show similarity with five
proteins (protein id. 147162, 181128, 171777, 181123 and 147163) that are located in a
cluster on scaffold 7.
The interaction between the pheromone and the receptor leads to the downstream dissociation
of the heterotrimeric G-protein. A Gpa1 homologue exist in H. irregulare (protein id. 33983)
with typical small G protein  domains. There are other two G- proteins with higher E-value
but still with typical G-protein domains (protein id. 57348 and 31682). The ß-subunit and γsubunit are also present in the genome.
A Cdc24 is present in the H. irregulare genome and possesses the DH-domain and the
pleckstrin-like domain characteristic for guanidine exchange factors. Furthermore, Cdc42
(activated by Cdc24) is present and shows the typical Ras GTPase domain (Johnson, 1999).
There is a Ste50 homologue in the H. irregulare genome, a protein that functions as an
adaptor between Cdc42-Ste20 and the MAPK Ste11 (Jung et al., 2011). In all basidiomycetes
9
analyzed (C. cinereus, L. bicolor, P. chrysosporium, H. irregulare, C. neoformans (serotype
A), C. neoformans (serotype B), C. neoformans (serotype D B-3501A), U. maydis, S. roseus)
this protein is longer than in the ascomycota due to the presence of SH3 domain (Fig. S6).
Ste5 homologue, a scaffold protein that binds Ste11p, Ste7p, and Fus3p kinases in the central
MAPK cascade module, is absent from H. irregulare and in all basidiomycetes analysed so
far. However, the Cdc42p-activated signal transducing kinase of the PAK (p21-activated
kinase) family Ste20 orthologue is present (Malleshaiah et al., 2010).
A Bem1 orthologue which has a Phox-like domain and an SH3-domain thought to interact
with Ste5-MAPK complex of the pheromone pathway is present. The central MAPK module
of the pheromone pathway in S. cerevisiae is composed of the MAPKKK Ste11, the MAPKK
Ste7 and the MAPK Fus3 (Gustin et al., 1998). Genes encoding homologous proteins to these
proteins are present in the H. irregulare genome in one copy. Ste11 shows the typical SAM
interactive domain that characterizes all the other Ste11 MAPKKK proteins in the fungus
studied. The SAM domain in the Ste11 protein is thought to interact with the SAM domain in
the upstream interaction protein Ste50 (Grimshaw et al., 2004).
HOG1 osmotolerance pathway
The Sln1 histidine phosphorelay sensor is used by fungi to sense changes in the environments
osmolarity conditions (Posas et al., 1996). In the H. irregulare genome there is one copy of
Sln1-like protein. Five transmembrane domains have been found in the H. irregulare Sln1
and this differs from the 2 transmembrane domains of the Sln1 sensor in S. cerevisiae. There
is a Sho1 orthologue protein in H. irregulare, this protein shows 4 transmembrane domains
and an SH3 domain.
The phosphorelay intermediate protein of the Hog1 osmoregulation pathway including Ypd1like protein and the response regulator Ssk1 is present in H. irregulare (Posas et al., 1996).
In the budding yeast there is another signaling branch related to the hyperosmolarity pathway
which is activated by the association between Sho1 and the signaling mucin protein Msb2.
The search in the H. irregulare genome didn’t show any orthologue similar to S. cerevisiae
Msb2. The core component of the Hog1 osmoregulation pathway is all present in the H.
irregulare genome. There is one copy for each of the following proteins, Ssk2 MAPKKK,
Pbs2 MAPKK and the Hog1 MAPK (Brewster et al., 1993; Rispail et al., 2009).
10
The Mpk1 cell integrity pathway
No members of the Wsc or the Mid2, of the cell integrity gene families, are present in H.
irregulare genome. However, the downstream GTPase protein Rho1 is present in one copy.
The Rho1-activation proteins Tus1 is missing while Sac7 and Bem2 are present. Rom1, a
GEP for Rho1, is present and it’s characterized by the DH domain (GEF activity), a plekstrinlike domain and a citron-like domain. The cell integrity pathway core is again characterized
by the MAPKKK Bkc1, the MAPKK Mkk1 and the effector MAPK Mpk1 (Gustin et al.,
1998). There are single copies of each of these elements in the H. irregulare genome.
The Ca2+ pathway
There are three membrane proteins involved in the Ca2+ metabolism. The large
transmembrane protein Cch1 is present in the H. irregulare genome with 18 transmembrane
domains. Mid1 orthologue is also present, it has no transmembrane domains since its
anchored trough a GPI-anchor (Locke et al., 2000). The calmodulin protein (Ca2+ binding
protein) is present and it shows the typical EF-hand domain that characterizes many Ca2+
binding proteins. There are two isoforms of the calcineurin A catalytic subunit. The
regulatory subunit of the calcineurin A is also present as a single copy (Cyert, 2003).
The Ca2+ pathway is characterized by different types of vacuolar Ca2+ channels. In the H.
irregulare genome there are two copies of the calcium-transporting ATPases Pmr1 and one
copy of the Pmc1. Interestingly in the genome there are 5 paralogues of the Vacuolar H+ /Ca2+
exchanger Vcx1 which is involved in the control of cytosolic Ca2+ concentration in S.
cerevisiae (Rispail et al., 2009).
The cyclic-AMP-PKA pathway
The cAMP-PKA pathway is composed by several elements that are all present in the H.
irregulare genome (Pukkila-Worley & Alspaugh, 2004; Shemarova, 2009). The adenylate
cyclase (AC) shows the characteristic LRR domain typical of proteins involved in signal
transduction. The phylogenetic analysis shows three distinct clusters of AC sequences that
correspond to phylogentic groups basidiomycota, ascomycota and oomycota (Fig. S7). The
AC doesn´t show any transmembrane domain which differs from the human AC that shows 9
predicted transmembrane domains. There are two sequences related to the phosphodiesterase
in the H. irregulare genome. The phosphodiesterase is responsible for reduceing the
intracellular concentration of the second messenger cAMP (Ma et al., 1999). The
11
phosphodiesterase class I (PDE I) is the low-affinity enzyme for the cAMP while the PDE II
is the high-affinity one. The PKA is a kinase responsible for the downstream regulation and
activation of target proteins in pathways regulated by the cAMP level. The H. irregulare
genome contains both the catalytic and the regulatory subunit of PKA. The regulatory subunit
is characterized by the type II PKA R subunit domain. Finally the CAP protein (Cyclaseassociated protein) is also present (Shemarova, 2009).
3.8. Transcription Factors
Out of 11,464 H. irregulare proteins identified 440 werecharacterised as putative
transcription factors (TFs). The 440 H. irregulare TFs are distributed in 36 families. Among
these 440 putative TFs, the zinc finger CCHC-type family is the most abundant in H.
irregulare and the second most abundant TF family in H. irregulare is the C2H2 zinc finger
whereas the fungal specific transcription factor zn2Cys6 is ranked as the third most abundant
(Fig. S5).
Notes S4 The mating incompatibility locus
Mating incompatibility in heterothallic Agaricomycotina is a post-fusion response controlled
by the mating-type (MAT) loci. Tetrapolar species have two MAT loci, with one locus (MATA) encoding homeodomain transcription factors and controlling clamp-cell formation and
conjugate nuclear division, and the other locus (MAT-B) encoding pheromones and
pheromone receptors and controlling nuclear migration and clamp-cell fusion (Brown &
Casselton, 2001). In bipolar basidiomycetes, these processes are regulated by a single locus,
and where it is known, this locus is homologous to MAT-A in tetrapolar species (Aimi et al.,
2005; James et al., 2006). H. irregulare is a bipolar species in an order of Agaricomycetes
(Russulales) that has never been investigated for its genomic structure of MAT. The species
has multiple mating types, forms clamp connections after heterokaryotization, but has
multinucleate rather than dikaryotic cells in the secondary mycelium (Korhonen & Stenlid,
1998).
Initial queries of the genome identified two distinct regions of the genome apparently
homologous to the MAT loci of other basidiomycetes. The MAT-A gene homologues
(homeodomain genes) were found on scaffold 1, within a genomic region of highly conserved
gene composition similar to other Agaricomycetes (Fig. S9) (James, 2007; Niculita-Hirzel et
al., 2008; Ya et al., 2009). The MAT-B gene homologues (G-protein coupled pheromone
12
receptors) were found on scaffold 5, with 5 receptors encoded in a genomic region of ~20 kb.
These genes, like those of all basidiomycete MAT-B loci, are homologous to the STE3 gene
(a-factor receptor) of S. cerevisiae (Brown & Casselton, 2001). At least three putative
pheromone genes (protein IDs: 181126, 181135, 181139) have also been identified in this
region and another one (protein ID: 181275) about 110 kb upstream to these. These two
genomic scaffolds are unlinked (Fig. 1), and either could represent the MAT locus of H.
irregulare.
In order to test whether either of the two putative mating-type loci found in the genome of H.
irregulare are the MAT locus, a segregation analysis of single basidiospore progeny was
conducted. A mature basidiocarp was taken from a stump of a fallen Pinus resinosa tree in
Stinchfield Woods, Washtenaw County, Michigan. A sample of 24 single basidiospore
progeny was isolated from the basidiocarp, and the mating types of the progeny were
determined by pairings on 0.5% malt extract agar and analysis of heterokaryotization by
clamp-connection formation. The segregation of the putative MAT loci was analyzed by
genotyping two markers: the MIP locus (protein ID: 59033) adjacent to MAT-A homologues
and one of the STE3 pheromone receptor homologues Ha-STE3.3 (protein ID: 181123), by
restriction enzyme analysis of PCR products of the regions (Fig. S9). Comparison of the
segregation of mating type and the two genomic regions demonstrated that the MAT-A
genomic region but not the MAT-B genomic region displays consistent cosegregation with
mating type. Therefore these data provide strong evidence that the region adjacent to MIP
encoding homeodomain genes paired with a novel or highly derived gene (see below) is the
MAT locus of H. irregulare.
The MAT loci of basidiomycetes are subject to strong balancing selection due to negative
frequency dependent selection on rare mating-type alleles, the coalescence time and therefore
DNA polymorphism of the MAT alleles in a species has been found to be extensive relative to
neutrally evolving genes (May et al., 1999). Ten homokaryotic isolates of H. irregulare are
being sequenced at the MAT-A and MAT-B regions to determine levels of polymorphism.
Preliminary data confirm the cosegregation data because MAT-A sequences are hyperpolymorphic while MAT-B sequences are not.
Most basidiomycete MAT-A loci encode one or more pairs of homedomain genes encoded in
divergently transcribed pairs (Kües & Casselton, 1992). Each gene pair is comprised of one
13
member from each of two classes (HD1 and HD2-types) (Hiscock & Kües, 1999). The HD2type is considered the typical homeobox DNA binding motif, whereas the HD1 type proteins
have an atypical DNA binding motif, and the heterodimerization of the two types is believed
to be crucial for regulating heterokaryosis and sexual development of basidiomycetes (Kües,
2000). The putative MAT locus of H. irregulare is flanked by the genes MIP and βfg as
observed in most Agaricomycetes and contains two HD1-type homeodomain genes paired in
a divergently transcribed arrangement with a gene with very low similarity to other proteins in
sequence databases (Fig. S5). These two genes (herein termed MAT-Aα2 and MAT-Aβ2), are
clearly homologous but their products share only ~50% amino acid similarity over 1/3 of their
length. Similarity is seen at the N-terminal region between amino acids 48-71 and 150-173,
the regions where other mating-type proteins typically encode their allelic specificity. The
second regions of similarity are from amino acids 347-365 and 509-524 and contain two NLS
signals but are otherwise S/T/P rich. MAT-Aα2 and MAT-Aβ2 are moderately predicted to
have a nuclear subcellular localization based on the presence of 2-3 nuclear localization
signals (NLS) in their sequence (Nakai & Horton, 1999). These two proteins are actively
expressed as evidenced by their hybridization to the NimbleGen microarray and the detection
of MAT-Aα2 by EST sequencing. The major characteristic that distinguishes MAT-Aα2 and
MAT-Aβ2 from the typical HD2 class of homeodomain proteins is the apparent absence of
the DNA-binding homeodomain motif that in C. cinerea is essential for HD2 function (Kües,
2000). Currently, the most parsimonious explanation is that the MAT-Aα2 and MAT-Aβ2
proteins are highly derived HD2 proteins that have lost or highly modified their DNA binding
motifs. The only similarity between the MAT-Aα2 and MAT-Aβ2 proteins and any others is
weak similarity between the H. irregulare MAT-Aβ2 protein and the MAT-A2 protein of P.
chrysosporium (protein ID: 7556) in the N-terminus region of the two proteins. This region is
the specificity-determining region in the other characterized basidiomycete homeodomain
genes (Hiscock & Kües, 1999). The organization of the H. irregulare MAT with two pairs of
HD1 genes divergently transcribed with the highly modified gene suggests that one pair arose
from the other by a duplication event.
In summary, these data provide strong evidence that mating type in H. irregulare is controlled
by homeodomain transcription factors in a manner similar to other Agaricomycetes. Thus, the
evolutionary origin of the bipolar mating system is an additional example of convergent
evolution of bipolarity from tetrapolarity wherein the pheromone receptors of the MAT-B
locus have lost function as mating-type specificity determinants, but have maintained their
14
structure, genomic location, and presumable function as regulators of sexual development
(James et al., 2006). One novel aspect of the H. irregulare MAT locus is the absence of
proteins with a detectable HD2 DNA binding motif. Future research will be needed to
determine if these proteins perform a similar function as the HD2 proteins in other
Agaricomycetes.
Notes S5 Wood degradation, enzyme content, expression and growth
The CAZy profile in Glycoside Hydrolases (GHs), Polysaccharide Lyases (PLs),
Carbohydrate Esterases (CEs) and Cellulose binding moduals (CBMs) of H. irregulare were
compared to those of 8 other fungi, including 7 basidiomycetes (U. maydis, P. placenta, P.
chrysosporium, L. bicolor, C. cinerea, S. commune, C. neoformans) and the ascomycete plant
pathogen M. oryzae. The global number of GHs, PLs, CEs and CBMs of H. irregulare is
almost perfectly in the median of each category (Table S16). However, these global numbers
overlook important details that appear during a detailed inspection of the GH profile at the
family level. In particular the spectrum of GH families dedicated to plant cell wall
polysaccharide degradation of H. irregulare is almost as complete as that of the saprophytes
S. commune and that of C. cinerea (Table S16). H. irregulare appears to have all the
enzymatic equipment to digest cellulose (enzymes from families GH5, GH6, GH7 and
GH45), xyloglucan and its side chains (GH27, GH29, GH12 and GH74), and pectin and its
side chains (GH28,GH43, GH51, GH53, GH78, GH88, GH105, PL1, PL4, CE8 and CE12).
In contrast H. irregulare appears to have a limited xylanolytic potential, with only two
xylanases from family GH10 and none from family GH11. The presences of 17 proteins with
CBM1 modules indicate a strong cellulose binding capacity (Table S16). The number of
enzymes active in pectin degradation was found to be more than twice as many as from the
basidiomycets and the pathogenic M. oryzae (Table S17).
Growth of H. irregulare was analysed on minimal media containing different carbon sources
and compared to growth of four other basidiomycetes, P. chrysosporium, L. bicolor, S.
commune and P. placenta, along with M. oryzae. Growth of H. irregulare on monomeric and
oligomeric sugars was slow compared to growth on some polysaccharides (data not shown).
Growth on sucrose, however, correlated well with the presence of an invertase (GH32) in the
genome of H. irregulare. Good growth was observed on guar gum (a galactomannan similar
in structure to softwood galacto(gluco)mannans), apple pectin and starch, while poor growth
15
was observed on beechwood xylan (glucuronoarabinoxylans), which correlates well with the
CAZy identified in the genome..
Gene models putatively involved in lignin degradation in the H. irregulare genome were
characterized and compared to other fungi (Table S12).
5.1. Transcripte profiling of wood degradation
Shavings of Pinus sylvestris (L.) sapwood, on moist soil separated with a nylon net, were
inoculated with H. irregular for a period of ten days at 20oC and 80% relative humidity.
Colonized wood shavings were frozen in liquid nitrogen and RNA was extracted (see above).
Out of the 305 carbohydrate active enzymes (CAZYs) present in the H. irregulare genome,
27 are differentially expressed (p<0.05) under wood degradation compared to growth on
liquid medium (Table S18). The carbohydrate active part of the transcriptome is dominated by
cellulase degrading glucoside hydrolysing enzymes in the groups GH61, GH5 and GH12
which may have compensated for the relatively low numbers of GH7, 10, 11 enzymes. Pectin,
a constutuent of the middle lamellae between pine-tracheid cell walls and ray cells, is the
target for pectic lyasates which are higly expressed under pine wood decay. Moreover,
cellulose fibres are protected by hemecellulose and they are degraded by arabinase and xylose
which are expressed at intermediate and low levels.
Generally, enzymes included in oxidative lignocellulose degradation have a lower expression
compared to carbohydrate hydrolysing enzymes under wood degradation. Manganese
peroxidase and versatile peroxidases are expressed threefold compared to the liquid medium.
Peroxidases require peroxide as an oxidant for their function. It may be provided by three
differet GMC-oxidoreductases that are 15 fold significantly higher expressed in wood
compared to liquid culture (Table S18). Glyoxal oxidase is another producer of H2O2 and is as
well upregulated in wood. Laccases are also moderately upregulated. Cellobiose
dehydrogenase responsible for the oxidation of cellobiose is moderately upregulated.
The fungal mycelium may connect energy and mineral resources at different physical
locations. In the context of wood decay this is accomplished by active transportation of sugars
from the wood degradation and ammonium into this area. Genes that are differentially
16
expressed include also those that encodes proteins that facilitate transport of nutrients (Table
S18).
Notes S6 Pathogenicity
The Plant-Associated Microbe Gene Ontology (PAMGO) consortium has extended the Gene
Ontology (GO), which describes a gene product in the context of their molecular function,
biological process and cellular localisation, to include terms describing processes involvning
multi-organism interactions (Ashburner et al., 2000).
From the AMIGO (http://amigo.geneontology.org/cgi-bin/amigo/go.cgi) web site gene
products identifications from M. oryzea and C. albicans for the GO:0044403 symbiosis
encompassing mutualism through parasitism term and it’s children were down loaded. After
manual curration and refining of the lists 219 and 216 unique sequences were retrieved from
M. oryzea and C. albicans, respectively (Arnaud et al., 2009, Torto-Alalibo et al., 2009).
Out of 219 M. oryzea proteins 116 had at least one hit, among the filtered model gene set of
H. irregulare, with an e-value below -10. There were 36 gene products that had 2 or more
hits. From C. albicans there were 181 gene products that had significant homology with
predicted proteins from the H. irregulare genome and 123 had more than two matches.
Among the genes identified there were only 13 that show homology with GO:0044403
classified genes of both M. oryzea and C. albicans.
The potentially conserved pathogenicity genes common between M. oryzea and C. albicans
are mainly classified as signalling pathway and transcriptional regulation proteins. Two
proteins, id: 148775 and 57164 are MAP kinases which share similarities with the mitogenactivated protein kinase, FUS3/KSS1 and SLT2/MPK1 respectively from S. cerevisiae. A
catalytid subunit of a cAMP-dependent protein kinase (Protein id: 68091) and adenylate
cyclase (Protein id:45792) were shared between M. oryzea and C. albicans. Further more,
signalling related proteins including a Ras related GTPase (Protein id: 66761) and a small
GTPase (Protein id: 154562) similar to Cdc42 from S. cerevisiae were found. Two proteins,
(Protein id: 99199 and 172816) show homology to proteins involved in transcription
alteration and chromatin remodelling by interference with protein-DNA complexes, a
trehalose-6-phosphate synthase (Protein id: 54829) and two proteins related to lipid
metabolism were also among the shared proteins. Protein 38714 are probably involved in
17
beta-oxidation in the peroxisome and protein 156259 show similarity to a acyltransferase
specific to short chain fatty acids.
6.1. Natural product genes in the H. irregulare genome
H. annosum (s.l.) is recognized as a producer of at least 10 different secondary metabolites
such as the fomajorins (Donnelly et al., 1987) and fomannosin (Kepler et al., 1967), which all
belong to the terpenoids, and fomannoxin (Hirotani et al., 1977). These compounds are
produced both in axenic cultures and during interaction with plants and other fungi.
Fomannosin was the first toxin that was isolated from H. annosum (s.l.) and hypothesized to
be involved in pathogenesis by Basset and co-workers in 1967, although they were not able to
detect the compound from infected Pinus taeda stems or roots, nor from H. annosum (s.l.)
cultures on pine sapwood shavings (Basset et al., 1967). Fomannosin was preferentially
produced on media containing high sugar levels, associated with the declining growth phase
of the fungus. Application of fomannosin to stem wounds in pine seedlings causes needle
browning and death, and de novo synthesis of pinosylvin (Basset et al., 1967). Another toxin
produced by H. annosum (s.l.) is fomannoxin, which have a 100 fold greater toxicity to plant
cells than fomannosin (Hirotani et al., 1977). This toxin has been isolated from H. annosum
(s.l.) infected Sitka spruce stem wood (Heslin et al., 1983). Uptake of fomannoxin by Sitka
spruce seedlings resulted in rapid browning of the roots accompanied by chlorosis and
progressive browning of needles (Heslin et al., 1983). This, and the production of fomannoxin
by actively growing hyphae, suggests a role for fomannoxin during pathogenesis (Heslin et
al., 1983). Fomannoxin, fomannosin and fomajorins shows varying degrees of toxicity to both
plant, fungal and bacterial cells (Sonnenbichler et al., 1989).
H. irregulare strain TC32-1 was grown in liquid Hagem media (Stenlid 1985) at 20 ºC and
stationary conditions for 30 days. Five liters culture filtrate was extracted using eight 10-g
solid-phase extraction (SPE) columns (Isolute C18 (EC), Biotage, Uppsala, Sweden). The
columns were activated and equilibrated (60 ml CH3OH and 120 ml H2O containing 0.2% v/v
formic acid, respectively) before sample loading. Water-soluble organic and inorganic
substances were washed out with 120 ml H2O, containing 0.2% v/v formic acid, before the
lipophilic analytes were eluted with 90 ml aqueous 95% CH3CN with 0.2% v/v formic acid
present. The 95% aqueous CH3CN eluate was dried under reduced pressure and the residue
was dissolved in 3 ml 30% aqueous CH3CN before it was subjected to preparative reversed
phase HPLC (Reprosil-Pur ODS-3, C18, 5 µm, 20 × 100 mm and 20 × 30 mm guard column,
18
Dr Maisch GmbH, Ammerbuch, Germany). Separation (3 ×1 ml injected) was performed in
60 min using a gradient of CH3CN in H2O with 0.2% v/v formic acid present (8-70% CH3CN
in 50 min, 70-95% in 5 min, followed by a hold at 95% for 5 min). The flow rate was 13.2
ml/min and the eluate was monitored (UV at 254 nm) and collected into 2-ml fractions in
deep-well plates using a liquid handler (Gilson FC204). Selected fractions were concentrated
under a stream of nitrogen and diluted with H2O before they were subjected to further SPE.
The SPE columns (50 mg and 100 mg, Isolute C18 (EC), International Sorbent Technology,
Hengoed, UK) were treated in the same manner as described above except for; D2O was used
for washing and eluation was done with CDCl3.
The structures of the selected compounds were determined using nuclear magnetic resonance
(NMR) spectroscopy and liquid chromatography-mass spectrometry (LC-MS). Data acquired
from these techniques were analyzed and compared to literature data of structures already
published from H. annosum (s.l.). NMR data were obtained using a Bruker DRX400
spectrometer equipped with a 5-mm QNP probe head and a Bruker Avance III 600
spectrometer with a 2.5-mm SEI probe head. All NMR experiments were recorded at 30°C
and with CDCl3 as solvent. Standard pulse sequences supplied by Bruker were used to
perform one-dimensional 1H and two-dimensional 1H-1H COSY and 1H-13C HSQC
experiments. Chemical shifts were determined relative to internal CHCl3 (δH 7.27; δC 77.23).
LC-MS was done using a HP1100 LC system (Hewlett-Packard, Palo Alto, CA, USA)
connected to an Esquire-LC ion-trap mass spectrometer with an electrospray interface
working in positive mode (Bruker Daltonics Inc., Billerica, MA, USA).
LC-MS analysis of the peak eluting after 39.5 min yielded one major ion at m/z 189.3 [M+H]+
which indicated a molecular mass of 188 Da, i.e. the molecular mass of the secondary
metabolite fomannoxin (Fig. 2) (Hirotani et al., 1977). The acquired 1H NMR data was shown
to be identical to data previously reported for fomannoxin (1) (Hirotani et al., 1977). Isolation
and subsequent investigation of the peak eluting at 26.2 min with LC-MS (m/z 187.3 [MH2O+H]+ and 205.3 [M+H]+) indicated a molecular mass of 204 Da. The 1H NMR spectrum
revealed resonances partly similar to the pattern obtained for fomannoxin including signals
from a 1,2,4-trisubstituted benzene and an aldehyde group. The chemical shifts of these
signals and remaining signals were found to be comparable with data previously reported for
the fomannoxin analogoue 5-formyl-2-(isopropyl-1´-ol)benzofuran (2) (Donnelly et al.,
1988). The peak eluting after 22.7 min yielded three major ions by LC-MS analysis, m/z
19
245.3, 263.3 and 285.3, probably corresponding to [M-H2O+H]+, [M+H]+ and [M+Na]+,
respectively, indicating a molecular mass of 262 Da, i.e. the molecular mass of the
sesquiterpene fomannosin. The NMR-data was in accordance with a sesquiterpene structure
and the data was found to be identical to data previously reported for fomannosin (Basset et
al., 1967; Kepler et al., 1967; Paquette et al., 2008). This constitutes the first report on the
production of secondary metabolites from the sequenced strain TC32-1. The results confirm
the production of fomannoxin from the North American species H. irregulare, a metabolite
that was previously only found in the European species H. annosum (s.s.) (Fig. 2).
Putative natural product genes were found in the H. irregulare genome (Table S20),
suggestive of a biosynthetic capacity greater than evident from the above chemical
investigations. Computer analysis using a secondary metabolite unique regions finder
(SMURF) web-tool (http://www.jcvi.org/smurf/index.php) and manual curation identified
clustered co-localization of putative natural product genes (Table S21). Two polyketide
synthase (PKS) genes (Prot. Id.174227 and 174228), were 174227 has an unusual tridomain
structure including adenylation carrier protein and ketosynthase domains, forms a short
cluster (cluster 1, Table S21) together with a halogenase gene. Further downstream genes in
this cluster indicative for secondary metabolism encode a major facilitator protein (Prot. Id.
52305) and a transcription factor (Prot. Id. 118599) containing a Cys2His2-zinc finger motif.
A third PKS gene (Prot. Id. 50938) for a canonical, non-reducing wA-like synthase is located
next to a gene for a dual Cys2His2 zinc finger/Zn(II)2Cys6 binuclear cluster transcriptional
regulator (Prot. Id. 51550) (cluster 8, Table S21), found in other species in the context of
natural product genes (Zhang et al., 2004, Misiek & Hoffmeister, 2008). No polyketide
metabolites have so far been isolated from H. annosum (s.l.), which suggests that the PKS
genes are tightly regulated.
Genes for multimodular nonribosomal peptide synthetases (NRPSs) were not detected,
however 13 genes for tridomain enzymes identically composed of typical NRPS domains
(adenylation, thiolation, reduction) are present in the genome and showed high homology to
LYS2 from yeast (Ehmann et al., 1999) which functions as α-aminoadipate reductase during
L-lysine synthesis in primary metabolism or to an ochratoxin biosynthesis NRPS-like gene,
npsPN of Penicillium nordicum (Karolewiez & Geisen, 2005). From the microarray data,
gene corresponding to Prot. Id. 153301 was induced during infection of pine bark (P = 0.033)
and during growth on wood (P = 0.030), compared with liquid cultures.
20
Putative terpene cyclase genes include genes corresponding to Prot. Id. 181194, 115814, and
169607. Such an enzymatic activity is required to synthesize the intermediate en route to
fomajorins and fomannosin. From the microarray data, gene 169607 was significantly downregulated during growth on wood (P = 0.019) compared with liquid cultures. This finding fits
well with previous failure to detect fomannosin from infected P. taeda stems or roots, or from
H. annosum (s.l.) cultures on pine sapwood shavings (Basset et al., 1967). Instead, the
production of fomannosin on media containing high sugar levels during the declining growth
phase of the fungus suggest that the primary function of fomannosin can be related to fungal
development or microbial interactions, rather than as a phytotoxin.
Several genes encoding so-called tailoring enzymes, i.e., enzymes for post-backbone
assembly modification, were identified, among those five genes for flavin-dependent
halogenases (Prot. Id. 174229, 181184, 181189, 181191, 181192) and one
dimethylallyltransferase synthase (DMATS)-type prenyltransferase (108351). For either
enzyme category, involvement in fungal natural product assembly has been experimentally
shown (Steffan et al., 2007, Wang et al., 2008). While a DMATS may catalyze the
prenyltransfer during fomannoxin assembly, halogenated natural products have not been
described for the secondary metabolome of H. annosum (s.l.) yet.
Here we report that the genome of H. irregulare contain a number of putative genes for
biosynthesis of secondary metabolites. Chemical analyses of culture filtrates of strain TC32-1
predicts the presence of terpene cyclase (fomannosin) and DMATS (fomannoxin) genes,
which were identified in the genome. However, the presence of putative PKS, NRPS and
halogenase genes predict that the biosynthetic capacity of H. irregulare is greater than evident
from the above chemical investigations, as no compounds have so far been identified that can
be classified as polyketides, non-ribosomal peptides or to be halogenated.
6.2. Genes differentially regulated in interactions between H. irregulare and pine
Samples of Pinus sylvestris (L.) bark colonised by H. irregulare for a period of four weeks
were frozen in liquid nitrogen and RNA was extracted (see above).
The 12200 models represented on the microarray were filtered for probes which cross
hybridized with transcripts from pine. This filtering eliminated 1747 gene models from the
future analysis. From the remaining 10453 the most highly expressed genes were selected, cut
21
of > 400 units mean raw. This gives a list of 1353 gene models out of there are 43 more than 2
fold up-regulated and 15 down-regulated (P<0.05). Genes highly expressed during infection
which are not significantly induced in the bark sample compared to the liquid culture are
presented in Table S22.
Five of the secreted proteins are classified as CAZymes, one as a lipase, two as oxidases
active on saccharide molecules in addition one acetylglucosaminyltransferase. The remaining
9 secreted protein have no homology with any protein with know function. However, two of
out of these have a structure indicating four transmembrane domains indicating a membrane
localisation.
6.3 Anchoring pathogenicity QTL’s to the genome sequence
An AFLP-marker based linkage map of H. annosum (s.l.) was originally published by Lind et
al., (2005). This map was based on a mapping population of 102 single spore isolates
originating from a compatible mating between a H. occidentale North American S-type
isolate (TC-122-12) and the now sequenced H. irregulare North American P-type isolate
(TC-32-1) (Olson et al., 2005). This map has been used to map several traits of interest, such
as growth rate (Olson, 2006), virulence (Lind et al., 2007b) and various intraspecific
interactions (Lind et al., 2007a).
To anchor the linkage groups containing the QTLs to the assembled genome, microsatellite
markers were designed using the sequence information. One microsatellite marker on each of
the 14 larger scaffolds was designed, screened for in the progeny isolates and added to the
existing set of markers. Using the JoinMap 3.0 software (van Ooijen, 2001), the new markers
were mapped and new linkage groups formed.
The most interesting linkage group from a virulence point of view is linkage group 15,
containing virulence QTLs from several different experiments on different hosts (Lind et al.,
2007b). This group absorbed microsatellite marker 35, which was designed from the end of
scaffold 12 (at 1557077 base pairs out of 1764121). Linkage group 20, which contained
another virulence QTL, absorbed microsatellite 4, at 524787 base pairs into scaffold 1. The
other linkage groups containing known QTLs might also be possible to anchor using an
analogue approach.
22
With the scaffolds containing the virulence QTLs uncovered, efforts were made to pinpoint
and sharpen their exact position. The microsatellite markers x1p, x2p and 198 were
successfully added to linkage group 20, fusing it to linkage group 4. Likewise, another 8
markers from scaffold 12 added to linkage group 15. These new markers also made linkage
group 9 fuse together with 15. This group now spans from marker 65_2, at 5473 bp, to marker
138.258, at 1742709 bp, indicating that the entire scaffold 12 is covered. Since no other
markers have linked to the microsatellites at the ends of this scaffold, it seems possible that
this scaffold in fact constitutes an entire single chromosome. This is confirmed by the
presence of telomeric repeats (Fig. 1).
Virulence assays described by Lind and collegues (2007a) were re-mapped using the new
linkage groups, and the QTL regions (Fig. 3) were scanned for candidate genes and compared
to microarray data from RNA extracted from infection sites.
Notes S7 Trade-off
The trade-off between growing in living and dead host within a specific organism can be
illustrated by differences in gene expression profiles (Table 1). The number of common genes
differentially expressed during colonisation of living and dead host is small as well as the
number of uniquely expressed genes at distinct conditions (Fig 5). Specifically genes involved
in degradation of lignocellulose and nutrient transportation are key components to understand
the underlaying mechanism of trade-off in necrotrophic plant pathogens. To not use the full
potential for wood decomposition represents an energetic cost for the pathogenic lifestyle.
H. irregulare have 17 Cazymes up regulated compared to liquid culture when grown on wood
while only half the number are expressed during bark colonisation (Table S24). The cost is
not related to the capacity to degrade pectin while the capacity to degrade cellulose is
connected with a trade-off. All genes associated with a trade-off effect show higher induction
on wood vs liquid culture (LC) than bark vs LC.
H. irregulare express 27 membrane transport proteins during wood growth (Table S25). The
lower cellulose degradation potential expressed during growth in bark correllates with lower
capacity for nutrient transport. Out of the 8 MFS1 that are connected with the trade-off effect
only two show a higher expression compared to liquid culture on bark than on wood, while
three of the four expressed both on wood and bark show higher expression on bark.
23
References
1.
Abeel T, Saeys Y, Bonnet E, Rouzé P, van de Peer Y. 2008. Generic eukaryotic core
promoter prediction using structural features of DNA. Genome Research 18: 310-323.
2.
Aimi T, Yoshida R, Ishikawa M, Bao DP, Kitamoto Y. 2005. Identification and
linkage mapping of the genes for the putative homeodomain protein (hox1) and the
putative pheromone receptor protein homologue (rcb1) in a bipolar basidiomycete,
Pholiota nameko. Current Genetics 48: 184-194.
3.
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local
alignment search tool. Journal Molecular Biology 215: 403–410.
4.
Arnaud M B, Costanzo MC, Shah P, Skrzypek MS, Sherlock G. 2009. Gene
ontology and annotation of pathogen genomes: the case of Candida albicans. Trends
in Microbiology 17: 295-303.
5.
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP,
Dolinski K, Dwight SS, Eppig JT, et al. 2000. Gene Onthology: toll for the
unification of biology. Nature Genetics 25: 25-29.
6.
Basset C, Sherwood RT, Kepler JA, Hamilton PB. 1967. Production and biological
activity of fomannosin, a toxic sesquiterpene metabolite of Fomes annosus.
Phytopathology 57: 1046-1052.
7.
Benson G. 1999. Tandem Repeat Finder: a program to analyze DNA sequences.
Nucleic Acid Research 27: 573-580.
8.
Brewster JL, Devaloir T, Dwyer ND, Winter E, Gustin MC. 1993. An
Osmosensing Signal Transduction Pathway in Yeast. Science 259: 1760-1763.
9.
Brown AJ, Casselton LA. 2001. Mating in mushrooms: increasing the chances but
prolonging the affair. Trends in Genetics 17: 393-400.
10.
Comparini C, Carresi L, Pagni E, Sbrana F, Sebastiani F, Luchi N, Santini A,
Capretti P, Tiribilli B, Pazzagli L, et al. 2009. New proteins orthologous to ceratoplatanin in various Ceratocystis species and the purification and characterization of
cerato-populin from Ceratocystis populicola. Applied Microbiology and Biotechnol
84: 309-322.
11.
Cyert MS. 2003. Calcineurin signaling in Saccharomyces cereviside: how yeast go
crazy in response to stress. Biochem Biophys Res Commun 311:1143-1150.
12.
Donnelly DMX, Fukuda N, Kouno I, Martin M, O´Reilly J. 1988.
Dihydrobenzofurans from Heterobasidion annosum. Phytochemistry 27: 2709-2713.
24
13.
Donnelly DMX, O’Reilly J, Polonsky J, Sheridan MH. 1987. In vitro production
and biosynthesis of formajorin D and S by Fomes annosus (Fr.) Cooke. Journal of the
Chemical Society Perkin Transaction 1: 1869-1872.
14.
Egan MJ, Wang ZY, Jones MA, Smirnoff N, Talbot NJ. 2007. Generation of
reactive oxygen species by fungal NADPH oxidases is required for rice blast disease.
Proceedings of the National Academy of Science USA 104: 11772-11777.
15.
Ehmann DE, Gehring AM, Walsh CT. 1999. Lysine biosynthesis in Saccharomyces
cerevisiae: mechanism of α-aminoadipate reductase (Lys2) involves posttranslational
phosphopantetheinylation by Lys5. Biochemistry 38: 6171-6177.
16.
Feschotte C, Keswani U, Ranganathan N, Guibotsy ML, Levine D. 2009.
Ewploring the repetitive DNA landscapes using REPCLASS, a tool that automates the
classification of transposable elements in eukaryotic genomes. Genome Biology and
Evolution 1: 205-220.
17.
Gessler NN, Aver'yanov AA, Belozerskaya TA. 2007. Reactive oxygen species in
regulation of fungal development. Biochemistry (Mosc) 72: 1091-1109.
18.
Grimshaw SJ, Mott HR, Stott KM, Nielsen PR, Evetts KA, Hopkins LJ,
Nietlispach D, Owen D. 2004. Structure of the sterile alpha motif (SAM) domain of
the Saccharomyces cerevisiae mitogen-activated protein kinase pathway-modulating
protein STE50 and analysis of its interaction with the STE11 SAM. Journal of
Biological Chemistry 279: 2192-2201.
19.
Gustin MC, Albertyn J, Alexander M, Davenport K. 1998. MAP kinase pathways
in the yeast Saccharomyces cerevisiae. Microbiology and Molecular Biology Review
62: 1264-1300.
20.
Heslin MC, Stuart MR, Murchu PO, Donnelly DMX. 1983. Fomannoxin, a
phytotoxic metabolite of Fomes annosus: in vitro production, host toxicity and
isolation from naturally infected Sitka spruce heartwood. European Journal of Forest
Pathology 13: 11-23.
21.
Hirotani M, O´Reilly J, Donnelly DMX. 1977. Fomannoxin – a toxic metabolite of
Fomes annosus. Tetrahedron Letter 7: 651-652.
22.
Hiscock SJ, Kües U. 1999. Cellular and molecular mechanisms of sexual
incompatibility in plants and fungi. International Review of Cytology 193: 165-295.
23.
Ito N, Phillips SEV, Stevens C, Ogel ZB, McPherson MJ, Keen JN, Yadav KDS,
Knowles PF. 1991. Novel thioether bond revealed by a 1.7 A crystal structure of
galactose oxidase. Nature 350: 87-90.
25
24.
James TY. 2007. Analysis of mating-type locus organization and synteny in
mushroom fungi- beyond model species. In: Heitman J, Kronstad J, Taylor JW,
Casselton LA, eds. Sex in fungi: molecular determination and evolutionary
implications. Washington, D. C. USA: ASM Press, 317-331.
25.
James TY, Srivilai P, Kües U, Vilgalys R. 2006. Evolution of the bipolar mating
system of the mushroom Coprinellus disseminatus from its tetrapolar ancestors
involves loss of mating-type-specific pheromone receptor function. Genetics 172:
1877-1891.
26.
Johnson DI. 1999. Cdc42: An essential Rho-type GTPase controlling eukaryotic cell
polarity. Microbiology and Molecular Biology Review 63: 54-105.
27.
Johnson L. 2008. Iron and siderophores in fungal-host interactions. Mycological
Research 112: 170-183.
28.
Jung K, Kim S, Okagaki LH, Nielsen K, Bahn Y. 2011. Ste50 adaptor protein
governs sexual differentiation of Cryptococcus neoformans via the pheromoneresponse MAPK signaling pathway. Fungal Genetics and Biology 48: 154-165.
29.
Karolewiez A, Geisen R. 2005. Cloning a part of the ochratoxin A biosynthetic gene
cluster of Penicillium nordicum and characterization of the ochratoxin polyketide
synthase gene. Systematics and Applied Microbiology 28: 588-595.
30.
Kepler JA, Wall ME, Mason JE, Basset C, McPhail AT, Sim, GA. 1967. The
structure of fomannosin, a novel sesquiterpene metabolite of the fungus Fomes
annosus. Journal of American Chemical Society 89: 1260-1261.
31.
Kersten PJ. 1990. Glyoxal oxidase of Phanerochaete chrysosporium: Its
characterization and activation by lignin peroxidase. Proc Natl Acad Sci USA 87:29362940.
32.
Kersten PJ, Cullen D. 1993. Cloning and characterization of a cDNA encoding
glyoxal oxidase, a peroxide-producing enzyme from the lignin-degrading
basidiomycete Phanerochaete chrysosporium. Proceedings of the National Academy
of Science USA 90: 7411-7413.
33.
Kersten PJ, Kirk TK. 1987. Involvement of a new enzyme, glyoxal oxidase, in
extracellular H2O2 production by Phanerochaete chrysosporium. Journal of
Bacteriology 169: 2195-2201.
34.
Kofler R, Schlötterer C, Lelley T. 2007. SciRoKo: a new tool for whole genome
microsatellite search and investigation. Bioinformatics 23: 1683-1685.
26
35.
Korhonen K, Stenlid J. 1998. Biology of Heterobasidion annosum. In: Woodward S,
Stenlid J, Karjalainen R, Hüttermann A, eds. Heterobasidion annosum: Biology,
Ecology, Impact and Control. Wallingford UK: CAB International, 43-70.
36.
Kües U. 2000. Life history and developmental processes in the basidiomycete
Coprinus cinereus. Microbiology and Molecular Biology Review 64: 316-353.
37.
Kües U, Casselton LA. 1992. Fungal mating type genes - regulators of sexual
development. Mycological Research 96: 993-1006.
38.
Lawson MJ, Zhang L. 2006. Distinct patterns of SSR distribution in the Arabidopsis
thaliana and rice genomes. Genome Biology 7: R14.
39.
Li Y-C, Korol AB, Fahima T, Nevo E. 2004. Microsatellites within genes: structure,
function, and evolution. Molecular Biology and Evolution 21: 991–1007.
40.
Lind M, Dalman K, Stenlid J, Karlsson B, Olson Å. 2007b. Identification of
quantitative trait loci affecting virulence in the basidiomycete Heterobasidion
annosum s.l. Current Genetics 52: 35-44.
41.
Lind M, Olson Å, Stenlid J. 2005. An AFLP-markers based genetic linkage map of
Heterobasidion annosum locating intersterility genes. Fungal Genetics and Biology
42: 519-527.
42.
Lind M, Stenlid J, Olson Å. 2007a. Genetics and QTL mapping of somatic
incompatibilityand intraspecific interactions in the basidiomycete Heterobasidion
annosum s.l. Fungal Genetics and Biology 44: 1242–1251
43.
Locke EG, Bonilla M, Liang L, Takita Y, Cunningham KW. 2000. A homolog of
voltage-gated Ca2+ channels stimulated by depletion of secretory Ca2+ in yeast.
Molecular and Cell Biology 20: 6686-6694.
44.
Ma J, Bennetzen JL. 2004. Rapid recent growth and divergence of rice nuclear
genomes. Proceedings of the National Academy of Science USA 101: 12404-12410.
45.
Ma PS, Wera S, van Dijck P, Thevelein JM. 1999. The PDE1-encoded low-affinity
phosphodiesterase in the yeast Saccharomyces cerevisiae has a specific function in
controlling agonist-induced cAMP signaling. Molecular and Cell Biology 10: 91-104.
46.
Malleshaiah MK, Shahrezaei V, Swain PS, Michnick SW. 2010. The scaffold
protein Ste5 directly controls a switch-like mating decision in yeast. Nature 465: 101105.
47.
Martinez D, Larrondo LF, Putnam N, Sollewijn Gelpke MD, Huang K, Chapman
J, Helfenbein KG, Ramaiya P, Detter JC, Larimer F, et al. 2004. Genome
27
sequence of the lignocellulose degrading fungus Phanerochaete chrysosporium strain
RP78. Nature Biotechnology 22: 695-700.
48.
Martinez D, Challacombe J, Morgenstern I, Hibbett D, Schmoll M, Kubicek CP,
Ferreira P, Ruiz-Duenas FJ, Martinez AT, Kersten P, et al. 2009. Genome,
transcriptome, and secretome analysis of wood decay fungus Postia placenta supports
unique mechanisms of lignocellulose conversion. Proceedings of the National
Academy of Science USA 106: 1954-1959.
49.
May G, Shaw F, Badrane H, Vekemans X. 1999. The signature of balancing
selection: fungal mating compatibility gene evolution. Proceedings of the National
Academy of Science USA 96: 9172-9177.
50.
Mc Carthy E, Mc Donald JF. 2003. LTR_STRUC: a novel search and identification
program for LTR retrotransposons. Bioinformatics 19: 362-367.
51.
Metzgar D, Bytof J, Wills C. 2000. Selection against frameshift mutations limits
microsatellite expansion in coding DNA. Genome Research 10: 72–80.
52.
Misiek M, Hoffmeister D. 2008. Processing sites involved in intron splicing of
Armillaria natural product genes. Mycological Research 112: 216-224.
53.
Mittler R, Vanderauwera S, Gollery M, van Breusegem F. 2004. Reactive oxygen
gene network of plants. Trends in Plant Science 9: 490-498.
54.
Mun J-H, Kim D-J, Choi H-K, Gish J, Debelle F, Mudge J, Denny R, Endre G,
Saurat O, Dudez A-M, et al. 2006. Distribution of microsatellites in the genome of
Medicago truncatula: a resource of genetic markers that integrate genetic and physical
maps. Genetics 172: 2541-2555.
55.
Nakai K, Horton P. 1999. PSORT: a program for detecting sorting signals in proteins
and predicting their subcellular localization. Trends in Biochemical Sciences 24: 3435.
56.
Niculita-Hirzel H, Labbé J, Kohler A, le Tacon F, Martin F, Sanders IR, Kües U.
2008. Gene organization of the mating type regions in the ectomycorrhizal fungus
Laccaria bicolor reveals distinct evolution between the two mating type loci. New
Phytologist 180: 329-342.
57.
Olson Å. 2006. Genetic linkage between growth rate and the intersterility genes S and
P in the basidiomycete Heterobasidion annosum s.lat. Mycolical Research 110: 979984.
58.
Olson Å, Lind M, Stenlid J. 2005. In vitro test of virulence in theprogeny of a
Heterobasidion interspecific cross. Forest Pathology 35: 321–331.
28
59.
Paquette LA, Peng X, Yang J, Kang H-J. 2008. The carbohydrate-sesquiterpene
interface. Directed synthetic routes to both (+)- and (-)- fomannosin from D-glucose.
Journal of Organic Chemistry 73: 4548-4558.
60.
Posas F, Wurgler-Murphy SM, Maeda T, Witten EA, Thai TC, Saito H. 1996.
Yeast HOG1 MAP kinase cascade is regulated by a multistep phosphorelay
mechanism in the SLN1-YPD1-SSK1 ''two-component'' osmosensor. Cell 86: 865875.
61.
Price AL, Jones NC, Pevzner PA. 2005. De novo identification of repeat families in
large genomes. In: Proceedings of the 13 Annual International conference on
Intelligent Systems for Molecular Biology (ISMB-05). Detroit, MI, USA, 351-358.
62.
Pukkila-Worley R, Alspaugh JA. 2004. Cyclic AMP signaling in Cryptococcus
neoformans. FEMS Yeast Res 4: 361-367.
63.
Rawlings ND, Morton FR, Kok CY, Kong J, Barrett AJ. 2008. MEROPS: the
peptidase database. Nucleic Acids Research 36: 320-325.
64.
Richards RI, Holman K, Yu S, Sutherland GR. 1993. Fragile X syndrome unstable
element, p(CCG)n, and other simple tandem repeat sequences are binding sites for
specific nuclear proteins. Human Molecular Genetics 2: 1429–1435.
65.
Rispail N, Soanes DM, Ant C, Czajkowski R, Grünler A, Huguet R, PerezNadales E, Poli A, Sartorel E, Valiante V, et al. 2009. Comparative genomics of
MAP kinase and calcium-calcineurin signalling components in plant and human
pathogenic fungi. Fungal Genetics and Biology 46: 287-298.
66.
Shemarova IV. 2009. cAMP-dependent signal pathways in unicellular eukaryotes.
Critical Review Microbiology 35: 23-42.
67.
Smit AFA, Hubley R, Green P. 1996-2010. RepeatMasker Open-3.0.
68.
Sonnenbichler J, Bliestle IM, Peipp H, Holdenrieder O. 1989. Secondary fungal
metabolites and their biological activities, I. Isolation of antibiotic compounds from
cultures of Heterobasidion annosum synthesized in the presence of antagonistic fungi
or host plant cells. Biological Chemistry Hoppe-Seyler 370: 1295-1303.
69.
Stallings RL. 1994. Distribution of trinucleotide microsatellites in different categories
of mammalian genomic sequence: implications for human genetic diseases. Genomics
21: 116–121.
70.
Steffan N, Unsöld IA, Li S-M. 2007. Chemoenzymatic synthesis of prenylated indole
derivatives by using a 4-dimethylallyltryptophan synthase from Aspergillus fumigatus.
Chembiochem 8: 1298-1307.
29
71.
Stenlid J. 1985. Population structure of Heterobasidion annosum as determined by
somatic incompatibility, sexual incompatibility, and isoenzyme patterns. Canadian
Journal Botany 63: 2268-2273.
72.
Tanaka A, Takemoto D, Hyon GS, Park P, Scott B. 2008. NoxA activation by the
small GTPase RacA is required to maintain a mutualistic symbiotic association
between Epichloe festucae and perennial ryegrass. Molecular Microbiology 68: 11651178.
73.
The UniProt Consortium. 2008. The Universal Protein Resource (UniProt), Nucleic
Acid Research 36: 190-195.
74.
Thompson JD, Higgins DG, Gibson TJ. 1994 CLUSTAL W: improving the
sensitivity of progressive multiple sequence alignments through sequence weighting,
position specific gap penalties and weight matrix choice. Nucleic Acid Research 22:
4673-4680.
75.
Tóth G, Gáspári Z, Jurka J. 2000. Microsatellites in Different Eukaryotic Genomes:
Survey and Analysis. Genome Research 10: 967-981.
76.
Vanden Wymelenberg A, Sabat G, Mozuch M, Kersten PJ, Cullen D, Blanchette
RA. 2006. Structure, organization, and transcriptional regulation of a family of copper
radical oxidase genes in the lignin-degrading basidiomycete Phanerochaete
chrysosporium. Applied and Environmental Microbiology 72: 4871-4877.
77.
Wang S, Xu Y, Maine EA, Wijeratne EMK, Espinosa-Artiles P, Gunatilaka AAL,
Molnár I. 2008. Functional characterization of the biosynthesis of radicicol, an Hsp90
inhibitor resorcylic acid lactone from Chaetomium chiversii. Chemical Biology 15:
1328-1338.
78.
van Ooijen JW, Voorrips RP. 2001. JoinMap® 3.0, Software for the calculation of
genetic linkage maps. Plant Research International, Wageningen, the Netherlands.
79.
Whittaker MM, Kersten PJ, Cullan D, Whittaker JW. 1999. Identification of
catalytic residues in glyoxal oxidase by targeted mutagenesis. Journal of Biological
Chemistry 274: 36226-36232.
80.
Wicker T, Sabot F, Hua-Van A, Bennetzen JL, Capy P, Chalhoub B, Flavell A,
Leroy P, Morgante M, Panaud O, et al. 2007. A unified classification system for
eukaryotic transposable elements. Nature Review Genetics 8: 973-982.
81.
Whittaker MM, Kersten PJ, Nakamura N, Sanders-Loehr J, Scheizer ES,
Whittaker JW. 1996. Glyoxal oxidase from Phanerochaete chrysosporium is a new
radical-copper oxidase. Journal of Biological Chemistry 271: 681-687.
30
82.
Wooton JC, Federhen S. 1996. Analysis of compositionally biased regions in
sequence databases. Methods Enzymology 266: 554-571.
83.
Yi R, Tachikawa T, Ishikawa M, Mukaiyama H, Bao Dapeng, Aimi T. 2009.
Genomic structure of the A mating-type locus in a bipolar basidiomycete, Pholiota
nameko. Mycological Research 113: 240-248.
84.
Zhang S, Monahan BJ, Tkacz JS, Scott B. 2004. Indole-diterpene gene cluster from
Aspergillus flavus. Applied and Environmental Microbiology 70: 6875-6883.
31
Download