Supplementary material for

advertisement

Supplementary material for:

Recapitulation of the evolution of biosynthetic gene clusters reveals hidden chemical diversity on bacterial genomes

Pablo Cruz-Morales

1,

*, Christian E. Martínez-Guerrero

1

, Marco A. Morales-Escalante

1

,

Luis Yáñez-Guerra

1

, Johannes Florian Kopp

2

, Jörg Feldmann

2

, Hilda E. Ramos-Aboites

1

&

Francisco Barona-Gómez

1,

*

1

Evolution of Metabolic Diversity Laboratory, Langebio-Cinvestav-IPN Irapuato,

Guanajuato, México

2

Trace Element Speciation Laboratory (TESLA), College of Physical Sciences. Aberdeen,

Scotland, UK.

Authors for Correspondence:

Francisco Barona-Gómez (

fbarona@langebio.cinvestav.mx

)

Pablo Cruz-Morales (pcruz@langebio.cinvestav.mx

)

Supplementary text S1:

EvoMining of Streptomyces sviceus draft genome reveals an Enolase enzyme family member recruited into a new phosphonate BGC

Enolase is a glycolytic enzyme that catalyzes the dehydration of 2-Phosphoglycerate (2-PGA) to produce phosphoenolpyruvate (PEP) in a Mg ++ -dependent reaction .

The enolase phylogeny ( Tree

S2 ) has two main clades; a major clade that includes orthologs associated with central metabolism from representatives of most species in the genome database (red braches, Figure S1A ). As expected, the general topology of this clade reflects that of the guide species tree ( Tree S1 ). A divergent clade (cyan, blue and green branches, Figure S1A ) includes a homolog from

Streptomyces viridochromogenes that has previously been identified found in the BGC for the phosphinothricin tripeptide (PTT) (1). This clade also includes a homolog from Streptomyces sviceus (GI 297146550; eno2-SSV) that has not been linked to NP biosynthesis.

The S. viridochromogenes PTT enolase or carboxyphosphoenolpyruvate synthase (CPS GI:

302549806) shares 33% sequence identity with its glycolytic counterpart, i.e. GI 302551949. A detailed sequence analysis showed only few changes in the active site residues ( Figure S2A ). To identify the tridimensional position of these changes, a structural model of eno2-SSV was obtained and compared with the crystal structure of the yeast enolase (PDB: 2ONE), which has been thoroughly characterized (2). This sequence and structural analysis revealed that the mutation

E211S (numbering of yeast enolase) affects the active site of CPS. To analyze the effect of this mutation, the CPS substrate, 2-Phosphonoformylglycerate was modeled in the active site of both structures. This analysis showed that the ancestral glutamine residue would not allow the accommodation of the substrate ( Figure 2SB ). Therefore, this particular mutation seems key to substrate specificity in CPS. Overall, this analysis suggests that other members of the divergent clade are related to a new enzyme function, likely involved in NP biosynthesis.

The draft genome sequence of S. sviceus has been deposited as a single scaffold with 551 gaps

(GenBank accession: CM000951.1 and BioProject PRJNA59513). Six gaps were located in the region of interest, including one at the 5’ end of the enolase homolog, leading to a partial sequence.

Remarkably, neither PKSs nor NRPSs could be found in the gene neighborhood of the CPS gene, although an incomplete CDS for a mutase resembling those related to phosphonates (1) could be detected. On the basis of the phylogenetic analysis we expected that the divergent clade includes enolases that are part of a BGC. To confirm this, the six gaps in the region were closed by iterative

PCR amplification (Supplementary Text 1 associated Table 1 ) and sequencing, followed by manual annotation of the region. The annotation and functional predictions confirmed the presence of a

BGC putatively encoding a pathway that shares common steps with PTT biosynthesis, including those related to the formation of phosphinopyruvate from phosphoenolpyruvate (1) (Supplementary

Figure 2C ). Moreover, the complete sequence allowed for the identification of the mutasedecarboxylase pair of enzymes present in most Streptomyces phosphonate biosynthetic systems

( Figure S1B ). Overall, this functional annotation suggests that the product of this BGC is a previously uncharacterized phosphonate natural product.

Supplementary text 1 associated table 1 . Annotation of a new phosphonate BGC in S. sviceus

Gene

1

2

3

4

5

6

Locus tag

SSEG_06268

SSEG_09941

SSEG_09940

SSEG_06265

SSEG_09939

SSEG_09938

Predicted function

LysR family transcriptional regulator

ABC transporter ATPase subunit

ABC-type multidrug transporter permease

Phosphonate dehydrogenase

Phosphopantetheinyl transferase

Phosphopantetheine-binding protein

Length

(Amino acids)

306

326

261

372

223

106

Closest homolog

LysR family transcriptional regulator, Nostoc punctiforme PCC 73102

ABC transporter, Catenulispora acidiphila DSM

44928

ABC transporter, Catenulispora acidiphila DSM

44928

D-isomer specific 2-hydroxyacid dehydrogenase,

Cyanothece sp. ATCC 51142

4'-phosphopantetheinyl transferase, Nocardia asteroides

Hypothetical protein, Actinokineospora enzanensis

ID 1

36%

63%

63%

45%

39%

46%

7 SSEG_06262 Phosphonate-acyltransferase 556 Hypothetical protein, Salinispora pacifica 50%

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

SSEG_06261 Manganese transporter MntH

SSEG_09937

SSEG_09936

SSEG_09935

SSEG_09934

SSEG_09933

SSEG_09932

Phosphoenolpyruvate phosphomutase

Rieske (2Fe-2S) iron-sulfur domain protein

Metallo-dependent amidohydrolase

Short chain dehydrogenase/reductase family

Glutamate-1-semialdehyde aminotransferase

Aminolevulinate-coenzyme A ligase

Within a gap Putative alcohol dehydrogenase

SEG_10418

SSEG_10417

Within a gap

Within a gap

SEG_10416

SSEG_10415

Hydroxyethylphosphonate dioxygenase

3-phosphoglycerate dehydrogenase

Phosphonopyruvate decarboxylase

Nicotinamide mononucleotide adenylyltransferase

2,3-bisphosphoglycerateindependent phosphoglycerate mutase

Enolase

SSEG_10414

Carboxyphosphonoenolpyruvate mutase

439

309

126

361

284

468

412

382

439

338

383

184

427

421

286 mn2+/fe2+ transporter, nramp family,

Micromonospora sp. L5 YP_004081943.1

Phosphoenolpyruvate phosphomutase,

Saccharopolyspora spinosa

Rieske (2Fe-2S) iron-sulfur domain-containing protein, Pseudonocardia dioxanivorans CB1190

Hypothetical protein, Paenibacillus daejeonensis

Alcohol dehydrogenase, Nocardiopsis halotolerans

Glutamate-1-semialdehyde aminotransferase,

Pseudomonas mendocina NK-01

8-amino-7-oxononanoate synthase, Pontibacter sp. BAB1700

Alcohol dehydrogenase, Streptomyces rimosus

2-hydroxyethylphosphonate dioxygenase phpD,

Streptomyces viridochromogenes

D-3-phosphoglycerate dehydrogenase, Frankia alni ACN14a phosphonopyruvate decarboxylase, Nocardia brasiliensis ATCC 700358

Nicotinamide mononucleotide adenylyltransferase phpF, Streptomyces viridochromogenes

PhpG, Streptomyces viridochromogenes

Carboxyphosphoenolpyruvate synthase,

Streptomyces viridochromogenes

Carboxyphosphonoenolpyruvate mutase,

Streptomyces hygroscopicus

23 SSEG_10413 Aldehyde dehydrogenase 462 Hypothetical protein, Amycolatopsis nigrescens

24 SSEG_08119

Beta-lactamase domaincontaining protein

255

Aldehyde dehydrogenase PhpJ, Streptomyces viridochromogenes

Percentage of amino acid sequence identity based in the BlastP alignment 1

52%

61%

46%

50%

44%

36%

58%

41%

46%

62%

63%

74%

61%

50%

80%

48%

69%

Supplementary text 1 associated methods.

Streptomyces sviceus BGC gap closure. The gaps and misassembles found in the region between 8 and 24 kbp downstream and upstream of the PTT enolase (ZP_06914376.1) in the S. sviceus draft genome sequence, which was obtained from GenBank (GI: NZ_ABJJ00000000), were closed by

PCR amplification and product sequencing (Supplementary text 1 associated table 2 ); for gap 3, which was too long for a single PCR, 3 iterative rounds of sequencing and primer synthesis were required until the gap was closed.

Molecular modeling of the recruited enolases (PhpH). The molecular model of PhpH was constructed with Modeller (3) using as template the crystal structure of the dimeric yeast enolase in complex with magnesium, 2-phosphoglycerate (2-PGA) and phosphoenolpyruvate (PEP) obtained from the Protein Data Bank (PDB : 2ONE) (2). This enolase shares 33% identity with the Carboxyphosphoenolpyruvate synthase (phpH or PTT enolase) from S. viridochromogenes (1). A model of the product of the PTT enolase, carboxy-phosphoenol pyruvate was built with VegaZZ (4), and located in an analog position with respect to PEP in the active site of the PTT enolase, by means of using superimpositions of the model and template in Pymol (The PyMOL Molecular Graphics

System, Version 0.99 Schrödinger, LLC; http://www.pymol.org/).

Supplementary text 1 associated table 2

. Primers used for gap closing

Total sequenced

Fragment Forward primer

1 bases

148

Reverse primer

F-TGCCGCCCAGTTCGAGCAGA R-ATCCGAACGCACACCGCTG

2 566

3

4

5

6

2726

538

484

535

F-CCAGCGTTCTGGCCAGGGCT R-CACGATCGCGACCGACGACT

FA-AAGGCGCCCTGCTTGATGAA RA-CAAACTCCAGGCCTTCTACG

FFB-GAAGTTGATGCGGAACGCCA RB-GCCGAGAACATCCTGCACGTG

FC-GCTGATGGGTTTGTCGTCGC RC-GGTGGCGTGATGGTCACAGC

RD-CGTGTGCACCACCGGCAAGTC

F-ATTCCGGTTGTTGGCGTGCC; R-TAGTTGTTGATGCTCCACAC

F-GTCGTCGAAGTCATGGGCGT; R-CATGGTCTTCGACACCCTGG

F-GAGTGGTCGGCATGGGCCGG; R-GTGACCTCGTGATCCGGGAC

Supplementary text 1 associated references:

1.

Blodgett JA et al. (2007) Unusual transformations in the biosynthesis of the antibiotic phosphinothricin tripeptide. Nat Chem Biol 3:480–5

2.

Zhang E, Brewer JM, Minor W, Carreira LA, Lebioda L (1997) Mechanism of enolase: the crystal structure of asymmetric dimer enolase-2-phospho-D-glycerate/enolasephosphoenolpyruvate at 2.0 A resolution. Biochemistry 36:12526–34.

3.

Sali A, Blundell TL (1993) Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol 234:779–815.

4.

Pedretti A, Villa L, Vistoli G (2004) VEGA--an open platform to develop chemo-bioinformatics applications, using plug-in architecture and script programming. J Comput

Aided Mol Des 18:167–73

Figure S1 . A. Phylogenetic reconstruction of the actinobacterial enolases ( Tree S2 ). Black branches include homologs associated with glycolysis while green branches were linked to NP

BGCs, a homolog from S. sviceus, highlighted in red implicates the loci shown in B in phosphonate biosynthesis. B.

The gene cluster (top) that encodes a novel biosynthetic pathway for a cryptic phosphonate NP identified using EvoMining on the genome of S. sviceus . The gene cluster organization is compared with the PTT gene cluster of S. viridochromogenes . At the bottom the common biosynthetic steps between the PTT and PSV pathways are shown

Figure S2. Structure-function analysis of enolases and carboxyphosphoenolpyruvate synthases (CPS).

A. Sequence alignment of enolases from various organisms and CPS, the amino acid numbers are relative to the yeast enolase. The catalytic residues are indicated at the top and central homologs are shown in white background, and recruited homologs in green as in the phylogenetic reconstruction in supplementary figure 1A. B.

Comparison of the yeast enolase crystal structure bound with its product phosphoenolpyruvate (PEP) and a structural model of the CPS from S. viridochromogenes and its substrate carboxy-phosphoenolpyruvate (CPEP), K345 the conserved catalytic base, and the mutations in the catalytic acid E211S and the catalytic water molecule holder E168Q are indicated and shown in sticks. C.

Reactions catalysed by the glycolytic enolases and the CPSs, colour code is the same as in A and supplementary figure 1A.

Figure S3. Distribution of EvoMining hits by BGC class as annotated by AntiSMASH. The most abundant known classes of BGCs are NRPSs (23%) and PKSs (PKS1, PKS2, PKS3 and TransPKS;

18 % in total). EvoMining predictions and EvoMining hits detected by ClusterFinder are altogether the most abundant class (30 %) and may represent several classes of unprecedented BGCs

Supplementary figure S4. A.

HPLC analysis (Vydac C-18 column) of extracts of a leupA deficient mutant in S. roseus

ATCC31245

in comparison with wild type S. roseus and a leupeptin authentic standard, leupeptin can be detected (see figure S5 for MS analyses) in wild type S. roseus while the leupA mutant cannot longer produce leupeptin. B.

HPLC analysis (Restek C-18 column) of extracts from E. coli DH10B carrying the 8_10B and 9_18N clones with the leup locus in comparison with a leupeptin authentic standard. Both strains produced leupeptin (see figure S5 for MS analyses).

Figure S5 . A . MS analysis of peaks with retention times equivalent to the leupeptin standard (See figure S4) confirming heterologous production of leupeptin using genomic clones containing the leup genes. B. MS 2 analysis of genomic clones containing the leup genes, the fragmentation patterns of the m/z =427.3 from the extracts are identical to those of the m/z=427.3 from the leupeptin standard. Similar patterns were obtained from the extracts of wild type S. roseus ATCC 31245.

Figure S6. Postulated pathway for arseno-organic NP biosynthesis in S. coelicolor and S.

lividans. The reactions proposed for SLI_1096, SLI_1097 and SLI_1091 are responsible for the biosynthesis of the As-C bond at the early stages of the biosynthetic pathway. The biosynthetic logic proposed for SLI_1088-9 is related to the synthesis of an acyl chain that is proposed to be linked to the As-C containing intermediary by other enzymes in the BGC. At the left, structural predictions of potential products for the pathway based on high resolution MS data are shown. This pathway and further studies on the water-soluble As-species present in the samples (data not shown) suggest a non-methylated As-moiety as shown in the last structure, which has not been described in literature yet.

Figure S7. A selected EvoMining Prediction. This BGC was predicted after identification of a recruited AroA homolog which was not identified by ClusterFinder or antiSMASH. Detailed annotation is available as table S7.

Download