Greg Challis (Lecture 1)

advertisement

Lecture 1: Methods for in silico analysis of cryptic natural product biosynthetic gene clusters

Microbial Genomics and Secondary Metabolites Summer School,

MedILS, Split, Croatia, 25-29 June 2007

Greg Challis

Department of Chemistry

Overview

• Introduction cryptic (orphan) gene clusters in microbial genomes

• Clusters encoding nonribosomal peptide synthetases (NRPSs) domains, modules, substrate specificity, predicting products

• Clusters encoding modular polyketide synthases (PKSs) domains, modules, substrate specificity, predicting products

• Clusters encoding other biosynthetic systems terpene synthases, iterative PKSs

Introduction

‘Cryptic’ (orphan) biosynthetic gene clusters

• Present in many of the 300 or so sequenced microbial genomes e.g. Streptomyces avermitilis

Streptomyces coelicolor

Bacillus subtilis

Pseudomonas fluorescens

Pseudomonas syringae

Nostoc punctiforme

• Polyketide synthases

• Nonribosomal peptide synthetases

• Terpene synthases

Aspergillus nidulans

• May prove a valuable new source of bioactive metabolites

Genome sequence of the model antibioticproducer Streptomyces coelicolor M145

HO

2

C

O

OMe

H

N

N N

H prodiginines

O OH

OH O

O OH

OH O actinorhodin

O

CO

2

H

O

O

CO

2

H methylenomycin A

O

O

HN

N

H

HN

OH

H

N

O

HN

O

O

O

O

H

N

N

H

O

O

HO

2

C

HO

2

C

R'

O

H

2

NH

H

N

NOC OR

O

NH

CO

2

HN

O

H

N

H

O

O

NH

CO

2

H calcium-dependent antibiotic

OH

Gene clusters directing complex metabolite biosynthesis in the S. coelicolor genome

Biosynthetic System

Type II iterative polyketide synthase

Type II iterative polyketide synthase

Fatty acid synthase; Butenolide synthase

NRPS; Type I Modular PKS; FAS

Nonribosomal peptide synthetase

Nonribosomal peptide synthetase

Nonribosomal peptide synthetase

Nonribosomal peptide synthetase

Type I Modular polyketide synthase

Type I Modular polyketide synthase

Type I Iterative polyketide synthase

Type III iterative polyketide synthase

Type III polyketide synthase

Type III polyketide synthase

Sesquiterpene synthase

Sesquiterpene synthase

Squalene-Hopene cyclase

Phytoene synthase

NRPS-independent siderophore

NRPS-independent siderophore

Type II FAS

Butanolide synthase

Metabolite

Actinorhodin

Grey spore pigment

Methylenomycin

Prodiginines

Calcium-dependent antibiotics

Siderophore?

Siderophore?

Unknown

Polyene?

Unknown

Polyunsaturated fatty acid?

Tetrahydroxynaphthalene?

Unknown

Unknown

Geosmin?

Geosmin?

Hopanoids?

Isorenieratine?

Desferrioxamines?

Siderophore?

Unknown

SCB1?

Size (kb) Location

22

8

20

33

80

20

26

14

70

10

19

1

3.5

1

2

2.5

15

15

7

4

10

1

5071-5092

5314-5320

SCP1

5877-5898

3210-3249

0489-0499

7681-7691

6429-6438

6273-6288

6826-6827

0124-0129

1206-1208

7669-7671

7221

6073

5222-5223

6759-6771

0185-0191

2780-2785

5799-5801

1265-1273

6266-6267

Bentley et al. Nature (2002) 417 , 141-147

Part 1: Nonribosomal peptide synthetase analysis

Recap of NRPS organisation and function: the gramicidin S synthetase as an example grsT grsA grsB synthetase 1

A module 1

E module 2

C A synthetase 2 module 4 module 3

C A C A module 5

C A TE

A = Adenylation

PCP = peptidyl carrier protein

C = Condensation

E = Epimerisation

TE = Thioesterase

O

H

2

N

O

NH

2

O N

H

2

N

O

H HN

NH

O

O

N

H

2

N

NH

2 2

O

NH

2

O

HN

NH

O

O N

H

2

N

NH

2

Recap of NRPS organisation and function: the gramicidin S synthetase as an example

O

O

S

NH

O

HN

O

O

NH

TE

O

O

NH

2

O

HN

NH

O

NH

O

O N

N

H

2

N

H

2

N

NH

2

TE

For further information see Lars Robbel’s poster

O

O

O O

NH N

NH

NH HN

NH

H

2

N

O

HN

HN

N HN

O O

O

O

NH

2

O

Nonribosomal peptide synthetases encoded by the S. coelicolor genome

Biosynthetic System

Type II iterative polyketide synthase

Type II iterative polyketide synthase

Fatty acid synthase; Butenolide synthase

NRPS; Type I Modular PKS; FAS

Nonribosomal peptide synthetase

Nonribosomal peptide synthetase

Nonribosomal peptide synthetase

Nonribosomal peptide synthetase

Type I Modular polyketide synthase

Type I Modular polyketide synthase

Type I Iterative polyketide synthase

Type III iterative polyketide synthase

Type III polyketide synthase

Type III polyketide synthase

Sesquiterpene synthase

Sesquiterpene synthase

Squalene-Hopene cyclase

Phytoene synthase

NRPS-independent siderophore

NRPS-independent siderophore

Type II FAS

Butanolide synthase

Metabolite

Actinorhodin

Grey spore pigment

Methylenomycin

Prodiginines

Calcium-dependent antibiotics

Siderophore?

Siderophore?

Unknown

Polyene?

Unknown

Polyunsaturated fatty acid?

Tetrahydroxynaphthalene?

Unknown

Unknown

Geosmin?

Geosmin?

Hopanoids?

Isorenieratine?

Desferrioxamines?

Siderophore?

Unknown

SCB1?

Size (kb) Location

22

8

20

33

80

20

26

14

70

10

19

1

3.5

1

2

2.5

15

15

7

4

10

1

5071-5092

5314-5320

SCP1

5877-5898

3210-3249

0489-0499

7681-7691

6429-6438

6273-6288

6826-6827

0124-0129

1206-1208

7669-7671

7221

6073

5222-5223

6759-6771

0185-0191

2780-2785

5799-5801

1265-1273

6266-6267

A new S. coelicolor NRPS gene cluster cchI cchJ cchH cchB cchA

Non-ribosomal peptide synthetase ( cchH )

MbtH-like protein ( cchK )

Flavin-dependent monooxygenase ( cchB )

Formyl-tetrahydrofolate-dependent formyl transferase ( cchA)

Esterase ( cchJ )

Export functions

Ferric-siderophore import

Challis and Ravel FEMS Microbiol. Lett.

(2000) 187 , 111-114

Prediction of domain and module structure

Conserved Domain (CD) search

(http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi)

Deduced domain and module organization

Module 1 Module 2 Module 3

A E C A E C A

SH SH SH

Prediction of A-domain selectivity pocket residues

GrsA DA SV W EMFMALLTGASLYIILKDTINDFVKFEQYINQKEITVI T LPPTYVVHL-----DPERILSIQTL I T A GSATSPSLVNKWKEK--VTYIN A YGPTETT I

Ncs1-M1 DI AV W ELLAAFVGGARLVIAEHRLRGVVPHLPELMTDHRVTVA H FVPSVLEELLGWMADGGRVG-LRLV V C G GEAVPPSQRDRLLALSGARMVH A YGPTETT I

GrsA D A W T I A A I

Ncs1-M1 D I W H V G A I

Stachelhaus, Mootz and Marahiel Chem. Biol. (1999) 6, 493-505

Challis, Ravel and Townsend Chem. Biol. (2000) 7, 211-224

Empirical correlation between specificity pocket residues and substrate

Ser

Orn hTyr

Cys ( ACV)

HPG

Glu, Gln

Leu, Ile, Val

Glu ( Fengycin)

Leu ( Eucarya)

Threonine

Asp, Asn, Gln

Valine

Ala, Dab

Cyst eine

Trp, Phe

Tyr

Val, Ala ( Eucarya)

Proline

Challis, Ravel and Townsend Chem. Biol. (2000) 7, 211-224

Prediction of substrates and possible products for the S. coelicolor cryptic NRPS

Module 1 Module 2 Module 3

A E C A E C A

H

O N

OH

OH

O

NH

2

N

H

H

O

N

OH

NH

2

OH

O O

H

N

OH NH

O

2

N

H

OH

H

N

H

O

H

O

N

OH

Challis and Ravel FEMS Microbiol. Lett.

(2000) 187 , 111-114

Part 2: Modular polyketide synthase analysis

Recap of modular PKS organisation and function: the erythromycin synthase as an example

• Three large modular enzymes (DEBS 1-

3), encoded by eryAI , eryAII , and eryAIII , assemble 6-DEB

• Each module performs one chain extension

O

Me Me

TE cyclizes

Me

OH

O

Me

Me

OH

O OH

Me

6-Deoxyerythronolide B

Recap of modular PKS organisation and function: the erythromycin synthase as an example

O

R

AT ACP

-CO

2

O

-

AT ACP

Recap of modular PKS organisation and function: the erythromycin synthase as an example

• Three large modular enzymes (DEBS 1-

3), encoded by eryAI , eryAII , and eryAIII , assemble 6-DEB

• Each module performs one chain extension

O

Me Me

TE cyclizes

Me

OH

O

Me

Me

OH

O OH

Me

6-Deoxyerythronolide B

Gene clusters directing complex metabolite biosynthesis in the S. coelicolor genome

Biosynthetic System

Type II iterative polyketide synthase

Type II iterative polyketide synthase

Fatty acid synthase; Butenolide synthase

NRPS; Type I Modular PKS; FAS

Nonribosomal peptide synthetase

Nonribosomal peptide synthetase

Nonribosomal peptide synthetase

Nonribosomal peptide synthetase

Type I Modular polyketide synthase

Type I Modular polyketide synthase

Type I Iterative polyketide synthase

Type III iterative polyketide synthase

Type III polyketide synthase

Type III polyketide synthase

Sesquiterpene synthase

Sesquiterpene synthase

Squalene-Hopene cyclase

Phytoene synthase

NRPS-independent siderophore

NRPS-independent siderophore

Type II FAS

Butanolide synthase

Metabolite

Actinorhodin

Grey spore pigment

Methylenomycin

Prodiginines

Calcium-dependent antibiotics

Siderophore?

Siderophore?

Unknown

Polyene?

Unknown

Polyunsaturated fatty acid?

Tetrahydroxynaphthalene?

Unknown

Unknown

Geosmin?

Geosmin?

Hopanoids?

Isorenieratine?

Desferrioxamines?

Siderophore?

Unknown

SCB1?

Size (kb) Location

22

8

20

33

80

20

26

14

70

10

19

1

3.5

1

2

2.5

15

15

7

4

10

1

5071-5092

5314-5320

SCP1

5877-5898

3210-3249

0489-0499

7681-7691

6429-6438

6273-6288

6826-6827

0124-0129

1206-1208

7669-7671

7221

6073

5222-5223

6759-6771

0185-0191

2780-2785

5799-5801

1265-1273

6266-6267

Bentley et al. Nature (2002) 417 , 141-147

A new S. coelicolor modular PKS cluster

Genes encoding a modular PKS

Prediction of domain and modules in CpkA

Conserved Domain (CD) search

(http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi)

Prediction of domain and modules in CpkB

Prediction of domain and modules in CpkC

Prediction of domains and modules in CpkABC

Pawlik, Kotowska, Chater, Kuczek and Takano Arch. Microbiol.

(2007) 187 , 87-99

Prediction of AT domain substrate selectivity

Haydock et al. FEBS Lett. (1995) 374, 246-248

Banskota et al. J. Antibiot. (2006) 59, 168-176

Prediction of KR domain stereoselectivity

Prediction of KR domain stereoselectivity

Caffrey ChemBioChem (2003) 4, 654-657

Reid et al. Biochemistry (2003) 42, 72-79

Prediction of substrates and possible products for the S. coelicolor cryptic PKS

OH OH or

OH O

H

Non-linear enzymatic logic can complicate things!

S

OH

N

H

S

H

N

H

OH S

N CO

2

H

Haynes and Challis, Curr. Op. Drug Discov. Develop.

(2007) 10 , 203-218

Non-linear enzymatic logic can complicate things!

Module 1 Module 3 Module 5 + 6 + 7

Load

AT ACP

Module 2

KS AT

KR

ACP KS AT

DH KR

ACP KS

Module 4 Module 7

AT

DH KR

ACP KS AT

KR

ACP KS AT

DH

ER

KR

ACP KS AT

KR

ACP TE

H

S

O

S

O

S

O

S

O

S

O

S

O

S

O

HO

CO

2

H H OH OH OH

CO

2

H

H OH

HO

3

OH

CO

2

H

H OH

HO

3

OH

CO

2

H

H OH

CO

2

H

H OH

CO

2

H H OH

OH OH

CO

2

H

NC

OH O

H

O

H

CO

2

H

BorI, Bor J

OH

H

O

H

O

CO

2

H

Haynes and Challis, Curr. Op. Drug Discov. Develop.

(2007) 10 , 203-218

Part 3: Analysis of other biosynthetic systems

Terpene synthases

OPP

OPP

OPP monoterpene synthase

C10 sesquiterpene synthase diterpene synthase

C15

C20 triterpene synthase

C30

Iterative polyketide synthases – type III PKSs

4 x

O

OH

Amycolatopsis orientalis

DpgA

HO

H

2

N

O

OH

HO

O

SCoA

OH

OH

HO

HO

H

2

N

HO

Me

O

Me

O

O

O

OH

O

O

Cl

HO

O

H

N

NH

H

HO

2

C

O

H

Cl

H

N

O

H

N

H

O

O

NH

2

H

N

O

OH

HO OH

N

H

OH

O

NHMe

OH OH

5 x

O

OH

Streptomyces griseus

RppA

HO

OH OH

OH

HO

HO

OH

OH melanin

OH OH

Conclusions

• Reasonably confident in silico predictions of domain / module organisation and substrate specificity of modular PKS / NRPS can be made

• Non-linear enzymatic logic can complicate the reliable prediction of product structure(s)

• For other types of biosynthetic system, reasonably confident predictions of substrate specificity can sometimes be made

• Prediction of chain length and substrate specificity in some iterative PKS systems, especially type III and fungal type I, remains difficult

Download