Microbial genomics for the improvement of natural product discovery Steven G Van Lanen and Ben Shen The quest for the discovery of novel natural products has entered a new chapter with the enormous wealth of genetic data that is now available. This information has been exploited by using whole-genome sequence mining to uncover cryptic pathways, or biosynthetic pathways for previously undetected metabolites. Alternatively, using known paradigms for secondary metabolite biosynthesis, genetic information has been ‘fished out’ of DNA libraries resulting in the discovery of new natural products and isolation of gene clusters for known metabolites. Novel natural products have been discovered by expressing genetic data from uncultured organisms or difficultto-manipulate strains in heterologous hosts. Furthermore, improvements in heterologous expression have not only helped to identify gene clusters but have also made it easier to manipulate these genes in order to generate new compounds. Finally, and perhaps the most crucial aspect of the efficient and prosperous use of the abundance of genetic information, novel enzyme chemistry continues to be discovered, which has aided our understanding of how natural products are biosynthesized de novo, and enabled us to rework the current paradigms for natural product biosynthesis. Addresses School of Pharmacy, University of Wisconsin-Madison, 777 Highland Avenue, Madison, WI 53705, USA Corresponding author: Shen, Ben (bshen@pharmacy.wisc.edu) Current Opinion in Microbiology 2006, 9:252–260 This review comes from a themed issue on Ecology and industrial microbiology Edited by Arnold Demain and Lubbert Dijkhuizen Available online 2nd May 2006 1369-5274/$ – see front matter # 2006 Elsevier Ltd. All rights reserved. DOI 10.1016/j.mib.2006.04.002 Introduction Natural products remain a consistent source of drug leads with more than 40% of new chemical entities reported since 1981 being derived from microbial natural products [1–3]. Perhaps more astonishing is that more than 60% of the anticancer and 70% of the anti-infective antibiotics currently in clinical use are natural products or natural product-based. Despite this impressive track record, most of the big pharmaceutical companies have recently de-emphasized, downsized or even abandoned their natural-product discovery efforts, partly because of the perception that the ‘tank’ of natural products has run dry Current Opinion in Microbiology 2006, 9:252–260 and that finding new natural-product drug leads is not a profitable endeavor. Recent progress in several aspects of natural-product research and microbial genomics, however, suggests that the potential of natural-product diversity and discovery is vastly underestimated, offering several promising alternatives to existing methods for the discovery of new natural products (Figure 1). First, the exponential growth in cloning and characterization of natural-product biosynthetic machinery in the past two decades has unveiled unprecedented molecular insights into natural-product biosynthesis, including the observation that genes for natural product biosynthesis are clustered in the microbial genome and that variations of a few common biosynthetic machineries can account for the vast structural diversity observed for natural products. These findings have fundamentally changed the landscape of natural-product research and discovery by enabling the revision of known natural-product structures, the prediction of yet-to-be isolated novel products on the basis of gene sequences, and the systematic generation of ‘unnatural’ natural products by manipulating genes that govern the naturalproducts biosynthesis (this process is also known as combinatorial biosynthesis). Second, whole-genome sequencing has revealed that there are far more biosynthetic gene clusters than there are currently known metabolites for a given organism, suggesting that the biosynthetic potential for natural products in microorganisms has been greatly under-explored by traditional methods of natural-product discovery. Third, only 1% of the microbial community is estimated to have been cultivated in the lab, implying that there is a vast biodiversity of natural products in microorganisms that remains to be exploited. Newly emerging cultivating techniques, culture-independent methods, which involve expressing gene clusters in model heterologous hosts, and chemoenzymatic bioconversion strategies have enabled access to these previously inaccessible natural-product resources. Finally, biochemical studies of natural-product biosynthetic enzymes have been extremely successful in the discovery of new enzyme pathways and unusual chemical conversions, and to date most known gene clusters include genes whose deduced products have no homolog or only have homology to proteins in the database with unknown function, which is indicative of proteins with novel functions and possibly the presence of undescribed natural products. Continued discovery and characterization of these novel enzymes should ultimately enable realization of the full potential of microbial genomics-based natural-product discovery. www.sciencedirect.com Microbial genomics for the improvement of natural product discovery Van Lanen and Shen 253 Glossary Colinearity rule: Chain extensions during polyketide and polypeptide formation depend on the number, type and organization of modules such that the genetic architecture is directly reflected in the final product. Genome sequence tags (GSTs): Short (700 base pair [bp]) random DNA fragments from a genomic library that can be used to generate probes to screen entire DNA libraries. Synthon: DNA fragments of approximately 500 bp that are amplified by PCR and used to ultimately create a contiguous synthetic gene cluster. Red/ET recombineering: Genetic engineering to reassemble a biosynthetic gene cluster based on recombination that is directed by l-phage-derived proteins Reda, Redb, and Redg. This review, therefore, highlights the interplay between microbial genomics and natural-product biosynthesis and illustrates the impact of this relationship on natural product discovery by using a few of the numerous examples that have been reported in the past a couple of years (Figure 1). Readers are referred to several other recent reviews for a more comprehensive coverage on this and related topics [4–8]. Whole-genome sequence mining It is possible to estimate the biosynthetic potential for a given organism by mining the whole-genome sequence, because natural-product biosynthetic genes are present in clusters in microbial genomes; however, it is the characteristics of the various biosynthetic machineries that have enabled the prediction of the types and sometimes the exact structures of the final natural products. This is best exemplified by the polyketide and non-ribosomal peptide biosynthetic gene clusters, featuring the polyketide synthases and nonribosomal peptide synthetases (NRPSs), which catalyze the biosynthesis of members of two of the largest families of natural products. Archetypical polyketide synthases are classified into type I, II or III enzymes according to their structural and catalyticdomain organization, and NRPSs have likewise been grouped on the basis of their catalytic architecture [9,10]. Typically, the nascent polyketide or peptide backbone is further modified by tailoring enzymes in postpolyketide synthase or post-NRPS steps of the pathway, most commonly oxygenases, oxidoreductases and glycosyltransferases or other transferases, in order to imbue additional structural functionalities to the final natural product. The conserved features within this machinery have been cornerstones of the genomics-guided discovery of natural products, and discovery methods are continually being altered and optimized to reflect the current paradigms of natural-product biosynthesis [11,12,13]. The genomes of 294 microorganisms have been sequenced and annotated in the National Centre for Biotechnology Information genome project, and a remarkable aspect of this wealth of information is that the number of genes that are expected to be involved in secondary metabolite production dramatically outnumbers the amount of known secondary metabolites [14]. An extreme example of this is in the cyanobacterium Nostoc punctiforme, which has 22 genes that encode probable polyketide synthases or NRPSs, although only one of these has been related to a secondary metabolite Figure 1 Microbial genomics and natural product biosynthesis and their impact on natural product discovery. See Figures 2 and 3 for structures of individual natural products (1–18). Highlighted in red boxes are techniques that have been introduced or advanced to utilize the wealth of genomic data available. Likewise, compounds shown in red have been discovered or more closely examined by the respective methodologies. Related to microbial genomics is the deciphering of new enzyme pathways (blue box) that has enabled the reworking of current paradigms for natural product biosynthesis (green) in order to optimize the available techniques to continue the cycle for novel natural product and drug discovery. www.sciencedirect.com Current Opinion in Microbiology 2006, 9:252–260 254 Ecology and industrial microbiology Figure 2 Current Opinion in Microbiology 2006, 9:252–260 www.sciencedirect.com Microbial genomics for the improvement of natural product discovery Van Lanen and Shen 255 (nostopeptolide). The complete genomes of 13 Actinobacteria, unequivocally the largest group of antibiotic producers, also follow this trend. A case in point is that of the genomes of three members of the genus Corynebacterium that have been sequenced: each contains a modular type I polyketide synthase, but these have not been correlated to any secondary metabolite. A last example is that in a survey of 11 secondary metabolite producers — including Streptomyces coelicolor, which is the best characterized model-strain for Streptomyces, and Streptomyces avermitilis, which is an indispensable industrial Streptomyces strain — unveiled a total of 118 NRPS and polyketide synthase clusters, only 14 of which had been assigned to known non-ribosomal peptides or polyketides at the time the genome sequences were reported (this represents less than 12% of the biosynthetic potential available to these organisms) [15]. The use of sequence data to search for novel metabolites has been exemplified by the discovery of the iron chelator coelichelin (1, Figure 2) from S. coelicolor M145 (Figure 2a) [16]. The gene cluster for 1 contains three NRPS modules and, from the colinearity rule (see Glossary) and the authors’ NRPS code, was predicted to contain a tripeptide containing the synthetic building blocks of L-d-N-formyl-d-N-hydroxyornithine, L-threonine and L-d-N-hydroxyornithine. By accurately predicting the adenylation domain substrate-specificity and understanding the role of hydroxamic acid as a divalent metal chelator, the appropriate condition for production was identified and a new tris-hydroxamate tetrapeptide iron chelator, 1, was produced. These results, along with others not mentioned here [17,18], unquestionably demonstrate the utility of genome mining for the identification of new secondary metabolites. Genome scanning As an alternative approach to whole-genome sequence mining, genome scanning provides an efficient way to discover natural-product biosynthetic gene clusters without having the complete genome sequence. This approach takes advantage of the fact that the genes for natural-product biosynthesis form clusters in a microbial genome, the size of which range from 20–200 kilobases (kb). By shotgun-sequencing a small number of random genome sequence-tags (GSTs; see Glossary) from a library of genomic DNA, it is expected that, when analyzed, any given gene cluster will be represented by multiple GSTs. GSTs derived from genes that are likely to be involved in the biosynthesis of natural products are identified and used as probes in order to localize entire biosynthetic gene clusters. The effectiveness of this method was elegantly demonstrated initially by isolating the dynemicin (2) and macromomycin biosynthetic gene clusters from strains known to produce these enediyne antitumor antibiotics, and subsequently by identifying 11 cryptic enediyne biosynthetic loci from 70 actinomycete strains that were previously not known as enediyne producers (Figure 2b). Armed with this genomic information, enediyne antibiotic production in the strains was verified by optimizing medium and fermentation conditions [19]. As a consequence of this type of genome scanning, the enediyne family was revealed to be more dispersed among actinomycetes than originally anticipated. The success of the aforementioned genome-scanning approach using GSTs depended crucially on the recognition of the enediyne polyketide synthase, a novel iterative type I enzyme unique to the enediyne family of natural products [20,21]. Taking advantage of the high sequenceconservation among the enediyne polyketide synthases, a PCR method for the rapid amplification of the minimal enediyne polyketide synthase genes was also developed and successfully applied to the cloning and localization of the esperamicin (3) and maduropeptin (4) biosynthetic gene clusters, providing an alternative method to directly access the enediyne biosynthetic machinery (Figure 2b) [22]. An approach for obtaining ‘perfect probes’ to identify all polyketide synthase and NRPS gene clusters in a genome, an approach which in principle is analogous to the GST-based genome scanning method, was independently developed. The utility of this method was first verified in silico by scanning the Bacillus subtilis genome to localize the known polyketide synthase and NRPS clusters, and subsequently used to identify the epothilone (5) gene cluster from Sorangium cellulosum along with several other NRPS and polyketide synthase gene clusters (Figure 2b) [23]. Cultivation and metagenomics Random sequencing of soil isolates suggests that less than 1% of microorganisms have been cultivated in the lab using most common cultivation conditions [24,25]. Recent investigations have added support to the idea that this phenomenon is as a result of the media, incubation times and inoculum size traditionally used, with the first two of these factors being the most important in the isolation and culture of rarely isolated soil bacteria. [26]. Using specific enrichment techniques, primarily by varying the media, hundreds of different organisms from the family of Streptosporangiacae were isolated, and many were hypothesized to produce novel antibiotics on the basis of observations of other members of this family [27]. In a separate example, by changing the media in combination with using selective agents for motile (Figure 2 Legend) Natural products, the discovery and isolation of which have benefited from microbial genomics as exemplified by (a) whole genome mining, (b) genome scanning, (c) metagenomics and (d) heterologous expression. www.sciencedirect.com Current Opinion in Microbiology 2006, 9:252–260 256 Ecology and industrial microbiology microorganisms, two new antibiotic-producing actinomycete species were discovered, demonstrating the usefulness of a guided-culturing approach for the isolation of rare strains [28]. Finally, the culturing of rare organisms has been improved by the development of new techniques. This is illustrated by the use of microcapsules, derived from a single encapsulated cell, for high-throughput screening [29]. These culturing techniques will undoubtedly improve the access to natural products. A valuable alternative to cultivating rare or slow-growing organisms is to extract community DNA and produce clone libraries in a cultivation-independent approach termed metagenomics [30]. Typically, large genomic DNA isolates are inserted into suitable carriers, such as bacterial artificial chromosomes (BACs) or cosmids, to be evaluated in fast-growing streptomycetes or other heterologous hosts. This methodology has been especially useful for the evaluation of both marine and terrestrial bacterial symbionts for their potential production of secondary metabolites. Putative biosynthetic genes for the pederin and byrostatin family of antitumor compounds were identified using a metagenomics approach. Pederin (6), isolated from extracts of Paederus species rove beetles, was determined to have originated from a microbial source that is that has the highest sequence homology to Pseudomonas aeruginosa [31,32]. The resulting polyketide synthase predicted for 6 biosynthesis was shown to be evolutionarily distinct from the archetypical polyketide synthase [33], and this acyltransferase (AT)-less polyketide synthase phylogenetic grouping has proved useful for the discovery of putative biosynthetic genes for the synthesis of onnamide A (7) from the marine sponge Theonella swinhoei [34]. Similarly, a putative biosynthetic gene coding for prebryostatin 1 (8), an antitumor compound isolated from the marine protozoan Bugula neritina, was also identified using metagenomics and, similarly to 6, the bryostatin polyketide synthase belongs to the AT-less polyketide synthase family [33,35]. These are just a few illustrations of the utility of metagenomics in the search for natural products (Figure 2c), and they also highlight examples of an emerging family of polyketide synthases that use discrete, iterative ATs for polyketide biosynthesis [36–38]. It should be noted that whereas metagenomics has been effective in identifying novel biosynthetic gene clusters, experimental verification of the predicted products of these biosynthetic gene clusters remains a great challenge. Heterologous expression With the difficulties associated with obtaining a functional genetic system or detectable production conditions for a particular secondary metabolite, efforts to heterologously express gene clusters have increased with undeniable success [39]. Escherichia coli, initially perceived as Current Opinion in Microbiology 2006, 9:252–260 an unsuitable host for heterologous expression, has been engineered to produce 6-deoxyerythronolide B (9) at levels comparable to the model Streptomyces host, S. coelicolor [40]. To further establish E. coli as a suitable heterologous host, 9 was successfully produced by introducing a synthetic 31.7 kb DNA polyketide gene cluster that was prepared using a ‘synthon’ approach (see Glossary) [41]. A chemoenzymatic approach was used to synthesize epothilone using E. coli as a host, by introducing the genes responsible for the second half of the biosynthetic pathway using a three-plasmid system and then by feeding the bacteria with a chemically synthesized substrate as an N-acetylcysteamine (SNAC) thioester (Figure 2d) [42]. E. coli has also been used to confirm the locus for the biosynthetic gene cluster of patellamide A and patellamide C, two cyclic peptides that are produced by the symbiont Prochloron didemni [43,44]. This investigation, similar to that of 6 and 8, represents a cultivation-independent approach to identify natural products, but also provides the first conclusive evidence of a gene cluster which has been isolated from an obligate symbiont by expressing the marine natural-product pathway in E. coli [44]. Other hosts, besides E. coli, that are easy to genetically manipulate are amenable to heterologous expression of various natural products. The biosynthetic pathway for rebeccamycin (10) was dissected and reconstituted in the strain Streptomyces albus, which provided an environment capable of supplying precursors without the need for further genetic manipulations [45]. Furthermore, a variety of indolocarbazoles [such as the congener staurosporine (11)] were generated by swapping and/or inserting genes from different loci; this illustrates the utility of heterologous expression in combinatorial biosynthesis. Intriguingly, whereas direct introduction of the fredericamycin (FDM) biosynthetic gene cluster from Streptomyces griseus into Streptomyces lividan resulted in little FDM production unless fdmR, which encodes a pathway-specific activator, is overexpressed under the control of a strong constitutive promoter; introduction of the FDM cluster alone into S. albus yielded FDM in amounts comparable to that from the S. griesus wild type strain [46]. This result not only duplicated the success of S. albus as a heterologous host, but also underscored how little is currently known about global and pathway-specific regulation of natural-product biosynthetic gene cluster expression in heterologous hosts. Daptomycin (12; Cubicin1), which has been recently approved for the treatment of Gram-positive infections of skin and skin structure, represents one of only two novel antibiotics to reach the market in 30 years. The daptomycin gene cluster has been cloned from Streptomyces roseosporus and characterized, and a BAC clone containing the entire 12-gene cluster on a 128 kb DNA www.sciencedirect.com Microbial genomics for the improvement of natural product discovery Van Lanen and Shen 257 fragment was successfully introduced into S. lividans for heterologous expression, resulting in production of 18 mg l1 of 12 (Figure 2d) [47]. Although the yield is still very poor in comparison with the wild type S. roseosporus strain under the optimized fermentation conditions, to our knowledge this represents the largest gene cluster that has ever been successfully expressed in a heterologous host. Another method combines the ease of genetic manipulation in E. coli with the selection of a separate, suitable host for natural-product production. The biosynthetic gene cluster for myxochromide S1 (13), a hybrid polyketide synthase-NRPS system, was rebuilt and engineered in E. coli to contain the entire locus and to include a toluic-acid inducible promoter and a homologous recombination region (Figure 2d). The assembly was performed using Red/ET recombineering (see Glossary) in E. coli; the final DNA was transferred into Pseudomonas putida and integrated into the chromosome [48]. The recombinant P. putida strain produced the desired product, 13, and remarkably at a fivefold greater level and in significantly less incubation time than the wild type. The above examples not only exemplify the utility of using heterologous hosts, but this strategy sets the stage to probe the functional roles of individual open reading frames or domains and will be useful for combinatorial biosynthesis to create unnatural natural products with desired properties. It also illustrates the many challenges to which we have yet to find solutions. These issues have to be addressed before the production of natural products by expressing their biosynthetic gene clusters in model heterologous hosts becomes a realistic alternative. Discovering novel chemistry The discovery of novel natural products using genomics and the improvement of industrial applications using bioinformatics depend on understanding the biochemistry of secondary metabolite production. The polyketide synthase responsible for 9 synthesis has historically been the archetype for the structure and function of polyketide synthases [9], and this modular type I system continues to be the most-studied polyketide synthase [49,50]. However, increasing numbers of polyketide synthases and NRPSs are being discovered that do not fit into the existing paradigms; this has been reviewed elsewhere [37,38]. In addition to polyketide synthase and NRPS, the analysis of other recombinant enzymes, such as tailoring enzymes, has recently resulted in several discoveries that were unexpected from sequence predictions. Coronatine (14, Figure 3) and pyoluteorin (15), two hybrid polyketide-peptides found in different Pseudomonas species, have recently been shown to be biosynthesized with unexpected halogenation events (Figure 3a). During 14 biosynthesis, CmaB, a non-heme Fe2+ www.sciencedirect.com a-ketoglutarate-dependent enzyme, was shown to carry out g-chlorination of a thioester-linked substrate, L-alloisoleucine, followed by cyclopropyl ring formation by a previously unknown protein, CmaC [51]. With respect to 15 biosynthesis, PltA (a predicted FADH-dependent halogenase) was found to not only regiospecifically insert chlorine at position 5 of thioester-linked proline but also to further modify the monohalogenated proline intermediate by chlorination at position 4 [52]. The first transformation, using cryptic halogenation, represents a unique strategy for biochemical conversions, and the enzymatic steps of CmaB and PltA were impossible to predict from DNA sequence alone. The difficulties associated with making functional predictions solely on the basis of DNA sequence have also been prominent in the development of a variety of Streptomyces-derived natural products (Figure 3b). In the biosynthesis of jadomycin (16), a polyketide isolated from S. venezuelae ISP 5230, the enzyme JadH was functionally assigned to have both oxygenase and dehydrase activities, although the former function was only predicted before biochemical analysis [53]. For nystatin (17) production by S. noursei, nysF was originally proposed to encode a 40 -phosphopantetheinyl transferase, but unexpectedly was shown to be a negative regulator of 17 biosynthesis, and therefore deletion of the gene did not abolish production but actually improved yields by 60% [54]. Finally, although OtcC was correctly predicted to be an oxygenase involved in oxytetracycline (18) biosynthesis by S. rimosus, the deletion of otcC resulted in a novel 17 carbon polyketide (19) instead of the native 19 carbon polyketide, showing that, although the activity of OtcC was correctly predicted, this deletion not only resulted in a loss of oxygenase activity but also resulted in an incorrect chain length being produced [55]. Conclusions and perspectives Discovery of novel antibiotics and clinically useful natural products has been in modest decline over the past few decades, despite estimates that more than one million cultures were screened by pharmaceutical companies during this period [2]. Increased antibiotic resistance in combination with the lengthy process of marketing new drugs only compounds this decline [56]. Consequently, new approaches to drug discovery need to be developed and applied. Genome sequencing of actinomycetes and other microorganisms has revealed a number of biosynthetic enzymes that cannot be related to known metabolites, suggesting that the occurrence of natural products has been underestimated by using classical techniques such as screening fermentation broths or target-based high-throughput screening [2,3]. Therefore, ways to take advantage of the wealth of genomic information have been brought to the forefront, including directed cultivation methods, Current Opinion in Microbiology 2006, 9:252–260 258 Ecology and industrial microbiology Figure 3 Novel chemistry, the discovery of which has influenced by microbial genomics as exemplified by (a) the halogenation steps during coronatine (14) and pyoluteorin (15) biosynthesis and (b) unusual enzymatic steps during other natural product biosynthesis in Streptomyces. Abbreviation: PCP, peptidyl carrier protein. using heterologous hosts to express gene clusters from metagenomes or organisms with silent phenotypes, or using different high-throughput techniques that are not discussed here, such as mass spectroscopy and near-infrared spectroscopy [57–59]. The vast amount of DNA sequence in the public database represents only the beginning of the new genomics era. Recent advances in DNA sequencing have led to the statement that we are on the verge of having ‘‘a genome sequencing center in every lab’’, implying that obtaining genomic data will no longer be the bottleneck for naturalproduct science [60,61]. Furthermore, when genome Current Opinion in Microbiology 2006, 9:252–260 sequencing of the ‘first tier’ organisms is completed, whole genome sequencing will almost certainly expand and diversify to include other relevant natural-product producers and other more rare organisms. Moreover, if the process of genome scanning and mining guides the discovery of a clinically useful compound in a similar fashion to that shown with coelichelin, whole genome sequencing will only gain momentum. It is apparent from the above examples that, although advancing rapidly, knowledge of the DNA sequence cannot be simply translated to enzyme function, and ultimately the structure, of a natural product. www.sciencedirect.com Microbial genomics for the improvement of natural product discovery Van Lanen and Shen 259 Nevertheless, as has been hinted to throughout this review, ingenuity, dedication and, if necessary, brute force can overcome any shortcomings encountered during the process of drug discovery, and a sense of optimism should prevail as long as we remain cautious of sequence annotations until biochemical assignments are made. A continued and concerted progression in genomics, microbiology and biochemistry will be invaluable in the next generation of drug discovery, whether this occurs by the identification of new or rare organisms, by cultivation or metagenomics techniques, or through rational engineering of the biosynthetic machinery. 14. Jenke-Kodama H, Sandmann A, Müller R, Dittmann E: Evolutionary implications of bacterial polyketide synthases. Mol Biol Evol 2005, 22:2027-2039. Acknowledgements 18. May JJ, Wendrich TM, Marahiel MA: The dhb operon of Bacillus subtilis encodes the biosynthetic template for the catecholic siderophore 2,3-dihydroxybenzoate-glycine-threonine trimeric ester bacillibactin. J Biol Chem 2001, 276:7209-7217. Current studies on natural product biosynthesis described from the Shen laboratory were supported in part by National Institutes of Health (NIH) grants CA94426, CA78747, CA106150 and CA113297. SVL is the recipient of NIH postdoctoral fellowship CA1059845, and BS is the recipient of an NIH Independent Scientist Award AI51689. References and recommended reading Papers of particular interest, published within the annual period of review, have been highlighted as: of special interest of outstanding interest 1. Newman DJ, Cragg GM, Snader KM: Natural products as sources of new drugs over the period 1981–2002. J Nat Prod 2003, 66:1022-1037. 2. Baltz RH: Antibiotic discovery from actinomycetes: will a renaissance follow the decline and fall? SIM News 2005, 55:186-196. 3. Koehn FE, Carter GT: The evolving role of natural products in drug discovery. Nat Rev Drug Discov 2005, 4:206-220. 4. Weist S, Sussmuth R: Mutational biosynthesis — a tool for the generation of structural diversity in the biosynthesis of antibiotics. Appl Microbiol Biotechnol 2005, 68:141-150. 5. Bode HB, Müller R: The impact of bacterial genomics on natural product research. Angew Chem Int Ed 2005, 44:6828-6846. 6. Weissman KJ, Leadlay PF: Combinatorial biosynthesis of reduced polyketides. Nat Rev Microbiol 2005, 3:925-936. 7. Keller NP, Turner G, Bennett JW: Fungal secondary metabolism — from biochemistry to genomics. Nat Rev Microbiol 2005, 3:937-947. 8. Pelzer S, Vente A, Bechthold A: Novel natural compounds obtained by genome-based screening and genetic engineering. Curr Opin Drug Discov Devel 2005, 8:228-238. 9. Staunton J, Weissman KJ: Polyketide biosynthesis: a millennium review. Nat Prod Rep 2001, 18:380-416. 10. Finking R, Marahiel MA: Biosynthesis of nonribosomal peptides. Annu Rev Microbiol 2004, 58:453-488. 11. Walsh CT: Polyketide and nonribosomal peptide antibiotics: modularity and versatility. Science 2004, 303:1805-1810. 12. Ostash BE, Ogonyan SV, Luzhetskyy AN, Bechthold A, Fedorenko VA: The use of PCR for detecting genes that encode type I polyketide synthases in genomes of actinomycetes. Russ J Genet 2005, 41:473-478. 13. Ayuso A, Clark D, González I, Salazar O, Anderson A, Genilloud O: A novel actinomycete strain de-replication approach based on the diversity of polyketide synthase and nonribosomal peptide synthetase biosynthetic pathways. Appl Microbiol Biotechnol 2005, 67:795-806. A fingerprinting approach for polyketide synthase and NRPS was developed using PCR and restriction digest analysis of the amplified fragments. www.sciencedirect.com 15. Ikeda H, Ishikawa J, Hanamota A, Shinose M, Kikuchi H, Shiba T, Sakaki Y, Hattori M, Omura S: Complete genome sequence and comparative analysis of the industrial microorganism Streptomyces avermitilis. Nat Biotechnol 2003, 21:526-531. 16. Lautru S, Deeth RJ, Bailey LM, Challis GL: Discovery of a new peptide natural product by Streptomyces coelicolor genome mining. Nat Chem Biol 2005, 1:265-269. A new iron-chelating compound, coelichelin, was isolated from S. coelicolor by mining the entire sequence genome of the strain. 17. Silakowski B, Kunze B, Nordsiek G, Blöcker H, Höfle G, Müller R: The myxochelin iron transport regulon of the myxobacterium Stigmatella aurantiaca Sg a15. Eur J Biochem 2000, 267:6476-6485. 19. Zazopoulos E, Huang K, Staffa A, Liu W, Bachmann BO, Nonaka K, Ahlert J, Thorson JS, Shen B, Farnet CM: A genomics-guided approach for discovering and expressing cryptic metabolic pathways. Nat Biotechnol 2003, 21:187-190. 20. Liu W, Christenson SD, Standage S, Shen B: Biosynthesis of the enediyne antitumor antibiotic C-1027. Science 2002, 297:1170-1173. 21. Ahlert J, Shepard E, Lomovskaya N, Zazopoulos E, Staffa A, Bachmann BO, Huang K, Fonstein L, Czisny A, Whitwam RE et al.: The calicheamicin gene cluster and its iterative type I PKS. Science 2002, 297:1173-1176. 22. Liu W, Ahlert J, Gao Q, Wendt-Pienkowski E, Shen B, Thorson JS: Rapid PCR amplification of minimal enediyne polyketide synthase cassettes leads to a predictive familial classification model. Proc Natl Acad Sci USA 2003, 100:11959-11963. 23. Santi DV, Siani MA, Julien B, Kupfer D, Roe B: An approach for obtaining perfect hybridization probes for unknown polyketide synthase genes: a search for the epothilone gene cluster. Gene 2000, 247:97-102. 24. Hugenholtz P, Goebel BM, Pace NR: Impact of cultureindependent studies on the emerging phylogenetic view of bacterial diversity. J Bacteriol 1998, 180:4765-4774. 25. Joseph SJ, Hugenholtz P, Sangwan P, Osbourne CA, Janssen PH: Laboratory cultivation of widespread and previously uncultured soil bacteria. Appl Environ Microbiol 2003, 69:7210-7215. 26. Davis KER, Joseph SJ, Janssen PH: Effects of growth medium, inoculum size, and incubation time on culturability and isolation of soil bacteria. Appl Environ Microbiol 2005, 71:826-834. 27. Lazzarini A, Cavaletti L, Toppo G, Marinelli F: Rare genera of actinomycetes as potential producers of new antibiotics. Antonie Van Leeuwenhoek 2001, 79:399-405. 28. Otoguro M, Hayakawa M, Yamazaki T, Iimura Y: An integrated method for the enrichment and selective isolation of Actinokineospora spp. in soil and plant litter. J Appl Microbiol 2001, 91:118-130. 29. Zengler K, Walcher M, Clark G, Haller I, Toledo G, Holland T, Mathur EJ, Woodnutt G, Short J, Keller M: High-throughput cultivation of microorganisms using microcapsules. Methods Enzymol 2005, 397:124-130. 30. Schloss PD, Handelsman J: Metagenomics for studying unculturable microorganisms: cutting the Gordian knot. Genome Biol 2005, 6:229. 31. Piel J: A polyketide synthase-peptide synthetase gene cluster from an uncultured bacterial symbiont of Paederus beetles. Proc Natl Acad Sci USA 2002, 99:14002-14007. 32. Piel J, Wen G, Platzer M, Hui D: Unprecedented diversity of catalytic domains in the first four modules of the Current Opinion in Microbiology 2006, 9:252–260 260 Ecology and industrial microbiology putative pederin polyketide synthase. Chem Bio Chem 2004, 5:93-98. 33. Piel J, Hui D, Fusetani N, Matsunaga S: Targeting modular polyketide synthases with iteratively acting acyltransferases from metagenomes of uncultured bacterial consortia. Environ Microbiol 2004, 6:921-927. A method was developed to target from metagenomic samples the family of polyketide synthases that use trans-acting ATs. By using this approach, two pederin-type polyketide synthase systems were identified. 34. Piel J, Hui D, Wen G, Butzke D, Platzer M, Fusetani N, Matsunaga S: Antitumor polyketide biosynthesis by an uncultivated bacterial symbiont of the marine sponge Theonella swinhoei. Proc Natl Acad Sci USA 2004, 101:16222-16227. 35. Hildebrand M, Waggoner LE, Liu H, Sudek S, Allen S, Anderson C, Sherman DH, Haygood M: BryA: an unusual modular polyketide synthase gene from the uncultivated bacterial symbiont of the marine bryozoan Bugula neritina. Chem Biol 2004, 11:1543-1552. 36. Cheng Y, Tang G, Shen B: Type I polyketide synthase requiring a discrete acyltransferase for polyketide biosynthesis. Proc Natl Acad Sci USA 2003, 100:3149-3154. 37. Shen B: Polyketide biosynthesis beyond the type I, II, and III polyketide synthase paradigms. Curr Opin Chem Biol 2003, 7:285-295. 38. Wenzel SC, Müller R: Formation of novel secondary metabolites by bacterial multimodular assembly lines: deviations from textbook biosynthetic logic. Curr Opin Chem Biol 2005, 9:447-458. 39. Wenzel SC, Müller R: Recent developments towards the heterologous expression of complex bacterial natural product biosynthetic pathways. Curr Opin Biotechnol 2005, 16:594-606. 40. Pfeifer B, Hu Z, Licari P, Khosla C: Process and metabolic strategies for improved production of escherichia coli-derived 6-deoxyerythronolide B. Appl Environ Microbiol 2002, 68:3287-3292. 41. Kodumal SJ, Patel KG, Reid R, Menzella HG, Welch M, Santi DV: Total synthesis of long DNA sequences: synthesis of a contiguous 32-kb polyketide synthase gene cluster. Proc Natl Acad Sci USA 2004, 101:15573-15578. 42. Boddy CN, Hotta K, Tse ML, Watts RE, Khosla C: Precursor-directed biosynthesis of epothilone in Escherichia coli. J Am Chem Soc 2004, 126:7436-7437. 43. Long PF, Dunlap WC, Battershill CN, Jaspars M: Shotgun cloning and heterologous expression of the patellamide gene cluster as a strategy to achieving sustained metabolite production. Chem Bio Chem 2005, 6:1760-1765. 44. Schmidt EW, Nelson JT, Rasko DA, Sudek S, Eisen JA, Haygood MG, Ravel J: Patellamide A and C biosynthesis by a microcin-like pathway in prochloron didemni, the cyanobacterial symbiont of Lissoclinum patella. Proc Natl Acad Sci USA 2005, 102:7315-7320. The entire patellamide A and C gene cluster from an uncultured symbiont was heterologously expressed in E. coli to confirm its function. 45. Sánchez C, Zhu L, Braña AF, Salas AP, Rohr J, Méndez C, Salas JA: Combinatorial biosynthesis of antitumor indolocarbazole compounds. Proc Natl Acad Sci USA 2005, 102:461-466. A systematic approach was used to analyze the biosynthetic steps of rebeccamycin by engineering a variety of gene cassettes and introducing them into a heterologous host, S. albus. Furthermore, the process of combinatorial engineering was applied to produce more than 30 new compounds. 46. Wendt-Pienkowski E, Huang Y, Zhang J, Li B, Jiang H, Kwon H, Hutchinson CR, Shen B: Cloning, sequence, analysis, and heterologous expression of the fredericamycin biosynthetic gene cluster from Streptomyces griseus. J Am Chem Soc 2005, 127:16442-16452. 47. Miao V, Coëffet-LeGal M, Brian P, Brost R, Penn J, Whiting A, Martin S, Ford R, Parr I, Bouchard M et al.: Daptomycin Current Opinion in Microbiology 2006, 9:252–260 biosynthesis in Streptomyces roseosporus: cloning and analysis of the gene cluster and revision of peptide stereochemistry. Microbiol 2005, 151:1507-1523. The daptomycin gene cluster, spanning nearly 130 kb, was introduced into the heterologous host S. lividans to confirm the identity of the gene cluster. 48. Wenzel SC, Gross F, Zhang Y, Fu J, Stewart F, Müller R: Heterologous expression of a myxobacterial natural products assembly line in pseudomonads via Red/ET recombineering. Chem Biol 2005, 12:349-356. A myxobacterium natural product, myxochromide S1, was produced in P. putida by piecing together the entire cluster in E. coli by using genetic recombination. 49. Wu J, Kinoshita K, Khosla C, Cane DE: Biochemical analysis of the substrate specificity of the b-ketoacyl-acyl carrier protein synthase domain of module 2 of the erythromycin polyketide synthase. Biochemistry 2004, 43:16301-16310. 50. Kim C, Alekseyev VY, Chen AY, Tang Y, Cane DE, Khosla C: Reconstituting modular activity from separated domains of 6-deoxyerythronolide B synthase. Biochemistry 2004, 43:13892-13898. 51. Vaillancourt FH, Yeh E, Vosburg DA, O’Connor SE, Walsh CT: Cryptic chlorination by a non-haem iron enzyme during cyclopropyl amino acid biosynthesis. Nature 2005, 436:1191-1194. The cyclopropyl functionality of coronatine was shown to originate by the tandem action of two enzymes: a halogenase CmaB that introduces chlorine, and a previously unknown enzyme CmaC that forms the cyclopropyl ring with concomitant release of the chlorine. 52. Dorrestein PC, Yeh E, Garneau-Tsodikova S, Kelleher NL, Walsh CT: Dichloronation of a pyrrolyl-S-carrier protein by FADH2-dependent halogenase PltA during pyoluteorin biosynthesis. Proc Natl Acad Sci USA 2005, 102:13843-13848. 53. Chen Y, Wang C, Greenwell L, Rix U, Hoffmeister D, Vining LC, Rohr J, Yang K: Functional analyses of oxygenases in jadomycin biosynthesis and identification of JadH as a bifunctional oxygenase/dehydrase. J Biol Chem 2005, 280:22508-22514. 54. Volokhan O, Sletta H, Sekurova ON, Ellingsen TE, Zotchev SB: An unexpected role for the putative 40 -phosphopantetheinyl transferase-encoding gene nysF in the regulation of nystatin biosynthesis in streptomyces noursei ATCC 11455. FEMS Microbiol Lett 2005, 249:57-64. 55. Peric-Concha N, Borovicka B, Long PF, Hranueli D, Waterman PG, Hunter IS: Ablation of the otcC gene encoding a post-polyketide hydroxylase from the oxytetracyline biosynthetic pathway in streptomyces rimosus results in novel polyketides with altered chain length. J Biol Chem 2005, 280:37455-37460. 56. Thomson CJ, Power E, Ruebsamen-Waigmann H, Labischinski H: Antibacterial research and development in the 21st Centuryan industry perspective of the challenges. Curr Opin Microbiol 2004, 7:445-450. 57. Geoghegan KF, Kelly MA: Biochemical applications of mass spectrometry in pharmaceutical drug discovery. Mass Spectrom Rev 2005, 24:347-366. 58. McLoughlin SM, Mazur MT, Miller LM, Yin J, Liu F, Walsh CT, Kelleher NL: Chemoenzymatic approaches for streamlined detection of active site modifications on thiotemplate assembly lines using mass spectrometry. Biochemistry 2005, 44:14159-14169. 59. Peric-Concha N, Long PF: Mining the microbial metabolome: a new frontier for natural product lead discovery. Drug Discov Today 2003, 8:1078-1084. 60. Zwick ME: A genome sequencing center in every lab. Eur J Hum Genet 2005, 13:1167-1168. 61. Marguiles M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen Y, Chen Z et al.: Genome sequencing in microfabricated high-density picolitre reactors. Nature 2005, 437:376-380. www.sciencedirect.com