EU MARIE CURIE: LYNGBYA KENYA - PROJECT PIIF-GA-2011-299550 CYP450 BIOSYNTHESIS OF LYNGBYA MAJUSCULA NATURAL PRODUCTS FINAL REPORT SCIENTIFIC COORDINATOR Professor J GRANT BURGESS School of Marine Science and Technology Newcastle University Armstrong Building, Queen Victoria Road NE1 7RU United Kingdom Tel. +44 (0) 191 2226717 Fax. +44 (0) 191 2225491 Email. grant.burgess@newcastle.ac.uk THE FELLOW Dr THOMAS M DZEHA Department of Chemistry and Biochemistry Pwani University P.O. Box 195-80108 Kilifi Kenya Tel. +254 (0) 788986063 Email. thomas.dzeha@gmail.com 1|Page EXECUTIVE SUMMARY A sustainable supply of marine natural products for potential therapeutics is one of the greatest challenges facing drug discovery efforts today, especially during clinical trials. Nearly 300 compounds with therapeutic potential have been isolated from the tropical marine cyanobacterium Lyngbya majuscula. However there are considerable concerns regarding the real source of this large number of natural products attributed to L. majuscula. This project focused on the cytochrome p450 biosynthesis of L. majuscula natural products namely the modular cyclodepsipeptides homodolastatin 16 (HMDS 16) and antanapeptin A (ANTAP A). L. majuscula was collected from Shimoni, Kenya in April 2012. The aim was to identify if these compounds originate from the cyanobacteria or bacteria cohabiting with it or both. Bacteria representing γ-proteobacteria, Firmicutes, and α–proteobacteria were isolated from the cyanobacteria. Non-ribosomal peptide synthetase (NRPS) and polyketide synthase (PKS) screens of bacteria with known complete genome sequences using bioinformatics tools showed in this study that the modular assembly lines for these bacteria are inconsistent with those of HMDS 16 and ANTAP A. However, Bacillus licheniformis and Marinobacterium stanieri synthesise the β amino acid dolamethleuline and Klebsiella oxytoca is involved with the biosynthesis of the unusual dolaphenvaline amino acid in HMDS 16 putatively. Profiling for the cyclodepsipeptides in bacteria supernatants using liquid chromatograph mass spectrometry (LCMS) confirmed the presence of HMDS 16, its analogue dolastatin 16, and ANTAP A in L. majuscula but not in the bacteria. Subsequently such leads aided the prospects for obtaining the complete genome sequence of L. majuscula from its metagenome and in identifying gene clusters encoding for HMDS 16 and ANTAP A. The cloning and heterologous expression of the gene clusters for these especially important anticancer agents is the goal that we aimed to attain by the end of the program. 2|Page TABLE OF CONTENTS Page Summary 2 Table of contents 3 Key words and abbreviations 5 A SCIENTIFIC REPORT 6 A1 Project context and objectives 6 A2 Modular assembly of homodolastatin 16 9 A3 Investigations of growth conditions for the Kenyan Lyngbya majuscula 10 A 3.1 Growth of L. majuscula in autoclaved seawater 11 A 3.2 Growth of Lyngbya in autoclaved seawater and antibiotics 11 A 3.3 Treatment of Lyngbya majuscula with antibiotics prior to culturing 12 A4 A 3.4 Growth of Lyngbya in autoclaved seawater, BG11 and KNO3 (15 gL-1) 12 A 3.5 Growth of Lyngbya in autoclaved seawater, BG11, KNO3 and antibiotics 13 A 3.6 Summary of the results of bacteria isolates on Lyngbya majuscula 13 Isolation of Lyngbya majuscula filaments 14 A 4.1 The ecology of the Kenyan L. majuscula 14 A 4.2 Do dead L. majuscula filaments contain homodolastatin 16? 16 A5 Bioinformatics strategies for identifying homodolastatin 16 gene clusters 17 A6 Significant outcomes of the scientific project 18 A 6.1 Isolation of culturable bacteria co-habiting with L. majuscula 18 A 6.2 Phylogeny of bacterial isolates and related taxon 18 A 6.3 16S rDNA isolation and identification of L. majuscula and A. colombiense 19 A 6.4 Antibiotic resistant bacteria associated with L. majuscula filament 20 A 6.5 LCMS profiling of homodolastatin 16 in L. majuscula and epibiotic bacteria 3|Page 21 A7 B A 6.6 Bioinformatics prediction for NRPS modular compounds 22 A 6.7 Putative biosynthesis of dolamethyleuline and dolaphenvaline fragments 23 A 6.8 Molecular identification of the Kenyan “Lyngbya majuscula” 25 A 6.9 Analysis of sequence data and phylogeny of the Kenya L. majuscula 27 Manuscripts 28 IMPACT 29 B1 Seminars and guest lectures 29 B2 International conferences and symposia 30 B 2.1 Federation of European Biochemical Society (FEBS) 2013 conference at St. Petersberg, Russia from 6-12 July 2013. 30 B 2.2 International Advanced Single Cell Biotechnology at Sheffield University, UK on 12 February 2014 32 Outstanding outcomes of the project 32 B 3.1 Grant Applications 32 B 3.2 Outreach program - Mentoring of Marine Biology students 33 B3 C THE UNITED KINGDOM AND MY EU MARIE CURIE FELLOWSHIP 34 D ACKNOWLEDGEMENT 34 E REFERENCES 35 F ANNEX 37 Annex 1 Table 1 of Kenyan Lyngbya majuscula epibiotic bacteria (EB) isolates 37 Annex 2: Phylogeny of EB isolates 38 Annex 3: Adenylation domain for Moorea producens 39 Annex 4: MtaD-M1-Cys Biosynthesis 43 Annex 5: TycA-M1-D-/L-Phe Biosynthesis 44 Annex 6: List of PARSE HMM modular domains for M. producens 45 4|Page KEY WORDS: Lyngbya majuscula, Homodolastatin 16, Dolastatin 16, Antanapeptin A, Epibiotic bacteria, 16S rDNA, NRPS/PKS, gene cluster, modular assembly, LCMS, putative biosynthesis, phylogeny, copper sulfate, molecular identification and differential DNA isolation ABBREVIATIONS A - adenylation ANTAP A – Antanapeptin A AT - acyl transferase Cy – cyclisation DH – Dehydratase DML – Dolamethleuline DPV – Dolaphenvaline ER - enoyl reductase HIV – Hydroxyisovaleriate HMDS 16 – Homodolastatin 16 KR – Ketoreductase KS - ketosynthase LCMS – Liquid chromatograph mass spectrometry M - Methylation NMe-Ile – N-methylisoleucine NRPS – Non-ribosomal peptide synthetase Phe – Phenylalanine PKS – Polyketide synthase T- thiolation TE - Thioesteration 5|Page A SCIENTIFIC REPORT A1 Project context and objectives The marine biotope has been identified as a large and rich area for exploration of biologically active pharmaceuticals of use in medicine and in biotechnology. The expansive diversity in form and function of the marine environment coupled with the unique adaptations of the marine organisms therein and varied biosynthesis pathways compared with the terrestrial world suggest that it is as yet an untapped resource. Nearly 15 marine natural products are in various phases of clinical development, mainly in oncology, with several products already on the market and with more on the way.1 Research over the last four decades has shown the filamentous marine cyanobacterium, Lyngbya majuscula of the order Oscillatoriales to be a prolific source of a diverse range of modular natural products 2-4 Out of the nearly 800 compounds isolated from marine cyanobacteria, L. majuscula dominates with nearly 300 substances coming from the species.5 The plethora of marine derived natural products isolated from L. majuscula worldwide in pan-tropically geographical locations include those which exhibit antimicrobial activity, anti-proliferative compounds, anti-HIV agents and those which have shown potential as anticancer agents.5,6 Useful L. majuscula natural products include the anticancer agent Curacin A, the neurotoxic jamaicamides and the UV-sunscreen pigment scytonemin.7 Investigations into the biosynthesis of L. majuscula natural products have revealed gene clusters encoding modular, mixed polyketide synthase (PKS)/nonribosomal peptide synthetase assembly lines that incorporate other functional groups through highly unusual mechanisms.3,4 Homodolastatin 16 (1) and antanapeptin A (2) isolated from the Kenyan L. majuscula exemplify such assemblies.2 The former is a mixed NRPS/PKS modular cyclodepsipeptide isolated from the Kenyan marine cyanobacterium L. majuscula along with antanapeptin A (2) and has moderate activity towards the oesophageal cancer cell lines WHCO1 and WHCO6 (IC50 values of 4.3 and 10.1 μg/mL respectively).2 Its analogue 6|Page dolastatin 16 (3) shows very strong activity against lung (NCI-H460: GI50 0.00096 μg mL-1), colon (KM20L2 GI50 0.0012 μgmL-1), brain (SF-295, GI50 0.0052 μgmL-1) and melanoma (SKMEL5 GI50 0.0033 μgmL-1) cancer cell lines.8,9 1 R = N-Me-Ile 2 3 R = N-Me-Val The biological activities exhibited by homodolastatin 16 (1) and its analogue dolastatin 16 (3) towards cancer cell lines suggests that there is a need to obtain sustainable amounts of these modular compounds for further investigations including structure activity relationships and clinical testing. A sustainable supply of these cyclodepsipeptides could only be achieved through aquaculture of the organism, chemical synthesis or by recombinant biosynthesis of a source organism.10 A complex circadian rhythm associated with cyanobacteria rules out aquaculture as a possible alternative.11 Low enantiomeric excess (e.e) yields and refractory problems associated with chemical synthesis only suggest that recombinant biosynthesis of these natural products is the ideal way to realizing sustainability. Gene shuffling, domain deletions and mutations are recombinant biosynthesis engineering tools that are currently used for the identification and cloning of gene clusters for polyketides, NRPS and hybrid polyketide-NRP metabolites.12 7|Page Recently in 2011, the genome of Lyngbya majuscula 3L which is a Caribbean strain that produces the tubulin polymerization inhibitor curacin A and the molluscicide barbamide was sequenced using Sanger and 454 sequencing approaches to near completion.3 Whereas the draft genome sequence revealed gene clusters for Curacin A and barbamide, only a mere 3% of the genes were dedicated to secondary metabolite production, biosynthesis, transport and catabolism.3 Questions therefore arise as to whether or not the natural products are strain specific, especially as most taxonomic classifications have been mostly morphological. It is also clear through the draft genome sequence of Lyngbya majuscula 3L that the cyanobacterium not only lacks the necessary nifH gene for photosynthesis but also encodes for a complex gene regulatory system for microbial association and environmental adaptation. Distinctively, there is a paucity of information regarding if the compounds isolated from L. majuscula originate from the cyanobacterium or the bacteria co-habiting with it or both, especially as efforts to render axenic L. majuscula culturable have been mostly fruitless. Subsequently there are considerable concerns regarding the real source of the large number of natural products attributed to L. majuscula. Given the biological activities of homodolastatin 16 (1) and its analogue dolastatin 16 (3) towards cancer cell lines, investigations to identify which of bacteria or the non-axenic filamentous cyanobacterium produces the cyclodepsipeptides were necessary. Subsequently, we report the 16S rDNA isolation and identification of bacteria cohabiting with the Kenyan L. majuscula; bacteria found on the filament and the LCMS profiling for homodolastatin 16 (1), antanapeptin A (2) and dolastatin 16 (3) in organic extracts of L. majuscula and bacterial isolates supernatants. For clarity putative probing of gene clusters of these modular compounds in the genomes of bacteria closely associated with the Kenyan L. majuscula in databases was carried out. Further, we report a new method for identifying non-axenic cyanobacteria. These findings have important implications on the understanding of symbiotic pathways for L. majuscula and in the recombinant biosynthesis of homodolastatin 16 (1) and its potent anticancer analogue dolastatin 16 (3). 8|Page SCIENTIFIC AND TECHNOLOGICAL RESULTS A2 Modular Assembly of homodolastatin 16 A thorough understanding of the modular assembly of homodolastatin 16 was essential in gaining insight into the biosynthesis pathway mechanism of the cyclodepsipeptide. The assembly could be inferred by examining the structure of the cyclodepsipepide. The structure comprises 3 proline moieties, an n-methyl leucine, a hydroxyisovaleriate (HIV) and two unusual beta-hydroxy amino acids, namely dolaphenvaline (dpv) and dolamethleuline (dml). We proposed the following putative modular assembly for homodolastatin 16: 9|Page Homodolastatin 16 above is likely to use the following building blocks; the beta methyl homo Phe (DPV) probably coming from methylation of the alpha keto acid, then transamination to beta methyl Phe then chain followed by extension as is Ileu biosynthesis to yield the homoskeleton. The Lactate and HIV hydroxy acids may be activated by keto acidrecognizing A domains with an embedded downstream NADH-dependent dehydrogenase in the module. The DML residue almost certainly uses a hybride NRPS-PKS module condensing Val and Methyl malonyl CoA. This PKS module should have KR/DH/ER domain to take the initial tethered beta keto extended Val-Me-mal scaffold to the fully saturated one here. This is precedent in statine assembly. The gamma amino group that results is the key indicator. In our quest to further understand the modular assembly and to identify the gene clusters for homodolastatin 16, we looked at compounds with similar unusual beta-amino acid fragments and came up with the following literature: the isolation from cephalaspidean mollusk Philinopsis speciosa and structure elucidation of kulokekahilide;13 and the isolation from cyanobacterium Lyngbya majuscula and structure elucidation of pitiprolamide.14 A3 Investigations of growth conditions for the Kenyan Lyngbya majuscula Lyngbya majuscula was investigated for growth conditions to establish how to render it cultivable. It should be noted that there has been controversy regarding whether or not the cyanobacteria possesses the niFH gene for photosynthesis. Lundgren et al. in 2003 reported the niFH gene for photosynthesis in L. majuscula on the basis of the acetylene reduction assay. Studies by Gerwick and co-workers on the near complete genome of Moorea producens (L. majuscula), assert that the cyanobacterium does not contain niFH genes in its genome but is endowed with substantial amount of genes for microbial association. In our study we hypothesized that if any genes for photosynthesis were observed in the acetylene reduction studies, they may have been attributed to association especially by nitrogen fixing bacteria associated with L. majuscula. The following experiments were therefore designed to 10 | P a g e address the question as to why it is rather difficult to grow L. majuscula under ordinary laboratory conditions. A 3.1 Growth of L. majuscula in autoclaved seawater Lyngbya majuscula was grown in autoclaved seawater from 5th March 2013 with unchanged medium until 17th April 2013 (1 month 2 weeks) in an orbital shaker incubator (27 °C, mild constant light). Observations revealed the cyanobacterium to be unhealthy and lacked the green pigmentation for photosynthesis. Isolation of bacteria from the Lyngbya mat grown under these conditions highlighted the following observations: Two main bacteria species were isolated from the medium namely the bright orange Shewanella algae sp. and the crusty creamy Klebsiella oxytoca. These were replicated from an isolation of bacteria from the L. majuscula mat. Re-culturing of the cyanobacteria for a further two weeks (17th to 30th April) under the same conditions but with fresh autoclaved seawater media resulted into further degeneration of the cyanobacterium, totally lacking in growth prospects and exhibiting no signs of the green pigment for photosynthesis. When the culture was transferred to the refrigerator (4 °C, 2 weeks) regeneration was observed but when transferred to an open well-lighted environment severe deterioration was observed. This confirmed the necessity of a light and dark regime for the effective growth of the cyanobacteria. A 3.2 Growth of Lyngbya in autoclaved seawater and antibiotics It was desired to make the cyanobacteria as axenic as possible and also to investigate if the absence of bacteria had an effect on the growth of L. majuscula. A cocktail of antibiotics targeting both Gram +Ve and Gram –Ve was made comprising of penicillin, Streptomycin and chloramphenicol each at the concentration of 4 mg L-1. The antibiotic treated contents were shaken vigorously to ensure uniform distribution and the cyanobacteria grown (5th March to 17 April, 2013) without any medium change. It was observed that there was slight improvement on the growth of L. majuscula compared with the autoclaved seawater alone 11 | P a g e with some areas showing the green pigmentation for photosynthesis. On this basis we speculated that bacteria affecting the cyanobacteria negatively were absent. Isolation of bacteria from the medium and cyanobacteria respectively led to the following observations: Bright orange (Shewanella algae sp.) and bright yellow (Pseudomonas stutzeri) were present along with the glassy Pseudomonas putida in the medium, results which were replicated from an isolation of the Lyngbya mat. However, the abundance of the bacteria on the plate was a lot less compared with the medium. A 3.3 Treatment of Lyngbya majuscula with antibiotics prior to culturing Based on the above observations, it was desired to treat the Lyngbya with the cocktail of antibiotics out of the medium. The intent was to establish antibiotic resistant bacteria and whether or not the surviving bacteria are associated with the non-ribosomal peptide synthetase (NRPS) genes for antibiotics. Treatment of the Lyngbya led to the following observations: Consistently, the yellow Pseudomonas stutzeri and the creamy Klebsiella oxytoca and the greenish Shewanella algae sp. dominated the isolation. Genome mining of the KEGG genome database revealed that K. oxytoca possessed the non-ribosomal peptide synthetase (nrps) genes for rifamycin and related antibiotics. A 3.4 Growth of Lyngbya in autoclaved seawater, BG11 and KNO3 (15 gL-1) The North Sea water at Newcastle upon Tyne is poor in nitrogen and it was therefore considered necessary to supplement the autoclaved seawater with nitrogen nitrate from KNO3 (15g L-1). In common with other cyanobacteria, Lyngbya is generally acknowledged to grow under low phosphorus phosphate conditions and therefore it was not grown the Lyngbya under phosphate conditions. Growth from (17th to 30th April) showed improvement compared with the autoclaved seawater only but comparable to the growth of the antibiotic treated Lyngbya. Isolation of bacteria from the medium and Lyngbya mat respectively demonstrated that P. stutzeri, S. algae sp. and K. oxytoca were the key bacteria closely 12 | P a g e associating with L. majuscula. No new recruitment of bacteria was observed by the incorporation of KNO3 into the medium. A 3.5 Growth of Lyngbya in autoclaved seawater, BG11, KNO3 and antibiotics The effect of antibiotics was determined for the growth of Lyngbya in the above medium. Growth of the cyanobacteria was comparable to that of the Lyngbya in autoclaved seawater, BG11 and KNO3. Whereas Shewanella algae sp. was absent from the bacterial isolates in the medium and Lyngbya mat respectively, P. stutzeri and K. oxytoca flourished. It was noted that regardless of the regeneration due to KNO3, and/or incorporation of the antibiotic cocktail the cyanobacteria could not survive under constant light. A3.6 Summary of the results of bacteria isolates on Lyngbya majuscula Bacteria and condition of Lyngbya Autoclaved seawater Shewanella algae sp. (Bright orange) Lyngbya condition poor Autoclaved seawater and antibiotics Shewanella algae sp, Pseudomonas stutzeri and Klebsiella oxytoca Lyngbya fair growth Autoclaved seawater, BG11 and KNO3 P. stutzeri, S. algae sp. and K. oxytoca. Lyngbya growth good Autoclaved seawater, BG11, KNO3, and P. stutzeri and K. oxytoca antibiotics Lyngbya medium) 13 | P a g e Lyngbya growth good and antibiotics only (no P. stutzeri and K. oxytoca Lyngbya growth good A4 Isolation of Lyngbya majuscula filaments Lyngbya majuscula was treated with cycloheximide overnight (5 mg L-1) to rid it of eukaryotic cells, protozoa and fungi, rinsed several times with filtered sterile seawater (12 times, 45 μ) and thereafter left in phosphate poor autoclaved seawater overnight. Following this, the cyanobacterium was submerged in phosphate buffered saline to detach filaments. The filaments were thoroughly rinsed, pooled together and weighed to afford 0.25 mg of biomass for DNA extraction and subsequent genome sequencing. In another experiment, bacteria were isolated from the filament and plated on marine agar to determine the assemblage of culturable bacteria associated with the filaments outside the sheath of bacteria. These experiments were repeated for a near dead Lyngbya mat. Bacteria isolated from the live and dead bacteria were designated as LFB and DFB respectively. Additional filaments from the live and near dead Lyngbya were thoroughly washed and stained with acridine and negrosin to establish association of the filaments with heterotrophic bacteria. The following observations were made: Predominantly, a bacteria species with a red tinge at the centre that could not be re-isolated into a single strain was isolated from the filaments of the live and near dead Lyngbya. Additionally, a creamy bacteria species characteristic of K. oxytoca was observed. The pcr amplification of the red bacteria did not generate any 16S rDNA sequence as expected. A 4.1 The ecology of the Kenyan L. majuscula In the quest to establish if L. majuscula filaments could entirely be rid of bacteria, filaments prepared as described previously were stained with acridine to establish cyanobacteriabacteria association of the filament (Fig. 1) and with negrosin to study cell wall and DNA degradation of the filament by bacteria (Fig. 2). 14 | P a g e Fig.1. Bacteria on the surface of a live L. majuscula filament Fig. 2. DNA (at the centre) and cell wall material (outside) Clearly, despite treatment of the cyanobacteria with cycloheximide and several rinses of the bacteria with phosphate buffered saline (PBS), microscopy (Fig. 1) and the isolation of the LFB and DFB bacteria from the filament surface revealed that bacteria are always associated with Lyngbya. This was regardless of the cyanobacteria getting actively involved with phototropism or on the verge of dying. It is widely acknowledged that oxygen is poisonous towards cyanobacteria including L. majuscula. Bacteria utilise oxygen for respiration and cyanobacteria capture carbon dioxide to undergo auto-photosynthesis creating an energy balance on the cyanobacteria-bacteria interface. Conclusively, it was not surprising that certain species of bacteria were found inside the cyanobacteria sheaths and on the filaments for this reason. Fig. 3. A non-uniform near dead filament Fig. 4. Bacteria embeded onto surface of a dying filament 15 | P a g e The green and the brown are the live and dead parts respectively of the filament. The congregation of live bacteria on the dead filament is shown by the red spots. We further aimed to monitor the behavior of bacteria on the surface of an untreated filament that was left overnight to die. Whereas it was observed that the bacteria did not enter the core of the filament targeting the DNA material, nevertheless there was considerable loss of cell wall material (Fig.3.) leading to the speculation that some of the bacteria on the surface survive on organic carbon from the cyanobacteria. These findings were corroborated by a broken cell wall as observed on a negrosin stained filament similarly left to die overnight. The bacteria isolated from the sheath and on the filaments of L. majuscula are close relatives of human pathogenic bacteria suggesting that they may not necessarily be pathogenic to it. These pathogenic bacteria especially K. oxytoca have been shown through genome mining to possess the niki B gene for nikimycin and rifamycin respectively. This suggests that within the consortium some bacteria species in addition to other roles supply chemical defense arsenals to the host substrate. However it is unclear how the live filament retains its cylindrical shape despite the presence of a myriad species of bacteria some of which are cellulose degraders as exhibited by the bacteria on a dying filament (Fig. 4). A 4.2 Do dead L. majuscula filaments contain homodolastatin 16? We considered that a comparison of the homodolastatin 16 content of dead and live L. majuscula filaments would provide a clue on the role of surface bacteria towards the biosynthesis of homodolastatin 16. The absence of the natural product in a dead filament would suggest that bacteria have a significant role towards the biosynthesis of homodolastatin 16. Cycloheximide treated filaments thoroughly cleaned with PBS were left to die naturally in a sterile container (2 days, ambient temperature) and extracted with dichloromethane:methanol 2:1 according to the method of Gerwick and co-workers with 16 | P a g e modifications. The LCMS profiles of the C-18 eluants of the extracts of these dead filaments compared favorably with those of live filaments. A5 Bioinformatics strategies for identifying homodolastatin 16 gene clusters Our overall objective for the project was to identify gene clusters encoding for homodolastatin 16 for expression in a Saccharomyces cerevisiae vector for a sustainable supply of the anti-cancer homodolastatin 16 and antanapeptin A cyclodepsipeptides. We were cognizant of the fact that a near complete genome of L. majuscula, otherwise renamed Moorea producens 3L because of issues of identity had been accomplished in 2011 by Gerwick and co-workers at the Scripps Institution of Oceanography, San Diego, La Jolla, USA. We therefore sought to identify non-ribosomal peptide synthetase (NRPS) domains in the genome of M. producens as a starting point for our understanding of the biosynthesis of homodolastatin 16 pathway. A blast of the near complete genome for M. producens against the NRPase HMM from Pfam (see http://pfam.sanger.ac.uk/family/PF08415) found 3 hits, two above cut-offs. These were identified as Pseudomonas pseudoalcaligenes CECT 5344, Klebsiella oxytoca KCTC 1686 Stenotrophomonas maltophilia K279a. Incidentally all three species of bacteria were isolated from the Kenyan L. majuscula in the current project. A blast of the peptide sequence for M. producens (see FASTA sequence annex 3) revealed the protein of 10623 amino acids to comprise a total of 38 domains of which 4 were of adenylation. Two of these domains encoded for the MtaD-M1-Cyst biosynthesis characteristic of Myxothiazol synthetase, epothilone synthetase, Bacitracin synthetase and Yersiniabactin synthetase (Annex 4). Each of these nrps natural products are synthesised from the DLYNLSLI modular assembly. The third domain had a DAWTVAAV modular assembly predicting for the TycA-M1-D-/L-Phe biosynthesis (Annex 4). This binding pocket putatively expresses the tyrocidine synthetase, Gramicidin synthetase and Bacitracin synthetase (Annex 4). The fourth adenylation domain was thought to comprise a hypothetical protein. An examination of the PARSE HMM hits for 17 | P a g e M. producens (Annex 5) revealed the adenylation (A), thiolation (T), methylation (M), thioesteration (TE), enoyl reductase (ER), ketosynthase (KS), acyl transferase (AT), dehydratase (DH) and cyclisation (Cy) domains. A6 Significant outcomes of the scientific project A 6.1 Isolation of culturable bacteria co-habiting with L. majuscula Direct streaking of bacterial isolates from the Kenyan L. majuscula biomass onto marine agar 2216 (10% w/v) consistently led to the isolation of colored colonies that were subcultured to obtain pure strains. The concentration of marine agar was altered between 1% and 10% to differentiate between bacteria growing under poor and rich nutrient medium respectively. Pseudomonas stutzeri (yellow) was isolated from the sub-culturing of colonies embedded into Enterobacter cloae (creamy), a mixture that was characterized by a green pigment. There were diverse morphologies of the bacteria isolates after overnight incubation including tiny colonies of Bacillus subtilis, glassy Pseudomonas putida and the large soft Shewanella algae (purple) and Pseudomonas stutzeri (yellow) respectively. Hard colonies were observed in Bacillus licheniformis (red). A list of the bacteria identified by 16S rDNA is shown in Table 1. Nearly 70% of all isolates were γ-proteobacteria. Firmicutes were isolated in reasonable quantities (17%) whereas α-proteobacteria and Actinobacteria were minimal. A 6.2 Phylogeny of bacterial isolates and related taxon The evolutionary history of bacteria isolates from L. majuscula investigated along other taxon of pathogenic bacteria and cyanobacteria using the Maximum Parsimony method shows the α-proteobacteria Ochrobactrum anthropi, Aminobacterium colombiense and the actinobacteria Cellulosimicrobium cellulans as being close relatives of L. majuscula (Annex. 2). The mycobacterium A. colombiense was isolated from L. majuscula gDNA as a consequence of the cross-reaction with the cyanobacteria primers. Surprisingly, Pseudoalteromonas carrageenovora isolated from the L. majuscula filament appears to have no close relationship to the cyanobacteria. This finding is corroborated with Klebsiella 18 | P a g e oxytoca, Shewanella algae species and Pseudomonas stutzeri that associate closely with L. majuscula. Pseudomonas pseudoalcaligenes related closely with the known pathogen Pseudomonas tolaasii and Enterobacter cloacae had close relationship with the pathogen Pseudomonas fluorescens. Klebsiella oxytoca related well with Enterobacter cancerogenous and Yorkenella regensburgeii. The firmicutes all related fairly well to each other. A 6.3 16S rDNA isolation and identification of L. majuscula and A. colombiense The appearance and morphology of the L. majuscula was consistent with the earlier identification of the homodolastatin 16 producing strain.2 16S rDNA was used to confirm the identity of the cyanobacterium. In order to achieve quality genomic DNA for the identification, L. majuscula in filtered (22 μm) autoclaved seawater poor in phosphate phosphorus was treated with cycloheximide (4 mg mL-1, 12 hrs) to rid it of eukaryotic organisms and thereafter left submerged in phosphate buffered saline (PBS, pH 7.4, overnight) to detach the filaments and to remove extracellular polysaccharides.15 The filament remained associated with bacteria even after several attempts to wash it with PBS and milliQ water (Fig. 1). Species identification under the microscope was not possible. Bacteria on the surface of L. majuscula were killed by exposure of the cyanobacterium to copper sulfate pentahydrate (5 min, 10 min, 30 min and 60 min) prior to weighing aliquots (0.5g) for DNA extraction. Most bacteria were dead within minutes whilst a few embedded themselves into the L. majuscula filament tissue (Fig. 3). Conventional genomic DNA extraction kits proved inadequate for the cyanobacteria. The addition of lysozyme (50 mg/mL), SDS, RNase and proteinase K in a power bead tube prior to homogenization lyophilized the cyanobacterium and aided lysis of the L. majuscula cell. Precisely, the SDS removed lipid polysaccharide.15 High molecular DNA (8 kb) of L. majuscula was subsequently extracted with phenol : chloroisoamyl alcohol (25:24:1) to afford DNA with 268/280 and 280/230 ratios of between 1.8 and 2.0 and 1.7 and 2.0 respectively as measured by spectrophotometry from the nano drop allowing for 16S rDNA identification and 19 | P a g e complete genome sequence. The 16S rDNA of the copper sulfate exposed L. majuscula isolates matched that of A. colombiense at 89% identity. A 6.4 Antibiotic resistant bacteria associated with the L. majuscula and the filament In other experiments, L. majuscula was treated to a cocktail of ampicillin, chloramphenicol and streptomycin (4 mg mL-1 each) antibiotics in growth culture to establish drug resistant bacteria in the consortia likely to offer protection to the cyanobacteria against bacterial infection. It was established that P. stutzeri (yellow) S. algae (pink to orange) and K. oxytoca (cream) resist the antibiotic cocktail treatment. The resistance towards antibiotics was corroborated by the presence of the emrA multidrug efflux system protein emrA [tr:G8WAU0_KLEOK], the K03543 multidrug resistance protein A and the beta lactamase peptidoglycan glycosyltransferase gene clusters in the Klebsiella oxytoca KCTC 1686 genome.16,17 Similarly, the multidrug resistance (MDR) efflux pump F2N102_PSEU6 encoding for TbtABM operon was observed in the genome of the nitrogen fixing P. stutzeri. Replating of detached L. majuscula filaments onto marine agar 2216 (10% w/v), identified Pseudoalteromonas carrageenovora and Ochrobactrum anthropi as the bacteria found on the filament. However, replating of L. majuscula specimen treated with copper sulfate pentahydrate onto the agar did not result into any observed cultures of bacteria but instead showed tiny specs of fragmented cells. It was also reasoned that cyanobacteria filaments left to die would be culpable to cell wall destruction by bacteria living on the surface. To investigate this, the filament was left to die on a microscope coverslip (48 hr) and thereafter observed under the microscope with nigrosin stain. Nigrosin stains blue DNA and cell wall material of bacteria and cyanobacteria. Whereas bacteria were present in a disfigured (noncylindrical) filament there were no indications of the cell wall material having been dismembered by the bacteria during the 48 hours of decay. 20 | P a g e A 6.5 LCMS profiling of homodolastatin 16 in L. majuscula and epibiotic bacteria Organic extracts of freeze-dried supernatants of epibiotic bacteria (EB) were investigated for the presence of homodolastatin 16 (1) through glass fibre (GFF 44μ) filtration and C18 purification. 2:1 dichloromethane/methanol eluants from the C-18 were evaporated down by a rotary evaporator (23 °C) and dried with nitrogen under vacuum. Extracts were yellow in color. Similar treatment was made for L. majuscula extracts. TD-Lyng chl was the first Lyngbya extract fraction to elute from the C-18 column and exhibited an intense pigmentation of chlorophyll. TD-Lyngbia eluted immediately after TD-Lyng chl. Both TDLyngbia and TD-Lyng chl fractions were observed to have similar chromatograms on the gradient elution. Fig. 5. Low resolution LCMS chromatograms. The times shown here differ with the high resolution values reported in the text (Not shown here) 21 | P a g e The molecular ion for dolastatin 15 (m/z 837.9050) consistent with the molecular formula C45H69O9H6 was used as a standard. Dolastatin 15 was eluted after 8.56 minutes. Homodolastatin 16 (1), antanapeptin A (2) and dolastatin 16 (3) were eluted at 12.06, 12.53 and 11.39 minutes respectively. The peak at 10.42 minutes is a contaminant from the column unrelated to the extracts. The molecular ion m/z 915.5178 M + Na++ was consistent with the molecular formula C48H72O10N6 for homodolastatin 16 (theoretical mass m/z 915.5202) whereas m/z 759.4290 M + Na++ corresponded with the molecular formula C41H60N4O8 for antanapeptin A (theoretical mass m/z 759.4314) that was previously isolated from L. majuscula along with homodolastatin 16 (1).2 The minor metabolite for the molecular ion m/z 901.5041 M + Na++ in the chromatogram is consistent with the molecular formula C47H70O10 for the potent anticancer agent dolastatin 16 (3) (theoretical mass m/z 901.5046) and differs from homodolastatin 16 (1) by a methylene group. Examination of the chromatograms and TOF MS ESI+ spectra for the epibiotic bacteria isolates did not show any matches for homodolastatin 16 (1) and its analogue dolastatin 16 (3). Neither was there observed any signal matching with that of antanapeptin A (2). Representative spectra for the γ-proteobacteria (Enterobacter cancerogenus, Pseudomonas carrageenovora, Pseudomonas pseudoalkaligenes, Yorkenella regensburgeii, Klebsiella oxytoca), Firmicutes (Staphylococcus saprophyticus), and α–proteobacteria (Ochrobactrum anthropii) are presented here (Fig. 5). These spectra also accounted for the bacteria closely associated with L. majuscula and those found on the filament of the cyanobacteria. A 6.6 Bioinformatics prediction for NRPS modular compounds The absence of homodolastatin 16 (1), anatanapeptin A (2) and dolastatin 16 (3) in the bacteria isolates prompted investigating whether this outcome was consistent with bioinformatics driven prediction. To ascertain if the cyclodepsipeptides found in the Kenyan L .majuscula had templates in M. producens, NRPS adenylation scaffolds of 11 prokaryotic 22 | P a g e microorganisms encoding the AMP-C family developed from orphan-proline genes were blasted onto the M. producens genome.16 Long chain fatty acid-CoA ligases (5853 bp, 1951aa, 462.e-130) for PKS were established in the M. producens genome. Similar blasts on A. colombiense resulted in long chain fatty acid but comprising fewer base pairs and amino acids (1512 bp, 504 aa, 157.e-38) compared with the cyanobacteria. NRPS gene clusters were found in P. putida encoding for pyoverdin siderophore biosynthesis (10413 bp, 3471 aa, 410.e-114), K. oxytoca for the siderophore enterobactin synthetase (3882 bp, 1294 aa, 840.e0) and yersiniabactin (6099 bp, 2033 aa, 233.e-61). A 6.7 Putative biosynthesis of dolamethyleuline and dolaphenvaline fragments Dolamethyleuline is a fragment in homodolastatin 16 (1) and dolastatin 16 (3) whereas dolaphenvaline (Dpv) has been observed in both 1 and 3; in kulokekahilide,13 and pitiprolamide.14 Valine in step i (Scheme 1) undergoes degradation into fatty acid biosynthesis via isobutyryl-CoA utilising the enzymes valine dehydrogenase (EC: 1.4.1.23) and the branched chain amino-acid aminotransferase (EC: 2.6.1.42) . Scheme 1 The dehydrogenation in step ii involves 2-oxoisovalerate dehydrogenase E1 component, alpha subunit (EC: 1.2.4.4) and 2-oxoisovalerate dehydrogenase E2 component (dihydrolipoyl transacylase) (EC 2.3.1.168). These are accompanied with dihydrolipoamide dehydrogenase (EC:1.8.1.4) for co-factor recycling and 2-oxoisovalerate ferredoxin oxidoreductase alpha subunit (EC:1.2.7.7). The dolamethyleuline β-amino acid is afforded 23 | P a g e through a polyketide/fatty acid extension with methyl malonyl CoA and β-transaminase in step iv. Bacillus licheniformis, Marinobacterium stanieri, Shewanella sp. and Pseudomonas putida isolated from the Kenyan L. majuscula putatively can synthesize the 2-oxoisovalerate dehydrogenase E1 component, alpha subunit. The accession numbers of the respective proteins of these bacteria are (WP_016885941.1), WP_010322356.1, WP_011622791.1, WP_010955110.1 respectively. Dpv is synthesized via a benzoyl-CoA biosynthesis into the phenol intermediate of a final transaminase component by an enzyme with aldolase functionalities similar to those of Nikkomycin B (NikB, Scheme 2 above). The NikB gene has been observed in the Streptomyces. In this study it has been found in the genomes of Klebsiella oxytoca (WP_004134764.1) and Pseudomonas putida (NP_745483.1). The gene cluster was absent in the nikB genome blast of the cyanobacteria M. producens. Scheme 2 The putative presence of the genes for the Dml and Dpv fragments in bacteria and their absence in M. producens suggested that there could be a symbiotic relationship in bacteriaL. majuscula consortia. This led to the isolation and identification of culturable bacteria from L. majuscula. 24 | P a g e A 6.8 Molecular identification of the Kenyan “Lyngbya majuscula” The failure to identify homodolastatin 16 and antanapeptin A genes in the Moorea producens genome raised concerns on the identity of the Kenyan marine cyanobacterium. The Kenyan L. majuscula had only been identified morphologically consistent with other species collected worldwide in pantropic geographical locations. Morphological identification is limited and unreliable because of the immense diversity of the Oscillatoriales. Molecular identification is highly accurate and specific as it is based on the genomic content of the organism. Whereas the technique works efficiently for axenic species, molecular identification of non-axenic cyanobacteria is especially difficult due to the presence of bacteria and other microorganisms that complicate genomic DNA isolation. Presently the identification of non-axenic cyanobacteria mostly utilises the multiple displacement amplification (MDA) method which has only a limited total genomic coverage. Various approaches for obtaining axenic cultures of cyanobacteria are well documented including treatment of cyanobacteria cultures with toxic chemicals and mechanical separations.17 However these methods do not elaborate on how to isolate genomic DNA from non-axenic strains. In this study the Kenyan L. majuscula was treated to toxic copper sulfate (CuSO4.5H2O) at different time intervals (0, 5 min, 15 min, 30 min, 60 min) with intermittent mechanical separation prior to DNA extraction. Controls in which the toxic chemical was not applied were used. Freeze drying of the samples in liquid nitrogen followed by periodical thawing and sonication (30% pulsar, 10 min maximal amplitude) removed residual bacteria.18 For obtaining genomic DNA of the cyanobacterium, homogenized L. majuscula pellets were exhaustively extracted for bacteria genomic DNA. The resulting bacteria DNA was of mixed species and did not therefore generate a 16S rDNA sequence. Surprisingly, gDNA isolation of the residue largely comprising of cyanobacteria provided quality16S rDNA sequences with 260/280 and 260/230 ratios of between 1.90 and 2.29 in the qubit assay respectively. Whereas both controls and copper sulfate treated samples generated 16S rDNA sequences, sequences of the latter had fewer nucleotide bases to 25 | P a g e afford sufficient coverage for complete and/or draft genome sequence. These observations were corroborated with degraded DNA for the copper treated samples on an electrophoresis gel (Fig.6 below) and deformed morphologies of the L. majuscula filament (Fig. below). Fig. 6. Lane 1: Gene ruler DNA ladder mix; Lane 2: TD01 (control); Lane 3: TD Conv (Supernatant treated with CuSO4.5H2O); Lane 4: TD Res (Residue treated with CuSO4.5H2O); Lane 5: Lambda DNA/Hind III marker. Fig. 7. Lyngbya filament treated with CuSO4.5H2O for 15 minutes (left) and for 60 minutes (right) 26 | P a g e A 6.9 Analysis of sequence data and phylogeny of the Kenyan L. majuscula Differential DNA isolation of the homogenised Kenyan L. majuscula commensed with a sample treated with toxic copper sulfate, generating 16S rDNA sequences with matches for Aminobacterium colombiense at 89% identity and 94% coverage. The control in which the cyanobacterium was not treated with copper sulfate did not generate any sequence. We made the assumption that there may have been cross reaction with the cyanobacteria primers CYA 106F (CGG ACG GGT GAG TAA CGC GTG A) and CYA 781R (GAC TAC TGG GGT ATC TAA TCC CAT T) during amplification. Mismatches arising from the primer CYA 106F are not unusual.19 Arguably the low % identity was questionable. Furthermore, only small concentrations of the DNA were obtained on isolation. Treatment with copper sulfate of a cyanobacterium residue rather than the usual supernatant of a TE buffered solution resulted into a 16S rDNA sequence matching with that of Cylindrospermum stagnale at 85% identity of 100% coverage and 88% at 95% coverage respectively for different aliquots. With this uncertainty the method was tested against the known L. majuscula CCAP 1446/4 strain from the Culture Collection at Oban, Scotland. Both the supernatant and the residue confirmed the identity of the strain with 100% identity for 100% coverage of 6 replicates. Still a major drawback was to get a non-degraded DNA with the quality suited for draft and/or complete genome sequence. The isolation of soil bacteria DNA assumes exhaustive extraction of bacteria DNA with the residual soil and humic substances as substrate for the bacteria. In an analogy with this study cyanobacteria residue was the substrate comprising the bulk of the cyanobacteria genomic DNA material. Exhaustive extraction of bacteria genomic DNA from the Kenyan L. majuscula followed by genomic DNA isolation of the residue with copper sulfate and controls respectively generated 16S rDNA sequences with sufficient nucleotides for a blast. Whereas the copper sulfate treated DNA was degraded as observed in an electrophoresis gel (Fig. 6 Lanes 3 and 4 respectively), the control (Fig. 6 Lane 2) was not and was of good quality to generate an assembly library for a draft genome. 27 | P a g e An NCBI blast of the generated sequence of the Kenyan marine cyanobacteria, without restricting organism identity matched the sequence at 99% identity with an uncultured Aminanaerobia bacterium. 16S rDNA fragments of this organism had up to 100% identity match with the aforesaid uncultured Aminanaerobia bacterium. These observations were consistent with all the 16S rDNA sequences obtained from the CuSO4.5H20 extractions at 0, 5 min, 10 min and 30 min. CuSO4.5H20 was found to fragment the cyanobacteria genomic DNA. With the blast restriction to cyanobacteria all the sequences were matched with an uncultured cyanobacterium respectively. 16S rDNA sequences obtained from the axenic Lyngbya majuscula strain CCAP 1446/4 by the aforesaid method did not show any matches to Aminanaerobia bacteria but instead consistently matched with 100% to L. majuscula. A phylogeny carried out for the Kenyan “Lyngbya majuscula” showed quite distant relations with L. majuscula CCAP 1446/4 and its clones. The goal of the project was to identify gene clusters encoding for the anticancer homodolastatin 16, antanapeptin A and the potent anticancer dolastatin 16 originally isolated from a Papua New Guinea sea hare; and to carry out the expression and recombinant biosynthesis of the anticancer compounds in a heterologous Saccharomyces cerevisiae system. Work on the draft genome of the Kenyan marine cyanobacterium is ongoing at the University of Aberystwyth in collaboration with Dr Justin Pachebat. This expected draft genome of the Kenyan cyanobacteria shall reveal the true identity and nature of the organism. A7 Manuscripts On manuscript preparation, we are shortly due to submit the manuscript ‘Bacteria living on marine cyanobacteria utilise biofilm exopolysaccharides desiccation and avoidance to resist UV irradiance’ to Photochemistry and photobiology C journal of Japan. Currently the manuscript is on the proof reading stage. I have also drafted a manuscript “Differential DNA isolation as a novel method for identifying non-axenic cyanobacteria” based on a novel 28 | P a g e technique that is likely to replace the multiple displacement amplification currently used to obtain the draft genome of non-axenic cyanobacteria. A key feature of the publication shall be the observation that molecular identification of the Kenyan L. majuscula is not consistent with the morphological identification previously done by Mirjam Girt of the Oregon State University in 2003. The manuscript is for submission to the Proceedings of the National Academy of Sciences (PNAS) journal of the USA for publication. Separately we shall soon publish the draft genome of the Kenyan “Lyngbya majuscula” in the Journal of Microbiology. Additionally, our confirmatory LC/MS results and draft genome data shall strengthen our resolve to publish our findings on the source of the anticancer homodolastatin 16, dolastin 16 and antanapeptin A in Nature biotechnology. B IMPACT B1 Seminars and guest lectures In regard to seminars and seminars, I provided the ‘Cyanobacteria-bacteria interactions’ lecture for the MST3011 Marine Microbiology Mini-Module at Newcastle University in 2012 and “The discovery of novel pharmaceutically relevant natural products from marine cyanobacteria”; and “The future prospects for biodegradable resins from marine cyanobacteria” for the 2013 Marine Biology group. Feedback from the Marine Biology Research students was quite good at an average 9/10. I also presented a talk entitled ‘Biosynthesis of the anticancer cyclohexadepsipeptide homodolastatin 16’ to Professor Ian Head’ research group at Newcastle University and externally I was invited to give a talk on the isolation and biosynthesis of the modular anticancer cyclohexadepsipetide to the Research Group of Professor Rebecca Goss at University of St Andrews, Scotland, UK in April 2013. On outreach I was a guest speaker at the EU Marie Curie conference held at Durham University in May 2013, a seminar that was organized for North East England. 29 | P a g e B2 International conferences and symposia B 2.1 Federation of European Biochemical Society (FEBS) 2013 conference at St Petersberg, Russia from 6-12 July 2013. With regard to international conferences, I was invited to present my talk “Biosynthesis of the modular anticancer cyclohexadepsipeptide homodolastatin 16” at the Federation of European Biochemical Society (FEBS) 2013 conference at St Petersberg, Russia from 6-12 July 2013. This was an especially prestigious ‘Mechanisms in Biology’ conference in which 11 Nobel laureates comprising 7 in Chemistry and 4 in Medicine or Physiology attended. They included: 1. Sidney Altman (USA) who won the Nobel Prize in Chemistry 1989 “for the discovery of catalytic properties of RNA” together with Nobel laureate Thomas Cech. 2. Nobel laureate Aaron Ciechanover (Israel) who was awarded the Nobel Prize in Chemistry in 2004 “for the discovery of ubiquitin-mediated protein degradation” together with Nobel laureates Avram Hershko and Irwin Rose. Nobel laureate Jules Hoffman (France) who won the Nobel Prize in Medicine or Physiology together with Nobel laureates Bruce Beutler and Ralph Steinman in 2011 “for their discovery concerning the activation of innate immunity”. 3. Nobel laureate Robert Huber (Germany) who together with Johann Deisenhofer and Hartmut Michel were awarded the Nobel Prize in Chemistry in 1988 “for the determination of the three-dimensional structure of a photosynthetic reaction centre”. 4. Nobel laureate Roger Kornberg (USA) who was awarded the Nobel Prize in Chemistry in 2006 “for his studies of the molecular basis of eukaryotic transcription”. 5. Nobel laureate Jean-Marie Lehn (France) who together with Donald Cram and Charles Pedersen won the Nobel Prize in Chemistry in 1987 “for their development and use of molecules with structure-specific interactions of high selectivity”. 6. Nobel laureate Richard Roberts (UK) who together with Phillip Sharp were awarded the Nobel Prize in Medicine or Physiology in 1993 “for their discovery of split genes”. 30 | P a g e 7. Nobel laureate Jack Szostak (USA) who together with Elizabeth Blackburn and Carol Greider were awarded the Nobel Prize in Medicine or Physiology in 2009 “for the discovery of how chromosomes are protected by telomeres and the enzyme telomerase”. 8. Nobel laureate Susumu Tonegawa (Japan) who won the Nobel Prize in Medicine or Physiology “for his discovery of the genetic principle for generation of antibody diversity”. 9. Nobel laureate Kurt Wuethrich (Switzerland, USA) that was awarded the Nobel Prize in Chemistry together with John Fenn and Koichi Tanaka in 2002 “for his development of magnetic resonance spectroscopy for determining the threedimensional structure of biological macromolecules in solution” 10. Ada Yonath (Israel) who in 2009 was awarded the Nobel Prize in Chemistry together with Venkatraman Ramakrishnan and Thomas Steitz “for studies of the structure and function of the ribosme” I attended nearly all the plenary sessions by the Nobel laureates and experienced their humility in servitude to science for humanity’s sake. In all their sessions it became clear that their approach is towards focusing on a problem “the goal” rather than allegiance to a discipline of science. This tremendously influenced my research during the last half of my Marie Curie Fellowship at Newcastle University. There was also much furor regarding taking photographs with Nobel laureates at the conference especially from our British conservatives. However, I argued that not a single African was honored with the award of a Nobel Prize in Chemistry and Medicine or Physiology and was therefore allowed the privilege. Subsequently I felt most humbled and yet honored to freely interact with Nobel laureates Jack Szostak, Jules Hoffman and Susumo Tonegawa. 31 | P a g e B 2.2 International Advanced Single Cell Biotechnology at Sheffield University, UK on 12 February 2014 I also attended the “International Advanced Single Cell Biotechnology” at Sheffield University, UK on 12 February 2014. The one day symposium was hosted by Dr Wei Huang of the Kroto Research Institute. The key note speaker was Professor Michael Wagner of the University of Vienna. There were presentations from all over the UK including Imperial College, London; Sanger Institute; Manchester University and there were also presentations from the USA. I was mostly interested in the symposium because of the difficulties I had encountered in isolating quality DNA from the Kenyan L. majuscula for complete/draft genome sequencing. The meeting was especially useful because I learnt some techniques that were most helpful towards my research. There were also a number of questions remaining unanswered in my project. I did find colleagues to partner with for my future research aspirations. In this regard I single out a project which aims “to investigate the role of cyanobacteria toxins on bacteria cell division and cell modulation and the relevance of cyclodepsipetides in cancer therapy”. It was a happy and exciting moment to realize that Dr Huabing Yin of Glasgow University had almost similar interests to mine albeit unconsciously and could not help hugging her after the end of her seminar. The symposia aided my networking and besides my on-going work on single cell technology at the University of Aberystwyth Wales, UK, prospects for working with Dr Wei on Raman Tweezer spectroscopy and with Dr Yin are high. It also afforded me the opportunity to realize that the UK is a giant in science, something I always took for granted. B3 Outstanding outcomes of the project B 3.1 Grant Applications Regarding achievements and outstanding milestones, the isolation of bacteria pathogens from a marine cyanobacterium opened the window to relooking into the biogenesis of human pathogenic toxins in portable water from Kenya, Tanzania and South Africa and in this regard we submitted a £1.24M proposal ‘Understanding the source of microbial 32 | P a g e contamination in African coastal borehole waters’ to the Royal Society DFID Capacity Building Initiative in April 2014. This intended project shall dissect the chemistry and biology of bacteria and protozoa pathogens in coastal borehole water, aiming to finding solutions to waterborne diseases in Africa and worldwide. The project shall link scientists in the UK from Newcastle University, Aberystwyth University, St. Andrews University with those from Pwani University, Kenya; University of Dar es salaam, Tanzania and the University of Cape Town, South Africa. We are awaiting the outcome of this application due in October 2014. Earlier on a scholarship was awarded to a summer student at the School of Chemistry by Newcastle University to investigate the synthesis of some fragments in homodolastatin 16 with the aim of tracing its origin and whether or not the fragments originate from the cyanobacteria or EB or both. This project was in collaboration with Dr Michael Hall of the School of Chemistry who advised me on Chemistry related issues of the EU Marie Curie project. B 3.2 Outreach program - Mentoring of Marine Biology students The EU Marie Curie Fellowship generated two projects for Marine Biology Honours students at Newcastle University namely; “The role of secondary metabolites in Bacillus licheniformis UV-resistance” and “Exploring Marine Bacteria Polysaccharides from a Desiccated Environment and Evaluating their Hygroscopic Abilities in Application to the Cosmetic Industry”. I supervised both projects and have co-authored a manuscript for publication in a peer reviewed journal along with my own research findings on UV-resistance with the students. Subsequently, these students are considering pursuing PhD studies in the UK. Additionally, nine undergraduate students undertook their projects with our research group during my fellowship as a result of the lectures I gave to them on cyanobacteria-bacteria interactions and on marine biotechnology and drug discovery. 33 | P a g e C THE UNITED KINGDOM AND MY EU FELLOWSHIP I very much enjoyed the experience of being an EU Marie Curie IIF Research Fellow in the United Kingdom. The independence of thought and the resolve to make a contribution towards research in the EU was a strong motivation for my work. It was evident that my stay had impacted on me positively; and made a lot of friends in addition to embracing the dry humour of the British people. Unfortunately non EU Marie Curie citizens are taxed heavily on money which does not originate in the UK when in reality they do not enjoy the same privileges as locals. Nevertheless, I would still recommend the UK as a destination for early career scientists to develop their expertise. D ACKNOWLEDGEMENT I wish to thank Professor Grant J Burgess for hosting me at the School of Marine Science and Technology and for his enduring support; Dr Michael Hall of the School of Chemistry for helpful discussions and mentorship during my project. Jill Cowans at the Dove Marine Laboratory arranged my purchases for consumables. I wish to most sincerely thank Ms Lisa Inganni and Anthony Gibson for handling my finances. Lastly, I acknowledge the EU for according me the opportunity to work as a Marie Curie IIF in the UK through their funding. 34 | P a g e REFERENCES 1. European Science Foundation, Position Paper 15 2010 Marine Biotechnology: A New Vision and Strategy for Europe. www.esf.org/marineboard 2. Davies-Coleman, M. T., Dzeha, T. M., Gray, C. A., et al., 2003 J. Nat. Prod. 66, 5,712 – 715. 3. Jones, A.C., Monroe, E.A., Podell, S., et al. 2011 www.pnas.org/cgi/doi/10.1073/pnas.1101137108 4. Ramaswamy, A.V., Sorrels, V.M., Gerwick, W.H. 2007 J. Nat. Prod. XXXX, xxx, 000, A-J. np 0704250 CCC. 5. Gerwick, W.H., Coates, R.C., Engene, N., et al. 2008 Microbe, 3, 6, 277- 284. 6. Engene, N., 2012 Int. Journ. Syst. and Evol. Microbiol, 62, 1171–1178. 7. Gu, L., Wang, B., Kulkarni, A., Geders, T.W., et al. 2009 Nature, 459, 731-735 8. Pettit, G. R., Smith, T. H., Xu, J., Herald, D. 2011 New crystal dolastatin 16 having specified unit cell dimensions, useful as an anti-cancer agent. WO2012148943-A1. 9. Pettit, G. R., Smith, T. H., Xu, J.-P., et al., 2011 J. Nat. Prod., 74 (5), 1003-1008. 10. Sudek, S., Lopanik, N. B., Waggoner, L. E., et al., 2007 J. Nat. Prod., 70 (1), 67-74. 11. Kondo, T.; Ishiura, M. 2000 Bioessays 22 (1), 10-15. 12. Cragg, G. M.; Newman, D. J. 2013 Biochimica Et Biophysica Acta-General Subjects 1830, 6, 3670-3695. 13. Kimura, J., Takada, Y., Inayoshi, T., et al. 2002 J. Org. Chem. 67, 1760-1767 14. Montaser, R., Abboud, A. K., Paul, V.J., Luesch, H. 2011 J. Nat. Prod. 74, 109-112 15. Wu, X., Zarka, A., Boussiba, S. 2000 Plant Mol. Biol. Rep. 18, 385–392. 35 | P a g e 16. Aziz, R. K., Bartels, D., et al., 2008 The RAST server: Rapid annotations using subsystems technology. Bmc Genomics 9. 17. Vaara, T., Vaara, M., Niemela, S 1979 Appl. Env. Microbiol., 38, 5, 1011- 1014 18. Nicolas Morin, T. V., Larissa Hendrickx , Leys Natalie , Annick Wilmotte, 2010 J. Microbiol Methods, 80, 148-154 19. Nubel, U., Garcia-Pichel, F., Muyzer, G. 1997 App. Env. Microbiol, 63, 8, 3327-3332 36 | P a g e Annex 1. Table 1 of Kenyan Lyngbya majuscula epibiotic bacteria (EB) isolates Accession Strain Taxon Shewanella algae KC660130 SHALG-01 99 γ-proteobacteria Shewanella algae KC660131 SHALG-02 99 γ-proteobacteria Marinobacterium stanieri KC660132 MARIS-01 99 γ-proteobacteria Acinetobacter johnsonii KC660133 ACJ-01 99 γ-proteobacteria Marinobacterium stanieri KC660134 MARIS-02 99 γ-proteobacteria Staphylococcus saprophyticus KC660135 STAPRO 99 Firmicutes Pseudomonas stutzeri KC660136 PST-01 99 γ-proteobacteria Enterobacter cloacae KC660137 ENTCLO 99 γ-proteobacteria Cellulosimicrobium cellulans KC660138 CCL-01 99 Actinobacteria Cellulosimicrobium cellulans KC660139 CCL-02 99 Actinobacteria Pseudomonas pseudoalcaligenes KC660140 PPS 99 γ-proteobacteria Pseudomonas putida KC660141 PPT 99 γ-proteobacteria Bacillus aereus ND ND 99 Firmicutes Bacillus licheniformis KC660142 BLC-01 99 Firmicutes Bacillus licheniformis KC660143 BLC-02 99 Firmicutes Bacillus subtilis KC660144 BS-00 99 Firmicutes Pseudomonas stutzeri KC660145 PST-02 99 γ-proteobacteria Enterobacter cancerogenus ND ND 99.24 γ-proteobacteria Klebsiella oxytoca ND ND 99.23 γ-proteobacteria Yokenella regensburgei ND ND 99.02 γ-proteobacteria Ochrobactrum anthropic ND ND 99.88 α-proteobacteria Pseudomonas stutzeri ND ND 99.87 γ-proteobacteria Pseudoalteromonas ND ND 99.22 γ-proteobacteria carrageenovora ND – Strain sequences not deposited with Genbank but were inferred from Blast 37 | P a g e Annex 2. Phylogeny of EB isolates 38 | P a g e Annex 3: Adenylation domain for Moorea producens >gi|332705439|ref|ZP_08425517.1|/1-2887 amino acid adenylation domain protein [Moorea producens 3L] -----------------MNLSEFLQELVISGWQFWA----EEGQVCFQAPDADSTDQVLAQLKQHKRDILT ILQEHPE---VLQVYPLGYGQ-------------------------------------------------QG IWFLWQLFPDNPNYNVSFATRIY--------SQVNVTTW---------------QQTFEALRKRHPLLCS--TFPKCGETPIRQHSEQLD-------FVQIDASTWDENELQTQVVAAHRHPFDLQTDPVMRVRWFTRSEQE -------HILLLTIHHIAWDGSSANI------IVKELS----ELYQAHCAGVAVDLPSLQHT---------YQDYVKWQ--------QQLVEGSKG------ESLWTYWQQQLAGELPVLNLPTDRPHPPIQTNNGAVYRFQ LPEHLVTQVKALSQAEGATLYMTLLAAF-------------QVLLHRYTG-------QEDILVGSPTSGRT-RPEFTSVVGYFVDSMVMRAKVSGSLSFREFLTQVRQ-------TVIDALAHQDYPFSLLVEKLQP--------------ERDLSRSPIFQVF-FGLHNFLQSETQQLFLGETKTLVHWGGMEVETFLFDQYESLEDLVL-----------EIIEINSQLSGFFKYNTDLFDEQTIAQMASHLQTLLAGIVT-----------HPEQRLESL P------LLTQAEQHQLLVEWNQ--------------TTTHYPTDKCIHQLFEEQVEQTPDAI--------A VVFKEEKLSYQELNIRANQLARYLQSLGVSPEV-LVGVC--------------VERSLEMIVGLLGILKAG GVYVPLDPKYPQ-------------ERLDYMFRD--SQMSVLLTQQQLLTLLPQYEAK-----------------VVCLDRDWQKIVTEN-----------------------------PKNVTSEVTAENLAYVIYTSGSTG KPKGVMVAHIGLHNLLKVQIQAFKVSSNSRVLQFASLSFDASIWEIVMALGSGASLY-----LESRENLL------------------PGASLSKWLNEKKITHLTLPPSALAVM-------QKEELPSLQTIVVAGEACPA EVISQWSQGR---QFVNAYGPTESTV-----CATMAE------CSPEYSVLP---IGHPIANTQI----YLL DNNLQP--VPIGIPAEMYIGGIGLARGYLN----------------------RPDLTTQKFIPNP-----FS NKAEQRL----------------YKTGDLARYLPDGNIEFLGRI--------DHQVKIR----GFRIETAEI EAVLNQNPTVKQTVVVA-REDKPGDKHL-------------------CAYIVAQMETATNSNPE--LSETHL NSW-QEIFNQQIYSQ--LSEVTDPLFNTTGYLSNYDKQP--IPEAQMRDWAEDIVTQV-------LANKPN SVWEVGCGTGMLLFKIAPHTRAY---------------------------YGTDISEVSLKYIQTQIAQQPD KYAHVTLAQKAAEEMADIADNSFDVV-------LLSS-----IVQYFPSVEYLLQVI------SNSIRVVKP GGMIFLGDIRSL--PLMRAFHTSVQLHKAPPSLSVQQLKQGIY---RLMQQETELLVSPEL-FVALKDTYP --EITHVQI-RLQRG----SEHNELNKYRYSV-LLHIQAKPTSVIVAPVENGVGMSMEDIEVYLGQQQPES ICFSSL-------TNGRVATDMAAVELLSQVESKLNVQQLRQQLRQKLVNGIEPEQLH--QLSASLGYELE LC------------WSHKTEGCFDAVFVRSSLAPEAM-VLTPLTQQSVVGGNWHRYGNNPLASVTGKQLIP Q-WR------------KYLEERLPEY-----MVPSRYVILP-QLPLTPNGKV-----------NRKAL---------------------------------------PAPDNTSSRSTEFVAPETSTEKAL--AAIWAEVLSI -----QQVGIHDNFFESGGHSLLATQVVSRIRQALGKELTLQRLLESPTIAELDSALVQLPRVEDSP KQKP DGLLPTIVPAPSQRYQPFPLTEIQQAYWLGRNSHFDLGNITTHGYLELDCENLALDRLSQAW QQVIDHHDML RMVILPNGEQQVLEQVYPYQIEVLDLRGQPEQIVSTELETIRYRLSHEMFPAGEWPLFKIRV TRLADQRYRL HWSFDALIADAWSMIIVWQQWLQLYQNPDSFLPKLDLTFRDYVLAELSLKDTPQYRRSQQY WWNRLETLPPA 39 | P a g e PELPLVKQTATLEQPEFNCYRAELSAPDWQQLQARAKQASLTPSGVLLAAFADFLAYWSKS PKFTINLTLFN RLPLHPQVNDLVGDFTSLTLLEVNQKNAAPFAQRAQRLQGQLWQDLDHRYVGGVEVQRE L-RRQRGSYQPMG VVFTSTLALNTSAEKGLPSNEWHAWPFDQLGETVYMVSKTPQVWLDNSVAEQNGALLLIW NVVEDLFPEGFL NDMFTSYYHWLQQLATSDVAWAQTCPQLLPLSQLTQRLQVNETYAPVSEETLHNLFVKQV QQRPEAIALITP QRTLTYHELYTEAQALGQQVQQLGATPNTLVAVLMEKGWEQIVAVLGILMAGAAYLPIDAAL PQERQWSLLE QGEVKLVVTQAALNASLGLPDHLHCLVVASQPQEIIDTPLEANVSSSDLAYVIFTSGSTGTPKGVMIDHRG AVNTIQDINQRFDVQPTDRMLAVSALNFDLSVYDIFGLLAAGGTLVMPTPEAAKDPVHWVEL MTTHQVTLWN TVPALMQMLVEYLSEHPDQVTEDLRLALLSGDWIPLNLPTQIQSLWPQGQVVSLGGATEAS IWSVYYPITTV EPEWKSIPYGKPLVNQSLHVLNHNLDPCPNWVPGQLYIGGIGLAQGYWRDEQKTNASFILH PQTGERLYKTG DLARYLPDGNSEFLGREDFQVKISGYRIELGEIEATLLGHATVKETVVAAVGELQSKQLVAYVVFHSESSSDSATEDVHD-------------DMRIDELRHYLQQQLPEYMVPPSYMVLDALPLTANGKVDRKRLPTP-ELISDHYSPDTYIPPRNHQELQLVKLWEEILEVQPIGVGSHFFDLGGHSLLAVRLMNRIEQDF GRSLPLATL FQAPTIEQLAVILQQEQGVPTLSPLVPIQTQGNQPPIFCVHPAGGTVFCYLELSQLLGANQP FYGLQSLGQQ EGQAPLTTVEEMANVYLAAIREVQPQGPYLLMGWSFGGMVALQMAHDLLSQGEQVAFLGL LDTYAPAHMPDE QVLSEDVEVLLELFGGPLSLDWEVLRDLPSEQQSALIWEQAHQANLVPPDLGAAQIERLLQ LMKLNHKAMRS YSPPDYPDVITLLHAEAGSVAVSSTEVTTDPTLGWQAISPSKVEVHTIPGYHEYMVYQPTVV IVAETIKADI EKGLNTDVETSSK >gi|332712440|ref|ZP_08432366.1|/1-3195 amino acid adenylation domain protein [Moorea producens 3L] -----------------MAELNLNRDLGTSNSEVVQLTELGNGVVQITMKDESSRNGFSPSIVEGLRHCFS VVAQNQQ--YKVVILTGYGNYFSSGASKEYLIRKTRGEVEVLDLSGLILDCEIPIIAAMQGHSFGGGLLLG LYADFVVFSQESVYATNFMKYGF--------TPVGATSLILREKLGSELAQ--EMIYTGENYRGKELAERG IPFPVVSRQDVLNYAQQLGQKIAKSPRLSLVALKQHLSADIKAKFPEAIKKELEIHQVTFNQP EIASRIQQE FGETVIPNLIQSTVEQKIPNPQPVQLRIPSYGLLKNLTWMPQERRKPKSTEVEVQIKAVPVNF REVLNVLGI FQEYIKKRYRSGIISAENLTFGVEGVGTVVAVGSDVSQWK---VGDEVILAYP--------GNAFSSFVIC SPDDLLAKPSDLSMVEAATIFMSFFTAYYGLHNLAKVQPGERVLIHAASGGAGQAAVQLAQ FFGSEVFATT-SPHKISVLREQGIKHVMNSRTT------EFASEVRELTQGNGVDVIFNSLTHGEYIPKNIDILAPGGRYI EIGRLNIWSHEQVSQRRPDVKYFPFDMSDEFVRDKQFHAKLWDDLALLFESGSLKPLPYKVFPS-EDVVEA 40 | P a g e FRHLQHSKHIGKIVVTMPELYNGVKNSSQQANQESMSHQEELLHQLQSGDISLENAEQLLL GLTDQQILATV PNNGQNKLINTDKTEQILSLLSSGEISLENAQNLLETVDLNSPTKKNLPTAVPNQGQSNQDE AILNQLQSGE VSLEDAEQLLLEIQQKESVTTKSIPDQRITDDIAIIGISCRYPGAKNWKEFWENLKHGVDSVT EPPPGRWEG RSWYHSDPEHPGTACSKYAAFLDDIDKFDPLFFQISPGEAELIEPQQRIFLEEAYHAIEDAGY APDSLKGKH CGVFVGAASSDYIKFLSNSGFGHHRLVLSGTMLSVLPARIAYFLDLKGPVVAVEAACSSSLV AVHQACESIK RGESEIAIAGGISTMLTPDFQVLSSQFQMVSPEGRCKSFDAEASGIVWGEGCGAILLKRYEQ AVQDQDHIYG IIKGTGTNYDGSTNGISAPSSKSQARLAENIYQQFGINPETISYL------EAHGTATPLGDPIEVEAF-T EAFSKWTAQK---QFCAIGSVKTNIGNAATAAGMSSLIKTILCLKNQKLVPSLHFNQPNPNIDFANSPFYV NTEFKAWEVPTGIPRRAAVNSFGLNGTNAHVVVEEAPIEDNRQTSPVSPQGGKATGNSED YLENSVHLLTLS AKTETALGEVISSYQNYLKTNPNLRLGDVCYTASTGRTHFTHRLAVVAPNQQELVEKLRQH QEGKKLAGITS GELLNNTTVAKIAFLFT-GQ--GSQYINMGKQLYQQAPTFRQAINQCEEILSSVETFQETSLRNILYPTDK NSSGSSLLGQTAYTQPALFAIEYALFK--LWQSWGIEPDVVMGHSVGEYVAATVAGVFSLEDGLKLIAARG SLMQKLPGDGKMLWAMAPESKVLETLKAKDLSEKVAIAAINGPQSIVISGEGKAVEAIATNLE SAGITTKPL KVSHAFHSPLMEPMLAEFEAAAKEITYEQPRIPLISNVTGKQVTEQITTAEYWVNHVRQPVQ FAQSMKTLYQ EGYELFLEIGPK--PVLLSMGRQCLPEKI-GVWLPSLRPGVEECQQMLSSLGKLYVEGAKVDWIAFEQNYA RQKVALPTY-PFQRERYWVSSQNGYEQKSY----WLKGKEQHPLLGEKINLAGIEDQHRFQSYIGAESPGY LNHHQVFGKVLFPSTGYLEIAASAGKSLFTSQEQVVVSDVDILQSLVIPETEIKTVQTVVSFAENNSYKFE IFSPSEGENQQTPQWVLHAQGKIYTEPTRNSQAKIDLEKYQAECSQAIEIEEHYREYRSKGI DYGSSFQGIK QLWKGQGKALGEMAFPEELTAQLADYQLHPALLDAAFQIVSYAIPHTETDKIYLPVGVEKFK LYRQTISQVW AIAEIRQTNLTANIFLVDNQGTVLVELEGLRVKVTEPVLTQKSAFKEQLKSASVSERQELLTT QISSAIVNI LGLRDGQQIERHQPLFDLGLDSLMAVELKNQLESNLGTSFSSTLLFDYPTVESLVEYLANNV IPIDSFSE---LPTLIPHPEQRYQPFPLNDIQQAYWIGRNQIFDLGNIATHIYIEVDCENLNLESLHQAWRRLID HHDML RMVVLADGNQQILEQVPPYEIEILNLSEESPETIASELEQIRNQMSHEVLPTNQWPLFHLRAT RLNEQCFRL HASIDMLIFDAWSTYVLFKQWSELYNNPQSSLPATEISFRDYVLAELELKDSPQYLSSQQYW FNRLDNLPPA PEIPQAKVTSAITDPQFNTHTAQLSQSDWQQLKNKASKANLTPSGVLLSAFASVLNYWSKS SKFTLNLTLFN RLPLHPQVNELIGDFTSVILLEVDNSQAVPFISRAQKLQRQLWEDLEHRYISGVEVQRELYR R--GRSQPMG VVFTSTLGLKSLADEEVG---RGFGLEHFGEVVYSAAQTPQVLLDHIVTEEKGALAFSWHTVEGLFPEGLI 41 | P a g e EQMFEAYCDLLQQLATSDEPWMETYHQLLPTAQLALQAQVNQTTQSWSEDILHSLFVKQV QVQSEATAVISP QKSLTYGELYQRSHQLGHGLRKLGVKPNQLVAVVMEKGWEQVVAVLGILMSGGAYLPIDP GLPQERQWYLLE QAQVTQVLTQTHLKQSLGWPEGIKCWSVDTEELAEYDPNPLEPVQTSEDLAYVIYTSGSTG LPKGVMIDHRG AINTILDINQRFKVTPSDRVLALAALNFDLSVYDIFGVLGAGGAIVMPPPKAAKDPACWRELII AHEVTLWN SVPALMQMLVEHLLGTSATAVGDLRVVMLSGDWLPVDLPSKIQSLWSNVQVMSLGGATEA SIWSIGYPIEKV GSDWKSIPYGKPLLNQSFYVLNELMEPRPVWVPGQLYIGGVGLAKGYWKNEHKTQASFIT HPVTQEPLYKTG DLGRYLPDGNIEFLGREDFQVKINGYRVELGEIEVALKQFPGIKEAIVTAIGESQQSKRLVAY AVFKEKSVI SDSSLTDIHQTEDKNEVGQPDQEINCTSEQLRKYLWQKLPEYMVPDDYVILEALPLTANGK VDRKRLPKPQR QTIADT--NQNILPQTKTEQQIAAVWTEVLELEEVGIHDNFFAIGGNSLLVIRVHNKLQELLGIELKVVDL FANPTVHFLSQHLTQ-----------------------------------------IGSKELF---------------------------METSKTRG-------------------------------------------DE RV-----------------------------------------------KKGTTRKER----------------------------------------------------------------------------------RNI RKSLR-----GKK >gi|375260917|ref|YP_005020087.1|/1-2032 yersiniabactin synthetase, HMWP2 component [Klebsiella oxytoca KCTC 1686] MISGAPSQDPLLSDNGEAADYQQLRELLIQELNVAPQQLQEESNLIQAGLDSIRLMRWLHW FRKKGYRLTLR ELYAAPTLAAWRQLMRSRSGEKPDDASSPAE-----------------------------------------AAWPVMSEGTPFPLTPVQHAYLTGRMPGQTLGGVGCHLYQEFAGHYLTAPKLEQAITILLQ RHPMLHI--AFRADGQQVWLPQPYWNG--------VTVHDLRQTDEASRQAYLETLRQRLSHRLLRVEMGETFDFQLTL LPDNC--HRLHVNIDLLIMDASSFTL------FFDELN--------ALLAGESLPPGDPRYD---------FRAYLLHQQKIN----QPLLDKARA-----------YWLAKASMLPPAPVLPLACEPATLREVRNTRRRMI VPTTRWNAFSQRAGENGVTPTMALATCF-------------AAVLGRWGG------LTRLLLNITLFDRQP LHPAVDEMLADFTNILLLDTACDG-----DTVSNLARKNQL---TFTEDWEHRHWSGVELLRELK---------------RQQSHPHGAPVVFTSNLGR-SLYSSRPESPLGEPE----WGISQTPQVWIDHLAFEHRGEV WLQWDSNDALFPPALVETLFNAYCQLINQLCDDESA---------------------------WKKPFADRM P----------QSQREIRQRVNA--------------TDAPVP-QGLLHEGIFRIALRQPQAL--------A VTDAHYQWNYRELTENARRCAGRLIACGVQPGD-NVAIT--------------MSKGAGQLVAVLAVLLSG AVYVPVSLDQPA-------------ARRGKIYAD----------ANVRLVLTCQHDASAWSDDIP--------------HLTWQQAIEAE-----------------------------PLADQAAHAPTQPAYIIYTSGSTG TPKGVVISHRAALNTCCDINSRYQVGPGDRVLALSALHFDLSVYDIFGVLSAGGSLV---IVMENQRR-------------------DPRAWCELIQRHQVTLWNSVPALFDMLLTWCEGFADAAPEKLRAVMLSGDWIGL DLPARYHAFRPQGQFIAMGGATEASI--WSNACEINR------VPDHWRAIP---YGFPLANQR----YRV 42 | P a g e VDELGR-DCPDWVPGELWIGGIGVAEGYFN----------------------DPVRSEQQFVTQS---------NARW----------------YRTGDLGCYWPDGTLEFLGRR--------DKQVKVG----GYRIELGEI ESALSQLAGVKQSTVVAIGE---KEKTL-------------------AAWVVPQGSAFCVTHHR---DPALP QAW-RGLAGTLPCC----------------------VCPPEISAGQVADFLQHRLLKL-----------KPG QTPGADPLPLMNALAIQPRWRA--------------------------------VVERWLAFLVTQQRLQPA AEGYQVCAGE-APENDPPSFSGHDLT---------------LTQILRGARHELSLLNDARWSPESLAFDHP ASALYIEELATICQQLSRRLQRPVRLLEV-----------GVRTARAAECLLTRL-SADEIEYVGLEHSQE LLLSARQRLAPWSDARLALWSADTLTAHAHSADIIWLNNALHRLL---------------------PEEPGL L--------------AALQQLAVPGALLYVLEFRQLTPSA--LLSTLLLTDGQPEAL------------------------------LHNSADWGAIFTAAAF----------NCQHGDEVEGLQRFLVQCPVSQVRRDPRQL Q---------------SALAERLPGW-----MVPKRIFLLD-ALPLTANGKI-----------DYQTL--------------------------------------KRCHTPEAENRTEADLPLGDIEKQV--AVIWQPLLSM ------GAVSRETDFFQHGGDSLLATRLIGQLHQAGYEARLSDLFNHPRLADFAATLRKTDLPVEQP--------FVHSPEERYRPFALTDVQQAYLVGRQPGFALGGVGSHFFVEFEIADLDIHRLEKVWNRLIAR HDML RAVV-RDGQQRVLEQTPPWVIPA-HILHSPEEAL----QVRDRLAHQVLNPEVWPVFDLQVGFVDGMPARL WLCLDNLLLDGLSMQILLSELEHGYRYPQQLPPPLPVTFRDYLQQPALRTPNPDSLA--WWQTQLDDIPPA PALPLRCLPQDVETPRFARLYGAMDSARWRRLKQRAADAHLTPSAVLLSVWSTVLAAWSA QPDFTLNLTLFD RRPLHPQINQILGDFTSLMLLSWHPGES-WLQSARLLQQRLSESLNHRDVSAIRVMRQLARRQNVPAVPMP VVFTSALGFEQD---------NFLARRNLLKPVWGISQTPQVWLDHQVYESEGELRFNWDFVAALFPDGQV ERQFAQYCALLNRMAEDDSSWQ------LPLADLVPPLKVTER--------------RARRLRPERA--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------QPRIAAD-------------------------------------------------------------------------------------------------------------------------------------------KSSVSLIC-------------------------------------------------------DTFREVVGE---------------------------------------------------------------------------------------------------------------------------PVAPAENFFEAGATSLNLVQLHVLLQRHEFATLTLLDL FTHPSPVALANYLAG-----------------------------------------VALKEK-------------------------------------------------------------------------------TK RV------------------------------------------------------------------------------------------------------------------------------------------RPV RRRQR------RI 43 | P a g e Annex 4: MODULAR ASSEMBLY DLYNLSLI DAWTVAAV MtaD-M1-Cys and TycA-M1-D-/L-Phe Biosynthesis DOMAIN NRPS EXPRESSED E-BIT VALUE MtaD-M1-Cy gi|6635397|gb|AAF19812.1|MtaDM1-Cys|Myxothiazol synthetase gi|6724259|gb|AAF26925.1|EpoPM1-Cys|Epothilone synthetase gi|2982194|gb|AAC06346.1|BacAM2-Cys|Bacitracin synthetase gi|408802|gb|AAA27636.1|Irp2M1-Cys|Yersiniabactin synthetase 18e -0.044 gi|48323|emb|CAA78044.1|AngRM1-Cys|Anguibactin synthetase 17e -0.077 gi|2623771|gb|AAC45928.1|TycAM1-D-/L-Phe|tyrocidine synthetase 1 gi|39369|emb|CAA33603.1|GrsAM1-D-/L-Phe|Gramicidin synthetase A gi|2623772|gb|AAC45929.1|TycBM3-D-/L-Phe/Trp|tyrocidine synthet... gi|2982196|gb|AAC06348.1|BacCM2-Phe|bacitracin synthetase 3 gi|440169|emb|CAA82227.1|CssAM9-Val|cyclosporine synthetase 15 0.38 18e -0.065 TycA-M1-D-/LPhe 18e -0.044 18e -0.044 17e -0.077 18e -0.065 17e 0.10 16e -0.16 15e -0.38 HYPOTHETICAL None N/A N/A PROTEIN 10623 amino acids comprising 38 domains were found in the blast for M. producens. 4 of the domains were for adenylation with 2 encoding MtaD-M1-Cy, 1 TycA-M1-D-/L-Phe and 1 hypothetical protein. The E-BIT values are exp (-ve). 44 | P a g e Annex 5: List of PARSE HMM modular domains for M. producens LIST OF PARSE HMMs HITs for gi|332705439|ref|ZP_08425517.1|/1-2887 amino acid adenylation domain protein [Moorea producens 3L] DOMAIN A M T Cy A T TE ER KS AT DH T Cy A T T Cy A T Cy 45 | P a g e AMINO ACID REGION 997 1253 1427 1938 2079 2146 2171 2610 2786 3000 3189 3253 3275 3528 320 663 823 1261 1383 1697 1768 1943 2079 2146 2171 2610 2786 3000 3189 3253 22 87 150 682 997 1253 2081 2146 2171 2610 SIZE NUMBER E-BIT SIZE 228 457 68 450 228 68 267 313 438 327 192 68 450 228 68 68 450 228 68 450 1 1 4 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 6 1 5.6e -52 6.1e -124 1.5e -18 7e -159 1.7e -85 3.9e -23 1.2e -43 7.6e -64 4.9e -172 6.1e -121 1.4e -21 2.3e -21 1.8e -172 8.7e -81 8.9e -18 3.6e -10 4.4e -87 7e -51 1.3e -10 3.7e -228