Transcriptomics analysis reveals shared precursors in the biosynthesis of hydrocarbons in photosynthetic algae and plants Adarsh Jose1,2, Wenmin Qin1, Marna Yandeau-Nelson1 and Basil Nikolau1 1Department of Biophysics, Biochemistry and Molecular Biology, Iowa State University, Ames, IA, United States 2Bioinformatics and Computational Biology, Iowa State University, Ames, IA, United States 1. Introduction 3. Bioinformatics Analysis Pipeline fastx toolkit It is generally well accepted that fatty acids are the metabolic precursors of simple, linear hydrocarbons (HCs) such as n-alkanes and n-alkenes. However, the mechanisms and genetic elements of this metabolic conversion are still unknown. (Figure 1). Illumina PE Transcriptome raw data Quality filter Protein sequences previously identified to be involved in synthesis of precursors of hydrocarbons and those hypothesized to be involved in synthesis of hydrocarbons were curated from different sources. (Figure 4) Trinity Assembler Pool reads and Assemble Transcriptome No Reference Transcriptome Genome sequence available ? Yes Figure 1. Four possible mechanisms of hydrocarbon biosynthesis. The fatty acid head-to-head condensation mechanism (1), the elongation-decarboxylation (2), and the fatty acid elongation-decarbonylation (3) pathways produce odd-numbered alkanes and alkenes. The final pathway (4) involves a primary alcohol intermediate and would result in even-numbered hydrocarbons. Map PE Reads to the Reference Sequences Statistical Analysis – List Enrichment Statistical Analysis – Diff Exp BLASTX The two algae, Botryococcus braunii (Figure 2.A) and Emiliania huxleyi (Figure 2.B) were found to accumulate hydrocarbons differentially across different levels of Nitrogen, Phosphate and Carbon in the growth media. The Pisum sativum accumulates ~ 10 x levels of carbon on its abaxial surface when compared to adaxial surface (Figure 2.C) while the corn silks emerged from the husk of Zea Mays was shown to accumulate ~3 x more hydrocarbon when compared to those encased in the husk (Figure 2.D). A. Tophat PE Aligner / Bowtie PE Aligner CuffLinks as FPKM / RSEM as FPKM Handpicked protein models from uniprot protein database Pathway Mapping Figure 4: Querying for curated genes and pathway: Genes involved in key breakpoints in carbon flux, synthesis of precursors of hydrocarbons and those hypothesized to be involved in synthesis of hydrocarbons identified by sequence similarity. BLAST scores are used to summarize gene expression levels. 5. Pathway diagram overlaid with fold changes across the hydrocarbon accumulation conditions Order of systems represented in the fold change boxes 20 15 10 5 0 NP+/C- FC < 1/5 5 NP-/C+ 3 2 1 NP-/C+ NP+/C- D 12% upper lower 10% 8% 6% 4% 2% 0% B17_2B17B15_2B15B13_2B13B11_2 B11 B9_2 B9 B7 B5 Hydrocarbon content of leaf epidermis (older branches to the right) Pea leaves E. huxleyi B. braunii. FC > 5 Continuous color coding of Fold Change From Green -> High Expression in the Low HC condition through Yellow -> No change To Red -> High Expression in the high HC condition 4 0 C. Hydrocarbon content (g/g dry weight) E. huxleyi-1516 6 1 Hydrocarbon (umol/g) Hydrocarbon (umol/g) 25 Hydrocarbon(umol/g) Corn Silk B. Braunii – UTEX 572 Heuristics using BLAST scores Figure 3: Bioinformatics pipeline: The reads were quality filtered and mapped to reference transcript/genome sequences. Reference transcriptomes were assembled de-novo when genomes were not available. The mapped short reads were counted and normalized to estimate expression levels. B. 30 BLASTX Summarize Expression data for each Enzyme/Transporter and estimate fold change across conditions Diff Exp usind Cuffdiff / DESeq R package Curated Protein Sequences Denovo assembled contigs/ Gene Models from the four organisms TAIR and aralip databases Chlamydomonas reinhardtii Protein models Estimate Normalized Read Counts Global Results The Arabidopsis Information Resource (TAIR) The Arabidopsis Acyl-Lipid Metabolism (ARALIP) website The Chlamydomonas reinhardtii genome v.4 portal from joint genome institute. Handpicked set of proteins from the uniprot database based on literature. Functional Annotations from the Genome Annotation Uniprot-KB, Conserved Domain Database, Sequenced Phylogenetic Neighbors 2. Hydrocarbon Accumulation Conditions Reference genome • • • • Reference Sequences BLASTX HCs are known to occur in algae and the epidermis of plants. In this project, four experimental systems are being investigated: two microalgae, Emiliania huxleyi and Botryococcus braunii, as well as vascular plants (leaf epidermis of Pisum sativum, and Zea mays)(Figure 2). 4. Curating candidate protein sequences 6 5 4 3 2 1 0 -4 -3 -2 -1 1 2 3 4 Segmentation of Corn Silk 6 Days post emergence Figure 2: Hydrocarbon accumulation conditions of the four organisms: A. Braunii – UTEX 572 was grown in Waris medium (low nutrients(NO3-PO4-), high C(HCO3-)) and modified B3N medium(High nutrients(NO3-PO4-), low C(HCO3-)). B. E. huxleyi-1516 grown in C+/NP- and C+/NP- media after reaching early stationary phase. C. Abaxial and adaxial surface of pisum sativum leaves. BX indicates the branch from which the leaves were obtained. D. Silk obtained from Zea mays encased within and emergence from the corn husk. Pathway drawn using pathvisio (http://pathvisio.org/)