1 Characterization of glyceraldehyde-3-phosphate dehydrogenase gene RtGPD1 2 and development of genetic transformation method by dominant selection in 3 oleaginous yeast Rhodosporidium toruloides 4 Yanbin Liu*, Chong Mei John Koh, Longhua Sun, Mya Myintzu Hlaing, Minge Du, 5 Ni Peng, Lianghui Ji* 6 7 Biomaterials and Biocatalysts Group, Temasek Life Sciences Laboratory, 1 Research 8 Link, National University of Singapore, Singapore 117604 9 10 Running title: Rhodosporidium toruloides GPD1 promoter and its applications 11 12 * Corresponding authors: 13 Lianghui Ji: 14 Tel: +65 6872 7483 15 Fax: +65 6872 7007 16 Email: jilh@tll.org.sg 17 Yanbin Liu: 18 Tel: +65 6872 7484 19 Fax: +65 6872 7007 20 Email: yanbin@tll.org.sg 21 1 1 Supplementary materials 2 3 Fig. S1. T-DNA organizations in constructs used in this study. All were inserted in 4 the backbone of the binary vector pPZP200 (Lee and Gavin 2008). (a) pEX2. (b) 5 pEC2. (c) pEC3. (d) pEC3GPD-EGFP. (e) pEC3Pxxx-EGFP. (f) pRH203. (g) 6 pRH203Pxxx-RtGFP. (h) pEX2Pxxx-HPT. (i) pEC3Pxxx-HPT3. LB: left border of 7 T-DNA; RB: right border of T-DNA; Pgpd: 595 bp promoter of Umgpd1; Pxxx: 8 RtGPD1 promoter of various lengths; hpt: E. coli hygromycin B phosphotransferase 9 encoding gene; hpt-3: codon-optimized hygromycin resistance gene based on the 10 codon usage bias in R. toruloides; EGFP: gene for enhanced green fluorescence 11 protein; RtGFP: codon-optimized gene for green fluorescent protein based on the 12 codon usage bias in R. toruloides. Tnos: terminator of A. tumefaciens nopaline 13 synthase gene; T35S: terminator of cauliflower mosaic virus 35S gene; cbx cassette: 14 carboxin resistance gene expression cassette. All vectors have the same pPZP200 15 backbone (Lee and Gelvin 2008). 2 1 2 Fig. S2. Isolation of full-length sequence of GPD1 from R. toruloides ATCC 10657. 3 (a) Degenerate PCR; (b) Inverse PCR; (c) 5’RACE and 3’RACE. (d) RT-PCR. 4 Abbreviations: M, 1 kb molecular weight marker (New England Biolabs, 5 Massachusetts, USA), from upper to lower size, 10, 8, 6, 5, 4, 3, 2, 1.5, 1.0, 0.5 kb; c, 6 cDNA as the template; g, genomic DNA as the template; P, PstI; E, EcoRI; B, 7 BamHI; 5', 5'RACE; 3', 3'RACE. 3 1 2 Fig. S3. Schematic illustration of GPD1 from R. toruloides ATCC 10657. (a) Gene 3 organization of R. toruloides GPD1. Exons are marked as solid boxes; (b) Genomic 4 organization of R. toruloides GPD1 (3543 bp). Nucleotides are numbered from the 5 translational initiation codon “ATG”. Exons and introns are labeled as capitals and 6 italics, respectively. Abbreviations: ct box - pyrimidine-rich region; tsp - 7 transcriptional starting site; PolyA site- polyadenylation site; putative CAAT box is 8 underlined; three nucleotide substitutions (313G>A, 457T>C and 649T>C) in R. 9 glutinis strain ATCC 90781 are indicated in italics and highlighted; putative substrate 10 binding site (ASCTTNCL) and potential phosphorylation sites are double-underlined; 11 TG repeats upstream of the polyA site are highlighted; residues of the NAD binding 12 site in the protein sequence (D39 and N320) are in italics; catalytic residues (C156 and 13 H183) are boxed and highlighted. 4 1 2 Fig. S4. Morphology and fluorescent microscopy of transformants of EGFP and 3 RtGFP constructs (vector series pEC3Pxxx-eGFP and pRH203Pxxx-RtGFP, 4 respectively) driven by GPD1 promoter fragments in U. maydis and R. toruloides, 5 respectively. DIC: Differentiation interference contrast; FL: Fluorescence with GFP 6 filter set. 7 5 1 2 Fig. S5. ATMT of U. maydis using truncated GPD1 promoters from R. toruloides 3 (vector series pEC3Pxxx-HPT). Hygromycin resistance gene (hpt) was driven by the 4 595 bp U. maydis gpd1 promoter (Um-595) and the 795 bp (795) and 176 bp (176) of 5 R. toruloides GPD1 promoters. 6 6 1 2 Fig. S6. Southern blot analysis of hpt-3-transformants. Genomic DNA samples (10 3 g) of wild-type R. toruloides ATCC 10657 (WT) and six randomly selected 4 transformants (pEC3Pxxx-HPT3 as binary vector) were digested with BamHI and the 5 blot was probed with a 581 bp digoxigenin-labeled hpt-3 DNA fragment. 6 7 1 Table S1. Comparison of introns in other basidiomycetous GPD1 genes Length Strains Range Number (Mean ± SD) (nt) Pycnoporus coccineus 6 63 ± 4 58-67 Omphalotus olearius 7 67 ± 24 51-121 Crinipellis perniciosa 8 55 ± 5 49-65 Thanatephorus cucumeris 10 49 ± 3 45-55 Xanthophyllomyces dendrorhous 6 117 ± 44 77-200 Phaffia rhodozyma 6 117 ± 42 78-195 Cryptococcus neoformans 3 95 ± 43 59-143 Cryptococcus curvatus 2 110 ± 62 66-153 Ustilago maydis 1 407 ± 0 NA 2 Note: Sequence information can be found in GenBank database under the following 3 accession numbers: P. coccineus: AB194780; O. olearius: AJ439986; C. perniciosa: 4 DQ099333; T. cucumeris: AF339929; X. dendrorhous: Y08366; P. rhodozyma: 5 AF006483; C. neoformans: AE017349; C. curvatus: AF126158; U. maydis: X07879. 6 NA: not applicable 7 8 1 Reference 2 3 4 Lee LY, Gelvin SB (2008) T-DNA binary vectors and systems. Plant Physiol 146:325-332 9