Quantification of MicroRNA Regulation and its Consequences at the Single Cell Level ARCHIVES MASSACHUSETTS INWTTI ITE OF VECHNOLOLGY by Yannan Zheng JUN 3 0 2015 B.S. in Mathematics and Physics LIBRARIES Tsinghua University (2008) Submitted to the Department of Physics in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY June 2015 0 Massachusetts Institute of Technology 2015. All rights reserved. Signature redacted Signature of Author ................................................... Department of Phyics / Certified by ........................ March 27, 2015 redacte! Signature redacted Alexander van Oudenaarden Professor of Physics and Professor of Biology at Massachusetts Institute of Technology Professor of Quantitative Biology of Gene Regulation at Hubrecht Institute Thesis Sunervisor Certified by..................... ............ Signature redacted (I Jeff Gore Latham Family Career Development Assistant Professor of Physics Thesis Co-Supervisor Accepted by .................... .... / VU Signature redacted /7'- Professor Nergis Mavalvala Associate Department Head of Physics In memory of my grandmother Meixiu Wei 1913-2014 2 Quantification of MicroRNA Regulation and its Consequences at the Single Cell Level by Yannan Zheng Submitted to the Department of Physics on March 27, 2015 in partial fulfillment of the requirements for the degree of Doctor of Philosophy Abstract MicroRNAs (miRNAs) are a class of small non-coding RNAs which play important roles in posttranscriptional gene regulation. miRNAs regulate more than half of mammalian proteincoding genes. They have been found to participate in almost every cellular process and their dysregulation is associated with many diseases. miRNAs recognize their targets by base paring to miRNA response elements (MREs), which are predominantly located at 3' untranslated region (3'UTR) of mRNAs. This thesis focuses on a microRNA activity reporter system to investigate various aspects of miRNA regulation on its endogenous 3'UTR targets. Mutation of selected MREs on 3'UTRs (MutUTRs) was designed and validated as miRNA unregulated control. It does not require genetic modifications of cellular background and effectively abolishes the majority of miRNA regulation with minimum perturbation to the UTR sequences. MicroRNAs can induce target silencing via mRNA transcript degradation and translational inhibition. But the relative contributions from the two sources have been under debate. It is also unclear how miRNA regulation varies for different target expression. MicroRNA regulation at the transcriptional and translational levels was quantified at single cell resolution over a target expression range of more than 100 fold using our reporter system. The transcriptional regulation was found to be uniform throughout the range of measurement, whereas translational regulation decreases at high target expression. Our data also suggests that translational regulation increase initially at low target expression for certain targets. For all UTRs under study, miRNA regulation from the two sources were found to be on the same order. In addition to target repression, miRNAs also control target expression noise. MicroRNAs decrease protein expression noise for lowly expressed genes, but increase noise for highly expressed genes, and the noise regulation seems to happen at translational level. By linking reporter assays to transcriptome expression, our findings suggest that microRNAs confer precision to protein expression in vivo, and transcriptional regulation might dominate for endogenous targets. Finally we applied the reporter system as miRNA decoys to study miRNAmediated-crosstalk. We also propose that the reporter systems could be used to study alternative polyadenylation, which is usually accompanied by consequential loss of MREs. Thesis Supervisor: Alexander van Oudenaarden Title: Professor of Physics and Professor of Biology Thesis Co-Supervisor: Jeff Gore Title: Latham Family Career Development Assistant Professor of Physics 3 Thesis Supervisor: Alexander van Oudenaarden, PhD Title: Professor of Physics and Biology, Massachusetts Institute of Technology Director of Hubrecht Institute for Developmental Biology and Stem Cell Research at the Royal Netherlands Academy of Arts and Sciences and University Medical Center Utrecht Thesis Co-Supervisor and Thesis Committee Chairman: Jeff Gore, PhD Title: Assistant Professor, Department of Physics, Massachusetts Institute of Technology Thesis Committee Member: Leonid Mirny, PhD Title: Associate Professor of Health Sciences and Technology and Physics, Harvard-MIT Division of Health Sciences and Technology, Massachusetts Institute of Technology Thesis Committee Member: Jeremy L. England, Ph.D. Title: Assistant Professor, Department of Physics, Massachusetts Institute of Technology 4 Acknowledgments The work presented in this thesis was made possible by the generous help of many friends, colleagues, and teachers. For their contributions I thank the following: Alexander van Oudenaarden for exceptional mentorship and inspiration. He gave me the phone interview and recruited me to MIT, my dream school ever since I was a child, and I will forever be grateful for this opportunity and the chance to work in his lab. My thesis committee members, Jeff Gore, Jeremy England, and Leonid Mirny for insightful comments and mentorship. During the dynamic latter stages of my PhD, my committee offered additional scientific guidance and support for me to reach the finish line. My collaborator J6rn M. Schmiedel, Sandy L. Klemm, Apratim Sahay in the study of miRNA control of target expression noise. And Nikolai Slavov for targeted Mass Spectrometry measurement. Members of the van Oudenaarden lab 2008-2015. I cannot think of a more fun, intellectually stimulating place to spend one's graduate years. I especially want to thank Ni Ji, Sandy Klemm, Magda Bienko, Dylan Mooijman, Apratim Sahay, Nikolai Slavov, and Stefan Semrau for scientific discussions; Ya Lin, Miaoqing Fang, Jialing Li, Annalisa Pawlosky, and Clinton Hansen for genuine friendship; Shankar Mukherji and Gregor Neuert for training when I first joined the lab; and Monica Wolf for administrative support. My friends in the Physics Department Wenlan Chen, Jiexi Zhang, Bo Zhen, Wujie Huang, Wenjun Qiu, and Arghavan Safavi-Naini. My parents Lianshuang Zheng, and Jinfan Xiao for love and support, for always pushing me to work harder on my research projects, and for inspiration of my love of nature from childhood. My husband Yabi Wu for helping me to be a better person This thesis is dedicated to my grandma, Meixiu Wei, the one person I love most in the world. 5 Contents Abstract........................................................................................................................................... 3 Acknowledgm ents........................................................................................................................... 5 Chapter 1 Introduction .................................................................................................................... 9 1. 1 Background ........................................................................................................................... 9 1.2 Thesis Outline ..................................................................................................................... 10 1.3 References ........................................................................................................................... 12 Chapter 2 Design and validation of microRNA activity reporter system.................................. 15 2.1 Abstract ............................................................................................................................... 15 2.2 Introduction ......................................................................................................................... 15 2.2.1 M icroRNA s biogenesis............................................................................................. 15 2.2.2 Interpretation of miRNA mutants of embryonic stem cells ................... 16 2.2.3 MicroRNAs profile and function in embryonic stem cells.................... 17 2.2.4 Regulatory network of miRNAs and proteins in ESC proliferation and differentiation ............................................................................................................................................... 18 2.2.5 M icroRNAs target recognition m echanism ............................................................... 19 2.3 Results ................................................................................................................................. 20 2.3.1 Design of two delivery systems for microRNA activity reporter.............................. 20 2.3.2 Choice of UTRs for study ......................................................................................... 22 2.3.3 Motivation to design MutUTR as miRNA unregulated control............................... 23 2.3.4 Design of M utUTR as miRN A unregulated control..................................................... 25 2.3.5 Validation of MutUTR as miRNA unregulated control .......................................... 26 2.3.6 Applications of UTR reporter system ....................................................................... 27 2.4 M aterials and M ethods.................................................................................................... 30 2.4.1 MutUTR design ............................................................................................................ 30 2.4.2 Plasm id construction.................................................................................................. 30 2.4.3 PCR and sequencing prim ers design ........................................................................ 31 2.4.4 M olecular cloning m aterials and kits......................................................................... 34 2.4.5 sm FISH probe design ................................................................................................... 34 2.4.6 Cell lines ....................................................................................................................... 35 2.4.7 Cell culturing................................................................................................................ 35 2.4.8 Transient transfection of plasm ids and dox induction............................................... 36 2.4.9 Cell fixation and hybridization.................................................................................. 36 2.4.10 Flow cytom etry........................................................................................................... 36 6 2.4.11 M icroscopy Im aging and image analysis ............................................................... 37 2.5 Supplem entary..................................................................................................................... 38 2.6 References ........................................................................................................................... 45 Chapter 3 Application of UTR reporter system to study microRNA regulation at transcriptional 48 and translational levels.................................................................................................................. 3.1 Abstract ............................................................................................................................... 48 3.2 Introduction ......................................................................................................................... 48 3.2.1 m iRN A-m ediated repression of translation............................................................... 48 3.2.2 m iRN A -m ediated m RNA deadenylation and decay .................................................... 49 3.2.3 Cellular compartm entalization of m iRN A repression .................................................. 50 3.2.4 Translation inhibition vs transcript degradation........................................................ 50 3.3 Results ................................................................................................................................. 51 3.3.1 MicroRNAs exert regulation at both transcriptional and translational levels ........... 51 3.3.2 Quantifying miRNA regulation at transcriptional and translational levels ........ 55 3.3.3 The transcriptional regulation stays relatively constant and translational regulation 55 saturates at high target expression...................................................................................... 3.4 Discussion ........................................................................................................................... 62 3.5 M ethods............................................................................................................................... 63 3.5.1 Flow cytom etry experim ents ........................................................................................ 63 3.5.2 Flow Cytom etry Data Processing ................................................................................. 64 3.5.3 Background analysis and repression fold calculation................................................ 68 3.5.4 Microscopy im age analysis ........................................................................................ 70 3.6 Supplem entary Inform ation............................................................................................. 75 3.7 References:.......................................................................................................................... 81 Chapter 4 Application of reporter system to study microRNA control of protein expression noise ....................................................................................................................................................... 85 4.1 Abstract ............................................................................................................................... 85 4.2 Results ................................................................................................................................. 85 4.3 M ethods............................................................................................................................... 92 4.3.1 Reporter plasm id construction.................................................................................. 92 4.3.2 Transient transfections............................................................................................... 92 4.3.3 Flow cytom etry............................................................................................................. 92 4.3.4 Transcriptom e profiling ............................................................................................. 93 4.3.5 Taqm an m icroRNA expression measurem ents ............................................................ 93 4.3.6 Flow cytom etry data processing ................................................................................... 93 7 4.3.7 M odel fit to signal m ean and noise........................................................................... 94 4.3.8 Mixed microRNA pool noise for correlated individual microRNA pools ......... 94 4.3.9 Mapping flow cytometry experiments to transcriptome expression ............. 95 4.3.10 D icer knock-out mESC transcriptom e expression data........................................... 95 4.4 Acknow ledgm ents............................................................................................................... 96 4.5 References........................................................................................................................... 97 Chapter 5 Application of UTR decoy system to study microRNA-mediated-crosstalk...... 100 5.1 Abstract............................................................................................................................. 100 5.2 Introduction ....................................................................................................................... 100 5.3 Results............................................................................................................................... 102 5.4 Methods............................................................................................................................. 108 5.4.1 FA CS cell sorting ....................................................................................................... 108 5.4.2 RNA sequencing......................................................................................................... 108 5.4.3 MicroRN A targets selection....................................................................................... 108 5.4.4 Targeted M ass Spectrom etry ...................................................................................... 108 5.5 Supplem entary................................................................................................................... 109 5.6 References......................................................................................................................... 115 Chapter 6 Double hybridization of GFP-Lin28a3'UTR transcript reveals a novel expression pattern ..................................................................................................................................................... 1 17 6.1 Abstract............................................................................................................................. 117 6.2 Results............................................................................................................................... 117 6.3 Discussion ......................................................................................................................... 122 6.4 Methods............................................................................................................................. 124 6.4.1 Taqm an m icroRN A expression m easurem ents .......................................................... 124 6.4.2 Co-localized spots detection....................................................................................... 124 6.4.3 Stable Integration........................................................................................................ 126 6.5 Supplem entary................................................................................................................... 130 6.5.1 Supplem entary figures................................................................................................ 130 6.5.2 Supplem entary m odel................................................................................................. 136 6.5.3 Supplem entary sequence inform ation ........................................................................ 138 6.6 References......................................................................................................................... 140 Chapter 7 Conclusions and Perspectives .................................................................................... 141 7.1 Future D irections............................................................................................................... 141 7.2 References......................................................................................................................... 145 8 Chapter 1 Introduction 1.1 Background MicroRNAs (miRNAs) are a class of small non-coding RNAs which play important roles in posttranscriptional gene regulation. The number of identified miRNAs approaches 1/-2% of the number of protein-coding genes in worms, flies, and mammals (Bartel, 2009). More than a half of mammalian protein-coding genes are predicted to be conserved targets of miRNAs, and most mammalian mRNAs are conserved targets of microRNAs (Friedman et al., 2008). miRNAs have been shown to participate in the regulation of almost every cellular process investigated so far, from stem cell biology to differentiation, and from proliferation to apoptosis (Bushati and Cohen, 2007; Filipowicz et al., 2008). Given this far-reaching role, it is not surprising that dysregulation of miRNAs is associated with many diseases, including cancer, heart ailments and neurodevelopmental disorders (Chang and Mendell, 2007). Accordingly, miRNAs are being developed as both targets and therapeutics in the clinic hoping to harness the power of RNA-guided gene regulation to combat disease and infection (Pasquinelli, 2012; Tomari and Zamore, 2005). miRNAs recognize their targets by base pairing. miRNA response elements (MREs) are predominantly located at the 3' untranslated region (3'UTR) of messages (Bartel, 2009). Each miRNA can regulate hundreds of mRNAs (Lim, 2005). On the other side, one 3'UTR can contain dozens of MREs, and receive combinatorial regulation from multiple miRNAs (Bartel, 2009). miRNA regulation had a widespread impact on UTR evolution. A large set of housekeeping genes possesses short 3'UTRs that are specifically depleted of microRNA binding sites to avoid microRNA regulation (Stark et al., 2005). Genes with tissue-specific expression tend to have longer 3'UTRs with more miRNA-binding sites (Stark et al., 2005). And the expression of those genes and their regulating miRNAs are anti-correlated or even mutually exclusive in contiguous developmental stages or neighboring tissues (Farh, 2005; Stark et al., 2005). Additionally, 3'UTRs are frequently shortened in tumors and proliferating cells via alternative polyadenylation (APA) (Ji and Tian, 2009; Mayr and Bartel, 2009; Sandberg et al., 2008). Conversely, progressive lengthening of 3'UTRs by APA modulation was observed during mouse embryonic development (Ji et al., 2009). miRNAs can induce target silencing by mRNA degradation and translation inhibition. In animals, initial evidence suggested that miRNAs repress their targets at the level of translation, with little or no influence on mRNA abundance (Olsen and Ambros, 1999; Seggerson et al., 2002). It has now become clear that miRNAs can also induce mRNA degradation in animals (Behm-Ansmant, 2006; Eulalio, 2007, 2009; Lim, 2005). Furthermore, recent advances in proteome and transcriptome measurements have enabled the modes of miRNA regulation to be dissected on a global scale (Baek, 2008; Guo et al., 2010; Hendrickson, 2009; Selbach, 2008). However, the relative contribution from these two sources has been under debates, with genome-wide assays and single gene analysis usually suggesting controversial results (Eichhorn et al., 2014; Filipowicz et al., 2008). Moreover, the mechanistic details of miRNA regulation are still poorly understood. Translational repression was proposed to occur at multiple stages (Fabian et al., 2010; Filipowicz et al., 2008). And other modes of miRNA regulation such as compartmentalization and translation activation have been discovered (Eulalio et al., 2007; Vasudevan et al., 2007). 9 Even though individual microRNAs only weakly repress the vast majority of their target genes (Baek, 2008; Selbach, 2008) and knockouts rarely show phenotypes (Miska et al., 2007), microRNA regulation must confer advantages because miRNA targeting is so ubiquitous (Lewis et al., 2005) and many of the miRNA sites are highly conserved (Friedman et al., 2008). miRNAs can act both as a switch (Olsen and Ambros, 1999; Reinhart, 2000) or a fine-tuner (Bartel and Chen, 2004; Karres et al., 2007; Poy et al., 2004) of gene expression, depending on whether the target residual expression is inconsequential or optimal (Bartel, 2009). Moreover, miRNAs can confer robustness to expression regulation by acting as reinforcers of switch and noise buffers (Ebert and Sharp, 2012). Biological systems usually need to turn on/off a gene during developmental transitions or in response to external signals. Instead of making the decision, miRNAs can help sharpen and maintain the decision by further dampening the expression of unwanted transcripts. Thus miRNAs add an additional, functionally redundant layer of repression, and would provide a failsafe mechanism to ensure the robustness of gene expression program (Bartel, 2009; Bushati and Cohen, 2007; Ebert and Sharp, 2012). miRNAs have also been proposed to act as buffers -against variation in gene expression at homeostasis. The noise reduction results from microRNA-mediated accelerated mRNA turnover and increased transcriptional activity needed to produce the same amount of protein (Ebert and Sharp, 2012; Noorbakhsh et al., 2013). miRNAs have been recently discovered to act as mediator of crosstalk between mRNA targets (Salmena et al., 2011). mRNAs sharing MREs could compete for miRNAs binding, titrate away the regulating resources from each other, thus achieve co-regulation of expression. This new discovery has been revealing a new layer of posttranscriptional regulation. 1.2 Thesis Outline This thesis focuses on miRNAs in animals, and studies the combinatorial effect of miRNA regulation on its natural targets, which are usually located at 3' untranslated regions (3' UTRs) of mRNA transcripts. MicroRNA activity reporter systems have been constructed for representative 3'UTRs, and have been applied to explore the following aspects of miRNA regulation. 1. What is a good miRNA unregulated control for a UTR reporter system? (Chapter 2) 2. How does transcript degradation and translation inhibition contribute to miRNA regulation? And how does miRNA regulation vary for different target abundances? (Chapter 3) 3. How does miRNA control target gene expression noise? (Chapter 4) 4. What conditions are needed for miRNA-mediated-crosstalk? (Chapter 5) 5. How does miRNA affect integrity/alternative polyadenylation of target transcripts? (Chapter 6) The thesis is structured into seven chapters, which includes one brief introduction chapter, five chapters that focus on addressing the above questions, followed by one conclusion chapter. Chapter 2 introduces a microRNAs activity reporter system, which is used as the foundation throughout this thesis. We begin with a literature review of the miRNA biogenesis and miRNA target recognition mechanism. MicroRNA profiles and function in mouse embryonic stem cells (mESCs) are also introduced as guidance for our choices of UTRs. 3'UTR of an endogenous gene is appended behind a fluorescent reporter and it allows quantification of miRNA target expression at single cell level by flow cytometry or microscopy. Another fluorescent protein without microRNA regulation is used as indicator. Two reporter/indicator delivery systems, bi-directional 10 plasmid and cotransfection systems are described, and the effectiveness of using indicator protein to reflect target abundance has been validated in both systems. This chapter introduces MutUTR as the miRNA unregulated control. By mutation of only miRNA response elements (MREs) on UTRs, MutUTRs effectively abolishes majority of miRNA repression whereas preserves other structures of UTR. Its advantages over using reporters followed by a short miRNA unregulated tail or using Dgcr8-'- ESCs as miRNA unregulated control have been discussed. In the end, we briefly introduce the potential applications of microRNAs reporter system. Chapter 3 begins with a literature review of current understandings of miRNA repression mechanisms and controversial view of transcriptional/translational contributions from genomewide studies and single-gene analyses. A UTR reporter system was applied to address miRNA regulation at transcriptional/translational levels for different target expression. By directly measuring the integrated fluorescence of reporter and indicator proteins, and label reporter transcripts with smFISH probes, we traced miRNA regulation at both levels for single cells over a target expression range of more than 100 fold. The transcriptional regulation is uniform throughout the range of measurement, whereas translational regulation gets titrated at high target expression. Our data also suggests that miRNA increase initially at low target expression region for certain targets. The molecular mechanisms behind the observed trend were discussed. Microscopy and RNA-Seq confirmed flow cytometry data. For all UTRs under study, miRNA regulation at the two levels were found to be on the same order. Chapter 4 focuses on studying how miRNA control target gene expression noise, which is defined as standard deviation divided by mean. MicroRNAs could reduce protein expression noise when their repressive post-transcriptional effects are antagonized by accelerated transcriptional dynamics. However, since microRNA levels are themselves variable, one should expect the propagation of their fluctuations to introduce additional noise. Mathematical modeling combining the opposing effects predicts that miRNAs decrease protein expression noise for lowly expressed genes, but increase noise for highly expressed genes. Assays using our reporter system are consistent with the model. Reporter expression has been mapped to ESCs transcriptome. Endogenous expression of highly repressed miRNA targets belong to the low-expression and reduced-noise region. Thus our findings suggest that microRNAs confer precision to protein expression in vivo and offer a plausible explanation for the preferential targeting of lowly expressed genes. Chapter 5 is inspired by the ceRNA model (Salmena et al., 2011 a), which states that miRNAs can induce crosstalk between their targets by competitive binding to this limiting regulating factor. miRNA repression was found to decrease at high target expression for our reporter systems, suggesting the possibility of miRNA-mediated-crosstalk. To explore its occurrence, we applied Lats2 3'UTR reporter system as miRNA sponges, sorted ESCs according to decoy expression levels, and measured genome-wide gene expression response by RNA-Seq. No crosstalk evidence was found for targets of miRNAs regulating Lats2 3'UTR at transcriptome level. To explain the lack of crosstalk, decoy expression and endogenous target site abundance (TA) were estimated, and were compared with endogenous miRNA abundance. Our estimation supports a model in which the changes in ceRNAs must begin to approach the TA of a miRNA before they can exert a consequential effect on the repression of targets for that miRNA (Denzler et al., 2014). Chapter 6 describes a novel expression pattern discovered for GFP-Lin28a3'UTR transcripts. CDS and 3'UTR of reporter transcripts were hybridized with different color smFISH probes and 11 measured under microscopy. A significant number of isolated gfp transcript without following Lin28a 3'UTR tail was discovered, and their co-localization percentage is expression dependent. Below an expression threshold of 100 gfp mRNA molecules, the probability of GFP having a colocalized Lin28a 3'UTR tail was highly variable between 0 and 1. Above the threshold, the colocalization probability was always high. Several trivial explanations for the observed phenomenon have been ruled out, and the co-localization behavior is miRNA dependent. The mechanism behind this novel expression pattern remains unknown, and the possibility of alternative polyadenylation (APA) was discussed in the end. Chapter 7 describes final conclusions and future directions for these projects. 1.3 References Baek, D. (2008). The impact of microRNAs on protein output. Nature 455, 64-71. Bartel, D.P. (2009). MicroRNAs: target recognition and regulatory functions. Cell 136, 215-233. Bartel, D.P., and Chen, C.-Z. (2004). Micromanagers of gene expression: the potentially widespread influence of metazoan microRNAs. Nat Rev Genet 5, 396-400. Behm-Ansmant, I. (2006). mRNA degradation by miRNAs and GW182 requires both CCR4:NOT deadenylase and DCP1:DCP2 decapping complexes. Genes Dev 20, 1885-1898. Bushati, N., and Cohen, S.M. (2007). MicroRNA functions. Annu Rev Cell Dev Biol 23, 175-205. Chang, T.C., and Mendell, J.T. (2007). microRNAs in vertebrate physiology and human disease. Annu Rev Genomics Hum Genet 8, 215-239. Denzler, R., Agarwal, V., Stefano, J., Bartel, David P., and Stoffel, M. (2014). Assessing the ceRNA Hypothesis with Quantitative Measurements of miRNA and Target Abundance. Molecular Cell 54, 766-776. Ebert, M.S., and Sharp, P.A. (2012). Roles for microRNAs in conferring robustness to biological processes. Cell 149, 515-524. Eichhorn, Stephen W., Guo, H., McGeary, Sean E., Rodriguez-Mias, Ricard A., Shin, C., Baek, D., Hsu, S.-h., Ghoshal, K., Villen, J., and Bartel, David P. (2014). mRNA Destabilization Is the Dominant Effect of Mammalian MicroRNAs by the Time Substantial Repression Ensues. Molecular Cell 56, 104-115. Eulalio, A. (2007). Target-specific requirements for enhancers of decapping in miRNA-mediated gene silencing. Genes Dev 21, 2558-2570. Eulalio, A. (2009). Deadenylation is a widespread effect of miRNA regulation. RNA 15, 21-32. Eulalio, A., Behm-Ansmant, I., and Izaurralde, E. (2007). P-bodies: at the crossroads of posttranscriptional pathways. Nature Rev Mol Cell Biol 8, 9-22. Fabian, M.R., Sonenberg, N., and Filipowicz, W. (2010). Regulation of mRNA translation and stability by microRNAs. Annu Rev Biochem 79, 351-379. 12 Farh, K.K. (2005). The widespread impact of mammalian microRNAs on mRNA repression and evolution. Science 310, 1817-1821. Filipowicz, W., Bhattacharyya, S.N., and Sonenberg, N. (2008). Mechanisms of posttranscriptional regulation by microRNAs: are the answers in sight? Nature Rev Genet 9, 102-114. Friedman, R.C., Farh, K.K.H., Burge, C.B., and Bartel, D.P. (2008). Most mammalian mRNAs are conserved targets of microRNAs. Genome Research 19, 92-105. Guo, H., Ingolia, N.T., Weissman, J.S., and Bartel, D.P. (2010). Mammalian microRNAs predominantly act to decrease target mRNA levels. Nature 466, 835-840. Hendrickson, D.G. (2009). Concordant regulation of translation and mRNA abundance for hundreds of targets of a human microRNA. PLoS Biol 7, e1000238. Inui, M., Martello, G., and Piccolo, S. (2010). MicroRNA control of signal transduction. Nat Rev Mol Cell Biol 11, 252-263. Ji, Z., Lee, J.Y., Pan, Z., Jiang, B., and Tian, B. (2009). Progressive lengthening of 3' untranslated regions of mRNAs by alternative polyadenylation during mouse embryonic development. Proceedings of the National Academy of Sciences of the United States of America 106, 7028-703 3. Ji, Z., and Tian, B. (2009). Reprogramming of 3' Untranslated Regions of mRNAs by Alternative Polyadenylation in Generation of Pluripotent Stem Cells from Different Cell Types. PLoS ONE 4, e8419. Karres, J.S., Hilgers, V. , Carrera, I., Treisman, J. , Cohen, S.M. (2007).The conserved microRNA miR-8 tunesatrophin levels to prevent neurodegeneration in Drosophila. Cell, 131, 136-145 Lewis, B.P., Burge, C.B., and Bartel, D.P. (2005). Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell 120, 15-20. Lim, L.P. (2005). Microarray analysis shows that some microRNAs downregulate large numbers of target mRNAs. Nature 433, 769-773. Mayr, C., and Bartel, D.P. (2009). Widespread shortening of 3'UTRs by alternative cleavage and polyadenylation activates oncogenes in cancer cells. Cell 138, 673. Miska, E.A., Alvarez-Saavedra, E., Abbott, A.L., Lau, N.C., Hellman, A.B., McGonagle, S.M., Bartel, D.P., Ambros, V.R., and Horvitz, H.R. (2007). Most Caenorhabditis elegans microRNAs are individually not essential for development or viability. PLoS Genet 3, e215. Noorbakhsh, J., Lang, A.H., and Mehta, P. (2013). Intrinsic Noise of microRNA-Regulated Genes and the ceRNA Hypothesis. PLoS ONE 8, e72676. Olsen, P.H., and Ambros, V. (1999). The lin-4 regulatory RNA controls developmental timing in Caenorhabditis elegans by blocking LIN-14 protein synthesis after the initiation of translation. Dev Biol 216, 671-680. Pasquinelli, A.E. (2012). MicroRNAs and their targets: recognition, regulation and an emerging reciprocal relationship. Nat Rev Genet 13, 271-282. 13 Poy, M.N., Eliasson, L., Krutzfeldt, J., Kuwajima, S., Ma, X., MacDonald, P.E., Pfeffer, S., Tuschl, T., Rajewsky, N., Rorsman, P., et al. (2004). A pancreatic islet-specific microRNA regulates insulin secretion. Nature 432, 226-230. Reinhart, B.J. (2000). The 21 -nucleotide let-7 RNA regulates developmental timing in Caenorhabditis elegans. Nature 403, 901-906. Salmena, L., Poliseno, L., Tay, Y., Kats, L., and Pandolfi, P.P. (2011). A ceRNA hypothesis: the Rosetta stone of a hidden RNA language? Cell 146, 353-358. Sandberg, R., Neilson, J.R., Sarma, A., Sharp, P.A., and Burge, C.B. (2008). Proliferating Cells Express mRNAs with Shortened 3' Untranslated Regions and Fewer MicroRNA Target Sites. Science 320, 1643-1647. Seggerson, K., Tang, L., and Moss, E.G. (2002). Two genetic circuits repress the Caenorhabditis elegans heterochronic gene lin-28 after translation initiation. Dev Biol 243, 215-225. Selbach, M. (2008). Widespread changes in protein synthesis induced by microRNAs. Nature 455, 58-63. Stark, A., Brennecke, J., Bushati, N., Russell, R.B., and Cohen, S.M. (2005). Animal microRNAs confer robustness to gene expression and have a significant impact on 3'UTR evolution. Cell 123, 1133-1146. Tomari, Y., and Zamore, P.D. (2005). Perspective: machines for RNAi. Genes Dev 19, 517-529. Vasudevan, S., Tong, Y., and Steitz, J.A. (2007). Switching from repression to activation: microRNAs can up-regulate translation. Science 318, 1931-1934. 14 Chapter 2 Design and validation of microRNA activity reporter system 2.1 Abstract In this chapter, we introduce the microRNAs activity reporter system, which will be used as the foundation for this thesis. Different miRNA regulatory elements (e.g. 3'UTR of endogenous gene) are fused behind a fluorescent reporter to allow quantification of miRNA target expression at single cell level by flow cytometry or microscopy. Another fluorescent protein without modifications at 3'UTR is used as indicator to monitor miRNA regulation at different target expression levels. And the positive correlation between indicator protein and reporter protein in the absence of miRNA regulation has been validated for both delivery systems used in the thesis, which are cotransfection and bi-directional plasmid system. Dgcr8-1- cells are generally used to study the global effect of miRNA loss, and can be used as background for miRNA unregulated control. Alternatively, we could selectively mutate the consequential miRNA response elements (MREs) on 3'UTRs and abolish most of miRNA regulation without too much change of the sequence. The mutation algorithm is described and the effectiveness of the mutated UTRs is validated. The mutation detangles miRNA regulation from other factors affecting expression such as transcript length and AU-rich elements. It also provides several additional advantages over Dgcr8-'- cells. In the end, we introduce some possible applications of the reporter system which will be discussed in detail in the following chapters. 2.2 Introduction 2.2.1 MicroRNAs biogenesis MicroRNAs (miRNAs) are single -stranded RNAs (ssRNAs) of -22 nucleotide (nt) in length that are generated from endogenous hairpin-shaped RNA molecules. MicroRNAs function as guide molecules in post-transcriptional gene regulation by base -pairing with the target mRNAs, usually in the 3' untranslated region (UTR). Canonical miRNA genes are transcribed by RNA polymerase II (Pol II) to generate primary transcripts (pri-miRNAs). Pri-miRNAs are usually several kilobases long and contain local stemloop structures called hairpins. The initiation step (cropping) occurs in the nucleus, and it is mediated by the Microprocessor complex composed of the nuclear RNase III enzyme Drosha and its dsRNA binding protein partner DiGeorge syndrome critical region gene 8 (DGCR8; Pasha in D. melanogaster and C. elegans). Cropping determines one end of the miRNA and generates -65 nucleotide precursor-miRNAs (pre-miRNAs) containing a short stem plus a 2-nt 3' overhang. Pre-miRNAs are recognized by the nuclear export factor exportin 5 (EXP5) and transported to the cytoplasm in a Ran-GTP-dependent manner. Upon export from the nucleus, the cytoplasmic RNase III enzyme Dicer together with its dsRNA binding protein cofactors TRBP (TAR RNAbinding protein) and/or PACT catalyze the second processing step (dicing), and produce -22nucleotide miRNA duplexes. The duplex is separated and usually one strand (the guide strand or miRNA) is selected as the mature miRNA, whereas the other strand (the passenger strand or 15 miRNA*) is degraded. Strand selection is usually determined by thermodynamic stability, whereas some hairpins produce miRNAs from both strands at comparable frequencies. Mature miRNAs are incorporated into effector complexes that are known as 'miRNP' (miRNAcontaining ribonucleoprotein complex), or 'miRISC' (miRNA-containing RNA-induced silencing complex). Argonaute (AGO) is the key component of the RISC complex. In humans, all four Ago proteins, AGO 1-4, bind to miRNAs with only marginal differences in miRNA repertoire even though AGO2 is the only one with endonucleolytic enzymatic activity (slicer activity) and function in the siRNA pathway. Apart from the canonical miRNA biogenesis pathways described above, various alternative mechanisms can generate miRNAs. A Drosha-independent example is given by mirtrons. After splicing from host mRNAs, the short intronic lariat is debranched and refolds into a short stemloop structure that resembles a pre-miRNA (Okamura et al., 2007; Ruby et al., 2007). On the other side, the biogenesis of miR-451 does not require Dicer and instead involves the catalytic activity of AGO2 (Cheloufi et al., 2010). Even though miRNA biogenesis can be flexible, the vast majority of functional miRNAs still follow the canonical pathway. 2.2.2 Interpretation of miRNA mutants of embryonic stem cells Both Dgcr8 knockout and Dicer knockout mouse ES cell lines exist, and are usually used as models to study effects of overall miRNA loss, but neither one is perfect as miRNA null background. Both Drosha-DGCR8 and Dicer-TRBP independent non-canonical miRNAs biogenesis pathways are discovered in mammalian cells. DGCR8 is exclusively involved in the miRNA pathway (Figure 2.1), whereas Dicer also participates in the endogenous siRNAs pathway. Conditional knockout of Dicer and Dgcr8 induce highly overlapping phenotypes in cell proliferation and differentiation, but Dicer knockout is more severe (Kanellopoulou, 2005; Murchison et al., 2005; Wang et al., 2007). This is possibly due to the existence of a population of DGCR8-independent, Dicer-dependent small RNAs such as mirtrons, endogenous small hairpin RNAs, and hairpin derived short-interfering RNAs (hp-siRNAs) in ESCs (Babiarz et al., 2008). Dicer conditional knockout cells are more difficult to isolate and are thought to require a secondary genetic or epigenetic event to grow (Murchison et al., 2005). We choose the Dgcr8 knockout cell line (for brevity referred to as KO ESCs) in this study for its less severe phenotype. 16 dsRIDO dRBD WW miRNA gene Prt gDNA 2 1 3 45 9 67 IIdI/ 1011 Exon 2 - Exon 3 N 12 13 14 -,mfRNA bnd II - exon 4 *WSo W probe X prQba Pre-mRNA 56 kb Md fit Mnd 10 4nd II d III Cytoplasm 23 kb ind III Eon 2 flox mdm1 Hind III IoxP Exon 3kiP 3,0 kb Eon 4 Exon 5 -- S' probe 3' 2.3 kb 3. Hnd I0 Exon 2 oP probe miRNA:rnaRNA* duplex kb P"d III Exon 4 Exon 5 X probe 5'probe a 52 kb b Mature miRNA within RISC Figure 2.1 Illustration of Dgcr8 knockout strategy and its biological consequences. (a) Dgcr8 knockout strategy. Dgcr8 knockout cell line were generated from parental V6.5 strain by removing exon 3 of Dgcr8 gene, resulting in the formation of several premature stop codons. (b) Canonical miRNA biogenesis pathway. The Dgcr8 knockout cell line lacks the ability to form a functional microprocessor complex with Drosha and can't process miRNAs in the canonical pathway from primary miRNA into precursor miRNA. (a) is copied from (Wang et al., 2007) and (b) is adapted from (Bartel, 2004). 2.2.3 MicroRNAs profile and function in embryonic stem cells Over one-third of mammalian genes are predicted to be directly targeted by miRNAs (Friedman et al., 2008). Consequently, the unique combination of miRNAs in each cell type determines the transcriptome of mRNAs. Recently, miRNAs have emerged as important players in controlling embryonic stem cell fate and behavior. Embryonic stem cells are derived from the inner cell mass of embryos, and are known for their capacity to indefinitely maintain an undifferentiated state in culture (self-renewal) and their potential to develop into every cell type (pluripotency). Knowledge of how protein-coding genes are controlled by key ES cell pluripotency transcription factors (Boyer, 2005) and chromatin modifiers (Benetti, 2008; Boyer et al., 2006; Sinkkonen, 2008) has provided important insights into the molecular control of ES cell identity and cellular reprogramming. The recent discoveries of miRNAs provide a new dimension to the ES cell core regulatory circuitry (Marson, 2008). In ESCs, the miR-290-295 cluster of miRNAs (for brevity referred to as the miR-290 cluster), expressed from a 2.2-kb polycistronic region on chromosome 7, comprises up to - 70% of total miRNA expression (Houbaviy et al., 2003; Marson, 2008). This miRNA family is homologous to human miR-371-373, a cluster expressed in human ESCs. miR-302-367, a miRNA cluster conserved in both mouse and humans, also shares the same seed sequence AAGUGC. This cluster of miRNAs is less expressed in naYve ES cells but is upregulated in primed ES cells state upon 17 differentiation (Rosa and Brivanlou, 2011). Both cluster of miRNAs are ESC specific, and diminish in differentiated cells. ChIP-sequencing data has shown that the promoter of miR-290 family miRNAs is occupied by the core transcriptional regulators of ESCs Oct4/Sox2/Nanog/Tcf3 (Marson, 2008), the same core transcriptional factors also encodes ~250 murine ES cell mRNAs that appear to be under the control of miRNAs of the miR-290 family (Sinkkonen, 2008), thus forming an "incoherent feed-forward" loop which can fine-tune the targets expression (Alon, 2007). miR-17-92, an oncogenic miRNA cluster which promotes cell proliferation (He, 2005) is also expressed in ESCs. It is interesting to note that the seed of miR-17-92 is only shifted by 1 nt compared to the seed of miR-290 family miRNAs. The transfection of miRNA mimics from either families into Dgcr8 knockout ESCs can rescue the proliferation defect of the miRNA null cells, and shorten the prolonged GI phase of cell cycle by suppressing inhibitors of GI -S transition. The miR-290 family miRNAs also control de novo DNA methylation via Rbl2 and other transcriptional repressors, and repress the self-renewal program via modulating the epigenetic status of pluripotency genes, such as Oct4, upon differentiation (Benetti, 2008; Sinkkonen, 2008) The primary transcript of let-7 is abundant in ESCs, but its maturation is blocked by Lin28 (Viswanathan et al., 2008). Lin28 is highly expressed in ESCs, and is one of the key regulators of ESC pluripotency. But upon differentiation, the tug-of-war is reversed, and let-7 wins whereas lin28 is repressed. Some tissue-specific miRNAs are silent in ESCs, but are co-occupied at the promoter by both the core pluripotency factors and transcriptionally repressive Polycomb group proteins, these type of miRNAs are poised, and are ready for quick activation upon differentiation (Marson, 2008). Some differentiation-related miRNAs, miR-296 (Marson, 2008) and possibly miR-134 and miR-470 belong to this class. These are lowly expressed in the ESCs, but are upregulated upon retinoic-acid-induced differentiation. The three miRNAs mentioned above also further down regulate the pluripotency network by targeting the coding sequence of Nanog, Oct4 and Sox2 (Tay et al., 2008). Thus its double promoter occupation and low expression in ESCs can be compared to a Trojan horse (Gangaraju and Lin, 2009). 2.2.4 Regulatory network of miRNAs and proteins in ESC proliferation and differentiation Cyclin-Dependent Kinase Inhibitor 1A (Cdknla, also named as P21), large tumor suppressor 2 (Lats2), and retinoblastoma-like 2 (Rbl2) are all confirmed targets of miR-290 family miRNAs, and are all inhibitors of the cyclin E-CDK2 regulatory pathway (Wang et al., 2008). Thus by targeting 3'UTRs of these genes, miR-290 family miRNAs control ES cell cycle and proliferation by promoting the cell transition from G1 to S phase (Wang et al., 2007). In addition to promoting cell growth, miR-290 family miRNAs also affect cell death (Zheng et al., 2011). Lats2 and another validated targets of miR-290 cluster miRNAs caspase 2 (Casp2) are also tumor suppressors and can induce apoptosis following exposure to genotoxic stressors (Zheng et al., 2011). Thus miR290 cluster also plays an anti-apoptosis / pro-survival role in ESCs. Rbl2 is also known as transcriptional repressor of de novo DNA methyltransferases (Dnmts) Dnmt3a and Dnmt3b (responsible for depositing repressive histone markers at gene promoters) (Benetti, 2008; Sinkkonen, 2008). Thus miR-290 family miRNAs affects methylation, and promotes differentiation of ESCs by stable silencing of pluripotency factors like Oct4. OCT4, SOX2, NANOG and LIN28 are four factors used to reprogram human somatic cells to pluripotent stem cells that exhibit the essential characteristics of embryonic stem cells (Yu et al., 18 2007). And their mouse homologs are also key pluripotency factors in mouse ESCs. Various miRNAs such as miR-134, miR-296 and miR-470 have been experimentally shown to target the CDS of Oct4/Sox2/Nanog trio upon differentiation (Tay et al., 2008). The negative feedback loop between Lin28 and let-7 is important for both ES cell pluripotency and differentiation (Viswanathan and Daley, 2010). miR-134, miR-296, miR 470 Pro- mlfferentiation IPPI MIRN29s29d IT Pro-s pluripotency G1-4Soxe2/Ncycl Apotois -_ Wh. 4a WiA N*Yde novo DNA methylation G1-S cell cycle progression Figure 2.2 Regulatory network of miRNAs and proteins in ESC proliferation and differentiation. Proteins are represented by ovals and miRNAs are represented by boxes. Red lines represent activation and blue lines represent inhibition. Casp2, Lats2, P21 and Rbl2 transcripts are all targets of miR-290-295 cluster of miRNAs, the most abundant miRNA families in ESCs. Other regulating miRNAs are not shown here. The Oct4-Sox2-Nanog trio of transcription factors are downregulated by a set of miRNAs at CDS during differentiation. The mutual inhibitory network between Lin28 and let-7 is also crucial in cell-fate decision. 2.2.5 MicroRNAs target recognition mechanism Most known miRNAs-target recognitions and interactions occur at 3' untranslated regions (3' UTRs) of mRNA transcripts, even though miRNAs targeting sites have also been identified in mRNA coding DNA sequence (CDS) (Rigoutsos, 2009) and 5'UTR (Lytle et al., 2007b). MicroRNAs target recognition relies heavily on the Watson-Crick pairing to the miRNA seed region, which is defined as positions 2-7 counting from the 5' of a mature miRNA. Structural 19 studies show that Argonaute protein pre-position nucleotides 2-8 of miRNA in a geometry resembling an A-form helix that would enhance both the affinity and specificity for matched mRNA segments. Nucleotide 1 is twisted away from the helix and not available for target pairing, but an A at position 1 of the site is presumably recognized directly by protein of the silencing complex. This is consistent with genome-wide analysis of site efficacy hierarchy as: 8mer (1-8)>> 7mer-m8 > 7mer-Al > > 6mer (2-7) > no site, with the 6mer differing only slightly from no site at all (Bartel, 2004; Bartel, 2009; Grimson, 2007). A miRNA family is comprised of miRNAs with the same seed+m8 sequence (positions 2-8 of the mature miRNA). miRNA members in the same family are expected to regulate the same set of targets with slight preferences based on different pairing to the 3' end. Other factors also boost site efficacy. Supplementary 3'pairing centering at nucleotide 13-16 of the miRNA increase target pairing efficacy, and sometimes can even compensate for a mismatch in the seed region. Optimal targeting and repression occurs where the binding site is positioned within the 3'UTR at least 15 nt downstream of the stop codon, away from the center of long UTRs, and in an AU-rich neighborhood. This is probably due to effective competition with translation machinery and increased accessibility in these regions. Adjacent targeting sites (within 40 nt, but no closer than 8 nt) tend to act cooperatively, and lead to marked enhancement in repression. All these factors are combined quantitatively into a single context score to reflect the computationally predicted target efficacy (TargetScan 6.2). Many miRNA sites are conserved under selective pressure. Sites that are deeply conserved tend to show stronger repression but non-conserved sites can also be functional (Bartel, 2004; Bartel, 2009; Farh, 2005; Friedman et al., 2008). 2.3 Results 2.3.1 Design of two delivery systems for microRNA activity reporter To measure miRNA regulation, 3'UTR of an endogenous gene was inserted behind a fluorescent protein, e.g. enhanced GFP (eGFP), which we refer to as the reporter. Thus we could directly read miRNA target expression at protein level. By hybridizing reporter transcripts with smFISH probes, we could also quantify miRNA target expression at mRNA level. The reporter was transiently transfected into mouse embryonic stem cells (mESCs) and measured by flow cytometry or microscopy. Since transfection efficiency and the transcription/translation machinery activity vary from cell to cell, another fluorescent protein, e.g. mCherry, was co-transfected as an indicator. The indicator has no modifications on 3'UTR, and is devoid of miRNA regulation. By aligning cells according to indicator expression, we could measure miRNA regulation for different target abundance. The canonical miRNAs pathway is blocked in Dgcr8& and Dicer-'- mESCs, and they can be used as background to measure reporter expression without miRNA repression. Alternatively, as we will describe shortly, we can use reporter followed by mutated 3'UTR (MutUTR) as miRNA unregulated control in wild type (WT) ESCs. MutUTR is named for mutation of consequential miRNA regulating elements (MREs), and the original 3'UTR of endogenous gene is referred to as OriUTR. Two delivery systems: the two plasmids cotransfection system and one bi-directional plasmid system were designed and constructed (Figure 2.3). In the two plasmids cotransfection system, 20 the reporter to indicator plasmid ratio was fixed at population level. This bulk ratio was tunable and the system could cover broader transfection ranges overall. Whereas in the bi-directional plasmid system, the reporter and indicator were both driven by a bidirectional Tet-inducible promoter, and the ratio was always fixed to one at single cell level. This system is especially helpful for validation of the cotransfection system at low transfection level. " Two plasmid co-transfection system Ori/Mut 3'UTR - miRNA activity reporter: Transfection level indicator TUnable buck ratio - Bi-directional plasmid system Ori/Mut 3'UTR ratio 1 Figure 2.3 Illustration of two delivery systems for miRNA activity reporter. In the two plasmids cotransfection system, the bulk ratio of reporter to control ratio is tunable and can cover broader transfection ranges overall. In the bi-directional plasmid system, the reporter to indicator ratio is always fixed at one-to-one, even at single cell level. By arranging individual cells according to their indicator expression level, we observed that when no UTRs were fused behind the reporter, with increasing mCherry, there was a concomitant increase in eGFP expression in both systems (Figure 2.4). The variations of eGFP expression at given mCherry levels in cotransfection system were no more pronounced than those of bidirectional plasmid system. Thus even with single cell plasmids delivery ratio variation, cotransfected mCherry plasmids could be used as transfection level indicator. 21 bi-directional plasmid Co-tansfection 5.5 5.5 5- 6- 4.5- 4.5 .5 4 0.. IL 3 2 1.5 2 2.5 3 3.5 4 4.5 5 5.5 .5 logj(mCherry) 2 2.5 3 3.5 4 1og 1 (mChervy) 4.5 5 Figure 2.4 Scatterplot of GFP reporter expression versus mCherry indicator expression in two delivery systems. (a) pTRE-GFP-RFP is transfected into V19 mESCs and induced with 1 pg/ml doxycycline. (b) pCAG-GFP and pCAG-RFP are co-transfected into WT mESCs. The expression of GFP is plotted against mCherry. GFP increase on average with increasing mCherry expression. Data is collected by flow cytometry, and 10% of the measured cells are plotted for visualization. Color corresponds to local cell density. 2.3.2 Choice of UTRs for study Cdknla, Lats2, Rbl2, and Casp2 are all experimentally validated targets of miR-290 family miRNAs (Wang et al., 2008; Zheng et al., 2011), the most abundant miRNAs family in ES cells (Marson, 2008). They represent various aspects of miR-290 function in ESCs, and have significant biological implications. In addition to miR-290 miRNA targeting sites, those 3'UTRs also contain targeting sites for other miRNAs, and some of which have been experimentally validated in ESCs. Thus those 3'UTRs serve as examples of natural miRNA targets to study combinatorial effect of miRNA regulation, and they were expected to receive strong repression in ESCs. The negative feedback loop between Lin28 and let-7 is important for both ES cell pluripotency and differentiation (Viswanathan and Daley, 2010). Lin28 serves as an example of a mildly repressed miRNA target due to its high expression in ESCs and low expression of its main regulating miRNAs let-7 ((Viswanathan et al., 2008) and Taqman data in Chap. 6). We also constructed 3'UTR reporter systems for ESCs pluripotency trio Oct4/Sox2/Nanog respectively. No miRNA repression was observed for those UTR reporters in ESCs context (data not shown). It is worth to note that Nanog transcript level in Dgcr8& ESCs is almost 3-fold higher than WT ESCs, and its expression in miR-295 cluster knockout ESCs resides between the two cell lines (Supplementary Figure 2.2). miRNAs have been reported to target Nanog CDS during ESC differentiation (Tay et al., 2008). Thus it might be interesting to construct the reporter system for Nanog CDS in the future, to see if miRNA directly repress Nanog expression through this region in WT ESCs. 22 The 3'UTRs of the endogenous genes chosen for the study are summarized in (Figure 2.2). 2.3.3 Motivation to design MutUTR as miRNA unregulated control In this study, we could choose either Dgcr8-1- or Dicer-'- mouse embryonic stem cells as "miRNAnull" background to quantify for reporter expression without miRNAs repression. But not every cell line has its miRNA null mutants available, and sometimes the mutants are not even possible due to viability issue. And MutUTR was designed as a miRNA unregulated control to be used in its original cell context. It's tempting to use reporter protein followed by a short, miRNA-unregulated polyA tail as the negative control for miRNA regulation. Because it is easy to construct and is universal for UTRs under study. In fact, in previous studies, GFP followed by rabbit beta-globin polyA tail (RBGpA) was used as miRNA unregulated control (Mukherji et al., 2011). This works if the UTR under study is also short, such as the artificially constructed N-consecutive miRNA targeting sites, which is less than one hundred base pairs (bp) long (Mukherji et al., 2011). But it does not apply to endogenous UTRs, which can be several kilo bases long. And two main reasons are explained as follows. First, in addition to microRNA response elements (MREs), 3'UTR contain other regulatory elements such as AU-rich elements and poly (A) sites (PAS). Additionally transcript length, secondary structure, and harbored RNA binding proteins (RBPs) all affect transcripts stability and translational efficiency. Thus, even in the Dgcr8 knockout cells, at the same transfection level, the expression of GFP-RBGpA is usually much higher than GFP-OriUTR. This is true for both bidirectional (Supplementary Figure 2.3) and cotransfection (Figure 2.5) systems. And if we use GFP-RBGpA as the miRNA unregulated expression control, we will also count in contributions from other factors and exaggerate miRNA regulation. MutUTR is only different from OriUTR by partial MREs. Other factors like transcript length, and GC content were reserved as much as possible. The expression of GFP-OriUTR and GFP-MutUTR were confirmed to be the same in the Dgcr8 knockout cells (Figure 2.8 and Supplementary Figure 2.4). Thus reporter-MutUTRs should be used instead of reporter-RBGpA as negative control to study miRNA mediated regulation on endogenous UTRs. It is worth to note that even for miRNA unregulated reporter, such as GFP-RBGpA, the reporter expression in wild type and knockout cells lines are not exactly the same. This could due to different doubling time of WT and KO cells, and other factors caused by the different transcriptome. Indeed, Dgcr8 knockout ESCs have prolonged GI phase compared to the wild-type (Wang et al., 2008) and global miRNAs knockout usually result in different gene expression profiles, especially miRNA targets (Lim, 2005). 23 Co-Transfecton 5.5-4.5 4 - C WT, GFP-RBGpA WT, GFP-lats2UTR WT, GFP-lin28UTR KO, GFP-RBGpA KO, GFP-lats2UTR KO, GFP-lin28UTR 0.3.5U10 2.52- 2 2.5 3 3.5 4 log(mCherry) 4.5 5 5.5 Figure 2.5 Bar plot of GFP expression at different mCherry levels for different UTR reporters. pCAG-GFP followed by different 3'UTR tails were co-transfected with pCAG-mCherry into either wild type (WT) or Dgcr8-' (KO) ESCs. Cells were binned according to mCherry expression, and the mean of GFP expression in each bin were calculated. Error bars correspond to standard error of the mean. Expression of reporter followed by miRNAs null target (RBGpA), or mild target (Lin28a UTR) were similar in WT and KO ESCs. Lats2 UTR received strong repression, and the reporter expression in KO cells was much higher than in WT cells. In KO ESCs, where we have eliminated the difference coming from miRNA regulation, the expression of GFP-RBGpA is much higher compared to both GFP-Lats2aUTR and GFP-Lin28aUTR, and this was due to factors other than miRNA. Secondly, in the cotransfection system, the size-dependent transfection efficiency will further amplify the difference between GFP-RBGpA and GFP-OriUTR. Transfection efficiency is size dependent, and smaller plasmid is easier to be delivered into the cells. In the cotransfection system, even though we can fix the bulk ratio of reporter to indicator plasmid, the reporter plasmid size with endogenous 3'UTRs are markedly longer than GFP-RBGpA. Thus at the same indicator plasmid transfection level, cells on average receive more GFP-RBGpA than the GFP-UTR plasmid. Thus we cannot compare the reporter expression at same indicator levels because the initial 24 received reporter plasmids are different to start with. This size-dependent delivery efficiency reciprocally affects the transfection distributions of indicator plasmid pCAG-mCherry. The distribution of indicator protein mCherry expression from cotransfection experiments were drastically different between pCAG-GFP-RBGpA and pCAG-GFP-OriUTR, but were statistically indistinguishable between pCAG-GFP-MutUTR and pCAG-GFP-OriUTR (Figure 2.6). 5.6 + .6[ 5 6 A, 4.5F 4.5F A- / 0 0 4 4 U 0 3.5'- 3.51- N. U. 3 0 2.5 2.562 3 b a 2.5 4.5 4 3.6 3 GFP-RBGpA CoT, WT 5 5.5 -2 2.5 4.6 4 3.5 3 GFP-lin28MutUTR CoT, WT 6 5.6 Figure 2.6 QQplot of indicator protein expression from different cotransfection experiments. Different reporter plasmids were co-transfected with indicator plasmid pCAG-mCherry, and the distributions of mCherry protein expression were compared in the QQplot. (a) mCherry expression in cotransfection of pCAG-GFP-OriUTR vs pCAG-GFP-RBGpA. The transfection efficiency is vastly different with p-value = 3.34e-38. (b) mCherry expression in pCAG-GFP-OriUTR vs pCAG-GFP-MutUTR. The transfection efficiency is statistically the same, with a p-value = 0.567. 2.3.4 Design of MutUTR as miRNA unregulated control Inspired by the mutagenesis design to study the effect of one particular miRNA on gene expression (Mayr et al., 2007; Melton et al., 2010; Tay et al., 2008; Wang et al., 2008; Wu and Belasco, 2005), we carefully select a list of microRNA targeting sites for mutation. We filter all the computationally predicted miRNA targeting sites (TargetScanMouse v6.2) by the following factors: miRNA site effectiveness, miRNA expression abundance in mESCs, and probability of conserved targeting. The filtering thresholds were chosen empirically to balance the tradeoff between effective mutation and minimal perturbation of the sequence. We mutate targeting sites with high evolutionary conserved probability independent of other factors. Because these sites are usually accompanied by high targeting efficacy and are more likely to be biologically consequential. Their mutations can prevent potential miRNA repression in subpopulations of spontaneously differentiating ES cells. Moreover, their mutations generalize the application of mutated UTR in cell context other than ESCs. For each site to be mutated, we follow the general 25 protocol of double point mutation of the seed sequence. The above process is iterated until no novel miRNA targeting sites are generated by mutation. And the final mutated sequence (MutUTR) is synthesized ab initio from GeneArt@ Gene Synthesis. The flowchart of the mutation process is summarized in Figure 2.7. targetScan prediction of miRNA targeting sites Filter by targeting efficacy, mniRNA expression in ESCs, and conserved targeting Mutation of Seed sequence Final MutUTR sequence synthesized by GeneArt* Figure 2.7 Flowchart of MutUTR Design 2.3.5 Validation of MutUTR as miRNA unregulated control For the chosen set of threshold parameters used in the UTR mutation algorithm, the mutated UTRs usually maintain >95% sequence identity with respect to their original versions, yet the MutUTRs have experimentally proven to be effective. For ech endogenous UTRs used in this study, reporter expression was measured under four conditions. MutUTR or OriUTR was co-transfected with indicator plasmid into WT or KO ESCs, and the reporter expression at different transfection levels was quantified. Casp2, Lats2, and Rbl2 were shown to be greatly repressed in ES cells (Wang et al., 2008). Correspondingly, the expression of OriUTR in WT is much lower compared to other three conditions. Mutation abolishes majority of the miRNA repression, and its expression is more similar to the reporter expression in KO cells (Figure 2.8). Also, unlike GFP-RBGpA, which is expressed much higher than GFP-OriUTRs even in the absence of miRNAs due to factors (Figure 2.5 and Supplementary Figure 2.3), the expression of MutUTR and OriUTR closely mimic each other in KO ESCs. And the similarity exists at both protein and transcript levels (Supplementary Figure 2.4). Lin28a is only mildly repressed in ESCs, and the reporter expression are more similar in all of the four conditions. So far, we have proven the effectiveness of mutation. Due to the similarity between MutUTR expression in WT ESCs and OriUTR expression in KO ESCs, either one could be used as miRNA unregulated control. MutUTR provides several additional advantages over Dgcr8-'- ESCs. Strictly speaking, no miRNA mutants achieves 100 percent abolishment of endogenous miRNAs, and both Drosha-DGCR8 and Dicer-TRBP independent non-canonical miRNAs biogenesis pathways exists in mammalian cells. By restricting ourselves to small perturbation of the MREs sequence, MutUTR minimizes other experimental variations such as cell seeding densities. It also eliminates 26 any potential secondary effect caused by different transcriptome resulted from global miRNA loss (Lim, 2005). Dgcr8-1- cells have a prolonged cell cycle compared to the wild-type ESCs (Wang et al., 2008), and the slight difference in dilution factor will introduce bigger differences upon longer transfection times. 2.3.6 Applications of UTR reporter system By transfecting UTR reporter systems into cells, and measuring different combinations of variables, we can explore various aspects of miRNA regulation. For instance, by measuring both protein and mRNA expression of reporter at different transfection levels, we could trace miRNA regulation strength at both transcriptional and translational levels for different target abundance (Figure 2.9c and Chap. 3). By studying the variation of reporter expression, we could analyze how miRNAs regulation controls targets expression noise (Chap. 4). By measuring endogenous gene expression in response to increasing amounts of transfection decoys titration, we can explore the possibility of miRNA-mediated-crosstalk (Figure 2.9e and Chap. 5). Finally, by labeling different regions of transcripts with different color smFISH probes, for instance CDS and 3'UTRs, and study their co-localization probability, we can discover if there's any novel patterns of miRNA mediated decay on certain targets (Figure 2.9d and Chap. 6). 27 Lats2 Protein burplot Casp2 Protein barplot 5.5 - 5 --- I OrUT R WT MUtUTR, WT OriUTR, Dgcr8KO - MutUTR, DgcrSKO 5.5 I - 5 4.5 4.5 4 4 3.51 -- OriUTR, WT MutiITR WT OriUTR, Dgcr8KO -MutUTR, Dgcr8KO a. 3.5 U- U- 3 2.51[ 2.5 1 2 2- 3 '195 4 3.5 4.5 5 15 5.5 3 Rb12 Protein barplot 5.5r 5 -- 5.5 OriUTR, WT - U.AIISD tIRW IVMI. IV Wr - OriUTR, Dgcr8KO - MutUTR, Dgcr8KO 4 3.5 4.5 5 5.5 5 5.5 log(mCherry) log(mCherry) 5 Lin28 Protein bwrplot - - OriUTR, WT MuIUTR, WT OriUTR, DgcrSKO MutUTR, Dgcr8KO 4.51 4.5[ 4 S4 U- 3.5[ 0.3.5 U- 0. 03 3 2.5 2.51 2 2 4 1.5 3 3.5 4 4.5 5 1. 5.5 log(mCherry) 3 3.5 4 4.5 log(mCherry) Figure 2.8 Validation of MutUTR design. GFP reporter followed by either original or mutated UTR was co-transfected into WT or Dgcr8 KO ESCs with indicator protein mCherry. For UTRs under strong miRNA repression, Casp2, Lats2, and Rbl2, three of the four conditions, OriUTR in KO cells, MutUTR in WT cells, and MutUTR in KO cells, are very similar to each other. And expression of GFP-OriUTR in WT ESCs is much lower. Lin28 UTR serves as an example of mild miRNA regulation target, and all of the four conditions are similar to each other. 28 mncherry 0 G ~? Transfection 0 E 2 0 a GFP mRNA Endogenous mRNA mCherry mRNA Indicator mRNA endogenous mRNA CDS Decoy mRNA Wd& 3'UTR mRNA miRISC GFP protein mCherry protein b GFP mRNA + GFP protein + mCherry protein S, miRNA regulation at transcriptional and translational level c + GFP mRNA UTR mRNA + mCherry protein - miRNA mediated decay 0 d GFP mRNA + Endogenous + mCherry protein mRNA CDS miRNA mediated crosstalk e Figure 2.9 Experimental schematics. (a) Cotransfection of reporter plasmid pCAG-d2eGFP-Ori/MutUTR and transfection level indicator pCAG-mCherry into WT/KO ESCs. The ratio of reporter plasmid to indicator plasmid is tunable. (b) A list of total measurable quantities, but due to overlapping spectra of fluorophores, the actual number of simultaneously measurables are limited. (c-e), Different aspects of miRNA regulation can be studied by different combinations of measurables. 29 2.4 Materials and Methods 2.4.1 MutUTR design Perl script of TargetScanMouse v6.2 (able for download from http://www.targetscan.org/code) was used to generate computationally predicted miRNA targeting sites (both conserved and nonconserved) for any custom sequence. Quantitative description of each site was also extracted from TargetScan. Context+ score reflects the effectiveness of a targeting site, and PCT stands for probability of conserved targeting. miRNA expression data in mESCs (miRNA Frequency by Solexa Sequencing and miRNA Microarray Expression Data) was collected from (Marson, 2008). MicroRNA targeting sites were filter selected for mutation. Specifically, the targeting miRNA has to be definitely expressed in mESCs. We chose an expression threshold of 10 counts for sequencing data (-0.01% of total miRNA count), and a threshold of 30 for microarray data (- the average of negative control). Secondly, the miRNA sites has to be effective enough, and only sites with context score larger than -0.1 were selected for mutation. For evolutionary highly conserved targeting sites (PCT>O. 1), even if the targeting miRNA was not expressed in ES cells or context score was smaller than -0.1, we insisted on mutating the sites. Because their mutations could prevent potential miRNA repression in spontaneously differentiating ES cells, which could happen to express the corresponding miRNA. Moreover, their mutations could generalize the application of mutated UTR in cell context other than ESCs. There is a trade-off between effective abolishment of miRNA repression and minimal perturbation of the sequence, and the above thresholds were chosen empirically to balance the two factors. The effectiveness of each MutUTRs were experimentally validated. Two of the six seed sequence (e.g. position (2, 4), (3, 5), or (4, 6)) of selected miRNA targeting sites were mutated. Adenine (A) was interchanged with cytosine (C), and thymine (T) was interchanged with guanine (G). Specific position combination was chosen to maintain GC content. If the miRNA site also involves 3' supplementary paring, one of the 3' sequence (position 13 to 1I6) _-1 __ vvas ais _-__ tA 4- ,+ .-1 11AA'7. A4 1+,-.- +-1 '1(1 A. 9X-. -- mk1ULaLVU t iviayr eL al., 2AJ7 IeVLCiLtIn eL al., 2AJ1U; Tay eL ai., 20J;O, VV anug -1 UL '1AAO. al., 20Vuo; i1 - 4. -1 11AAO. Wu and Belasco, 2005). The resulting mutated sequence was used as input for another round of TargetScan miRNA sites prediction. This process was iterated until no novel miRNA targeting sites passing all mutation thresholds was generated. GC content of the final mutated sequence was checked to make sure it did not change by more than 1 percent. Finally the mutated sequence was checked for internal restriction enzyme cutting sites (NEBcutter). And XhoI and BamHI/BglII (depending on which one was compatible with internal cutting sites) had been added to 5' and 3' end as flanking sequences. The final sequence was synthesized ab initio from GeneArt@ Gene Synthesis, and came in a form of plasmid bearing the designed MutUTR sequence. 2.4.2 Plasmid construction d2eGFP (destabilized enhanced green fluorescent protein #2) has a PEST destabilization tag on the C-terminus of eGFP, which targets protein for degradation and results in rapid protein turnover along with healthier cells under d2eGFP overexpression. d2eGFP was subcloned from pcDNA5CMV-d2eGFP vector, and was ligated into pCAGGS-RBGpA vector digested with EcoRI and 30 BglII to generate pCAG-d2eGFP-RBGpA. The purpose of promoter switch was to optimize transgene expression in ESCs. 3'UTRs of endogenous genes were PCR-amplified from mESCs genomic DNA. Flanking XhoI and BamHI/BglII cutting sites were appended from PCR primers. PCR fragments were digested, and ligated into XhoI and BglII double digested pCAG-d2eGFP-RBGpA backbone to generate pCAG-d2eGFP-OriUTR-RBGpA. MutUTRs were extracted from GeneArt plasmids by either subcloning or digestion, and ligated into XhoI/BglII double digested pCAG-d2eGFP backbone to generate pCAG-d2eGFP-MutUTR-RBGpA. Starting from a previously established bi-directional reporter system (Mukherji et al., 2011), eYFP was replaced with ZsGreenl-1 (Clontech) or d2eGFP using EcoRI and NdeI digestion sites because eYFP was silenced in mESCs. OriUTRs and MutUTRs were subcloned from pCAGd2eGFP-Ori/MutUTR plasmid with desirable cutting sites appended, and ligated into pTRETIGHT-Bi-directional plasmid. Two color sets, pTRE-d2eGFP-mCherry and pTRE-ZsGreenmCherry were used in our system. ClaI and SalI/EcoRV were selected as cutting sites for insertion of UTRs after mCherry, and BglII and XbaI were selected as cutting sites for insertion of UTRs after ZsGreen/d2eGFP. The bi-directional system has reporter to indicator ratio fixed at single cell level. But in general, the cotransfection system is easier to construct, and it is more flexible because it does not require dox induction for transgene activation. Especially we have proved that even though the reporter to indicator plasmid delivery ratio could deviate from bulk ratio due to stochasticity, the result in the low target expression region is still trustworthy. 2.4.3 PCR and sequencing primers design OriUTRs were PCR amplified from mESCs genomic DNA, as very few mouse 3'UTR contained introns (Hong et al., 2006). But if decoys of other regions such as coding sequences (CDSs) had to be constructed, the sequence might have to be amplified from cDNA. MutUTRs were subcloned from GeneArt@ plasmid. Enzymatic sequences identical to the backbone digestion sites were incorporated into PCR primers. If the UTR sequence itself contained the enzymatic cutting sites, another enzyme with compatible ends was added to the primer instead. The usual compatible pairs used in this thesis were BamHI and BglII, XbaI and NheI. All PCR primers were listed in Table 2.1. Table 2.1 PCR primers target size . PCR pmer digestion enzyme sequence(bp) d2eGFPF d2eGFPR EcoRi BgIII GGAATTCACCGGTCGCCACCATGGT GAAGATCTAAATATTGGCGCTCGAGGCG 878 Nanog3UTR F Xhol CCGCTCGAGGACTTACGCAACATCTGGGC 223 Nanog_3UTRR BamHI CGCGGATCCCCGACTGCTCTTCCGAAGG 31 Oct4_3UTRF Xhol CCGCTCGAGAGGCACCAGCCCTCCCTG Oct4_3UTR_R BamH I CGCGGATCCAGCTATCTACTGTGTGTCCCAGTC Sox2_3UTRF Xhol CCGCTCGAGGGGCTGGACTGCGAACTG Sox2_3 UTRR Sox23UTR BamH BmHI CGCGGATCCCGCTTTCAGTGTCCATATTTCAAAAATTTATTT ATCTC Lin28a_3UTRF Xhol CCGCTCGAGAGGCCCAGGAGTCAGGGTTATTC Lin28a_3UTRR BamHI CGCGGATCCCAGTACCAACTCTGGAGTACCAATAAG Casp2_3UTRF Casp2_3UTRR Xhol BamHI AAGCTCGAGTGCCGCCTGCTATTCCTGC CGGGATCCTCAACATTTATTTGGCACCTGATGGCAATAC Lats2a_3UTRF Xhol AAACTCGAGCGAGGAAACCCAAAATGAGATTTCTTTC Lats2a 3UTR R -a R Bglll BTAGA GGAAGATCTGGCTTTAAAGTTTTAATAATAAATTGTGCCAG Rbl2_3UTRF Xhol CCGCTCGAGGGTTAGTGTCCAGGAGGAAACTGTCTTC Rbl2_3UTRR BamHl CGCGGATCCTAAGTGCTTTATTGAAAAATACACATATTTTC ATATAAAATTACAGTAGCG Cdknla_3UTRF Xhol CCGCTCGAGAGTGCCCACGGGAGCC Cdknla 3UTR R -k- GGAAGATCTCCGAATCATCGAGAAGTATTTATTGAGCACC Bglll BAGCTTTGG NanogCDS_F NanogCDSR Xhol BamHI CCGCTCGAGTGAGTGTGGGTCTTCCTGGTCC CGCGGATCCTGCCCTGACTTTAAGCCCAGATGT Cdknla3UTR_F- Clal CCATCGATAGTGCCCACGGGAGCC RFP __________________________ Sal AATAAGTCGACAATCATCGAGAAGTATTTATTGAGCACCA GCTTTGG Rbl2_3UTR_F_R Clal AATAAATCGATGGTTAGTGTCCAGGAGGAAACTGTCTTC Rbl2_3UTR_R_R FP EcoRV TTGGGATATCTAAGTGCTTTATTGAAAAATACACATATT CATATAAAATTACAGTAGCG Casp_3UTR_F1 Clal CCATCGATTGCCGCCTGCTATTCCTGC Sall TTATTGTCGACTCAACATTTATTTGGCACCTGATGGCAATA C Casp_3UTR_R_R FP 1085 2775 2018 1605 1391 1329 918 2018___ Cdknla3UTR_R_ RFP RFP 226 __________________________ 2018 1391 1605___ 32 1605 Nanog_CDS_F_R FP NanogCDSR_ RFP lats2a_3UTR_F_ Clal CCATCGATTGAGTGTGGGTCTTCCTGGTCC 918 EcoRV TTATTGATATCTGCCCTGACTTTAAGCCCAGATGT Clal CCATCGATACGAGGAAACCCAAAATGAGATTTCTTTTC RFP _________________________ Iats2a_3UTR_R_ RFP EcoRV TTATTGATATCGGCTTTAAAGTTTTAATAATAAATTGTGCC AGTAGA BamHI_lats2aUT RF Nhellats2aOriU TRR BamHI AATAAGGATCCCGAGGAAACCCAAAATGAGATTTCTTTTC Nhel AACAAGCTAGCGGCTTTAAAGTTTTAATAATAAATTGTGCC AGTAGA lats2aMut_F_R Clal CCATCGATACGAGGAAACCCAAAATGAGATTTCTTTTC lats2aMutRR FP EcoRV TTATTGATATCTAGGCTTTAAAGTTTTAATAATAAATTGTGC CAGTAG BamHlIats2aM utUTR F Nhellats2aMut UTRR BamHI Use BamHlIats2aMutUTR_F Nhel AACATGCTAGCTAGGCTTTAAAGTTTTAATAATAAATTGTG CCAGTAG FP 1605__ 1605 1605 1605________________________ 1605 1605 Sequencing primers were designed for verification of DNA sequences. They were designed -50 bases upstream of the sequence to be validated. And their design followed the primer considerations such as length and melting temperature listed by MIT sequencing facility (http://web.mit.edu/biopolymers /www/DNA.html). All sequencing primers were listed in Table 2.2. Table 2.2 Sequencing primers sequencing primer sequence pCAGGS4 GCTCTAGAGCCTCTGCTAAC pCAGGS20_d 2eGFP PCAGGS19_b PolyA ATGTCTTGTGCCCAGGAGAG nCh2MREpA CGTGGAACAGTACGAACGC pA2YFPside AGTCAGTGAGCGAGGAAGCT description located on pCAG backbone, for sequence d2eGFP located on d2eGFP, for sequence what's after d2eGFP located on pCAG backbone, for seuqnce what's before RBGpA located on pTRE backbone, for sequence what's after mCherry loated on pTRE backbone, for sequence p sGA what's before pA on the ZsGreen/GFP side CAGCATATGGGCATATGTTGCC 33 Rbl2Seq1 RbI2SeqMut1 CTTCCCCAGTAGGTACTGTAC p21Seqi GTGATCTGCTGCTCTTTTCCC p2lSeqMutl CCTCTATTTTGGAGGGTTAATCTGG GATACTGATCCCTTGAGCACTC lats2Seql GGATATGACTGAGTTCTTCGGG CaspSeq1 GTCTGTATGCCATGACACTGG CaspSeq2 TCTGGTGATGTCATTCTCTTGC CaspSeq3 GGAAGAGGGCATTTGGATTTCTC CaspSeqMut3 GAAGAGGTCCTTTGGCTGTATC plin28_1 CCTGCACTGTGTTCTCAGGTAC plin28_2 TCTCTCGACCTAAGGGTGACAG plin28 3 ATAACCCTGTCCTTTGGTGCTG 2.4.4 Molecular cloning materials and kits KOD Hot Start Master Mix (Novagen) was used for PCR amplification. QlAquick® PCR Purification Kit (Qiagen) was used for purification of PCR fragments after enzymatic digestion or general cleanup of DNA. QlAquick@ Gel Extraction Kit (Qiagen) was used for extraction of DNA fragments from polyacrylamide gels. All restriction enzymes and buffers for Endonucleases were ordered from NEB. Quick Ligation TM Kit (NEB) was used for quick ligation of DNA after digestion. DNA plasmid was transformed into One Shot® TOP lOF' Chemically Competent E. coli cells (Invitrogen C303003) and cultured overnight in LB. QlAprep@ Spin Miniprep Kit (Qiagen) was used for small amounts (from 5ml LB) of DNA plasmid preparation from pelleted bacteria, whereas HiSpeed@ Plasmid Maxi Kit (Qiagen) was used for large amounts (from 200ml LB) nlsmid preparation. DNeasy@ Blood & Tissue Kit (Qiagen) was used for total DNA extraction from cultured mESCs. 2x1 06 cells were used as starting material to prevent over-loading of silicabased spin-column. 2.4.5 smFISH probe design smFISH probes were designed using a custom algorithm [now publicly available at https://www.biosearchtech.com/stellarisdesigner/ ] to locate twenty oligonucleotide regions with 35 -65% GC content in the cDNA of the gene of interest, with a separation of at least 2 nucleotides between adjacent probes. Pre-designed probes were then subjected to BLAST (Basic Local Alignment Search Tool) analysis of the entire genome, and those with significant alternative targets were removed from the selection process. Oligonucleotide probes were then coupled to fluorophore tetramethylrhodamine (TMR) (Invitrogen), Alexa 594 (Invitrogen) or Cy5 (GE Amersham) strategically to allow simultaneous visualization of different transcripts. The concentration of coupled probes was measured using Nanodrop and diluted to a working concentration of -lng/pl during FISH hybridization. For each 3'UTR of choice, we designed the FISH probes against its coding sequences (CDSs) and 3'UTRs separately. We aimed to design 48 probes against each region, but sometimes due to the limited sequence length and low GC contents in the 3'UTR region, as few as 36 probes were 34 designed. Due to the high sequence similarity between original and mutated version of 3'UTR, FISH probes designed against OriUTRs were confirmed to also work for their mutated versions. FISH probes for transcript of fluorescent protein d2eGFP and mCherry were also designed, and were labeled with Cy5 probes for flow cytometry experiment fluorescence compatibility. The sequences of the probes are available upon request. 2.4.6 Cell lines V6.5 mouse embryonic stem cells (WT mESCs) were derived from the inner cell mass (ICM) of a 3.5 day old male mouse embryo from a C57BL/6 X 129/sv cross background. Dgcr8& ES cell line is a gift from Robert Blelloch's lab. The cell line was derived from V6.5 strain, but was incapable of forming functional microprocessor and producing canonical miRNAs. V19 ES cell line is a gift from Laurie Boyer's lab. The cell line was also derived from V6.5 background, but it contains a reverse tetracycline trans-activator (M2rtTA) driven by the Rosa26 promoter. It is able to activate the expression of Tet-On promoter driven genes under doxycycline induction. Dicer1 ES cell line and miR-290 cluster knockout ES cell line are gifts from Phil Sharp's lab. 2.4.7 Cell culturing All tissue culture plates were gelatin coated for maximum ES cell attachment. 0.2% gelatin solution was made from dissolving gelatin powder (Sigma GI 890) in 1xPBS (GIBCO 14190), and it was autoclaved, filter sterilized and stored at 4'C. To make gelatinized plates, we distributed suitable amounts of 0.2% gelatin solution to cover the plate surface, incubated it for at least 10 min at room temperature, and removed the solution after. mESCs were all cultured on gelatinized cell culture plates in standard mESCs media. During the 4 hours of transient transfection, all cell lines were grown in mESCs media without antibiotics Penicillin-Streptomycin. Table 2.3 mESCs media Final concentration GIBCO 10829 Quantit y (500ml) 410ml FBS (heat-inactivated) L-glutamine Penicillin-Streptomycin Hyclone SH30070.03EH GIBCO 25030 (200mM) GIBCO 15140 (10,000 U/mL) 75ml 5ml 5ml MEM Non-Essential Amino Acids GIBCO 11140 (100x) Sigma M6250 MILLIPORE ESG1107 5ml 4/PI 50 P I 15% 2mM 100U/ml Pen 100pg/ml Strep 100pM 0.1 mM 10 3 U/ml Component Purchase number Knockout DMEM ,8 -Mercaptoethanol Leukemia Inhibitory Factor (LIF) 35 The ES cells were seeded at a density of 2M for 10cm plates, and were passaged every two days at a ratio from 1:6 to 1:10. ESCs were detached with 0.25% Trypsin supplemented with 0.53 mM EDTA. To facilitate the maintenance of undifferentiated ESCs, y-irradiated MEF cells (GSC-6202G) were plated as feeder layers one day before the plating of ESCs. MEF cells were seeded at a density of 2M per 10cm plate. The MEF media was the same as standard mESCs media except that we did not supplement it with LIF and the FBS concentration was reduced to 10%. The MEF feeder layer was only used for maintenance and passage of ESCs, and it was weed off by differential sedimentation before the transfection experiment to reduce contamination of MEF cell lines. 2.4.8 Transient transfection of plasmids and dox induction One day before the transfection, ESCs were plated at a density of 1.4M per 60mm plate, i.e. 5e4/cm 2 . The cells were transfected about 18 hours later at a confluence of 80%. For each transfection sample in 60mm plate format, 20pl Lipofectamine 2000 (Invitrogen 11668) was mixed with 8 pg plasmid in lml Opti-MEM I Reduced Serum Medium (Invitrogen 31985) according to Lipofectamine 2000 plasmid DNA transfection protocol. In the cotransfection system, 7 pg reporter plasmid and 1 pg indicator plasmid were mixed and added. In V19 ESCs transfection, doxycycline (Sigma-Aldrich D9891) was added to the media right after transfection at a concentration of 1 jig/ml. Cells were cultured in mESCs media without antibiotics for 4 hours during the transfection, and were then changed back to standard mESCs media. Transfected cells were passaged the next day at a ratio of 1 to 3 from each 60mm plate into one 10cm plate. And 48hs after transfection start, cells were harvested for downstream experiments. 2.4.9 Cell fixation and hybridization We performed smFISH as previously described (Raj et al., 2006). Harvested cells from 10cm plate were fixed with lml fixation buffer (3.7% para-formaldehyde in 1xPBS) at room temperature for 10 minutes, washed twice with 1xPBS, and permeabilized in lml 70% ethanol at 4*C for at least overnight. For hybridization, the samples were resuspended in 100 pl of hybridization solution containing labeled DNA probes in 2xSSC, 1 mg/ml BSA, 10mM VRC, 0.5 mg/ml Escherichia coli tRNA and 0.1 g/ml dextran sulfate, and 25% formamide, and incubated overnight at 30 C. Optimal probe concentrations during hybridization were determined empirically, and the working concentrations were usually around lng/l. The next day, the samples were washed twice by incubating in 1 ml of wash solution consisting of 25% formamide and 2xSSC at 30'C for 30 minutes. And the cells were transferred to -1ml FACS buffer for flow cytometry or ~100ul glox buffer for microscopy imaging. 2.4.10 Flow cytometry Cells were resuspended in -ml FACS buffer (1xPBS supplemented with 2% RNase-free BSA (NEB B9000S) to reduce cell stickiness to container. Samples were filtered through strainer cap of polystyrene test tubes (Falcon #352235) to reduce clumps. Cells were assayed on LSRII analyzer (BD Biosciences) with FACSDiva software. Single cells were gate separated from cell 36 clusters and debris according to their forward scatter (FSC) and side scatter (SSC) profiles. eGFP and ZsGreen proteins intensities were read from FITC channel, mCherry protein intensity was read from PE-TxRed channel, and Cy5 intensity (i.e. corresponded to labeled transcript) was read from APC channel. 2.4.11 Microscopy Imaging and image analysis We counted the mRNAs in individual cells as described previously (Raj et al., 2006). Briefly speaking, the samples were resuspended in glucose oxidase anti-fade solution, which contains 10 mM Tris (pH 7.5), 2xSSC, 0.4% glucose, supplemented with glucose oxidase and catalase. Then 3 pl cell suspension were sandwiched between two coverglasses, and mounted on a glass slides using a silicone gasket. Images were taken with a Nikon TE2000 inverted fluorescence microscope equipped with a 100x oil-immersion objective and a Princeton Instruments camera using MetaMorph software. Stacks of images were taken automatically with 0.35 microns between the z-slices. To segment the cells, a marker-guided watershed algorithm was used. Briefly, cell boundaries were obtained by running an edge detection algorithm on the bright-field image of the cells. To generate markers, the centroid of the region enclosed by individual cell boundaries was computed. A marker-guided watershed algorithm was then run on the distance transformation of the cell boundaries, using the markers located within the cell boundaries. The resultant cell segmentation image was then manually curated for mis-segmentations. A manual segmentation method was used as a supplement, and GFP mRNA images were used as a reference for manual drawing of cell boundary polygons. To quantify the number of RNA molecules in each cell, a log filter was run over each optical slice of an image stack to enhance signals. A threshold was taken on the resultant image stack to pick up mRNA spots. The locations of mRNA spots were then taken to be the regional maximum pixel value of each connected region. The number of mRNA spots located within the cell boundaries of an individual cell was thus quantified. 37 2.5 Supplementary Quantitative measurement of mRNA using smFISH 48 20-mer probes, each coupled to a fluorophore Target mRNA Max-Zprojection of raw image Computationally detected mRNA dots Supplementary Figure 2.1 Illustration of smFISH and image analysis. Above, smFISH technique illustration. smFISH method probes each mRNA species with 30 or more short, singly labeled oligonucleotide probes that are about 20-mers in lengths. Simultaneous binding of a probe set, which typically consists of 48 different oligonucleotide probes, to each mRNA molecule results in a diffraction-limited fluorescent spot under fluorescence microscope. Below, Analysis of mRNA spots. The left panel is a fluorescent maximum Z-projection image showing Oct4 transcripts in WT ESCs. The right panel is processed image showing each individual mRNA transcript as a single bright pixel. Cells were segmented using bright-field images, and cell boundaries were shown as red polygons. 38 0CM &7.32+002 s 60 Sox2 s0 n035700 21+002 U 60 60 90.-M+001 40 40 20 20i 0 mrn 18t002 00 ~ 2 40 20 200 400 600 800 0 1000 00 100 200 300 400 500 600 700 0 100 200 00 40 403 20 20 20 60 200 400 600 800 0 1000 500 00 j -t65+0 60 40 00 400 8.12 00 0 60 300 100 200 300 400 500 600 10 20 30 40 50 60 0 100 200 300 400 600 600 00 120 300 4W0 S0 60 700 0CIA 80 300 40 man .34*+002 200 0 30 20 100 20 0 200 400 600 1000 I00 0 100 200 300 400 500 0O 700 Supplementary Figure 2.2 Transcript levels of ES cell core transcription factors Oct4, Sox2 and Nanog in various ES cell lines. Transcript levels of Pou5fi (encodes Oct4), Sox2 and Nanog were measured with smFISH for Dicer knockout ESCs (First row in red), miR-290-295 cluster knockout ESCs (second row in black), and wild-type ESCs (third row in blue). Pou5fi and Sox2 expressions are similar in the three cell lines. Nanog transcript level in Dicer-'- ESCs is almost 3-fold higher than WT ESCs, miR-290-295 cluster and other miRNAs affect Nanog transcript expression, whether this effect is direct or not needs further study. 39 - 6 bi-directional plasmid protein barplot 5--pTRE-GFPlin28UTR-RFP, KO pTRE-GFP-RFP, KO - 5 C * 4.5- 0 40. LL 03.50 2.52 1.5 2 3 4 log1 0(mCherry) 5 Supplementary Figure 2.3 Bar plot of GFP expression at different mCherry levels in different bi-directional systems. Bi-directional plasmids were transfected in Dgcr8 knockout ESCs. The GFP expression from pTRE-GFPlin28aOriUTR-RFP differs from pTRE-GFPRBGpA-RFP. This difference is miRNA irrelevant, and comes from factors such as 3'UTR length, RNA binding proteins, etc. 40 Casp2 Protein barplot 5.5 6 .- i OdIUTR Dnr6 KO -MuttJTR, Dqcr9 KO C 5 a 2 RNA b lt Dgcr8 KO -OdUTR, 4.5 4.5 4 E . 3.5 0. U- - Mu tt 35 O crO K DUTR , 3 2.6 2.5[ 2 2 2.5 3 1.5 3.5 4 log(mCherry) 4.5 5 RbI2 Protein barplot 5.6 OriUTR, Dqcr9 5 I- 5.5 2 C Sr 2.5 3 3.5 4 log(mCherry) 4.5 5 d Rbl2 mRNA barplot KO MutUTR, Dgcr8 KO 4.5 5.5 -OriUTR. Dgcr8 KO MutTR. Dacr6 KO 4.5 4 z 3.6 Ix 3.5 U. 3 2.6 2.5 2 2.5 3 3.5 4 ig(mCheny) 4.5 5 -OdUTR 5. I- -MuUTR, 2 e Lais2 Protein barplot 5.6 5.5 3.5 3 2.5 2.6 3 3.6 4 4 log(mCherry) 4.5 5 f Las2 mRNA barplot Sr I--.i Dgcr8 KO 4.5 I- Dgcr8 KO T 5.5 - 1.5 WrwTRIF%$8 KO M -MutUJTR, Docre KO 4.5~ 4 LL z W 3.5[ U. 3 CL3.5 3 2.5 2 2.5 3 3.6 4 g(mChery) 4.5 5 -1. . I I . . 2 5.5 2.5 3 3.5 4 Iog(mCherry) 4.5 . 2.6 2 5 5.5 Supplementary Figure 2.4 Bar plot of GFP protein and mRNA at different mCherry levels for Dgcr8- ESCs transfection. Unlike GFP-RBGpA, the expression of which is much higher than GFP-OriUTRs even in Dgcr8~ ' ESCs. GFP-MutUTR closely mimics the expression of its original counterpart, at both protein and mRNA levels, in the miRNA null background. The mutation of MREs barely perturbs other factors affecting expression, and MutUTR should be used instead of GFP-RBGpA as miRNA unregulated control in the wild type cell lines. 41 Supplementary sequence information MutUTR sequence for GeneArt@ ordering >XhoI-Casp2UTRMut-BamHI CTCGAGTGCCGCCTGCTATTCCGGATGTTGGAGGCCACTGGACCACTGGGGGCACAAGGTAGACTTCTC TTCAGAATGGTTTTTGTTCTGTATCCCCTCTAATGGATATGAGATTCTCCCAGGCTTGTTTCCTGTCAGCC ATCTCTGTCTTTGGGTATGAAACATAAGGATGGCTCCTCCGGTGTCGTGTTCTCGAACTATAGAGCCAGC TCTGAATGGATGTGTTACCAGAAGCATTTTAGCTACAGCCTAGAAAATGACATTTTTAACACATTCTTATT GTGGGAAGAGGTCCTTTGGCTGTATCAATGTTGGGGATATTTTTGTTCCCAAGGCATCTTAGGAGTACTT GGATCATAGCTTTTTTTTTTTTTCCTAAATCAGTTAAGGAGTCTCAGAGATCCTATCCTTTTTT1TCCATATC TACACCATCATTTTTCCCACAGTGGAGATTTGGAAGATGTCCCAATTTAATGTAGGTGTTTTCATCTGTCA TTACGGGACAGATGAGATCCTACTACTTGCGAAGTTTCTATGCATACCTTTAAGTTCAGGCCCTAGGTTA CGGACAGTCCCTCAGCCTTTCCATTGGTTCCTTTGTGTTCAGTGCACCCAGCCTTTGAACAGAGCCTAGG GTCTGTATGCCATGACACTGGAAGTCATAGAAATTTCCCTGGTCATGCTTTGTTTGAACTTTAACTGAATG AACCTTATCGGGCATAACGAAATGAAAATGCAGTGACAGCTGAGTGTGCTGTGTCTCACACTATCACCC GTCATCAGGATGTCGCGCCTTCCTTACTGTGGCTTCTGCATGCCCGTACACTGTACTTGACGGCTGGCCT CCAGGGTCTCTCTTGCTTTGTACTGGTTCCCCTCTTTACCTTCACCATTCGATTCTTCTGCCAAGTCTGTGA AGCCGTCCTTTGTAGGATTTGTCTTGCCACTTACGCTGTCCGGTAGTTGCTTATTCTTTCTGCCTTCTGCTT CAGCGTGAGGCTTCTTTGGTTTTCTGTGGCAGCGTCTCCCTTCTCATTGTTTCTCTGTGTTTTAGTGGGGA TAGTACCATATGTGATATAACCTAGAAGAAATTGTCTCTGCTCTTATGAAACTTGCTTATTCTTGAAAACC TTCTGCATTTCCATTTTTTCCTCTCGTACAATTTATTCTCCATGTAACAGAGTAGTTTGGTTTTTAAAATATC TGGTGATGTCATTCTCTTGCTTAGAACACTAGCTTCCTGTTACGCTTCATCTAAAATGCAAATTCTTACACC CAGCTTACGAGATCTGGCTCATACCTTCCCTTTGGATCTCATTAAATGGTGATGTATCACTATGCTCCAGC CCCTCTTAGGTCCTCCTATCCGTCTTGCAGGTGTTCTGAACTCTCCTTTGGCTAGTCTCTGATTTTTGAGTC TGGCGGAGGCCTCTTGACCATTCGGCCCATGCTGTCTACTGTGCCTCCTTATGAGGGCATCATGTTGGTC TCTGTTGTGCTTACTGCAGGCTGTAATGGCCCGTTTGCTTGTGTAACTTGTTCCCTCTGAGGCTGAATGCT CCAAGAGAGTGGGAACTGTGCTTCTTACTTACTGATATCCAGTAACTGGCCCGTACTAGGTCTTCATGCA GGTTTCCTGAGTAAAGGAAGGAGACCAGCATCGAACCTTAGTTAGAGCCTACCTTTTGCAGTTTCTAAAT TGCTATTATAGTGTACAGTTCAATTAGTATATGGGTTTTTTTTTCCAGGTGTTTTATTTTTATCCACTGTTTT GTTGTTGT1TTTTTATATTTTCTAAAGATCACGTTTTAGAAACCTTCTTTCACATCTCCATAGTGCCCAGCA AATTTGAGGCCTATGGTAGTTGAGGTGCTCACCGAATGTGTTTTGTATGAACCAAGTGGTTTGAAGACTT GCTCCAACATTCTGCCTTTTGGGTCAGTATAGGCTTCATAAGTGGTAGAATCTTCACACTTCCCACGGAC AAGATTTTGTATTGCCATCAGGGTACCAATAAATGTTGATGGATCC >XhoI-Cdkn1aUTRMut-BgIII CTCGAGAGTGCCCACGTGCGCACAGCCCTCTTCTGCTGTGGGTCAGGAGGCCTCTTCCCCATCTTCGGCC TTAGCCCTCACTCTGTGTGGCGTAATTATTATTTGTGTTTTAATTTAAACGTCTCCTGTATATACGCTGCCT GCCCTCTCCCAGTCTCCAAACTTAAAGTTATTTAAAAAAAGACCCAAACCACACAAAAAAAACCACACCA CACCAAACCTAAATTAGTAGGACGGTAGGGCCCTTAGTGTGGGGGATTTCTATTATGTAGATTATTATTA TTTAAGCCCCTCCCAACCCAAGCTCTGTGTTTCCTATACCGGAGGAACAGTCCTACTGATATCAACCCATC TGCATCCGTTTCACCCAACCCCCCTCCCCCCATTCCCTGCCTGGTTCGTTGCCACTTCTTACCTGGGGGTG ATCCTCAGACCTGAATAGAAATTTGGAAAAATGAGTAGGACTTTGGGGTCTCCTTGTCACCTCTAAGGCC 42 AGCTAGGATGACAGTGAAGCATGAACAGCCTAGAACAGGGATGGCAGTTAGGACTCAACCGTAATATC CCGACTCTTGACATTGCTCAGACCTGTGAAGACAGGAATGGTCACAACTCTGGATCCCCTTTGCCACTCC TGGGGAGCCCACCTCTCCTGTGGGTCTCTG CCAGCTGCCCCTCTATTTTGGAGGGTTAATCTGGTGATCT GATTCTCTTTTCCCCCACCCCATACTTCCCCTTCTGCAGGTCGGCAGGAGGCATATCTAGGAAATTGCCAC CCAGCTCAGTGGACTGGACGTGCATGTATATGCAGGGTACACTAAGTGGGATTCCCTGGTCTTACCTTA GGAATCTCCAGTGGCAACCCCCTGCATTGTGGGTCTAGGGTGGGTCCTTGGTGGTGAGACAGGCCTCCA ATAGCATTCTATGGGGGGTGGTGGTGGGGGTGGGCTTATCTGGGATGGGGACCCCAGTTGGGGTTCTC AGTGACTTCTCCCATTTCTTAGTAGCAGTTGTACAAGGAGCCAGGCCAAGATGGTGTCTTGGGGGCTAA GGGAGCTCACAGGACACTGAGCAATGG CTGATCCTTTCTCATTTTTGAATACCGTGGGTGTCAAAGAAA TTAGTGGGTCTGACTCCAGCCCCAAACATCCCTGTTTCTGTAACATCCTGGTCTGGACTGTATCCCCTTAG CCCGCACCCCAAGAACATGTATTGTGGCTCCCTCCCTGTCTCCACTCAGATTGTAAGCGTCTCACGAGAA GGGACAGCACCCTGCATTGTCCCGAGTCCTCACACCCGACCCCAAAGCTTGGGCTCAATAAATACTTCTC GATGATTAGATCT >Xhol-lats2aUTRMut-BgI I CTCGAGCGAGGAAACCCAAAATGAGATTTCTTTTCAGAAGACAAACTCAAGCTTAGGAAGCATTCATTTT TAGTTCTGGTAAATGGGCAACAGGAAGAGTCAACATGATGTAAAATTAGCCCTCTGAGGACCTTCACTG AAGTAAAACATACT1TTTAAAAAATTAGTACAGTATGGACAGATCCCTTATTTTGTGGATACCCATCTTTT TCTTACTAAATTATAAGGACTGACGGGGAGAAACCATGATTCTGTTATTTCCATGTGTGTTGTATCGGCT AGAAATTGTCCACAGCTAGAAAAGGAAGAGCTGGAGAGCGTGAGGCAAGACGTCTGTTCCATAAGAGA GGATGAGGCGACGGAGCTCTGCTCAAGTCACGAGGACCGCTTATCTACACAGTGGCTTAGTTTTGTATT TTCGCACATGTAAAATTGTGATGTAATGTTTGAAAGCTGCTTTTGTATTTTCTCCTTTTCCTATTATAGTTC CTAGAAAGAGTGAGCAGAGAGCTGGTGGGTGTGACTCCGGTGTCTGGTGTGGAGAGTACTGCATGAGC AGGGGTTTCTAGTATAAAATACCGTATCGTTCCATTCACATCCGGTCCTTTTAATACGTTTTTAAATGAGG TATTCTAGACAGTGTGCTTAGATTGTATTGTGTGGATGTGTGTTAAAGAAATGCATATGTATAACTGAAG TGTGAT1TTTTTTTTAATGTGTGTGTGTCTTGGATATGACTGAGTTCTTCGGGAGGCAAATGTAAACATTTG TCATATATAAAACACATCAAACGTGATTAAGTCAGCTTTTCAAAAACATTGACATAATTCTAGCGTTTTGT CCATTTCCGTAGTCCTGTCTGCTGTCAGGTGTGGCTGTGGGAGCTGGACCCTGCTATTCATTCTTTCACTC ACAGGGTTCCCG CAGACCTAGGTGATGTAAGGGTCCTGCTTCCTGTGTTCTCAGCCAACCAGGAGGTTC TTTTAAACCCAGTTCTTTGGGCCTCTTTCACATGAGAGGTGTCTTTAACATCTCAATGTGAATGAATACGT TTTTCTAACTTTGTAAAAAGAAAAAAAGATTCTTTGAAGCAACATTGGAGTACAAAAACAATCAATACTT TTTTCTTAGACATATAGGGGGGTATATAACTATAGATAAACACACAAAATAGTCCTTATGTAAAATTAGT ACGCTTCCTACTTAAGGTGATTTATATTTGAGTACATTCAGTTTCTTTTGCTTTTCAAGGATGGAACACAT CCCATTTTCATTATGCTATGACCAATCTTCTCACCAAGGTTCTTAGCACAGTGCACCCGTTACTTAGGAGT ATCTAGGCAGAAACACTTACAAATTTATCGAGGTCTAAGAAACCTGCCTGTGTCTGGTGTCCATTTGTAT GAATGGCATATTCTGAAGTCTGCTGTGCTGGGATTGTTAATTACATTCTTCTCGCTATTTTGTAGTAATGC CGTGTTATTTACAGCGCTCTGACATAGTTTGATGTGGTAGGTTCTTTCTCAGGAACTCAATTTAACTATTA TTTATTGATATATCATTGCCTTTGAAAGCTTCTACTGGCACAATTTATTATTAAAACTTTAAAGCCTAAGAT CT >Xhol-Rbl2UTRM ut-BglI I 43 CTCGAGGGTTAGTGTCCAGGAGGAAACTGTCTTCACATGAACTGGTTATCCGGACTTAATG CATGCAGG ACTACGGAACCTTGCTCCTGAATCCAGCAACTGATTAAAGGAGGGGATAAAAGGGAAGCGCTTCTGACT AATTGTGGCAGCAAATGCCTGGTATCCCATCACCCAAGGGGTAGGGGACAAGAGGACCAGGAGTTTAA GGCCAACCTGAGCAATACAGCATGTCTGAGGGCAACTTGGACGAAATGTAACCCTTACTCAAAAACAAA GAACCGGAAGGGATGTTTTGGTAAGAAATCAGACTTATCTCACTGTCCTTTGACTATTTTTCATCCCAGTT GCCTCTTCCTCTACTTAGTGCTTACCTTCAACACGGCTCAGAATCCAAACTTGGGGTTTTGAACTCTGGCA AACTTTTACAAGTACTGCAGGAAGCAAATCTTTAGAGGCTTTTGTAGGTAGGCCCCAGGAGAGGAACTG TATTTAACTTCATTTCCACGTTCATATGGTTAGGTCCAACCATGTGTTTTAGGATGAAAACCAATAGACAT TTACAAACAGAACAAGAGGGGCTGGCCCCGACCTGGAAGTGTCCAGGCCTTGGCCTAAAGATACTGAT CCCTTGAGCACTCACTCTCCCTTCCCCAGTAGGTCCGGTACAGTTTTAAAGCGTTCCATGTCTGAAGGAA CCTGTGTAATTGGTGGCCCGTTATGGCTGTAAGATGCATAGCATTGTGACCCAGGGTTTGCTGTATATTT ATGATGGCCCGTTCTATGGTTTTAACTTTGGTAGGTACAAGCCTTAGGCTAAACAGCTAATAATTTCTTTT AATGCTTTTCTTAAAAGACTTCGGATATAGCTACATGTTCTGGCCACATGTAAAAAACTTCCATTTGTGGT AGTGGAAGTACATAGGGATCTTTTAGCTAAGTAAAGATTTTTAAGTCAAGTTGAATTGAGAGTATTGA AAAGTTTTGACCCCTTCCTTTTGGAAGTAGTTATCCCCACGAAACTATCTTTGAGGGTATTCCTGGAAGTT AAAAAAATAGGTTGGAGAAGTGAGGTTTTTATTAGTACATAGTACCATTTATACAAATTAGAAAATTATT TAACAGCTATTGATTATCTACGCATATCTTTATTAATCATTATTGTCGTTT1T TAAGTTGGATTAATAATC CTAAGGAAAAAATTCAATTGTAAATTGGATCATGATAAACCAAGTTACTAGGTAACTTCATGATTCTCTA CAGCACCCAGCTGAGGACCTACAAGCCTGGCACTCCCCCCCCCACCACAGAGTAGTGCTGTGCAGAGTA CTTAGAAAACCTTAGTACCGCTAATTTAATTTTATATGAAAATATGTGTATTTTTCAATAAAGAAATTATA AATTAAGATCT > CCG-Xhol-C+Iin28UTRMut+Bgll+TC CCGCTCGAGCGGCCCAGGAGTCAGGGTTATTATGTGGCTAATGGGGAGTTTAAGGAAAGAGGCATCAA TCTGCAG AG.TGG A ATGGGGGTA AGGTGTTCTGGGTACTTG A ArCG ArGTTrTrAGG CCGGGGTTCCCAGTGTCACCCTGTCTTTCCTTGGAGGGAAGGAAAGGATGAGGCAAAGGAACTCCTACC ACACTCTATCTGAAAGCAAGTGAAGGCTTTTGTGGGGGAGGAACCACCCTAGAACCCGAGGCTTTGACC AGTGGCTGGGCTAGGGAAGTTCTTTTGTAGAAGGCTGTGTGATATTTCCCTTGCCAGACGGGAAGCGAA ACAAGTGTCAAACCAAGATTACTGAACCTACCCCTCCAGCTACTATGTTCTGGGGAAGGGACTCCCAGG AGCAGGACGAGGTTATTTTCACACCGTGCTTATTCATAACCCTGTCCTTTGGTGCTGTGCTGGGAATGGT CTCTAGCAACGGGTTGTGATGACAGGCAAAGAGGGTGGTTGGGGGAGACAACTGCAGACCTTCGGCCC ACACCTCACTCCCAGCCCTTTCTGGGCCAATGGGATTTTAATTTATTTGCTCCCTTAGGTAACTGCAACGT GGGTCCCACTTTCTCCAGGATGCCAACTGAACGATCTACGTGCGAATGACGTATCTTGTGCGTTCTTTTT1 1T TTAAT1TFVAAAA I I I I I IICCTCTTCTTAAAATAAGTAATGGGTTTGTATTTTTTTCTATTTTAATCTT CCGGCCCTCATTCCTGCCCTTTGTTCTCAGGTACATGAGCAATCTCCGTGATAATAAGTCCGTAGCAGCTC CAGGTCTGCTCAGCCGTAATACTTTGTTTTGTTTTGTTTTGATCACCATGGAGACCAACCATTTGGAGTGC ACAGCCTGTTGAACTAACGCATTTTTGCCGATTACAGCTGGCTTTTCTGCAAGAGCGTCCTTGAAAAATG TGTCTCACGGGTTTCGATTGAGCTGCCCCAAGACTTGATCTGGATTTGGCAAAACATAGGACATCACTCT AAACAGGAAAGGGTGGTACAGAGACATTAAAAGGCTGGGCCAGGTAAAAGGCACAAGAGGAACTTTC CATACCAGATCCATCCTTTTGCCAGATTAGTGGAAGCCTGCCATGCACAGCCGTGTGTGAGAGAGAGAG TGTGTATGTATGTGTGTGTGGATTTTTTTTAATTCCAATTTATGAAGACGAGGTGGGTTTTGTTTATTTGA TTGCTTTTTGTGCTGGGGATAGAATCTTGGGCTTCATTTGTGCTAGGAAGTACACGGACACTGAGTTATC 44 CCAGTAAGAATTCCACTTAAGACCAGTACCCTTATTCCCACACTGTGCTGTCCAGGCATGGGAACATGAG GCAGGGACTCAACTCCTTAGCCTTTCACAATCTTGGCTTTCAGAGAGACTCATGAGTATGGGCCTCAGTG GCAAGTGTCCTGCCCTTCGGTAGCATGATGGTTGATAGCTAAAGGAAAGAGGGGGTGGGGAGTTTCGT TGAAATGCTGTTAGATCGCCAGAAACCTAACGCACTGTGTTGAAACGGGACAAATTCCATAGAACACAT TGGGTGGTGTGTGTGTGTGTCTGATCTTGGTTTCTTGTCTCCCTCTCCCCCCAAATTCGGCCCTCACCCCT AGTTAATTGTATTCGTCTGGCCTTTGTAGGACTTTTACTGTCTCTGAGTTGGTGATTGCTAGGTGGCCTAG TTGTGTAAATATAAATGTGTTGGTCTTCATGTTCTTTTGGGGTTTTATTGTTGAAAAAACTTTTGTTGTATT GAGAGAAAAATAGCCAAAGCATCTTTGACAGAAAGCTCTGCACCAGACAACACCATCTGAAACTTAAAT GTGCGGTCCTCTTCTCAAAGTGAACCTCTGGGACCATGGCTTATCCTTACCTGCTCCTCCTGTGTCTCCCA TTCTGGACCACAGTGACCTTCAGACAGCCCCTCTTCTCCCTCGTAAGAAAACTTAGGCTCATTTACTTCTT TGAGCATCTCTGTAACTCTTGAAGGACCCAGGTTAAAATTCTGAAGAAGCCAGGAACCTCATTATGTCCT TGTCCCTAACTCAGTGAAGAGTTTTGGTTGGTGGTTGTTAGACAGGGCCTCACTCTGTAGCTGGAGATA GAGAGCCTCGGGTTCCTGGCTCTCCTCCTGCCTTCTGCACAGAGTCCCCTGTGCAGGG CTTGCAGGTGCC GCTTCTCCCTGGCAAGACCATTTATTTCATGGTGTGATTCGCCTTTGGATGGATCAAACCAATGTAATCTG TCACCCTTAGGTCGAGAGAAGCAATTGTGGGGCCTTCCATGTAGAAAGTTGGAATCTGGACACCAGAAA AGGGACTATGACTTTACAGTGAGTCACTCAGGAACTTAATGCCGGTGCAAGAAACTTATGTCAAAGAGG CCACAAGATTGTTACTAGGAGACGGACGACTTTATCTCCATGTTGAATGCTAGAAACCAAAGCTTTGTGA GAAATCTTGAATTTATGGGGAGGGTGGGAAAGGGTGTACTTGTCTGTCCTTTCCCCATCTCTTTCCTGAA CTGCAGGAGACTAAGGCCCCCCACCCCCCGGGGCTTGGATGACCCCCACCCCTGCCTGGGGTGTTTTATT TCCTAGTTGATTTTTAATGGACCCGGGCCCTTTTCTTCCTATCGTATAATCATCCTGTGACACATGCTGACT TTTCCTTCCCTTCTCTTCCCTGGGAAAATAAAGACTTATTGGTACTCCAGAGTTGGGAATGAGATCTTC 2.6 References Alon, U. (2007). Network motifs: theory and experimental approaches. Nat Rev Genet 8, 450-461. Babiarz, J.E., Ruby, J.G., Wang, Y., Bartel, D.P., and Blelloch, R. (2008). Mouse ES cells express endogenous shRNAs, siRNAs, and other Microprocessor-independent, Dicer-dependent small RNAs. Genes & development 22, 2773-2785. Bartel, D.P. (2004). MicroRNAs: genomics, biogenesis, mechanism and function. Cell 116, 281- 297. Bartel, P.D. (2009). MicroRNAs: target recognition and regulatory functions. Cell 136, 215-233. Benetti, R. (2008). A mammalian microRNA cluster controls DNA methylation and telomere recombination via Rbl2-dependent regulation of DNA methyltransferases. Nature Struct Biol 15, 268-279. Boyer, L.A. (2005). Core transcriptional regulatory circuitry in human embryonic stem cells. Cell 122, 947-956. Boyer, L.A., Plath, K., Zeitlinger, J., Brambrink, T., Medeiros, L.A., Lee, T.I., Levine, S.S., Wernig, M., Tajonar, A., Ray, M.K., et al. (2006). Polycomb complexes repress developmental regulators in murine embryonic stem cells. Nature 441, 349-353. 45 Cheloufi, S., Dos Santos, C.O., Chong, M.M.W., and Hannon, G.J. (2010). A dicer-independent miRNA biogenesis pathway that requires Ago catalysis. Nature 465, 584-589. Farh, K.K. (2005). The widespread impact of mammalian microRNAs on mRNA repression and evolution. Science 310, 1817-1821. Friedman, R.C., Farh, K.K.H., Burge, C.B., and Bartel, D.P. (2008). Most mammalian mRNAs are conserved targets of microRNAs. Genome Research 19, 92-105. Gangaraju, V.K., and Lin, H. (2009). MicroRNAs: key regulators of stem cells. Nat Rev Mol Cell Biol 10, 116-125. Grimson, A. (2007). MicroRNA targeting specificity in mammals: determinants beyond seed pairing. Mol Cell 27, 91-105. He, L. (2005). A microRNA polycistron as a potential human oncogene. Nature 435, 828-833. Hong, X., Scofield, D.G., and Lynch, M. (2006). Intron Size, Abundance, and Distribution within Untranslated Regions of Genes. Molecular Biology and Evolution 23, 2392-2404. Houbaviy, H.B., Murray, M.F., and Sharp, P.A. (2003). Embryonic stem cell-specific microRNAs. Dev Cell 5, 351-358. Kanellopoulou, C. (2005). Dicer-deficient mouse embryonic stem cells are defective in differentiation and centromeric silencing. Genes Dev 19, 489-501. Lim, L.P. (2005). Microarray analysis shows that some microRNAs downregulate large numbers of target mRNAs. Nature 433, 769-773. Lytle, J.R., Yario, T.A., and Steitz, J.A. (2007). Target mRNAs are repressed as efficiently by microRNA-binding sites in the 5' UTR as in the 3' UTR. Proceedings of the National Academy of Sciences of the United States of America 104, 9667-9672. Marson, A. (2008). Connecting microRNA genes to the core transcriptional regulatory circuitry of embryonic stem cells. Cell 134, 521-533. Mayr, C., Hemann, M.T., and Bartel, D.P. (2007). Disrupting the Pairing Between let-7 and Hmga2 Enhances Oncogenic Transformation. Science 315, 1576-1579. Melton, C., Judson, R.L., and Blelloch, R. (2010). Opposing microRNA families regulate selfrenewal in mouse embryonic stem cells. Nature 463, 621-626. Mukherji, S., Ebert, M.S., Zheng, G.X.Y., Tsang, J.S., Sharp, P.A., and van Oudenaarden, A. (2011). MicroRNAs can generate thresholds in target gene expression. Nat Genet 43, 854-859. Murchison, E.P., Partridge, J.F., Tam, O.H., Cheloufi, S., and Hannon, G.J. (2005). Characterization of Dicer-deficient murine embryonic stem cells. Proc Natl Acad Sci USA 102, 12135-12140. Okamura, K., Hagen, J.W., Duan, H., Tyler, D.M., and Lai, E.C. (2007). The mirtron pathway generates microRNA-class regulatory RNAs in Drosophila. Cell 130, 89-100. 46 Raj, A., Peskin, C.S., Tranchina, D., Vargas, D.Y., and Tyagi, S. (2006). Stochastic mRNA Synthesis in Mammalian Cells. PLoS Biol 4, e309. Rigoutsos, I. (2009). New tricks for animal microRNAS: targeting of amino acid coding regions at conserved and nonconserved sites. Cancer research 69, 3245-3248. Rosa, A., and Brivanlou, A.H. (2011). A regulatory circuitry comprised of miR-302 and the transcription factors OCT4 and NR2F2 regulates human embryonic stem cell differentiation. The EMBO Journal 30, 237-248. Ruby, J.G., Jan, C.H., and Bartel, D.P. (2007). Intronic microRNA precursors that bypass drosha processing. Nature 448, 83-86. Sinkkonen, L. (2008). MicroRNAs control de novo DNA methylation through regulation of transcriptional repressors in mouse embryonic stem cells. Nature Struct Biol 15, 259-267. Tay, Y., Zhang, J., Thomson, A.M., Lim, B., and Rigoutsos, I. (2008). MicroRNAs to Nanog, Oct4 and Sox2 coding regions modulate embryonic stem cell differentiation. Nature 455, 1124-1128. Viswanathan, S.R., and Daley, G.Q. (2010). Lin28: A MicroRNA Regulator with a Macro Role. Cell 140, 445-449. Viswanathan, S.R., Daley, G.Q., and Gregory, R.I. (2008). Selective Blockade of MicroRNA Processing by Lin28. Science 320, 97-100. Wang, Y., Baskerville, S., Shenoy, A., Babiarz, J.E., Baehner, L., and Blelloch, R. (2008). Embryonic stem cell-specific microRNAs regulate the Gi-S transition and promote rapid proliferation. Nat Genet 40, 1478-1483. Wang, Y., Medvid, R., Melton, C., Jaenisch, R., and Blelloch, R. (2007). DGCR8 is essential for microRNA biogenesis and silencing of embryonic stem cell self-renewal. Nat Genet 39, 380-385. Wu, L., and Belasco, J.G. (2005). Micro-RNA regulation of the mammalian lin-28 gene during neuronal differentiation of embryonal carcinoma cells. Mol Cell Biol 25, 9198-9208. Yu, J., Vodyanik, M.A., Smuga-Otto, K., Antosiewicz-Bourget, J., Frane, J.L., Tian, S., Nie, J., Jonsdottir, G.A., Ruotti, V., Stewart, R., et al. (2007). Induced Pluripotent Stem Cell Lines Derived from Human Somatic Cells. Science 318, 1917-1920. Zheng, G.X.Y., Ravi, A., Calabrese, J.M., Medeiros, L.A., Kirak, 0., Dennis, L.M., Jaenisch, R., Burge, C.B., and Sharp, P.A. (2011). A Latent Pro-Survival Function for the Mir-290-295 Cluster in Mouse Embryonic Stem Cells. PLoS Genet 7, e 1002054. 47 Chapter 3 Application of UTR reporter system to study microRNA regulation at transcriptional and translational levels 3.1 Abstract MicroRNAs are known to regulate their targets via inducing mRNA degradation and inhibiting translation. But the relative contributions from the two sources have been under debate. It's also unclear how miRNA regulation varies for different target expression. Here we apply the UTR reporter system to study miRNA regulation at these two levels, and monitor miRNA regulation over a target expression range which spans more than two orders of magnitudes. Unlike some genome-wide studies, which suggest transcript degradation account for most (> 84%) of miRNA repression (Guo et al., 2010). We found that the contribution from the two sources were on the same order for all UTRs under study. Moreover, miRNA regulation strength were found to vary for different target expression. The transcriptional regulation is more stable throughout the range of measurement, whereas translational regulation gets saturated and decreases for high target expression. Our data also suggests that miRNA might increase initially at low target expression region. Taken together, our measurements provide single cell information on miRNA regulation at two levels for a wide range of target expression. 3.2 Introduction 3.2.1 miRNA-mediated repression of translation mRNA translation can be divided into three steps: initiation, elongation and termination. Initiation starts with the recognition of 5' cap by the eukaryotic translation initiation factor (elF) eIF4F, which contains eIF4G, an important scaffold protein for the assembly of the ribosome initiation complex. eIF4G also interacts with the poly(A)-binding protein (PABP) and brings the two ends of the mRNA in close proximity (Derry et al., 2006). This 'circularization' stimulates translation initiation by increasing the affinity of eIF4E with the cap, and facilitating ribosome recycling. Transcriptional initiation of some viral mRNAs is independent of the m7 G cap; in this case, 40S ribosomes are directly recruited to the internal ribosome entry site(IRES) (Jackson, 2005). Initiation is usually the rate limiting step in translation, and consequently the common target for translational control (Fabian et al., 2010). Repression at the initiation step The first proof of miRNA-mediated translational repression at initiation step comes from repression of m7 G-capped, but not IRES containing or ApppN capped (non-functional) mRNAs targeted by either endogenous (let-7) (Pillai, 2005) or artificial (CXCR4) (Humphreys et al., 2005) miRNAs. Polysome gradient analysis shows that transcripts targeted by miRNAs shift towards the top in sucrose gradient sedimentation. This is shown for both cultured cells (Bhattacharyya et al., 2006; Huang, 2007) and in C. elegans (Ding and Grosshans, 2009). Several in vitro experiments using cell-free systems faithfully recapitulate the mode of miRNAs regulation in cultured cells. In all of them, the presence of the m7 G cap was required for translational repression (Mathonnet, 2007; Thermann and Hentze, 2007; Wakiyama et al., 2007; Wang et al., 2006). MicroRNA 48 specificity was validated either by mutating targeting sites of specific miRNA or by transfection of antisense oligonucleotides (antimiRs) which specifically block the targeting miRNA. Some discrepancies exist regarding the role of the poly(A) tail in miRNA-mediated translational repression. Poly(A) tail was shown to be both necessary (Humphreys et al., 2005) and dispensable (Pillai, 2005) for optimal miRNA repression. Translational repression was also observed in the absence of poly(A) tail (Eulalio, 2009; Eulalio et al., 2008; Wu et al., 2006). The data suggest that poly(A) tailper se is not absolutely required for the repression, but miRNA-mediated deadenylation might further contribute to translational repression by preventing the synergy between the 5' cap and 3' poly(A) tail (Wakiyama et al., 2007). Repression at post-initiation steps A number of studies concluded that miRNAs could also inhibit translation at post-initiation steps in addition to initiation suppression. Polysome sedimentation analyses showed miRNAs in association of microRNA ribonucleoprotein complex (miRNP, also known as miRISC) components and repressed mRNA were found to cosediment with active translating polysomes both in C. elegans and mammalian systems (Maroney et al., 2006; Nottrott et al., 2006; Olsen and Ambros, 1999; Petersen et al., 2006; Seggerson et al., 2002). Moreover, several groups observed IRES-driven translation being repressed by the miRNA machinery (Lytle et al., 2007a; Nottrott et al., 2006). In addition, repression ofpall mRNA by GLD 1 in C. elegans seems to involve the stalling or slowing down of elongating ribosomes (Mootz et al., 2004), translational repression of miRNA targets was also found in yeast and D. melanogasterembryos (Braat et al., 2004; Clark et al., 2000; Ruegsegger et al., 2001). Several models have been proposed to explain miRNAs inhibition at post-initiation stage. Petersen et al. (Petersen et al., 2006) proposed a drop-off model, in which miRNAs render ribosomes prone to premature termination. Alternatively, Maroney et al. (Maroney et al., 2006) speculated that miRNAs decelerate translation elongation. The inability to detect nascent polypeptides from repressed reporters in both immunoprecipitation (Olsen and Ambros, 1999) and pulse-labeling experiments (Petersen et al., 2006) has led Nottrott et al. to come up with the proteolysis model (Nottrott et al., 2006). 3.2.2 miRNA-mediated mRNA deadenylation and decay Evidence on miRNA-mediated mRNA degradation comes from studies on specific miRNA-target pairs, and more generally from transcriptome studies. Microarrays or deep sequencing data both showed that the abundance of miRNA targets inversely correlates with the level of miRNA. Cellular levels of selected miRNAs could be modified by miRNA transfection, antimiRs or genetic knockdown, and differentiation (Baek, 2008; Guo et al., 2010; Hendrickson, 2009; Krutzfeldt, 2005; Lim, 2005; Selbach, 2008). Furthermore, in cultured cells, depleting essential components of the miRNA pathway (for example, Dicer, AGOs or miRNA silencing effector GW182) increased the abundance of miRNA targets (Behm-Ansmant, 2006; Eulalio, 2007, 2009; Giraldez, 2006; Rehwinkel, 2006; Rehwinkel et al., 2005; Schmitter, 2006). Expression profiles from differentiating and developing cells also provided examples showing anti-correlated expression of miRNAs and their targets. For example the dramatic increase in miR-430 expression at the onset of zebrafish zygotic transcription correlates with the degradation of a large number of maternal 49 mRNAs containing miR-430-binding sites in their 3' UTRs (Farh, 2005; Giraldez, 2006; Mishima, 2006; Stark et al., 2005). Animal miRNAs rarely induce endonucleolytic cleavage of target transcripts due to partial complementary binding. Instead, they direct their targets to deadenylation, and accelerates mRNA destabilization through 5'-to-3' mRNA decay pathway. Transcripts targeted by miRNAs are primarily deadenylated by the CAF1-CCR4-NOT deadenylase complex, and then decapped by the decapping-complex proteins, DCP 1 and DCP2, and ultimately degraded by the 5'-to-3' exonuclease XRN1. The role of miRNA-mediated mRNA decay factors is evidenced by the observation that the abundance of miRNA targets increases when these factors are depleted or when dominant-negative forms are overexpressed (Behm-Ansmant, 2006; Chu and Rana, 2006; Eulalio, 2007, 2009; Piao et al., 2010; Rehwinkel et al., 2005). 3.2.3 Cellular compartmentalization of miRNA repression P-bodies or GW-bodies are cellular structures that are enriched in mRNA-catabolizing enzymes and translational repressors. They are considered as discrete foci for mRNA degradation, and temporary storage sites for repressed mRNAs in yeast and mammals (Eulalio et al., 2007a). The demonstration that AGO proteins, GW182, miRNAs and mRNAs repressed by miRNAs are all enriched in P-bodies implicated that P-bodies are evolved in miRNA repression (BehmAnsmant, 2006; Bhattacharyya et al., 2006; Jakymiw, 2005; Leung et al., 2006; Liu et al., 2005; Meister, 2005; Pillai, 2005). There is a good correlation between miRNA-mediated repression and accumulation of mRNAs in visible P-bodies (Bhattacharyya et al., 2006; Huang, 2007; Liu et al., 2005; Pillai, 2005). The endogenous CAT] mRNA, a target of miR-122, localizes to P-bodies when translation is repressed, and it can be reversed by stress. In addition, overexpression of miR122 is sufficient to concentrate CAT] mRNA in P-bodies (Bhattacharyya et al., 2006). Despite observations supporting the role of P-bodies in miRNA regulation, unresolved issues remain. Knockdown of some P-body components disperses visible P-bodies, but has no effect on miRNA function (Eulalio et al., 2007c). The relative distribution of miRISC components between P bodies and the cytosol dispute the significance of P-bodies. Only -1.3% of enhanced GFP (EGFP)-tagged AGO2 localized to P-bodies in HeLa cells (Leung et al., 2006). Moreover, exchange of miRNPs between P-bodies and cytoplasm is very slow (Andrei, 2005; Kedersha, 2005; Leung et al., 2006). Collectively, these data suggest that miRNA repression either involves submicroscopic P-bodies or occurs outside of them. Because most P-body components are also found throughout cytosol (Eulalio et al., 2007a), it is likely that miRNA-mediated repression is initiated in the cytosol and the microscopically visible P-body is a consequence rather than the cause of silencing (Chu and Rana, 2006; Pillai, 2005). 3.2.4 Translation inhibition vs transcript degradation Compelling evidences support that miRNA repress targets via both translational inhibition and transcript degradation. However, it was controversial which of these mechanisms dominates. Recent development in proteomics methods enables genome-wide analyses on both proteome and transcriptome, and allow us to assess to what degree silencing was caused by translational 50 repression versus mRNA degradation (Baek, 2008; Eichhorn et al., 2014; Guo et al., 2010; Hendrickson, 2009; Selbach, 2008). All of those studies agree on one main conclusion: miRNAs only modestly inhibit protein production, rarely resulting in more than a fourfold reduction in protein levels. However, they disagree on the extent of translational contribution. Bartel and colleagues (Baek, 2008; Eichhorn et al., 2014; Guo et al., 2010) use quantitative mass spectrometry and ribosome profiling to measure the effect of miRNAs on protein output, and found that changes in protein and mRNA levels strongly correlate. Accordingly, changes in mRNA levels accounted for most of the regulation. The initial ribosome profiling experiments (Guo et al., 2010) were performed on cytoplasm-extracted and poly(A)-selected RNA only. It might lead to underestimation of translational repression, because it missed the deadenylated transcript as well as transcripts in the P-bodies. Those issues have been addressed in (Eichhorn et al., 2014), and mRNA destabilization still explains most (66% 90%) miRNA-mediated repression. Hendrickson et al. (Hendrickson, 2009) used polysome profiling to estimate translation rate, and concluded that mRNA degradation accounted for about 75% of the total changes. However, Selbach et al. (Selbach, 2008) used pulse-labeled mass spectrometry and to demonstrate direct translational repression for hundreds of genes. And many genes were down regulated at protein level with little mRNA changes at early time points (8 hours) after miRNA transfection, a phenomenon recapitulated in (Eichhorn et al., 2014). In contrast to genome-wide studies which suggest that target degradation is the predominant mode of regulation by miRNAs in mammalian cultured cells, single-gene analysis usually show that repression at the protein level is generally stronger and more robust than the repression at transcript level. There are also numerous examples of miRNA regulation at the translational level, with no or minimal effect on mRNA degradation (Behm-Ansmant, 2006; Eulalio, 2007; Filipowicz et al., 2008; Poy et al., 2004; Zhao et al., 2005). Moreover, studies using reporter rather than endogenous genes usually observe major contributions from translational repression (Doench and Sharp, 2004; Kiriakidou, 2007; Nelson et al., 2004; Pillai, 2005; Yekta et al., 2004). The discrepancies between genome-wide analysis and single gene studies still need to be addressed. There is debate regarding the order of events, because deadenylation has been reported both to precede (Beilharz, 2009; Iwasaki et al., 2009; Wakiyama et al., 2007) and to follow translational repression (Fabian, 2009; Zdanowicz, 2009). Even though disruption of transcripts circularization could account for both transcript degradation and translational inhibition, those two processes are not entirely coupled, nor is one supplementary to another. miRNA-dependent target degradation is seen even when translation of miRNA targets is precluded, either by a defective cap structure that impairs translation (Mishima, 2006; Wakiyama et al., 2007) or by translational arrest using cycloheximide treatment (Eulalio, 2007; Fabian, 2009). Conversely, miRNA-mediated translation inhibition can be observed in the absence of deadenylation. MicroRNA repression still occurs when poly(A) tail is replaced by a histone mRNA stem-loop structure or by a self-cleavable ribozyme (Eulalio, 2009; Eulalio et al., 2008; Wu et al., 2006). 3.3 Results 3.3.1 MicroRNAs exert regulation at both transcriptional and translational levels Numerous evidences support that miRNAs induce target silencing at two levels, via transcriptional degradation and translational inhibition. However, which of these two mechanisms occurs 51 predominantly has been highly controversial, with conflicting lines of evidence supporting both views (See 3.2.4). Also, it's not clear whether or not miRNA regulation varies for different target expression, and if it does, how would it vary. We set out to address these questions with our UTR reporter systems. The 3'UTR of an endogenous gene was appended behind GFP reporter, and the fluorescence of the protein can be directly measured. GFP transcripts were hybridized with FISH (fluorescent in situ hybridization) probes, and rendering mRNA level to be quantifiable too (Klemm et al., 2014). We compare the expression of the original UTR (OriUTR) with respect to its MREs mutated version (MutUTR), which has been validated as miRNA unregulated control (Chap. 2), and total repression as well as repression at transcriptional level could be quantified. The translational inhibition can be derived by dividing total repression with transcriptional contribution. Another fluorescent protein mCherry was delivered into the cells together with miRNA activity reporter. mCherry was followed by a short RBGpA which is devoid of miRNA regulation. The expression of mCherry reflects the variability in delivery efficiency and expression machinery activity in single cells, and it is correlated with reporter expression in the absence of miRNA regulation (Chap. 2). We can further arrange the cells according to mCherry abundance, and quantify miRNA repression at different transfection levels. Thus we utilize the variability of individual cells, and cover miRNA regulation for target expression varying by orders of magnitude in a single transient transfection experiment. Only cells 4 standard deviations away from transfected population majority background were selected for analysis (See 3.5.2), and the conclusions drawn from those cells are very robust to background estimation (Figure 3.12). By arranging cells according to mCherry expression, and overlay the scatterplot of GFP-OriUTR and GFP-MutUTR (Figure 3.1), we observe that at the same transfection level, GFP-MutUTR expression is always higher than GFP-OriUTR on average. This is true for both GFP transcripts and protein, and the difference is more pronounced at protein level. GFP protein is plotted against gfp mRNA, and we notice that the marginal distributions of gfn transcrint and GrFP nrntein hnth shift tmward-, lmwer enc fnr mnRNA reresccdpA (rhTTTR (Figure 3.2a), and the shift is also more prominent at the protein level. Both plots illustrate the contribution of miRNA regulation from transcript degradation, and strongly suggest the contribution from translational inhibition. A proof of latter comes from quantification of conditional mean of GFP protein expression at different GFP transcript levels. Less proteins are produced for GFP-OriUTR on average except at very high transcript levels (Figure 3.2b). Protein produced out of unit mRNA was computed (Figure 3.2c), and we find that there's about three fold repression on OriUTR versus MutUTR. The translational inhibition is de-repressed at high transcript levels, indicating that translational inhibition mechanism get titrated for very high target expression. We only present Casp2 as an example here, other UTRs yield qualitatively similar results (Supplementary Figure 3.5). Thus, our data suggests that miRNAs regulation happens at both transcriptional and translational levels in general. 52 5.6 r 6.61 5 Casp2M utUTR Casp24 ri UTR -. 5 Casp2MutUTR Casp2OriUTR 4.51 4- 4 3.51F U. U- 0. L- R$ %, 3 .2 2.5 3 2.5 [ 2 2 1.5- 1.5 3 3.5 4 4.5 5 5.5 - a 3 6 log(mChorfy) 3.5 4.5 4 log(mChwny) 5 5.5 6 Figure 3.1 Scatterplot of GFP protein and mRNA versus transfection level indicator mCherry. (a) Scatterplot of GFP protein vs mCherry, (b) Scatterplot of gfp mRNA vs mCherry. pCAGd2eGFP-Casp2Ori/MutUTR were co-transfected with pCAG-mCherry plasmid into wild-type ESCs. Only cells 4 standard deviations away from the transfected majority background were selected for the plot. OriUTRs was plotted in red, and MutUTR was plotted in blue. Only 10% of the measured cells are plotted for visualization. MicroRNAs repress both transcript and protein production of GFP-Casp2OriUTR. Figure 3.2 The relationship between protein and mRNA for Casp2 3'UTR. pCAG-d2eGFP-Casp2Ori/MutUTR were co-transfected with pCAG-mCherry plasmid into wildtype ESCs. Only cells 4 std away from the transfected majority background are selected for the plot. (a) Scatterplot of GFP protein versus mRNA, with marginal distributions plotted on the side. OriUTR was plotted in red, and MutUTRs was plotted in blue. Only 2% of the measured cells were plotted for better visualization. MicroRNAs repress both transcript and protein levels of GFPOriUTR, and the histograms of both are shifted towards smaller values. (b) Bar plot of protein and mRNA. Cells were binned according to mRNA expression, and the mean of protein expression in each bin were calculated. Error bars correspond to SEM. (c) Proteins produced out of unit transcript at different transcript expression. The conversion rate for OriUTR stays relatively constant, but decreases for MutUTR at high transcript levels. This reflects the de-repression of translation inhibition at high transcript levels, as depicted in the black dotted line. 53 a 5.564.54- 3.5- A 'IL -- .~ L32.521.51- I I I al 3 2.5 2 5.5 6 b OriUTR -MutTR 5 5.6 5 4.5 4 3.5 log(GFP mRNA) C ---- OIUTR MutUTR 5 4.5F 4 4 I' z \Jj.{i a. U. '.~ 03.5 3 *I-. I 2.5 .5 3 3.5 4 log(GFP 4'5 5 5.5 mRNA) 54 .5 3 3.5 4 4.5 log(GFP mRNA) 5 5.5 3.3.2 Quantifying miRNA regulation at transcriptional and translational levels Now we confirmed that miRNAs regulate their targets at both transcriptional and translational levels, we want to know the relative contribution from the two and see if one mechanism dominates regulation. Cells were arranged according to mCherry intensities, and repression fold was calculated for different target expression by RF = regulated Ibtn - background . The quantified repression fold was independent of the specific background estimation method we took (See 3.5.3). Transcriptional and total repression were determined by comparing transcript and protein expressions of OriUTR to MutUTR, and translational inhibition was derived by dividing the total repression with respect to the contribution from transcriptional degradation. 3.3.3 The transcriptional regulation stays relatively constant and translational regulation saturates at high target expression MicroRNAs repression for endogenous 3'UTRs were quantified as described. And the repression was assessed over a region which spans more than 100 fold of target expression (Figure 3.3 a-c and Supplementary Figure 3.1 a-b). To compare our measurements with population-based assays, cells with different transfection levels were combined, and a single value repression fold was calculated (See 3.5.3). Consistent with our previous studies (Mukherji et al., 2011) and genomewide assays (Guo et al., 2010; Selbach, 2008), even though miRNA regulation could exceed 10 fold for certain target expression range, its regulation at population level is usually subtle, and rarely exceeds 4 fold on average (Figure 3.3d and Supplementary Figure 3.1c). Here we extend the conclusion from artificially constructed miRNA targets ((Mukherji et al., 2011) e.g. 7 consecutive miR-20 sites) to endogenous UTRs. Genome-wide miRNA regulation strength are expected to be even smaller than what we presented in Figure 3.3d, because Casp2, Lats2, Rbl2 are all strongly repressed miRNA targets. Repression on mildly regulated Lin28a UTR and another miR-290 targets P21 are less than twofold (Supplementary Figure 3.1). Also, the relative contributions from transcriptional degradation and translational inhibition varies for different UTRs, and mRNA destabilization accounts for 39% to 88% of total regulation (Figure 3.3d and Supplementary Figure 3.1c). For the same UTR, the contribution also varies for different target expression levels (Figure 3.3 a-c and Supplementary Figure 3.1 a, b). Despite the variability, transcriptional and translational regulation strength are at the same order for all UTRs under study, and there's no single dominating source of contribution factor. Transcriptional regulation stays relatively constant Throughout the measured range, which spans over 100 fold of target expression, miRNA regulation at transcriptional level appears to be relatively stable, even though the actual value varies for different UTRs (blue line in Figure 3.3 a-c and Supplementary Figure 3.1 a-b). Moreover, transcriptional regulation does not saturate at high target expression, in contrast to miRNA regulation at translational level. The stableness of transcriptional miRNA regulation is further corroborated by two independent experimental techniques, microscopy (Figure 3.5) and RNA sequencing (Supplementary Figure 3.2). 55 a b Casp2 Repression Fold -+-total repression fold Lats2 Repression Fold -+-total repression fold 9 -4-mRNA degradation repression -- protein translation inhibition 8 RF = 1 reference line degradation repression -+-protein translation inhibition RF I reference line -+-mRNA 710 C 4- 32 1 01 10 0.I 3 104 10 mCherry C d Rbl2 Repression Fold 12 4 10 mChery 10 Popuimion Repression 6 total repression fold -+-mRNA degradation repression -+- protein translation inhibition =I reference ine 10 -RF -- 43% 10 5 47%0/ A 5- 39% 8- 3S 4 I ZT 2- 2 1 0 0 Rb2 Lat2 Cau2 mCherry Figure 3.3 miRNA repression at transcriptional and translational levels, data quantified from flow cytometry. Total miRNA-mediated repression (red), transcriptional repression (blue) and translational contribution (green) were quantified for Casp2 UTR (a), Lats2 UTR (b) and Rbl2 UTR (c) over a target expression range of about 100 fold. (d) All transfected cells were combined and a single population repression value was derived like in bulk assays, with total repression in red bars and transcriptional repression in blue bars. Error bars correspond to standard deviation from more than 56 three experiment replicates. And the percentage corresponds to the relative contribution from transcriptional regulation to total repressions. Flow cytometry measures the integrated fluorescent intensity from cells. But cells have autofluorescence, and non-specific binding of smFISH probes within cells further increases the background in mRNA channel. The repression fold calculated is more sensitive to background estimation at low target expression region. On the other side, FISH probe bound transcripts were resolved as diffraction limited spots under microscopy. This method achieves single molecule resolution for transcript detection and overcomes the background issue in mRNA channel. Thus microscopy complements with flow cytometry, and provides high confidence data for transcriptional regulation at low target expression. Naturally the microscopy approach has its own limit. Transcript spots become interconnected and resolution is lost at high mRNA expression (more than 600 ~1200 mRNA per cell). The measurement is also low throughput. We integrated fluorescent protein signal within cells (See 3.5.4) for protein expression to mimic the measureable in flow cytometry. And we counted GFP transcript number for mRNA expression. Cells were also arranged according to mCherry expression. Repression at transcriptional level is obvious from the scatterplot (second column of Figure 3.12). Repression fold is quantified similarly to flow cytometry, except that transcript background equals zero. Due to the limitation of microscopy experiment, we can only measure up to medium transfection levels, and this part exactly mimics the first half of the repression fold given by flow cytometry experiment. Two UTRs measured by microscopy, Casp2 and Lats2, confirm that transcriptional regulation is indeed stable at low target expression, which still spans a region of ~10 fold (ref blue line in Figure 3.5). The cells were also flow sorted according to mCherry signal into 5 bins, and followed by downstream genome-wide RNA sequencing (Chap. 5). Repression fold determined by sequence reads of gfp further confirms that transcriptional regulation does not saturates for high target expression (Supplementary Figure 3.2). Repression fold for the first bin (lowest target expression) is slightly lower than others, and it is likely caused by relatively high background reads for gfp. Animal miRNAs rarely acts through perfect complementary targeting and Ago2-dependent catalysis, it rather facilitates mRNA destabilization by accelerating transcripts deadenylation and decapping. MicroRNAs promote recruitment of CAF1-CCR4-NOT deadenylases, and the deadenylated mRNA is then decapped by Dcp1 /Dcp2 enzymes. Without the protection of poly (A) tail and 5'cap, the naked mRNA is vulnerable to 5'-to-3' exonucleolytic decay by XrnI and 3'to-5' decay by the exosome, and rapidly degraded. Thus the relative constant transcriptional repression might reflect the different turnover rate of intact transcripts versus deadenylated and decapped transcripts. The deadenylation, decapping and endonucleolytic cleavage processes themselves are biochemical, and are relatively fast compared to the recruitment process. And all of the participating enzymes can be quickly recycled. That might explain the large capacity of transcriptional regulation and its refractory to saturation at high target expression. 57 a. Lats2MutUTR, mCherry b. Lats2MutUTR, GFP c. Lats2OriUTR, mCherry d. Lats2OriUTR, GFP Figure 3.4 Typical microscopy images of GFP and mCherry proteins expression for Lats2 Mut/OriUTR in Cotransfection experiment. GFP and mCherry proteins are approximately correlated without miRNA regulation in the MutUTR experiment (a-b). GFP is strongly repressed and the correlation between two channels is lost with miRNA regulation in the OriUTR experiment (c-d). 58 Casp2 repression fold Lats2 repression fold 8 1s 12- b --- total repression fold -- mRNA degradation repression total repression fold -+-mRNA degradation repression -+protein translation inhibition 10 ..... RF =1 reference line -- 16 -+- protein translation inhibition RF = 1 reference line 1412- 8 10- 8 -6- 4 4- 2 2 0' 10 - - - 10 0--0 10 10 transfection lovel(RFP total protein) transfection level(RFP total protein) Figure 3.5 miRNA repression at transcriptional and translational levels for different target expression, data is quantified from microscopy experiment. pCAG-d2eGFP-Casp2/Lats2 Ori/MutUTR is co-transfected with pCAG-mCherry into WT ESCs, mRNA degradation repression is confirmed to be stable for low target expression. And translational inhibition is confirmed to increase initially. Total repression is plotted in red, transcriptional repression is plotted in blue and translational contribution is plotted in green. 59 Translational regulation saturates at high target expression Contrary to the relative stableness of transcriptional regulation, translational regulation varies for different target expression levels (green line in Figure 3.3 a-c). At the high end of target expression, translation inhibition always decreases for increasing targets (Figure 3.3). The decrease is consistent with translational efficiency directly quantified from reporter mRNA-protein relationships (black line in Figure 3.2c). It is known that miRISCs compete with initiation complexes eIF4F for binding to the cap structure and inhibit translation initiation. It is possible that miRISCs and related translation repressors are limiting factors within cells. MicroRNA targets have to compete with each other for shared regulating elements, and the accessible resources per target get titrated away when target expression is high. And it results in decrease of regulation at translational level. Surprisingly for some UTRs (e.g. Casp2, Lats2, and Rbl2, but not for Lin28a and P21), translational regulation increases initially at the low end of target expression (Figure 3.3), and regulation only reach its maximum capacity at the ultra-sensitivity region (Mukherji et al., 2011). And this is further confirmed by microscopy measurement. Figure 3.4 presented one typical microscopy image of MutUTR and OriUTR for Last2 UTR respectively. We observe that for Lats2 MutUTR, which is devoid of miRNA repression, intensity of reporter protein GFP roughly correlates with transfection level indicator mCherry (Figure 3.4 a and b). For Lats2 OriUTR, GFP expression is strongly repressed to background levels for the given field of view, and the correlation is lost (Figure 3.4 c and d). Protein expression is quantified as described (See 3.5.4), and translational repression do increase initially for Lats2 and Casp2 3'UTRs (green line in Figure 3.5). One trivial explanation for the observed initial increase of translational repression is that, in the cotransfection system, the actual delivered GFP to mCherry plasmids could deviate from bulk ratio. In the extreme case, some cells might only take in mCherry plasmid. The variability is more pronounced at low transfection levels, and could result in underestimation of repression. To rule out this possibility, we conducted the experiments with bi-directional plasmids, which have a fixed GFP to mCherry ratio of 1:1 at single cell level. The initial increase still exists for the bi-directional plasmids (Figure 3.6), excluding the possibility that the initial increase is caused by contamination of mCherry singly transfected cells. The repression values derived from the two system are also similar. However, we do not observe the final decrease of translational inhibition. This is due to different reporter to indicator ratios in the two systems, bi-directional system has a fixed ratio of 1 and the ratio was set to 7 in the cotransfection system. Thus bi-directional system has not reached the saturation region, and the derived repression resembles the left part of the repression plot from cotransfection system. MicroRNA-mediated translational inhibition is a complicated process which could happen at translation initiation, elongation and termination stages. This multilayer process might not happen simultaneously. And depending on the relative concentration of miRNA targets and regulating factors, translational regulation at one stage might occur in addition to one another. Indeed, the formation of microscopically visible P-bodies has been observed as a consequence of miRNA regulation. Those subcellular loci are enriched with translational repressors such as RCK/p54 and eIF4E-transporter. P-bodies are also enriched in mRNA deadenylation and decapping enzymes, which in turn contribute to translation inhibition by disrupting the synergy between the two ends of transcripts. It might be possible that P-bodies aggregation only formed under certain target 60 concentration, and their formation in turn facilitate miRNA regulation, and results in the initial increase for translational repressions. Lats2 UTR repression, bi-directioani plasmid 10 ,* 1 Total repression fold * mRNA degradation repression Protein translation inhibition RF = I reference line 8M -a 6 C4 40 2- 10 10 mCherry 10 Figure 3.6 miRNA repression at transcriptional and translational levels for Lats2 UTR, data is quantified from bi-directional plasmids. Bi-directional plasmid pTRE-GFP-Lats2Ori/MutUTR-mCherry is transfected into V19 ESCs, and induced with 1 pg/ml doxycycline. MicroRNA repression is measured by flow cytometry. Translational inhibition is confirmed to increase initially. Insert is miRNA repression for the same UTR measured from cotransfection system. Repression values derived from the two measurements are similar, except we do not observe the final titration of regulation in the bi-directional system. This is due to different ratio of reporter to indicator in the two systems. We fix this ratio to be 7 in the cotransfection system whereas the ratio always equals to 1 in the di-directional system. Total repression is plotted in red, transcriptional repression is plotted in blue and translational contribution is plotted in green. 61 3.4 Discussion It is still an unresolved issue why genome-wide analysis usually differs from single-gene analysis and reporter assays (Eichhorn et al., 2014). One possible explanation is that single-gene analyses study the effect of a cohort of miRNAs acting on an endogenous gene whereas genome-wide analysis normally explores the effect after a single miRNA overexpression or depletion. Additionally, transgene is usually overexpressed in reporter assays, which could be much higher than majority of endogenous gene expression within cells, especially miRNA targets. We have shown that relative contribution from translational inhibition and transcriptional degradation varies for different targets expression, and the discrepancies between reporter assays and genome-wide analysis might simply reflect different modes of regulation at different target expression regions (Table 3.2 and Supplementary Figure 3.6). Some recent analysis revealed that the miRNAmediated transcriptional and translational regulation varies in terms of relative contribution at different time points after miRNA activity induction (Eichhorn et al., 2014; Selbach, 2008). Translational regulation dominates miRNA regulation at early phase of exogenous miRNA introduction, and mRNA destabilization accounts for majority of repression at steady state (Eichhorn et al., 2014). Some of our initial effort showed that miRNA regulation varies both in strength and profiles for cells harvested 24hs to 72hs after transfection (data not shown). In addition, our data adds another dimension to describe miRNA regulation, and revealed that the relative contribution of translational and transcriptional regulation also varies for different target expression. It might seem surprising that translational inhibition was observed to increase at low target expression region for some UTRs under study. And further experiments is needed for corroboration of the conclusion. Targeted mass spectrometry (targeted MS) could be one possible solution. By sacrificing total number of monitored peptides for resolution, targeted MS could achieve very high resolution for user-specified list of targeted precursor-fragment pairs ('transitions'), i.e. the fingerprints for the selected proteins. In our pilot experiment, the linearity of detection holds down to 10,000 cells. It also does not require isotope amino acid culturing. But most importantly, this method is background free. Thus it could provide confident data for low target expression region. Cells under normal culturing and transfection are sorted into 5 bins according to mCherry intensities, and followed by targeted MS. Reporter protein, indicator protein, and proteins of several top miR-290 targets were selected for measurements. However, even though fingerprints for all desirable proteins were successfully picked with clear resolutions, the initial attempts from sorting have not been successful. The main challenge is the time scale of experiment and the lack of intermediate check on sample quality. Some efforts have been made to alleviate this (Supplementary Figure 3.4), and the sorted sample were also split for RNAsequencing, which provides confidence on RNA integrity and serves as side proof on sample quality. But more efforts were needed for the reproducibility of targeted MS itself, and we are looking forward to the results provided by this independent measurement technique. 62 3.5 Methods 3.5.1 Flow cytometry experiments Each set of experiments at least contained the mock transfection experiment 1 and single color transfection experiments (3 and 4) for characterization of cellular background and color compensation between channels. For each 3'UTRs of choice, experiments 7 and 8 have been performed for validation of the MutUTR design. Experiments 5 and 6 or 5 and 7 are performed together to allow quantification of miRNA-mediated repression fold and noise control. Experiments abbreviation Goals 1 Mock Transfected, HB with GFP mRNA -Cy5 FISH probe Mock T Single value estimation of background signal 2 GFP single Transfection, no HB 3 GFP single Transfection, HB GFP single T FMO control for Cy5. Color compensation for Cy5 BT into other 2 channels. 4 single mCherry Transfection, no HB mCh single T Single color control for mCherry. Color compensation for mCherry BT into other 2 channels. Also used as bin wise estimation of background 5 WT OriUTR, directional TT Co-T/bi- XXX OriUTR miRNA activity reporter for an endogenous UTR 6 WT MutUTR, directional TT Co-T/bi- XXX MutUTR miRNA unregulated control for the corresponding UTR 7 OriUTR, KO directional TT Co-T/bi- miRNA unregulated control for the corresponding UTR 8 KO MutUTR, directional TT Co-T/bi- Validation of MutUTR design for the corresponding UTR Single color control for GFP. Color compensation for GFP BT into other 2 channels. Table 3.1 Transfection experiment and control set. HB: hybridization. TT: transient transfection. BT: bleed through. FMO: Fluorescence minus one. WT: wild-type mESCs. KO: Dgcr8 knockout mESCs. Co-T: pCAG-d2eGFP-Ori/MutUTR and pCAG-mCherry cotransfection. Bi-directional: pTRE-d2eGFP-Ori/MutUTR-mCherry or pTREmCherry-Ori/MutUTR-ZsGreen bi-directional plasmid. The bleed through from GFP into other two channels was proven to be negligible, and only experiments 1, 3, and 4 were performed for estimation of background signal and color compensation for later experiments. 63 3.5.2 Flow Cytometry Data Processing Sample gating: Single cells were first gated from cell clusters and debris according to their forward scatter (FSC) and side scatter (SSC) profiles. Transfection conditions were optimized, and transfection efficiency was larger than 90%. Majority of the cells took in some (even though it can be few) plasmids, and expression of fluorescent proteins resulted in a global shift of fluorescent signals in the corresponding channels, as could be seen in the shift of the 'red eye' in the scatterplot Figure 3.7. If we want to reduce the shift of background, we could dilute the reporter plasmids with carrier plasmid. In previous studies (Mukherji et al., 2011), carrier plasmid pUCI8b was mixed with reporter plasmid with a ratio of 50:1 in the transfection. The carrier plasmid is non fluorescent, thus we can effectively reduce the shift of background, and extend the confidently quantifiable region. The usage of carrier plasmid is necessary, and could not be replaced by simply reducing the amount of transfected plasmid. Because certain ratio of DNA to transfection reagent has to be maintained to ensure effective transfection. Within the 'red eye', fluorescent signals between channels were correlated due to cell-cycle effect, so did the cell background signals. Our single value background estimation did not work for this region. The mCherry single transfection background did not work either. Because due to the competitive nature of the co-transfected plasmids in the cotransfection, the average admission of mCherry plasmid in the cotransfection experiment was less than single mCherry plasmid transfection experiment, and their 'red eyes' did not fully overlap. Thus for simplicity, we just excluded this part of cells for later analysis. Specifically, the marginal distributions of transfection level indicator mCherry were plotted (Figure 3.8). A Gaussian distribution was fitted to the bell shape in the lower end, and only cells with log 10 (mCherry) signal> 3 were gated for later analysis. This corresponded to the cells having mCherry expressed larger than 4 standard deviation away from the mean of Gaussian in the cotransfection experiments. And those cells were far enough from the 'red eye', such that cellcycle effect and background estimation methods had very little effect on the data analysis (Figure 3.9). 64 Figure 3.7 Scatterplot of GFP protein or mRNA levels versus mCherry expression. Color corresponds to the Jet heat map of the local cell densities. Note the shift of fluorescent signals for population majority (red eye) in various transfection experiments. Only cells on the right of the black lines were selected for data analysis, since they were far enough away from cell majority background and were very robust to whichever data analysis methods applied. Mock T Mock T 5 5 4.5 4.5 4 4- 3.5 3.5- 3 3- 2.5 2.5- 2 2 1.5 1.5 0.5 1:5 2 2.5 3.5 3 log10(mChey) 4 4.5 I1 5 1.5 2 2.5 3 3.5 log10(mchey) 4 4.5 5 4 4.5 5 4 4.5 5 inCh single T rnCh single T 5 5 4.5 4.5 4 4 3.5 3.5 3 3 2.5 .92.5 2 2 1.5 1.5 I 1 1.5 2 2.5 3.5 3 Iog10(mChey) 4 4.5 11 5 1.5 2 2.5 3 3.5 loglO(mChery) GFP single T GFP single T S 4.5 - 4.5 4- 4 I.5 3.5- ~ S3 3g 2. ar2.5 4M 2 1.5 -1 1.5 2 2.5 3 3.5 4 4.5 V 5 log10(rnCheny) 65 -.- - I . 2 1.5 15 2 2.5 3 3.5 loglO(mCheny) Canp2OriUTR Casp2OriUTR 5 4.5 4.5 4- 4 -E3.5CL 3.5 3- 3 92.5 12.5 2- 2 1.5 1 1.S 1.5 2.5 3-3. 2 2.6 3 1.5 3.5 loglO(mChwry) 4 4.5 1L 1 5 1.5 2 2.5 3 3.5 2 2.5 3 3.5 loglO(mChemry) 4 4.5 5 4 4.5 5 Casp2MutUTR Casp2MutUTR 5[ 4.5 - - 4.5 4- 4 3.5 S3.5- 92.5 2.5 2 1.5[ 1.5 1.5 2 2.5 ~ *1 1.5 2 2.5 loglo(mCherry) 3.5 3 klo10(mchwy) kts2OriUTR lats2OriUTR 3 3.5 4 4.5 5 45 54.5- 4 4- 3.5 13.61 L3 * :~.. 3- ~2.5 S2.5 2 2- 124 1.5- 1.5. 2.5 0.5 1.5 2 2.5 3 3.5 3 3.5 loglO(mCherry) 4 4.5 5 IWts2MutIJTR lats2MutUTR 5 5 4.5 4 4 3.5 W3.5 3 ;39 2.5 2 2- 1.5 1.5 0.5' I1 1.5 2 2.5 3 3.5 loglO(mCherry) 4 4.5 5 IoglO(mctwrry) 66 Figure 3.8 Marginal distribution of mCherry expression. Transfection of fluorescent plasmids did not affect background signal in other channels (e.g. mCherry channel of Mock T and GFP single T were statistically identical despite high GFP expression). But expression of fluorescent proteins increased signals in the corresponding channel, and high transfection efficiency resulted in shifts of the 'red eye' (e.g. comparing row 2, and 4-7 to row 1 and 3). Gaussians were fitted to the bell shapes in the lower end of cotransfection experiments (row 4-7), parameters were estimated to be y = 2.2, a = 0.2. Only cells with logio (mCherry) value larger than 3 were gated for later analysis. Those cells were larger than 4 standard deviations away from the mean of population majority background (> 0.9999 quantile), and were very robust to whichever background estimation methods we applied. -1 Mock T 0.01 0.005- I I I - * * i 1 T.5 0.01 0.005 [ 1.5 II 1 .5 1.5 2 3.5 2.5 mCh stgle T I I 2 2.5 3.5 3 GFP Qigle T I 0.0110.005- 1 0.01 rC0.005 .5 1 1.5 2 I I i 1 1.5 2 2.5 .5 1.5 2 4.5 5 5.5 4.5 5 5.5 4 I -1 I I I I I 3.5 4 4.5 I I 4 4.5 3.5 2.5 Casp2 rUTR I 1 4 5 5.5 5 5. 5 5. 5 lats2l UTR L 0.01 0.005 I _____________________________________ * .5 I 3.5 TR 2.5 lats2M I I 4 4.5 5 I I I 4 4.5 5 5.5 4 4.5 5 5.5 0.010.005 b.5 1 1.5 2 1 3.5 2.5 Casp2l lutUTR II 0.01 0.005 ~h&t*hEImmmj~L~ II .5 1 ___________ 1.5 2 3.5 3 2.5 log I0(mCherry) 67 3.5.3 Background analysis and repression fold calculation Cells have autofluorescence, and non-specific binding of FISH probes inside cells particularly increases the background in the mRNA channel. The background signal needs to be deconvolved from measurements to get true biological signal from protein and mRNA product. On the population level, we need to estimate the mean of background signals. Two approaches were adopted. Mock transfection experiment (experiment 1 in Table 3.1) provided a single value estimation of background in each channel. And mCherry single plasmid transfection experiment (experiment 4 in Table 3.1) provided a bin-wise background estimation for different mCherry expression levels. Cells were binned according to mCherry intensities, and bin width was set to ~0.2 in loglO space. The mean of GFP protein and mRNA were calculated for each bin, and standard error of the mean (SEM = Ad ) were plotted as error bars together with the mean in all the bar plots. There was still residue bleed through between different channels after color compensation (y value increased slightly at very high mCherry levels in mCherry single transfection experiment, which was plotted in black in Figure 3.9). And the second approach took into consideration the effect of non-perfect color compensation. Repression fold were calculated accordingly with the two approaches. For the single value estimation of background, repression fold was calculated as: repressionfold = unregulated bin - background regulated Ibin - background Equation (1) And for the bin-wise estimation of background, repression fold was calculated as: repressionfold - unregulated bin - background I bin regulated Ibin - background Ibin Equation (2) We could see that these two approach yielded essentially the same result on repression fold quantification in the region for analysis (i.e. logio (mCherry)>3, Figure 3.9 c and d). And the conclusions we drew in this study was independent of the background estimation approaches we took. For population repression fold calculation, all the transfected cells were combined. This included the cells in the 'red eyes'. Similar to bulk experiments like luciferase assays, background (single value estimation from mock transfection) was subtracted from total signals. The fluorescent signals from all cells were summed up, and the total expression of OriUTR was compared to MutUTR to derive single value repression fold. Figure 3.9 Bar plot of protein and mRNA expression at different transfection levels and repression fold calculated using different background estimations. Cells were binned according to mCherry expression (i.e. different transfection levels), and the mean of GFP protein (a) and GFP mRNA (b) were calculated in each bin. Error bars corresponds to SEM. Mock transfection and mCherry single plasmid transfection experiments were conducted 68 in parallel to evaluate cellular background, and the two approaches yield very similar estimation (black and gray lines in a and b). Repression fold were calculated by subtracting single value background estimated from mock transfection (c) and bin-wise background estimated from mCherry transfection (d), and the repression fold derived from the two approaches are very similar. Iats2 protein barpiot 10 a - MutUTR, WT - - OriUTR, WT -OrUTR, WT -mCh singleT background mean background -MCh singleT background mean background 10 lats2 mRNA barplot 10 b MutTR, WT 10' -1. 10' E 0. L- 10, 10 102 10 '102L. 10 3 10' 10 3 10 log(mCherry) C Ifs2UTR Repression Fold MockT rean bg ubtraction) 15 -- total repression +transcriptional degradation -+-trnslational inhibition RF =I reference lne 15 -+-+- -- 10' y i0s log(mCherry) d Iats2UTR Repression Fold inCh singleT bg subtraction total repression transcriptional degradation translational inhibition RF = I reference line 10- 101F 32 0 L 0 U- I 5. (a 5 5 01 El] 10 10' mCherry 10, 69 10 3 104 mnCherry i0s 3.5.4 Microscopy image analysis Microscopy imaging, cell segmentation and FISH transcript counting were the same as (Methods 2.4.11). To quantify fluorescent protein expression. Images were taken for the corresponding fluorescent channel at the focal plane of field of cells, which was chosen by the auto focus function of the microscope. The mCherry protein we used was susceptible to photo-bleaching, thus a stack-wise measurements along z direction was not adopted. Cells were segmented as previously described, and pixel values within the cell boundaries were summed up. Cell height was counted as the stack number between the appearance of first transcript and the disappearance of last transcript in the view. The expression at 2-D plane was then multiplied with z-height to get total protein expression. Here we assume that cells were bounded by the slides and cover glass, thus all cells within one field of view had similar z-direction height, and cell volume was proportional to intersection area. But different sample slides were squeezed differently, and the z-height had to be taken into account. To extend the range of measurement, different exposure conditions of mCherry images were taken for Casp2UTR experiment. The signal of cells with high mCherry expression was saturated at larger gain (50ms, gain3), but not for smaller gain (50ms gaini). Cells which were not saturated in both conditions were used for regression, and the expression of saturated cells could be extrapolated (Figure 3.11). Images with shorter exposure time could also be used for extrapolation, but exposure time of less than 1 Oms was not recommended as it challenged the accuracy of mechanical camera shutters. To estimate cellular autofluorescence background, images were taken on the mock transfected sample, and the average was taken as the single value estimation for background in the protein channel, similar to background estimation method 1 in flow cytometry. 70 Figure 3.10 Illustration of cell segmentation and transcript quantification of transfection sample. GFP transcripts were labeled with Cy5 probes. (a) Z-projection of transcripts expression. (b) Cell segmentation and transcripts counting. Cells were segmented according to bright-field images, and the boundaries were plotted in red. GFP transcripts detected by the algorithm were plotted as green +S. 71 RFP mg Condition Calibration 14 X 104 + y = -2094+ 4.41*x r=0.9996 12C 10*i co E 4 - LO U (L4 U- 2- 0 0.5 2.5 2 1 1.5 RFP Focal Plane 50ms, gaini 3 3.5 X 10' Figure 3.11 Extrapolation of RFP expression. Usually, images with higher gain or longer exposure time was used for quantification due to its better correlation with real signals. To extend the range of measurement, images with different exposure conditions were taken for the RFP channel. Cells with high RFP expression saturated in signal in gain 3 condition but not in gain 1. And the latter was used for extrapolation of RFP expression. For GFP, the extrapolation was not needed because the protein expression was highly repressed for the measured UTRs. 72 Casp2 MutUTR Co-T Ifild. 2439 cells 44 10 10 1000 *4 .4 *. *.. **4~*, g * v;.4 # CL *,*. : ** 0 * * * 4 .~ *4 * #~ *. ** 10 , 4 * 4 0 10 10 10 4 U. 109 SW0 10 10 4 * 10,I .4 10 10 * * ** 10CO- 10 1000 500 10 154 Gasp2 OrIUTR Co-T 2497 cells 10 10 10 1000 z U.-0 E 0 * 60 . 44 ** * 4~* .~ .. * a 4 * * L o~9 10 , , * *4 * 4* .C, ~4. 10 10 10 0 10 10 10 10 RFP total protein 10" 10 10 RFP totl protein I0 11 10 0 1000 500 GFP mRNA 1500 Figure 3.12 Microscopy data analysis. Top row is data from cotransfection of pCAG-d2eGFP-Casp2 MutUTR and pCAG-mCherry, and the bottom row is from cotransfection of pCAG-d2eGFP-Casp2OriUTR and pCAG-mCherry. Data were presented in a similar fashion as the scatterplot of flow cytometry data, except that GFP transcript level were plotted on a linear scale due to zero expression in some cells. The first two columns present the expression of GFP protein and GFP mRNA arranged by the expression of transfection level indicator mCherry. The scatterplot of GFP protein versus mCherry from microscopy is very similar to flow cytometry data (compare first column in Figure 3.12 and Figure 3.7). The strongly correlated left-lower part of GFP protein plot corresponds to the 'red eye' in the flow cytometry scatterplot. For the 2nd column, if we estimate the cellular autofluorescence in transcript channel to be -50 molecules (Klemm et al., 2014), add the background average to transcript count, and plot the total count on a log scale, the scatterplot would also resemble its flow cytometry counterpart (data not shown). By comparing the first and second column, we observe that at the same transfection level, miRNA repression of both the protein and transcript production is obvious. By comparing the third column, we notice that the transcript expression under miRNA regulation shrinks to lower values. Also at the same transcript expression, GFP protein expression is lower. Thus, the microscopy data is consistent with flow cytometry experiment, and miRNA exerts its regulation via both transcriptional and translational level. 73 11 Casp2 mRNA Barplot Casp2 Protein Barplot 800 I 10" r 700 600 1010 z E 400 0. Ua, 300 0. 0 500 10 200 10 8 1010 108 O 10 1011 - - 100 10 1010 1011 Lats2 mRNA Barplot Lats2 Protein Barplot 1200 100r 1000800- z CL 0. aL E 10 0. L C, I 10 9 108 600400200fL 1 0' 1010 1011 10 10' 1010 1011 Figure 3.13 Microscopy data bar plot. Protein and transcript expression were quantified for microscopy experiment. A background threshold of 10 9 on mCherry expression was applied, and only cells on the right side of the black line were shown for repression fold plot. Microscopy is background free in the mRNA channel, the transcriptional repression of miRNA is apparent even for low target expression. 74 3.6 Supplementary Information Lin28 Repression Fold a -+-Overall Repression Fold 4.5 -+-mRNA Degradation Repression Protein Translation Inhibition -+RF - I reference line - 3.5F 3.5 3 3 2.51F I 2 b Overall Repression Fold 4.5 --- mRNA Degradation Repression - Protein Translation Inhibition RF - I reference line A -- 4 I P21 Repression Fold 5 2.5 5W. a 1.5 1.5I II 0 0.5- .. 4 10 5 10 mCherry 2 0 10 3 10' mCherry 10 Population Repression .6 r C 88% 2 1. 70% 0.56 0 IZTI Supplementary Figure 3.1 miRNA repression at transcriptional and translational levels for Lin28a and P21 UTRs, data quantified from flow cytometry. Total miRNA-mediated repression (red), transcriptional repression (blue) and translational contribution (green) were quantified for Lin28a UTR (a), and P21 UTR (b) over a target expression region of about 100 fold. (c) All transfected cells were combined and a single population repression value was derived, with total repression represented in red bars and transcriptional repression in blue bars. No error bars were included because experiments have been 75 performed for less than 3 times. The percentage corresponds to the relative contribution from transcriptional regulation to total repressions. 5 2LAM E 0 0 100 3 2.. 1 102 101 1o mCherry (normalized to binO) Supplementary Figure 3.2 miRNA-mediated transcriptional repression on Lats2 UTR, data quantified from cell sorting and downstream RNA sequencing. pCAG-d2eGFP-Lats2Ori/MutUTR and pCAG-mCherry were co-transfected into WT ESCs, sorted into 5 bins according to mCherry expression and followed by RNA sequencing. MicroRNA mediated transcriptional repression via Lats2 UTR was calculated from gfp reads. Transcriptional regulation does not get titrated for the last bin. 76 lats2 Repression Fold GLORvsGLMR 10 -+-Overall Repression Fold mRNA Degradation Repression 9 --- Protein Translation Inhibition 0.99 quantile of Mock T background 8 a al --7 -+- lats2 Repression Fold RLOX vs RLMX Overall Repression Fold mRNA Degradation Repression Protein Translation Inhibition RF = I reference line b 6 7 . 6 54 5 1 4 3 2 2 1 1 %I a03 - 104 i u10 3 10 mCherry 10' ZsGreen 105 Supplementary Figure 3.3 Color switch experiment. Different combinations of fluorescent proteins were used as reporters and indicators. (a) GFP was used as reporter, and either the original and mutated version of Lats2a 3'UTR was appended behind it. mCherry was used as indicator protein, and was followed by a non-miRNA-regulated tail. The repression fold was calculated for different target expression levels in wild type ESCs. (b) mCherry was used as reporter, and ZsGreen was used as indicator protein. MicroRNA repression strength was quantified for Lats2a 3'UTR too. Similar miRNA repression fold trend, i.e. relative constant repression strength at transcriptional level, and the initial increase and final decrease at overall protein product level were observed for both color sets, indicating that the miRNA repression behavior we observed is attributed to miRNA mediated regulation, not to any intrinsic properties of the fluorescent proteins we used. 77 pwaforrnaldehyde A' 5 A 4 3 2 1 1 U- EtOH long fix 3 2 4 EtOH short fix 5 5 4 4 I 3 2 1 2 5 3 2 1 4 Methanol + 11, 2 C RNAprotector Acetic Acid 0 4 b a RNAlater 5 5 4 4 3 3 2 2 ;IPX % 1 2 3 4 0 6 d 2 4 e 0 2 4 f loglO(mCherry) Supplementary Figure 3.4 The effects of fixation methods on fluorescent protein signal. During FACS experiment, live cells could stay out of cell culture condition and stay on ice for up to several hours between trypsinization and sorting into lysate buffer. The experiment schedule was very restricted due to short time window of cell viability after trypsinization. Also, cell states could possibly change and protein could get degraded during these time. Thus we tried various fixation methods, and hoped to find a method to fix the cell states without perturbing the fluorescent protein signal. Since the fixed cells also had to be compatible for cell sorting the downstream Mass spectrometry (MS) and RNA sequencing, the fixation methods had to preserve the morphology of the cells and the integrity of both protein and transcripts. The standard paraformaldehyde fixation method was not applicable due to its cross-linking nature. We tried the following methods, which could be coarsely separated into two types of mechanisms, fixation by denaturation and fixation by precipitation. Shown are the fluorescent signal scatterplot after cell fixation. (a) Fixation by standard 4% paraformaldehyde followed by 70% Ethanol. (b) Fixation by 70% ethanol, and preservation in ethanol. (c) Fixation by 70% ethanol for 10 minutes, and preservation in IX PBS. (d) Fixation by 3:1 v/v Methanol + acetic acid. (e) Fixation by RNAprotector@ (Qiagen) (f) Fixation by RNAlater@ (Ambion). None of the methods fully preserve the fluorescent signal. But if in future fixation before sorting becomes essential, fixation by short time ethanol looks to be most promising, and shorter time can be explored. 78 - 5.5 54 AI 4 -4I:- 3.5 P4- 3 U- ' 2.5 2 1.5 1 nd I 1.5 I 2 I 2.5 I 3.5 Iog(GFP mRNA) loom MWIW qq 3 0V Lats2 Lats2 4 10 OrIUTR -MutUTR - reatv Ori JTR UTR 3.5[ - -Mut 5.5 5 4.5 4 10 2.5 Iz E 2 C. . A CL .5S 1.5 CL1 I 03 0.5 10 GFP mRNA ' 10 5 10 6 2 -2 10 3 . '3' 10 ' 1021 2 10 10 3 4 10 GFP mRNA 10 5 10 6 Supplementary Figure 3.5 The relationship between protein and mRNA for Lats2 3'UTR. pCAG-d2eGFP-Lats2Ori/MutUTR were co-transfected with pCAG-mCherry plasmid into wildtype ESCs. Only cells 4 std away from the transfected majority background are selected for the plot. (a) Scatterplot of GFP protein versus mRNA, with marginal distributions plotted on the side. OriUTR was plotted in red, and MutUTRs was plotted in blue. Only 1% of the measured cells were plotted for better visualization. MicroRNAs repress both transcript and protein levels of GFPOriUTR, and the histograms of both are shifted towards smaller values. (b) Bar plot of protein and mRNA. Cells were binned according to mRNA expression, and the mean of protein expression in each bin were calculated. Error bars correspond to SEM. (c) Proteins produced out of unit transcript at different transcript expression. The conversion rate for OriUTR stays relatively constant, but decreases for MutUTR at high transcript levels. This reflects the de-repression of translation inhibition at high transcript levels, as depicted in the black dotted line. 79 LdtS7 tbP mRNA f11012 Pill D9 Oc4 SOX2 3s8 203 0.97 1.17 (other literature) Table 3.2 FISH quantification of gene expression at transcriptlevel in WT and Dgcr8' ESCs. mRNA Barplot 800 -+-total repression fold -eCasp; OrQUTR -*-mRNA degradation repression -+-protein translaio inhi=ition 10 RF = I reference line MutUTR 700 Casp2 repression fold 12 600 8 500 z E 400, 3C 6 ce 0U- (9 300 4 200 2 100 0 1.'-~ 10 n 1010 mCherry (A.U.) 109 10 10 mCherry (A.U.) Supplementary Figure 3.6 Linking endogenous gene expression to miRNA repression in vivo. The relationship between endogenous gene expression and total miRNA repression fold is monotonic in the low target abundance region. Thus we can link the two, and estimate miRNA repression and the relative contribution from mRNA degradation and translation inhibition in this region. miRNAs preferentially target lowly expressed genes in vivo (Farh, 2005; Sood et al., 2006). Consistent with this, Table 3.1 shows that most of the miR-290 targets are expressed less than 100 transcripts per cell. Here we utilized microscopy measurement. First we linked mRNA levels with indicator expression (Left), then we linked indicator expression with miRNA repression (Right). And we observe that miRNA repression is subtle in the low target abundance region, and the majority of contribution comes from mRNA degradation. Our estimation is consistent with in vivo genome-wide assays (Baek, 2008; Guo et al., 2010; Hendrickson, 2009). 80 3.7 References: Andrei, M.A. (2005). A role for eIF4E and eIF4E-transporter in targeting mRNPs to mammalian processing bodies. RNA 11, 717-727. Baek, D. (2008). The impact of microRNAs on protein output. Nature 455, 64-71. Behm-Ansmant, I. (2006). mRNA degradation by miRNAs and GW182 requires both CCR4:NOT deadenylase and DCP1 :DCP2 decapping complexes. Genes Dev 20, 1885-1898. Beilharz, T.H. (2009). microRNA-mediated messenger RNA deadenylation contributes to translational repression in mammalian cells. PLoS ONE 4, e6783. Bhattacharyya, S.N., Habermacher, R., Martine, U., Closs, E.I., and Filipowicz, W. (2006). Relief of microRNA-mediated translational repression in human cells subjected to stress. Cell 125, 11111124. Braat, A.K., Yan, N., Arn, E., Harrison, D., and Macdonald, P.M. (2004). Localization-dependent oskar protein accumulation; control after the initiation of translation. Dev Cell 7, 125-131. Chu, C.Y., and Rana, T.M. (2006). Translation repression in human cells by microRNA-induced gene silencing requires RCK/p54. PLoS Biol 4, e210. Clark, I.E., Wyckoff, D., and Gavis, E.R. (2000). Synthesis of the posterior determinant nanos is spatially restricted by a novel cotranslational regulatory mechanism. Curr Biol 10, 1311-1314. Derry, M.C., Yanagiya, A., Martineau, Y., and Sonenberg, N. (2006). Regulation of poly(A)binding protein through PABP-interacting proteins. Cold Spring Harb Symp Quant Biol 71, 537543. Ding, X.C., and Grosshans, H. (2009). Repression of C. elegans microRNA targets at the initiation level of translation requires GW182 proteins. EMBO J 28, 213-222. Doench, J.G., and Sharp, P.A. (2004). Specificity of microRNA target selection in translational repression. Genes Dev 18, 504-511. Eichhorn, Stephen W., Guo, H., McGeary, Sean E., Rodriguez-Mias, Ricard A., Shin, C., Baek, D., Hsu, S.-h., Ghoshal, K., Villen, J., and Bartel, David P. (2014). mRNA Destabilization Is the Dominant Effect of Mammalian MicroRNAs by the Time Substantial Repression Ensues. Molecular Cell 56, 104-115. Eulalio, A. (2007). Target-specific requirements for enhancers of decapping in miRNA-mediated gene silencing. Genes Dev 21, 2558-2570. Eulalio, A. (2009). Deadenylation is a widespread effect of miRNA regulation. RNA 15, 21-32. Eulalio, A., Behm-Ansmant, I., and Izaurralde, E. (2007a). P-bodies: at the crossroads of posttranscriptional pathways. Nature Rev Mol Cell Biol 8, 9-22. Eulalio, A., Behm-Ansmant, I., Schweizer, D., and Izaurralde, E. (2007b). P-body formation is a consequence, not the cause, of RNA-mediated gene silencing. Mol Cell Biol 27, 3970-3981. 81 Eulalio, A., Huntzinger, E., and Izaurralde, E. (2008). GW182 interaction with Argonaute is essential for miRNA-mediated translational repression and mRNA decay. Nature Struct Mol Biol 15, 346-353. Fabian, M.R. (2009). Mammalian miRNA RISC recruits CAF1 and PABP to affect PABPdependent deadenylation. Mol Cell 35, 868-880. Fabian, M.R., Sonenberg, N., and Filipowicz, W. (2010). Regulation of mRNA translation and stability by microRNAs. Annu Rev Biochem 79, 351-379. Farh, K.K. (2005). The widespread impact of mammalian microRNAs on mRNA repression and evolution. Science 310, 1817-1821. Filipowicz, W., Bhattacharyya, S.N., and Sonenberg, N. (2008). Mechanisms of posttranscriptional regulation by microRNAs: are the answers in sight? Nature Rev Genet 9, 102-114. Giraldez, A.J. (2006). Zebrafish MiR-430 promotes deadenylation and clearance of maternal mRNAs. Science 312, 75-79. Guo, H., Ingolia, N.T., Weissman, J.S., and Bartel, D.P. (2010). Mammalian microRNAs predominantly act to decrease target mRNA levels. Nature 466, 835-840. Hendrickson, D.G. (2009). Concordant regulation of translation and mRNA abundance for hundreds of targets of a human microRNA. PLoS Biol 7, e1000238. Huang, J. (2007). Derepression of micro-RNA-mediated protein translation inhibition by apolipoprotein B mRNA-editing enzyme catalytic polypeptide-like 3G (APOBEC3G) and its family members. J Biol Chem 282, 33632-33640. Humphreys, D.T., Westman, B.J., Martin, D.I., and Preiss, T. (2005). MicroRNAs control translation initiation by inhibiting eukaryotic initiation factor 4E/cap and poly(A) tail function. A A~ Q _: T TO A 4Q_ 1 1 40AC Proc XT-4T.1 IN atL Acad .3%. USAX I1n1,1 11696VV -1_6966V. Iwasaki, S., Kawamata, T., and Tomari, Y. (2009). Drosophila argonauteI and argonaute2 employ distinct mechanisms for translational repression. Mol Cell 34, 58-67. Jackson, R.J. (2005). Alternative mechanisms of initiating translation of mammalian mRNAs. Biochem Soc Trans 33, 1231-1241. Jakymiw, A. (2005). Disruption of GW bodies impairs mammalian RNA interference. Nature Cell Biol 7, 1267-1274. Kedersha, N. (2005). Stress granules and processing bodies are dynamically linked sites of mRNP remodeling. J Cell Biol 169, 871 -884. Kiriakidou, M. (2007). An mRNA m7G cap binding-like motif within human Ago2 represses translation. Cell 129, 1141-1151. Klemm, S., Semrau, S., Wiebrands, K., Mooijman, D., Faddah, D.A., Jaenisch, R., and van Oudenaarden, A. (2014). Transcriptional profiling of cells sorted by RNA abundance. Nat Meth 11, 549-551. Krutzfeldt, J. (2005). Silencing of microRNAs in vivo with 'antagomirs'. Nature 438, 685-689. 82 Leung, A.K., Calabrese, J.M., and Sharp, P.A. (2006). Quantitative analysis of Argonaute protein reveals microRNA-dependent localization to stress granules. Proc Natl Acad Sci USA 103, 1812518130. Lim, L.P. (2005). Microarray analysis shows that some microRNAs downregulate large numbers of target mRNAs. Nature 433, 769-773. Liu, J., Valencia-Sanchez, M.A., Hannon, G.J., and Parker, R. (2005). MicroRNA-dependent localization of targeted mRNAs to mammalian P-bodies. Nature Cell Biol 7, 719-723. Lytle, J.R., Yario, T.A., and Steitz, J.A. (2007). Target mRNAs are repressed as efficiently by microRNA-binding sites in the 5 [prime] UTR as in the 3 [prime] UTR. Proc Natl Acad Sci USA 104, 9667-6972. Maroney, P.A., Yu, Y., Fisher, J., and Nilsen, T.W. (2006). Evidence that microRNAs are associated with translating messenger RNAs in human cells. Nature Struct Mol Biol 13, 11021107. Mathonnet, G. (2007). MicroRNA inhibition of translation initiation in vitro by targeting the capbinding complex eIF4F. Science 17, 1764-1767. Meister, G. (2005). Identification of novel argonaute-associated proteins. Curr Biol 15, 2149-2155. Mishima, Y. (2006). Differential regulation of germline mRNAs in soma and germ cells by zebrafish miR-430. Curr Biol 16, 2135-2142. Mootz, D., Ho, D.M., and Hunter, C.P. (2004). The STAR-Maxi-KH domain protein GLD-1 mediates a developmental switch in the translational control of C. elegans PAL-1. Development 131, 3263-3272. Mukherji, S., Ebert, M.S., Zheng, G.X.Y., Tsang, J.S., Sharp, P.A., and van Oudenaarden, A. (2011). MicroRNAs can generate thresholds in target gene expression. Nat Genet 43, 854-859. Nelson, P.T., Hatzigeorgiou, A.G., and Mourelatos, Z. (2004). miRNP: mRNA association in polyribosomes in a human neuronal cell line. RNA 10, 387-394. Nottrott, S., Simard, M.J., and Richter, J.D. (2006). Human let-7a miRNA blocks protein production on actively translating polyribosomes. Nature Struct Mol Biol 13, 1108-1114. Olsen, P.H., and Ambros, V. (1999). The lin-4 regulatory RNA controls developmental timing in Caenorhabditis elegans by blocking LIN-14 protein synthesis after the initiation of translation. Dev Biol 216, 671-680. Petersen, C.P., Bordeleau, M.E., Pelletier, J., and Sharp, P.A. (2006). Short RNAs repress translation after initiation in mammalian cells. Mol Cell 21, 533-542. Piao, X., Zhang, X., Wu, L., and Belasco, J.G. (2010). CCR4-NOT deadenylates mRNA associated with RNA-induced silencing complexes in human cells. Mol Cell Biol 30, 1486-1494. Pillai, R.S. (2005). Inhibition of translational initiation by Let-7 MicroRNA in human cells. Science 309, 1573-1576. 83 Poy, M.N., Eliasson, L., Krutzfeldt, J., Kuwajima, S., Ma, X., MacDonald, P.E., Pfeffer, S., Tuschl, T., Rajewsky, N., Rorsman, P., et al. (2004). A pancreatic islet-specific microRNA regulates insulin secretion. Nature 432, 226-230. Rehwinkel, J. (2006). Genome-wide analysis of mRNAs regulated by Drosha and Argonaute proteins in Drosophila melanogaster. Mol Cell Biol 26, 2965-2975. Rehwinkel, J., Behm-Ansmant, I., Gatfield, D., and Izaurralde, E. (2005). A crucial role for GW182 and the DCP1:DCP2 decapping complex in miRNA-mediated gene silencing. RNA 11, 1640-1647. Ruegsegger, U., Leber, J.H., and Walter, P. (2001). Block of HACI mRNA translation by longrange base pairing is released by cytoplasmic splicing upon induction of the unfolded protein response. Cell 107, 103-114. Schmitter, D. (2006). Effects of Dicer and Argonaute down-regulation on mRNA levels in human HEK293 cells. Nucleic Acids Res 34, 4801-4815. Seggerson, K., Tang, L., and Moss, E.G. (2002). Two genetic circuits repress the Caenorhabditis elegans heterochronic gene lin-28 after translation initiation. Dev Biol 243, 215-225. Selbach, M. (2008). Widespread changes in protein synthesis induced by microRNAs. Nature 455, 58-63. Sood, P., Krek, A., Zavolan, M., Macino, G., and Rajewsky, N. (2006). Cell-type-specific signatures of microRNAs on target mRNA expression. Proceedings of the National Academy of Sciences of the United States of America 103, 2746-275 1. Stark, A., Brennecke, J., Bushati, N., Russell, R.B., and Cohen, S.M. (2005). Animal microRNAs confer robustness to gene expression and have a significant impact on 3' UTR evolution. Cell 123, 1133-1146. Thermann, R., and Hentze, M.W. (2007). Drosophila miR2 induces pseudo-polysomes and inhibits translation initiation. Nature 447, 875-878. Wakiyama, M., Takimoto, K., Ohara, 0., and Yokoyama, S. (2007). Let-7 microRNA-mediated mRNA deadenylation and translational repression in a mammalian cell-free system. Genes Dev 21, 1857-1862. Wang, B., Love, T.M., Call, M.E., Doench, J.G., and Novina, C.D. (2006). Recapitulation of short RNA-directed translational gene silencing in vitro. Mol Cell 22, 553-560. Wu, L., Fan, J., and Belasco, J.G. (2006). MicroRNAs direct rapid deadenylation of mRNA. Proc Nati Acad Sci USA I03, 4034-4039. Yekta, S., Shih, I.H., and Bartel, D.P. (2004). MicroRNA-directed cleavage of HOXB8 mRNA. Science 304, 594-596. Zdanowicz, A. (2009). Drosophila miR2 primarily targets the m7GpppN cap structure for translational repression. Mol Cell 35, 881-888. Zhao, Y., Samal, E., and Srivastava, D. (2005). Serum response factor regulates a muscle-specific microRNA that targets Hand2 during cardiogenesis. Nature 436, 214-220. 84 Chapter 4 Application of reporter system to study microRNA control of protein expression noise 4.1 Abstract MicroRNAs repress many genes in metazoan organisms by accelerating mRNA degradation and inhibiting translation, thereby reducing the level of protein. However, microRNAs only slightly reduce the mean expression for most targeted proteins, leading to speculation about their role in the variability of protein expression, or noise. Here we use mathematical modeling and single cell reporter assays to show that microRNAs - in conjunction with increased transcription - decrease protein expression noise for lowly expressed genes, but increase noise for highly expressed genes. Genes that are regulated by multiple microRNAs show more pronounced noise reduction. We estimate that hundreds of (lowly expressed) genes in mouse embryonic stem cells have reduced noise due to substantial microRNA regulation. Our findings therefore suggest that microRNAs confer precision to protein expression and thus offer plausible explanations for the commonly observed combinatorial targeting of endogenous genes by multiple microRNAs as well as the preferential targeting of lowly expressed genes. 4.2 Results MicroRNAs regulate numerous genes in metazoan organisms (Enright et al., 2003; John et al., 2004; Lee et al., 1993; Lewis et al., 2005; Wightman et al., 1993) by accelerating mRNA degradation and inhibiting translation (Guo et al., 2010; Lim, 2005). Although the physiological function of some microRNAs is known in detail (Brennecke et al., 2003; Johnston and Hobert, 2003; Lee et al., 1993; Wightman et al., 1993), it is unclear why microRNA regulation is so ubiquitous and conserved, since individual microRNAs only weakly repress the vast majority of their target genes (Baek, 2008; Selbach, 2008) and knockouts rarely show phenotypes (Miska et al., 2007). One proposed reason for this widespread regulation is the ability of microRNAs to provide precision to gene expression (Bartel and Chen, 2004), and previous work has hypothesized that microRNAs could reduce protein expression variability (noise) when their repressive posttranscriptional effects are antagonized by accelerated transcriptional dynamics (Ebert and Sharp, 2012; Noorbakhsh et al., 2013). However, since microRNA levels are themselves variable, one should expect the propagation of their fluctuations to introduce additional noise (Figure 4.1a). To test the effects of endogenous microRNAs, we quantified protein levels and fluctuations in mouse embryonic stem cells (mESCs) using a dual fluorescent reporter system (Mukherji et al., 2011), where two different reporters (ZsGreen and mCherry) are transcribed from a common bidirectional promoter (Figure 4.1b). One of the reporters (mCherry) contained several variants and numbers of microRNA binding sites in its 3'UTR and we quantified single cell fluorescence using a flow cytometer (Figure 4.1c). We used ZsGreen fluorescence intensity to bin cells with similar transcriptional activity (mostly due to varying plasmid copy numbers) and calculated mean and noise (standard deviation divided by mean) of mCherry intensities distributions in each bin (Figure 4.ld). 85 We first assessed the effects of endogenous miR-20a in mESCs, on a designed target site in the reporter. In cells with low reporter (mCherry) expression containing a miR-20a site, noise was reduced (compared to an unregulated control at equal mCherry expression) in contrast to increased noise at high reporter expression (Figure 4.1e). These changes in mCherry noise are more pronounced when the miR-20a sites in the reporter are perfect targets or when there are multiple sites in the 3' UTR (Figure 4.1 f, g). a maCd* 4 machy gene i mRNA protein mocroRNA 00 t d C b 10' no 3UTR tout buled mR-20E - - -TRI sites p A 10' I #, - 10* microRNA 1* potemA'& 1004 mchery iensity [a-u.] Of one ZsGree bin) 1o0 10' 10' 100 10v ZsGreen intensity [a.u.] f e g * no 3UrR * no 3UTR 0 One bulgPrn1'fat 1 *n 1 0.51 - i~ ~~~lo U, 10' 10' 1.5 - *no 3UTR 0 tour !u(11rdmi 1. 0.51 109 100 0.51 10' 10' 10' 101 100 U' mcherry intensity mean [au.] Figure 4.1 microRNA regulation has opposing effects on noise at low and high protein expression. (a) The expression of a microRNA regulated gene. Noise in protein expression originates from stochastic molecular reactions in the production of the protein (intrinsic noise; jagged arrows) or fluctuations propagating from external factors (extrinsic noise). (b) Plasmid reporter system coding for two fluorescent proteins ZsGreen and mCherry, transcribed from a common bi86 directional promoter. mCherry 3'UTR can be modified to contain microRNA binding sites. (c) Overlay of two flow cytometry measurements of mESC populations transiently transfected with different variants of the plasmid system: empty mCherry 3'UTR (black) and mCherry 3'UTR containing four bulged miR-20a binding sites (blue). For further processing cells are binned according to ZsGreen intensity (red lines) and cells below ZsGreen background are discarded (grey) (See Methods). (d) Mean and noise (standard deviation divided by mean) of mCherry intensities are calculated from marginal distributions in each bin. (e-g) Noise of mCherry intensity as a function of mean mCherry intensity in each bin for three different miR-20a regulated constructs (blue) compared to respective unregulated constructs (black). Panels are ordered from left to right according to increasing repression of constructs by miR-20a. Dots: data, lines and shaded area: model fit. In order to explore the mechanism for these seemingly opposing effects on protein expression noise, we built a mathematical model where we decomposed total noise into intrinsic noise and extrinsic noise (ot t = 77int + 17xt , Eq. 1) (Elowitz et al., 2002; Swain et al., 2002) (See Supplementary Model of (JMrn M. Schmiedel, 2015)). Intrinsic noise 7int results from the stochasticity of transcription, translation and decay but is mostly dominated by transcriptional dynamics (Blake et al., 2003; Raj et al., 2006) and low mRNA copy numbers (Bar-Even et al., 2006; Ozbudak et al., 2002). Extrinsic noise 77ext stems from fluctuations propagating from external factors to the gene (Pedraza and van Oudenaarden, 2005). The modeling predicted opposing effects of microRNA regulation on intrinsic and extrinsic noise. On the one hand, the model predicted that a microRNA-regulated gene (reg) has reduced intrinsic noise compared to an unregulated gene (unreg) at equal protein expression levels; intrinsic noise is approximately unreg reduced by the square root of microRNA-mediated fold-repression r, reg = V7 (Eq.2) (Figure 4.2a). Noise reduction results from microRNA-mediated accelerated mRNA turnover and increased transcriptional activity needed to produce the same amount of protein (Ebert and Sharp, 2012). The model predicts that the effect occurs independently of the mode of microRNAmediated repression (Jrn M. Schmiedel, 2015). On the other hand, the model predicted that , p (Eq.3) (Figure microRNA regulation acts as an additional extrinsic noise source 7lext = 4.2b). The magnitude of 7lext depends on the noise in the pool of regulating microRNAs (4,) and on how strongly microRNAs repress the target (V) (Jhrn M. Schmiedel, 2015). Therefore the model predicted that the combined net effects of decreased intrinsic and additional extrinsic noise would result in decreased total noise at low expression, but increased total noise at high expression (Figure 4.2c); and model-fits, with the microRNA pool noise 4, as the only free parameter, yield accurate agreement with the experimentally observed total noise profiles (Figure 4.1 e-g). 87 a -no 1.5 .3 2 itS mRN-miN 11"fa.3 1 0.5 c extrinsic noise b intninsic noise% I..3 1.51 11 2 0.51 n 05 1 5 4 1 0.51t 2i34t 10 10 10 10, 10 10 10 protein expression [a.u.] protein expression [a.u.] 1.51 "I 10 10 10 10 102 10 3 2 1 total noiseY4 10 1 protein expression [a.u.] Figure 4.2 Predictions of the noise model for a microRNA-regulated gene. (a) Intrinsic noise due to low molecule numbers declines with increasing expression. MicroRNA regulation reduces intrinsic noise as a function of repression due to higher mRNA turnover. (b) Noise in microRNA pool propagating to target gene results in additional extrinsic noise dependent on conferred repression and saturation of the microRNA pool. (c) Net influence of microRNA regulation results in decreased total noise at low and increased total noise at high expression levels. b a no 3UTRs no 3'UTR no 3UTR xu mR20a 0.5- .4 mCR-20a iXpeRee 1 xug "d miR 2 __pT~4A Tmf -r t4A MMu U~oF p~omim 10o C 10 10' 1c mean mCherny + ZsGreen intensity [a.u-] 10' - -- - ---............... C I - 0 .5 1 - - 0.25} 1 z 0. - 2 ftR-26. a R 2 3 sqrt(fold-ieptession) 4 10 e d 0.5 efec erR-2 ct 10' I 3 I rA9 11 01# 1-p 101 +' fok1-repressao Figure 4.3 Exploration of intrinsic and extrinsic noise effects. (a) Plasmid reporter system with identical 3'UTRs for ZsGreen and mCherry, to quantify expression-dependent intrinsic noise. (b) Intrinsic noise as a function of expression for three different miR-20a bi-regulated constructs. Dots: data, lines and shaded area: model fit. (c) 88 Measured intrinsic noise reduction for bi-regulated constructs compared to fold-repression, as measured independently by mCherry-regulated constructs. 3 biological replicates. (d) MicroRNA pool noise estimates for nine different microRNAs endogenously expressed in mESC. Subset of microRNAs with two instead of one gene copies indicated in red. n>3 biological replicates. (e) MicroRNA pool noise estimates for individual and mixed pools, using data from reporters with two perfect binding sites behind mCherry as indicated. Red bars: expectation for mixed pool noise when sub-pools were fully correlated. n=3 biological replicates. To distinguish between microRNA-mediated intrinsic and extrinsic noise effects experimentally, we modified the plasmid reporter system so that both reporters contained identical 3'UTRs (Figure 4.3a). Now intracellular differences in their expression can only result from processes individual to each gene, i.e. intrinsic noise. Comparing identical reporters both with and without miR-20a sites, we show that miR-20a regulation reduces intrinsic noise compared to an unregulated construct (Figure 4.3b) by the square root of fold-repression, as predicted by modeling (Figure 4.3c). These results also show that the observed increase in total noise at high mCherry expression must be due to additional extrinsic noise (J5rn M. Schmiedel, 2015). The model together with the experiments suggest that the reduction of intrinsic noise is a generic property of microRNAs and should occur irrespective of the specific microRNAs or the molecular details of the mRNA-microRNA interaction. To test the generality of these conclusions we constructed eight additional reporters with mCherry 3'UTRs containing a perfect binding site for a variety of microRNAs that are endogenously expressed in mESC (Jmrn M. Schmiedel, 2015). For all constructs, the intrinsic noise reduction was approximately the square root of foldrepression (Jbrn M. Schmiedel, 2015). This was also confirmed by direct measurement for miR291 a target sites (Figure 4.3c and (Jrn M. Schmiedel, 2015)) and reporters containing AU-rich elements (Barreau et al., 2005; Jbrn M. Schmiedel, 2015), the latter further supporting the plausibility that reduction of intrinsic noise is a generic property of post-transcriptional repressors. Additional extrinsic noise stems from the variability of the microRNA pool and consistent with this we find that microRNA pool noise indeed differs between microRNAs (Figure 4.3d). The validity of these results is supported by the observation that different constructs assaying the same microRNA result yield similar pool noise estimates (Jrn M. Schmiedel, 2015). Although microRNA pool noise decreases for microRNAs conferring stronger repression, it is still substantial for the most potent and highly expressed microRNAs in mESC (miR-290 cluster (Marson, 2008)) (Figure 4.3d). Interestingly, the microRNAs with two independent gene copies, producing the identical mature microRNA (Figure 4.3d, red), tend to have lower microRNA pool noise compared to single gene microRNAs. This suggested to us that microRNA pools could have lower noise if they consist of independently transcribed microRNAs and thus uncorrelated fluctuations can average out. To test this hypothesis, we constructed reporters with a perfect target sites for miR-20a and either miR-16 or miR-290 in the mCherry 3'UTR and compared them to reporters with two perfect target sites for miR- 16, miR-20a or miR-290, respectively. We find that the noise levels in the mixed pools are lower than expected if the individual microRNA pools were fully correlated and can be lower than the noise in the individual microRNA pools (JMn M. Schmiedel, 2015). Therefore our data show that, if noise between different microRNAs is not correlated, combinatorial regulation can result in lower noise of the target protein. 89 In contrast to our artificial 3'UTRs, endogenous mRNAs often contain many binding sites to different microRNAs and with less complementarity (Enright et al., 2003; Krek et al., 2005). To test if our findings are likely applicable in vivo, we constructed mCherry reporters with the 3'UTRs from Wee], Lats2, Casp2 and Rbl2; all predicted to be combinatorial regulated by mESC microRNAs (Jmrn M. Schmiedel, 2015). This multiple-microRNAs regulation resulted in 3 to 5.5fold repression compared to the control 3'UTRs containing mutated sites and reduced total noise except when reporter expression levels were high (Figure 4.4a and (Jmrn M. Schmiedel, 2015)). Model fits estimate intrinsic noise reduction for the wild-type 3'UTRs as large as the square root of fold-repression (Jrn M. Schmiedel, 2015), consistent with our findings for the artificial 3'UTRs. Furthermore, little additional noise at high expression levels results from low noise in the mixed microRNA pools regulating the wild-type 3'UTRs (Jbrm M. Schmiedel, 2015), corroborating that combinatorial microRNA regulation is a potent way to optimize overall noise reduction. To determine whether the reporter assay covers expression levels relevant to endogenous genes, we used fluorescence-activated cell sorting and RNA sequencing (J6m M. Schmiedel, 2015). The reporter assay covers the range of 25% to 99% (~l RPKM to ~500 RPKM) of expressed genes in mESC (Figure 4.4b). Model-based extrapolation shows that reduction of total noise for the endogenous 3'UTRs extends in a graded fashion up to the top 10% of the transcriptome expression distribution (Figure 4.4c). While most microRNAs individually repress genes only to a small extend (Baek, 2008; Selbach, 2008), we find that hundreds of genes are substantially repressed (>2 fold) by the combinatorial action of microRNAs in mESC (Jmrn M. Schmiedel, 2015), as determined from transcriptome expression data for wild-type and microRNA-deficient Dicer knockout mESC (Leung, 2011). Furthermore, most of the highly repressed genes have low expression levels ((Jbm M. Schmiedel, 2015) consistent with refs. (Farh, 2005; Sood et al., 2006)), suggesting that these genes should have reduced protein expression noise as a consequence of microRNA regulation in vivo. In summary, our integrated theoretical and experimental analyses show that reduction of intrinsic nolse is a generic property of microRNA, and more gnenrally pQt-.trnsrtional reguiltin that is linked to repression of protein expression. MicroRNAs preferentially target lowly expressed genes, for which noise reduction will be strongest, while selectively avoiding ubiquitous and highly expressed genes (Farh, 2005; Sood et al., 2006). Combinatorial microRNA regulation, a widely observed phenomenon in vivo (Enright et al., 2003; Krek et al., 2005), enhances overall noise reduction by providing strong repression to endogenous genes with only little additional noise from microRNA pools. Combinatorial microRNA regulation may thus be a potent mechanism to reinforce cellular identity by reducing gene expression fluctuations that are undesirable for the cell. 90 a K I. C . percentage of genes expressed below 25 50 75 9095 99 Lats2 3'UTR mut Lats2 3UTR wt 1 -----------50 0.51 b . . . . . 10' 10' 10 my iChery intensity mean [a.u.] Ii "1 q1W - I transcuiptome 0.1 - - - - - - --- Lats2 3'TR - - 501 -- 0 100 --- - - -- 50 100 ----- V.'exl 10 10' 00 berry mRNA love 0.2 I 100 --- 2-3-VTTR -- 01 1w i 0.3 I Casp2 3UTR ------ 100 4 6 10 [RPKMI 1 ---- - - 0or__________________ 10' 10 100 10 mRNA levels [RPKM] Weel3'UTR *edgn us expiessen nOise teducuon I mESC * * C C * * S S S S S S * * S 6 * U * 6 10o 10' 10 10' mRNA levels [RPKM] Figure 4.4 Reduction of total noise dominates for microRNA-regulated endogenous 3'UTRs. (a) Noise as function of mean for mCherry with Lats2 3'UTR (blue) or control 3'UTR with pointmutated microRNA binding sites (black). Dots: data, lines and shaded area: model fit. (b) Mapping fluorescent reporter range to mESC transcriptome. (Upper panel) FACS sorting and least square regression was used to determine conversion between mean mCherry fluorescent intensities and mCherry mRNA levels (as measured by RNA-seq). (Lower panel) Range covered by mCherry in relation to transcriptome expression in mESC (~25% to -99%). (c) Model-based extrapolation of total noise in assayed endogenous 3'UTRs relative to control 3'UTRs as a function of transcriptome expression (blue line and area: mean and 95% confidence interval based on parameter estimates of three biological replicates). 91 4.3 Methods 4.3.1 Reporter plasmid construction Starting from a previously established reporter system (Mukherji et al., 2011), eYFP was replaced with ZsGreenl-1 (Clontech) using EcoRI and NdeI digestion sites. MicroRNA binding sites were inserted into the mCherry 3'UTR using Clal and EcoRV digestions sites and into the ZsGreenl-] 3'UTR using NdeI and XbaI digestion sites. N=1 bulged (full complementary to microRNA except central bulge, as in (Mukherji et al., 2011)) and perfect (full complementary) microRNA target sites were created by aligning complementary single stranded oligonucleotides with respective overhangs for digestions sites (IDT) at 65'C for 30 minutes, with previous heating to 95'C for 5 minutes. N=4 bulged miR-20a binding site 3'UTR contains random 50bp spacers between individual binding sites and was synthesized (IDT gBlocks). Wee] wild-type and mutated 3'UTR fragments (nt 130-610) as well as Casp2 and Rbl2 wild-type and mutated 3'UTRs were synthesized (IDT gBlocks). The Lats2 wild-type 3'UTR was amplified from murine embryonic stem cell cDNA and was sequence confirmed. The mutated version of Lats2 3'UTR was synthesized (GeneArt). The mutated 3'UTRs were synthesized with double point mutations in all predicted microRNA binding sites (Targetscan6.2 (Garcia et al., 2011)) of significantly expressed mESC microRNAs (Marson, 2008). Seed positions 3 and 5 were mutated such that purines and pyrimidines were interchanged, yielding mutated 3'UTRs that maintain >95% sequence similarity to wild-type 3'UTRs. Refer to Supplementary Table S1 of (Jrn M. Schmiedel, 2015) for a list of mutated microRNA seed sites. Synthesized fragments were PCR amplified to append necessary digestion sites. MicroRNA binding sites and 3'UTRs were cloned into digested and dephosphorylated plasmid backbone using T4 ligase (NEB). For a list of target site sequences, endogenous 3'UTR sequences and their mutated versions refer to (Jirn M. Schmiedel, 2015). 4.3.2 Transient transfections Murine embryonic stem cells V19 below passage 20 were plated two days before transfection in 2 ml synthetic 2i medium (Ying et al., 2003) (Gibco) on gelatinized 6-well plates, starting at ~105 cells. Medium was refreshed after 24 hours. Reporter plasmids were diluted 1:25 in pUC19b carrier plasmid (NEB) and mixed with Lipofectamine 2000 (Invitrogen). 10 gl reagent with 4 pg DNA in 300 p1 Opti-MEM was added to 2 ml 2i medium per well. 4 hours post transfection, cells were detached using Accutase (EMD Millipore), split 1:2 and passaged onto gelatinized 60 mm plates in 3 ml 2i medium containing 3 pg doxycycline. Medium was refreshed 24 hours after passaging. 4.3.3 Flow cytometry Cells were assayed on a LSRFortessa analyzer (BD Biosciences) two days after transfection. Cells were gated according to their forward (FSC-A) and side (SSC-A) scatter profiles. Each set of experiments contained at least one cell population transfected with the corresponding unregulated reporter construct and one mock transfected cell population (pUC 1 9b carrier plasmid only), which was used to characterize background fluorescence. 92 4.3.4 Transcriptome profiling Cells were transfected with plasmid containing mCherry-Weel wildtype 3'UTR (as described above). Cells were sorted into four fractions (~100.000 cells each) on a FACSAriaIII cell sorter (BD Biosciences) according to ZsGreen intensities. RNA from cells in each fraction was extracted using Trizol LS (Life Technologies). From isolated RNA sequencing libraries were prepared using Illumnia TrueSeq Stranded mRNA kit. Libraries were sequences on an Illumnia HiSeq 2500 sequencer. Sequencing results were mapped to RefSeq mRNAs (mm10) and mCherry sequences using Bowtie v2.2.0 (Langmead and Salzberg, 2012). Reads per kilobase gene model per million mapped reads (RPKM) was calculated for all transcripts and transcript isoforms were then aggregated to GeneSymbols. For further analysis we only considered genes expressed above 0.1 RPKM, what corresponds to about one transcript per mouse embryonic stem cells (Dominic Grin, personal communication) 4.3.5 Taqman microRNA expression measurements RNA was isolated from mESC V19 cells two days after transiently transfection with control plasmid reporter (no 3'UTR behind mCherry) : pUC 19 carrier plasmid mix (as described above) using Life Technologies miRVANA miRNA Isolation Kit. Expression of microRNAs mmu-miR16, mmu-miR-20a and mmu-miR-290 was assayed using Life Technologies Taqman microRNA assays. 4.3.6 Flow cytometry data processing For uni-regulated constructs (3'UTR only behind mCherry), cells were binned according to ZsGreen intensities (bin-width 0.2 in log10 space). The lower bin limit was set to the 0.9999quantile of the background distribution. In each bin, cells below 0.001 -quantile and above 0.999quantile were discarded to deal with outliers. 1,000 iterations of 50% bootstrapping were used to evaluate uncertainty of the data in each bin. From each bootstrap, mean and noise of mCherry intensities were calculated. Mean of mean and noise values over all 1,000 bootstraps serve as observables for each particular bin. Standard deviation of mean and noise values over all 1,000 bootstraps serve as uncertainty of the observables for each particular bin. For bi-regulated constructs (identical 3'UTRs behind ZsGreen and mCherry), cells were binned according to the summed [ZsGreen + mCherry] intensity (bin-width 0.2 in log 10 space). The lower bin limit was set to the 0.9999-quantile of the summed [ZsGreen + mCherry] intensity of the background distribution. In each bin, ZsGreen intensity was normalized such that ZsGreen and mCherry intensity distributions had identical means. Mean and bootstrapped standard deviations for intrinsic noise were calculated in each bin by bootstrapping as described above. Intrinsic noise was calculated as rint = (Elowitz et al., 2002), with z and m as ZsGreen and mCherry intensities of cells and <x> denoting the mean of a variable over all cells in the bin. 93 The aforementioned observables describe mean and noise of the flow cytometry measurements. Actual biological signal mean and noise were deconvolved from measurement noise as described in (Jmrn M. Schmiedel, 2015). 4.3.7 Model fit to signal mean and noise For all model fits to single cell data, a MATLAB implementation of the profile likelihood approach (Raue et al., 2009) was used to determine optimal fits and 95% confidence intervals of parameter estimates. Uni-regulated reporter constructs (3'UTR only behind mCherry): The mass action kinetics model (see Supplementary Equation 21 of (Jbrn M. Schmiedel, 2015)) was fitted to the background corrected and binned mean signal intensity data by using ZsGreen signal intensities as being proportional to the transcription rate, and mean mCherry signal intensity as being proportional to the mean protein concentration. From the fit, microRNA-mediated repression R and microRNA saturation S could be estimated. The noise model (Supplementary Equation 37 of (J6rn M. Schmiedel, 2015)) was fitted to the corrected mCherry signal noise data of both the regulated reporter and the respective unregulated control reporter simultaneously. The fit yielded parameter estimates for the scaling factor x = <mCerr!y>, which relates mean mCherry signal intensity <mCherry> to molecule numbers, the microRNA-independent extrinsic noise iext and the effective microRNA pool noise q, -. Bi-regulated constructs (identical 3'UTRs behind ZsGreen and mCherry): Intrinsic signal noise was fitted as proportional to the square root of summed [ZsGreen + mCherry] mean intensities <z+m> as iy = __z__, with y as a scaling factor. Scaling factors for both the regulated reporters and the respective unregulated control reporters were estimated and their ratio yielded the intrinsic noise reduction conferred by microRNA regulation. 4.3.8 Mixed microRNA pool noise for correlated individual microRNA pools 4B) was calculated as =x~y = . The hypothetical microRNA pool noise of fully correlated individual microRNA pools (cf. Figure Here, the correlation coefficient p was set to 1. The standard deviation a was calculated as ai = i, -< i >, the product of noise in the individual microRNA pool 5 (known from mCherry reporters only regulated by the specific microRNA) times the relative microRNA pool size <i>, which we measured using Taqman microRNA assays. 94 4.3.9 Mapping flow cytometry experiments to transcriptome expression To calculate the conversion factor from mCherry fluorescent intensities to RPKM values a leastsquare fit of respective values over the four bins was performed. Comparability of transcriptome expression from different bins is given by high similarity (R2 > 0.96, Figure SI OC of (Jmrn M. Schmiedel, 2015)). Relative effects of microRNA regulation on total noise as a function of RPKM values were calculated based on the parameters obtained from model fits to noise data from endogenous 3'UTRs (n=3) and the mCherry fluorescent intensity to RPKM conversion factor. 4.3.10 Dicer knock-out mESC transcriptome expression data Microarray expression data and Ago2 CLIP-seq data from wild-type and Dicer knockout mouse embryonic stem cells were obtained from Gene Expression Omnibus GSE25310 and data was processed as described in (Leung, 2011). MicroRNA-mediated repression was calculated as the fold-change in mean expression between wild-type and Dicer knock-out samples. Loess regression was performed to obtain an error model relating standard deviations of expression for each gene as a function of mean expression over three replicates for both wild-type and knock-out samples. Significance of fold-changes was assessed at alpha<0.05 (Bonferroni corrected) by calculating z<KO>-<WT> scores as z = ,. Genes below microarray intensity of -4.2 (40% of genes) were 4'UKO'WT discarded as background. Genes were labeled as 'AGO-bound' if at least one read cluster in their 3'UTR was found in CLIP-seq data. Genes were labeled as predicted microRNA targets if they contain at least one predicted conserved microRNA binding site (Targetscan6.2 (Garcia et al., 2011)) for a microRNA seed family expressed above 0.1% of total microRNA expression in mESC (Marson, 2008). 95 4.4 Acknowledgments This chapter is in collaboration with Jbrm M. Schmiedel, Sandy L. Klemm, and Apratim Sahay under the instruction of Nils Bltithgen, Debora S. Marks, and Alexander van Oudenaarden. The original paper 'MicroRNA control of protein expression noise' is accepted in Science. We thank Margaret Ebert, Shankar Mukherji, Dylan Moojiman, Lennart Kester, Dominic GrUn and Mauro Muraro for discussions and help, the Boyer lab for mESC line V19, the Cuppen lab for sequencing and Stefan van der Elst for help with FAC-sorting. Support by NWO (VICI award, AvO), ERC (ERC-AdG 294325-GeneNoiseControl, AvO), DFG (GK1772, JMS), EMBO (STF, JMS), DFG (SPP 1395, NB), BMBF (FORSYS & BCCN, NB), HMS institutional support (DSM). 96 4.5 References Baek, D. (2008). The impact of microRNAs on protein output. Nature 455, 64-71. Bar-Even, A., Paulsson, J., Maheshri, N., Carmi, M., O'Shea, E., Pilpel, Y., and Barkai, N. (2006). Noise in protein expression scales with natural protein abundance. Nature Genetics 38, 636-643. Barreau, C., Paillard, L., and Osborne, H.B. (2005). AU-rich elements and associated factors: are there unifying principles? Nucleic Acids Research 33, 7138-7150. Bartel, D.P., and Chen, C.-Z. (2004). Micromanagers of gene expression: the potentially widespread influence of metazoan microRNAs. Nat Rev Genet 5, 396-400. Blake, W.J., Kaern, M., Cantor, C.R., and Collins, J.J. (2003). Noise in eukaryotic gene expression. Nature 422, 633-637. Brennecke, J., Hipfner, D.R., Stark, A., Russell, R.B., and Cohen, S.M. (2003). bantam Encodes a Developmentally Regulated microRNA that Controls Cell Proliferation and Regulates the Proapoptotic Gene hid in Drosophila. Cell 113, 25-36. Ebert, M.S., and Sharp, P.A. (2012). Roles for microRNAs in conferring robustness to biological processes. Cell 149, 515-524. Elowitz, M.B., Levine, A.J., Siggia, E.D., and Swain, P.S. (2002). Stochastic Gene Expression in a Single Cell. Science 297, 1183-1186. Enright, A., John, B., Gaul, U., Tuschl, T., Sander, C., and Marks, D. (2003). MicroRNA targets in Drosophila. Genome Biology 5, RI. Farh, K.K. (2005). The widespread impact of mammalian microRNAs on mRNA repression and evolution. Science 310, 1817-1821. Garcia, D.M., Baek, D., Shin, C., Bell, G.W., Grimson, A., and Bartel, D.P. (2011). Weak seedpairing stability and high target-site abundance decrease the proficiency of lsy-6 and other microRNAs. Nat Struct Mol Biol 18, 1139-1146. Guo, H., Ingolia, N.T., Weissman, J.S., and Bartel, D.P. (2010). Mammalian microRNAs predominantly act to decrease target mRNA levels. Nature 466, 835-840. John, B., Enright, A.J., Aravin, A., Tuschl, T., Sander, C., and Marks, D.S. (2004). Human MicroRNA Targets. PLoS Biol 2, e363. Johnston, R.J., and Hobert, 0. (2003). A microRNA controlling left/right neuronal asymmetry in Caenorhabditis elegans. Nature 426, 845-849. Jdrn M. Schmiedel, S.L.K., Yannan Zheng, Apratim Sahay, Nils Bluthgen, Debora S. Marks, Alexander van Oudenaarden (2015). miRNA control of protein expression noise. Science. Krek, A., Grin, D., Poy, M.N., Wolf, R., Rosenberg, L., Epstein, E.J., MacMenamin, P., da Piedade, I., Gunsalus, K.C., Stoffel, M., et al. (2005). Combinatorial microRNA target predictions. Nature Genetics 37, 495-500. Langmead, B., and Salzberg, S.L. (2012). Fast gapped-read alignment with Bowtie 2. Nat Meth 9, 357-359. 97 Lee, R.C., Feinbaum, R.L., and Ambros, V. (1993). The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14. Cell 75, 843-854. Leung, A.K. (2011). Genome-wide identification of Ago2 binding sites from mouse embryonic stem cells with and without mature microRNAs. Nature Struct Mol Biol 18, 237-244. Lewis, B.P., Burge, C.B., and Bartel, D.P. (2005). Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell 120, 15-20. Lim, L.P. (2005). Microarray analysis shows that some microRNAs downregulate large numbers of target mRNAs. Nature 433, 769-773. Marson, A. (2008). Connecting microRNA genes to the core transcriptional regulatory circuitry of embryonic stem cells. Cell 134, 521-533. Miska, E.A., Alvarez-Saavedra, E., Abbott, A.L., Lau, N.C., Hellman, A.B., McGonagle, S.M., Bartel, D.P., Ambros, V.R., and Horvitz, H.R. (2007). Most Caenorhabditis elegans microRNAs are individually not essential for development or viability. PLoS Genet 3, e215. Mukherji, S., Ebert, M.S., Zheng, G.X.Y., Tsang, J.S., Sharp, P.A., and van Oudenaarden, A. (2011). MicroRNAs can generate thresholds in target gene expression. Nat Genet 43, 854-859. Noorbakhsh, J., Lang, A.H., and Mehta, P. (2013). Intrinsic Noise of microRNA-Regulated Genes and the ceRNA Hypothesis. PLoS ONE 8, e72676. Ozbudak, E.M., Thattai, M., Kurtser, I., Grossman, A.D., and van Oudenaarden, A. (2002). Regulation of noise in the expression of a single gene. Nature Genetics 31, 69-73. Pedraza, J.M., and van Oudenaarden, A. (2005). Noise Propagation in Gene Networks. Science 307, 1965-1969. Raj, A., Peskin, C.S., Tranchina, D., Vargas, D.Y., and Tyagi, S. (2006). Stochastic mRNA Synthesis in Mammalian Cells. PLoS Biol 4, e309. Raue, A., Kreutz, C., Maiwald, T., Bachmann, J., Schilling, M., Klingmiller, U., and Timmer, J. (2009). Structural and practical identifiability analysis of partially observed dynamical models by exploiting the profile likelihood. Bioinformatics 25, 1923-1929. Selbach, M. (2008). Widespread changes in protein synthesis induced by microRNAs. Nature 455, 58-63. Sood, P., Krek, A., Zavolan, M., Macino, G., and Rajewsky, N. (2006). Cell-type-specific signatures of microRNAs on target mRNA expression. Proceedings of the National Academy of Sciences of the United States of America 103, 2746-2751. Swain, P.S., Elowitz, M.B., and Siggia, E.D. (2002). Intrinsic and extrinsic contributions to stochasticity in gene expression. Proceedings of the National Academy of Sciences 99, 1279512800. Wightman, B., Ha, I., and Ruvkun, G. (1993). Posttranscriptional regulation of the heterochronic gene lin-14 by lin-4 mediates temporal pattern formation in C. elegans. Cell 75, 855-862. 98 Ying, Q.-L., Stavridis, M., Griffiths, D., Li, M., and Smith, A. (2003). Conversion of embryonic stem cells into neuroectodermal precursors in adherent monoculture. Nat Biotech 21, 183-186. 99 Chapter 5 Application of UTR decoy system to study microRNAmediated-crosstalk 5.1 Abstract MicroRNAs regulation strength is observed to decrease at high target expression for nearly all of our UTR reporters, it suggests that regulating miRNAs or co-factors are subject to titration in this region. Titration of miRNAs is a necessary condition for miRNA-mediated-crosstalk of coregulated targets. To study if over-expression of UTR decoys is sufficient to induce crosstalk generally, we applied Lats2 3'UTR reporter system as miRNA sponges, sorted ESCs according to decoy expression levels, and measured genome-wide gene expression response by RNA-Seq. No statistically significant derepression was found for miR-290 miRNA targets, and other miRNAs targeting Lats2 3'UTR. We estimated that decoy expression and added miRNA targeting sites were comparable to or much higher than endogenous miRNA expression. However, the total expression of endogenous miRNA targets, or the endogenous target site abundance (TA), were between 1.5 to 4 fold high as the added MREs. And the lack of miRNA-mediated-crosstalk in our system supports a model in which the changes in ceRNAs must begin to approach the TA of a miRNA before they can exert a consequential effect on the repression of targets for that miRNA (Denzler et al., 2014). 5.2 Introduction In recent years, endogenous RNAs have been found to communicate via a microRNA response elements (MREs) language. By harboring same set of MREs, RNAs could compete with each other for binding to the shared pool of regulating miRNAs. Therefore those competing endogenous RNAs (ceRNAs) could titrate miRNAs availability and co-regulate the expression of each other (Salmena et al., 2011 b). Diverse RNA species have been added into this ceRNAs regulatory network, those include proteincoding mRNAs and non-coding RNAs (ncRNAs) such as pseudogenes, long non-coding RNAs (lncRNAs), and circular RNAs (circRNAs). By combining computational prediction with experimental validation, Tay et al. (Tay, 2011) discovered that protein-coding transcripts, such as VAPA and CNOT6L, could function as PTEN ceRNAs. Their expressions are significantly correlated with PTEN in vivo, and those ceRNAs modulate PTEN expression via a miRNAdependent manner. PTENP] is a pseudogene of PTEN (Poliseno, 2010). The proximal region of its 3'UTR shares 95% identity with PTEN and contains conserved binding sites for five of the corerilating mRPNA. RPtrrnira1 n'-rovenrei(n-n of the PTEP1 UATR T preglate PT1NT Tn Dicer-dependent manner. And siRNA knockdown of endogenous PTENP1 in prostate cancer cells results in a decrease in PTEN levels. A similar correlation of expression is found between KRAS and its pseudogene KRAS]P. T cells transformed with primate virus Herpesvirus saimiri(HSV) express viral ncRNA called H. saimiri U-rich RNA 1 (HSUR-1). It contains miR27 sites and in turn accelerates the turnover of miR-27 (Libri et al., 2012). Moreover, the expression of HSUR-l is correlated with miR-27 target gene FOXO1, suggesting its ability to control host gene expression and its function as ceRNA. Another ncRNA example is linc-RoR. Being highly expressed in human ESCs, it is able to titrate miR-145 from 100 OCT4, SOX2 and NANOG transcripts and is essential for ESCs pluripotency and self-renewal (Wang et al., 2013). circRNA ciRS-7 has been recently identified as a natural sponge for miR-7. It contains more than 70 conserved binding sites for miR-7. And it is highly expressed with complete resistance to miRNA-mediated target degradation. Thus circRNAs with tandem MREs may be exceptionally potent miRNA-mediated-crosstalk modulators (Hansen et al., 2013). Actually before the discovery of ceRNAs, artificial miRNA sponges have been shown as effective miRNA inhibitors (Ebert et al., 2007). These sponges are usually expressed from strong promoters, contain multiple binding sites for a miRNA of interest and have been shown to derepress miRNA targets at least as effectively as chemically modified antisense oligonucleotides. Intriguingly, imperfectly complementary 'bulged sponges' sequester miRNAs more effectively than perfectly complementary miRNA sponges, and endonucleolytic cleavage of perfect targets might reduce its ability to hold miRNAs for longer time. Efforts have been made to identify ceRNAs networks at transcriptome-wide scale. Several miRNA-target prediction algorithms, including TargetScan, miRanda, ma22 and PITA, have been used in Cupid to identify ceRNA interactions in breast cancer cell lines (Chiu et al., 2015). Sumazin et al. (Sumazin, 2011) investigated the mRNA and miRNA network in glioblastoma cells using data from the Cancer Genome Atlas and a new multivariate analysis method called Hermes. They identified a post-transcriptional regulation layer of surprising magnitude, comprising over 248,000 pairwise miRNA-mediated interactions and 7,000 RNAs that can function as miRNA sponges. High-throughput biochemical techniques such as crosslinking immunoprecipitation-sequencing followed by high-throughput sequencing (CLIP-Seq) have been integrated in starBase, and genome-wide interaction maps of endogenous miRNA-targets provided further insight into ceRNA network (Li et al., 2013). Although multiple examples of ceRNA interactions have been described, little is known about the molecular conditions necessary for optimal ceRNA activity. The abundance of ceRNAs and miRNAs as well as their stoichiometry are obviously important. Other factors like the miRNA catalytic activity of HSUR-l (Cazalla et al., 2010), and exceptional stability of circRNAs (Memczak et al., 2013) could further increase the potency of sponging decoys. The effectiveness of a ceRNA would also depend on the accessibility, affinity, and subcellular localization of RNAs. Titration with other RNA binding proteins (RBPs) and indirect interactions may also be intertwined with ceRNAs network. Changes in the ceRNA expression levels have to be large enough to relieve the miRNA repression, whether or not this could happen under physiological conditions is another issue. PTENP1 RNA is expressed at a much lower level than the PTEN mRNA (~I%) in DU145 cells, and it is unlikely to significantly perturb PTEN and other ceRNAs in this context. Yet it is conceivable that the effectiveness of the crosstalk depends on the sensitivity of the regulated genes to subtle changes in expression level. PTEN is a haploinsufficient tumor suppressor, and even 20% decrease in expression can promote cancer growth (Alimonti et al., 2010). 101 5.3 Results In order to apply UTR decoy system to study miRNA-mediated-crosstalk, we transfected wild type mESCs with either functional decoys (Lats2 OriUTR) or miRNA regulation elements (MREs) mutated controls (Lats2 MutUTR). Transfected cells were sorted into five bins according to mCherry intensities (Figure 5.1). Cells were split in half for downstream applications. RNA sequencing (RNA-Seq) was performed to measure genome-wide miRNA-mediated-crosstalk at transcript level, and targeted mass spectrometry (MS) was conducted in parallel on a list of preselected proteins to illustrate the crosstalk at protein level. Due to the capacity of targeted MS, only a dozen computationally predicted top targets of miR-290 family including Lats2 itself were selected, and those proteins were most likely to be affected by Lats2 UTR overexpression, if crosstalk occurred. 11 2013 ATS2 mut 11 O11jLATS2prd 'C a 11 2 B4. oil a U- 1 2 i B2 . Cl 0 P4 C., 0 BO 3-1 I M IM BO C4, 111 . M M I 2 'I I 1' 1411 1 - 10 0 1 1 1-1 10 - 11 1 -I . CL 10 a a42 0 10 10 10 10 mCherry mCherry Figure 5.1 Flow cytometry data and gate positions illustration. pCAG-d2eGFP-Lats2 Ori/MutUTR was co-transfected with pCAG-mCherry into WT ESCs, and cells were sorted according to mCherry intensities. Positively transfected cells were sorted into 4 bins (BI, B2, B3, and B4), and 1 bin (BO) was set as background control. miRNA repression on GFP protein production is evident from the plot. Sorting data recapitulated what we have observed on FACS analyzers. MicroRNAs repression on GFP protein production increases initially and decreases at high target expression (Table 5.1). The 5 bins of sorted cells cover a mCherry expression range of -3 x 103 fold, with the mean from B4 and B 1 differ by more than 150 fold (data not shown). 102 0.8 0.6. 0.2 0 Lats2 Casp2 3 -01 -021 OAui 11111I- -0321 01 M1 02 M2 03 M3 04 M4 01 M1 02 M2 03 M3 04 M4 o 2 2 3 6 1.5 2, 4 1 2 0.5 0 01 M1 02 M2 03 M3 04 M4 0 01 M1 02 M2 03 M3 04 M4 01 M1 02 M2 03 M3 04 M4 Nr2c2 Hiflan E22 6 2 01 M1 02 M2 03 M3 04 M4 Teti Tgfbr2 Cdknla 8 -u.4 1.5 1.5. * 1 2 0.5 001 M1 02 M2 03 M3 04 M4 M 2 20M34M 0 01 M1 02 M2 03 M3 04 M4 0.5 m01imli: 010M102 M2 03M3 04M4 Figure 5.4 Expression of top miR-290 targets in sorted samples. Lats2 and Tgfb2 exhibit higher expression in OriUTR transfected samples compared to MutUTR transfected ones. But similar number of counter examples were also observed for Casp2, Teti, and Nr2c2. Moreover, we do not observe consistent increase in gene expression upon higher decoy expression. Thus miRNA-mediated-crosstalk is weak even for top miR-290 targets. 107 B4 B1 mChery si nsle 1or T ftpressgn 50 87 48 51 65 7J 3.8 TI68 / Table 5.1 GFP protein intensities and repression fold at protein level, data quantified from cell sorting. GFP protein expression averages were extracted from FACS, repression fold was quantified using RF = (GFPunreg - GFPbg)/(GFPreg - GFPbg). / Targeted MS measurement has not been successful, and we only restrict ourselves to discussion of RNA-Seq data from now. RNA-Seq has very low, if any, background signal (Wang et al., 2009). Thus we define fold change of a gene expression to be FC = expression in OriUTR Transfection expression in MutUTR Transfection, and no background subtraction is needed. If overexpression of decoys does lead to miRNA-mediated-crosstalk, we expect to see titration of regulating miRNAs by OriUTR but not MutUTR. This will result in derepression of co-regulated genes in OriUTR transfected bins compared to MutUTR ones. Consequentially FC of miRNA targets defined above will be larger than 1 on average, and the log of fold changes (LFCs) will be larger than 0 on average. Distributions of fold changes were plotted on a log scale, and the ith bin was compared to 0 th bin control with two-sample Kolmogorov-Smimov test (KS test). As expected, the expression distribution of all genes was unaffected by decoy overexpression, and the LFCs were always centered on 0 irrespective of decoy levels (Supplementary Figure 5.3, bini to bin4). However, we did not observe significant shifts in LFCs for miR-290 targets either (Figure 5.2), and the differences of cumulative distribution function (CDF) of miR-290 targets LFCs throughout bins is no more significant than that of all genes (Figure 5.3). We concluded that no genome-wide miRNA-mediated-crosstalk for miR-290 targets was detected under our experimental conditions. It has been hypothesized that miRNA-RNA competition would only apply to a small subset of moderate or low abundance miRNAs, as the overexpression of decoys would have little impact on regulation on highly abundant miRNAs (Wee et al., 2012). miR-290 family is the most abundant miRNA families expressed in ESCs and it accounts for 70% of total ESCs miRNAs expression (Marson, 2008). It's estimated that ES cells have on average -7,600 copies of miR-290 family miRNAs per cell based on TaqMan@ assays (Chen, 2007). To estimate the number of decoys, RNA-Seq RPKM reads was regressed against smFISH counts for 8 available genes, which cover a wide expression range (-100 fold). And 1 RPKM roughly corresponds to 1 transcript per cell (data not shown). This means that about 900 and 8000 decoy transcripts were expressed per cell in bin3 and bin4 originally, and even after miRNA-mediated degradation, we have more than 200 and 2000 remaining decoy mRNAs per cell respectively (Table 5.2 gfp reads in MutUTR and OriUTR transfection). Given that each Lats2 OriUTR contains two binding sites for miR-290 miRNAs (TargetScan 6.2), the number of overexpressed miR-290 MREs is comparable to the total 103 number of miR-290 miRNAs. On the other side, decoys have to compete against all ceRNAs for miR-290 miRNAs binding. mESC is one of the hard-to-transfect cell lines. Even in the highest expressed bin (bin4), decoy transcripts only account for -% and -2.5% of the total transcriptome expression for OriUTR and MutUTR respectively (Supplementary Figure 5.6). This is in drastic contrast to transfection of human colon cancer cells HCT1 16, in which decoys expression could constitute more than half of the transcriptome (Apratim Sahay, personal communication). The total expression of all miR-290 targets is about 10,500 transcripts per cell. The average miR-290 MRE number for those targets is -1.3 per transcript, and the median is 1. Thus decoy overexpression, which adds another -30% of miR-290 MREs to the total transcriptome in bin4, may still not be able to perturb endogenous miR-290 availability in a significant way. Next we examined several other miRNAs targeting Lats2 3'UTR. Those include miR-31, miR135, miR-103/107, and miR-15/16, which were expressed at -0.01%, -0.1%, -1%, and -5% of miR-290 cluster expression (extracted from Solexa-Seq data from (Marson, 2008)). The summation of expression for all endogenous targets of those miRNAs were on the same order of miR-290 targets. And none of the miRNAs targets explored exhibited a significant miRNAmediated-crosstalk on a genome-wide scale (Supplementary Figure 5.4). Factors such as number of shared MREs, miRNAs binding affinity and overlap of regulating miRNAs positively affect the strength of miRNA-mediated-crosstalk (Tay et al., 2014). Thus we limit ourselves to only top targets of miR-290 family miRNAs, which usually contain more than two miR-290 MREs per transcript with strong miRNA binding affinity, and monitor their expressions throughout sorted samples (Figure 5.4). Only a few genes (Lats2 and Tgjb2) exhibit higher expression in OriUTR transfected samples compared to MutUTR transfected ones. But similar number of counter examples were also observed (Casp2, Teti, and Nr2c2). Moreover, even for endogenous Lats2, which is a perfect competing RNA for Lats2 decoy UTRs, we do not observe consistent increase in gene expression upon increasing amounts of decoys expression. For example, expression from bin 3 and 4 is smaller than that of bini and 2. Thus we suspect the apparent variations of gene exnression throughout samples are largely expression noise for lowly expressed genes, as the expression for more abundant genes (Cdknla) and housekeeping genes (Supplementary Figure 5.2) are very uniform throughout all measured samples. It is important to point out that the derepression of miRNA regulation we observed for UTR reporters happened at translational rather than transcriptional level. Therefore, even though we did not observe miRNA-mediated-crosstalk at transcriptome level, crosstalk might exist at proteome level. At transcriptome level, even though the number of overexpressed MREs was comparable to or much larger than endogenous miRNA expression, due to the high abundance of endogenous miRNA targets, it was difficult to observe crosstalk between ceRNAs. Our study is consistent with the high target site abundance (TA) model proposed in (Denzler et al., 2014), which states that the changes in ceRNAs must begin to approach the TA of miRNA before they can exert a consequential effect on the repression of targets for that miRNA. Interestingly, siRNA knockdown of endogenous PTEN, which was expressed lower than 40 transcripts per cell, was sufficient to induce a significant ceRNAs crosstalk (Apratim Sahay, personal communication). It might be intriguing to see if subcellular localization and downstream signaling process could amplify transcript perturbation. 104 Bin I Bin 2 0.3[ 0.31 pval= 0.27 g 0.2- pvai = 3.3.-06 0.2* 0.1. *0.1. I -2 -1 0 1 2 -2 0 -1 1og2 FC Bin 3 pval = 0.59 0.31 0.3 0.2- g 0.2- 0.1- 0.1 -2 -1 0 1 2 1og2 FC Bin 4 1 0 -f2 2 1og 2 FC pval = 0.53 .. -1 1 W U 1 2 1og 2 FC Figure 5.2 Distributions of LFCs for miR-290 family targets are not affected by Lats2 UTR decoy overexpression. Log fold change distribution of bin X (red stairs) was overlaid on top of binO (grey bars) background control for miR-290 targets, and the means of LFCs were plotted as red line and grey dashed line respectively. Two sample KS-test was performed between the two bins, and the differences were not significant. The small p-value for bin2 is due to small variation of the LFCs in bin 2 (Supplementary Figure 5.5). 105 miR-290 targets all genes 1 1 0.9 0.8 0.7 S- -b inO 0.9- -b in1 -b in2 0.8- -b in3 - b in4 0.7- bin 0 bin 1 bin2 bin 31 bin 4 0.6 0.6L 0 0 0.5- 0.5 0.4 0.4 0.3- 0.3 0.2 0.2- 0.1 0.1 0' -1 -0.5 0 0.5 -11 1 log2(fold changes) -0. 5 0.5 0 log2(fold changes) 1 Figure 5.3 Cumulative distribution function (CDF) of log2 (fold changes) for all genes and miR-290 family targets. CDF of LFCs from binO to bin5 were shown for all genes (left) and miR-290 targets (right). No consistent shift of LFC to the right was observed for miR-290 targets. And the differences throughout bins for miR-290 targets were no smaller than all genes control. The biggest change comes from bin2 (magenta line), and this is caused by smaller variation LFCs for all the sequenced genes for bin2 (Supplementary Figure 5.5). 106 5.4 Methods 5.4.1 FACS cell sorting Wild type mESCs were co-transfected with plasmid pCAG-d2eGFP-Lats2Ori/MutUTR and pCAG-mCherry. Two days after transfection, cells were harvested and sorted into five fractions (~100,000 cells each) on a FACSAriaIII cell sorter (BD Biosciences) according to mCherry intensities. 5.4.2 RNA sequencing RNA from cells in each fraction was extracted using Trizol LS (Life Technologies). From isolated RNA sequencing libraries were prepared using Illumnia TrueSeq Stranded mRNA kit. Libraries were sequences on an Illumnia HiSeq 2500 sequencer. Sequencing results were mapped using Bowtie v2.2.0 to a RNA sequence library consisting of RefSeq mRNAs (mm10) except Lats2, plus gfp, Lats2 CDS, Lats2 OriUTR, Lats2 MutUTR, and mCherry sequences. Reads per kilo base per million mapped reads (RPKM) was calculated for all transcripts and transcript isoforms were then aggregated to GeneSymbols. Both RNA-Seq and smFISH data are available for the following genes, Pou5fi, Sox2, Nanog, Lin28a, Lats2, Casp2, Rbl2, and Cdknla (data not shown). Based on their regression, 1 RPKM roughly corresponds to 1 transcript per cell. For further analysis we only considered genes expressed above 1 RPKM, because lowly expressed genes were susceptible to small number deviations and highly variable LFCs. 5.4.3 MicroRNA targets selection Targets of a certain miRNA family were predicted using TargetScanMouse version 6.2 (http:// www.targetscan.org/mmu_61/). Target sets given by three different criteria, conserved targets only, top 1000 targets and top 500 targets were extracted for each miRNA family. The top miRNA targets were ranked by total context score (i.e. site efficacy), irrespective of site conservation. Targets were also filtered by a gene expression threshold of 1 RPKM. The conclusion we drew is independent of which target set we chose, and all the figures were shown for top 1000 targets. 5.4.4 Targeted Mass Spectrometry A dozen computationally predicted top targets of miR-290 miRNAs, LATS2A, RBL2, CASP2, P21, TGFBR2, TETI, E2F2, EDNRB, together with mCherry, d2EGFP and GAPDH were selected for targeted MS monitoring. At least three interference-free, sequence-specific fragment ions (transition ions) were selected for each protein of detection. Sorted cells were resuspended in lysis buffer (provided by Nikolai), and send for targeted MS on a Harvard Mass Spectrometry core facility machine Agilent 6460 Triple Quadrupole Mass Spectrometer with Agilent 1290 uHPLC. And data is analyzed using Skyline Targeted Proteomics Environment. 108 5.5 Supplementary gfp Lats2 UTR mCherry OriO Oril Ori2 Ori3 Ori4 91.25 4.29 5.37 84.62 10.77 24.26 110.44 32.86 123.74 291.44 146.96 572.71 2818.43 1780.41 4705.22 Muto 86.31 13.84 5.13 Muti 105.26 39.38 22.82 Mut2 240.86 205.73 121.89 Mut3 839.31 904.79 446.89 Mut4 7557.12 8499.21 3865.01 Table 5.2 gfp, Lats2 UTR and mCherry RPKM reads for sorted samples. The reads for Lats2 UTR is the summation of Lats2 OriUTR and Lats2 MutUTR reads. GFP reporter 14- Iat&2UTR 14- b a 12- 12- 10- 10- Ca 4- 414 2a- m5er trn8cin-ee niao 2 14- C 1210- 42 0 O1O Mut0.11 M.*UI 0612 KW*2 0113 MutS 0114 W44 Supplementary Figure 5.1 Histograms of RPKM reads for gfp, Lats2 UTR and mCherry. (a) RNA-Seq reads for gfp have relatively high background, yet the miRNA-mediated repression at transcriptional level is explicit as reads from Ori samples are always lower than reads from corresponding Mut samples. (b) RNA-Seq reads for Lats2 UTR also reflect the miRNA repression at mRNA level. The difference between OriO and MutO reflects the different stability of OriUTR and MutUTR. (c) Contrary to miRNA activity reporter, RNA expression for indicator mCherry is comparable for Ori and Mut samples. 109 betwAUn $ Gapdh 12 12 10 10 8 8 S $ 6 4- 4 2- 2 0,Od0 WutO Ordl Mu O2 Mut2 06r3 Mud O4 MubM 010 MutO 0i1 MuLW O2 Mut2 Or6S MuS 0r14 Mud 46 Supplementary Figure 5.2 Expression for housekeeping genes are homogenous throughout sorted samples. RNA-Seq reads for housekeeping genes Beta-A ctin and Gapdh are homogenous throughout sorted samples. 110 Bin2 Bini ] 0.3 [ 0.2, pval = 0.0026 0.3 [ pva = OA 0.2. X0.1- - 19 0.1 -2 -1 ' 'A U 1 0 -2 2 -1 -1 1og 2 FC Bin3 0.3 F pva = 0.65 0.3[ C 1 2 pval = 0.76 0.2- 0.2[F 0r 0* 9 0.1 - Z~ 0.1 -1 0 1og2 FC 1 01 -2 2 ' 0 -2 -M 0 1og 2F C Bin4 -1 - 0' 1 2 1og 2FC Supplementary Figure 5.3 Distributions of LFCs for all genes are not affected by Lats2 UTR decoy overexpression. Log fold change distribution of bin X (red stairs) was overlaid on top of binO (grey bars) background control for all RNA-Seq genes, and the means of LFCs were plotted as red line and grey dashed line respectively. Since power of KS test depends on sample size, 500 genes (similar to total number of miRNA targets after expression filtering) were sub-sampled from all genes. Two-sample KS-test was performed between bin X and binO, and the differences between samples were not significant. 111 Mouse Lats2 3' UTR 1V .I . 0.ik 0.2k I. t 0.4k . 06k 0.7k 0m1 0.6 1k 1.1k 1.2k 1k 1.4k Y. .6 . GeN fous Lat*2 NN01577 3' UTR Iength:IM2 Conserved sites for MiNl fmaflie broadig conserved miR-93/93&/105/IOW6a/2 iemi vertebrates miR-135Wb/135a-5p ?-3p//9/302Abed/372/373/428/519&/520b/520cd-3p/1378/1420&c 0 mIR-25/32/92abc/363/363-4/367 IR-203 MiR-103/107/O07b OrliMuti Ori2Mut2 pval a0.056 0.3- Ai 0.30.25- 0.2 0.2 Cr0.15 s0.15- 0.1 0.1 0.05- 0.05-1 U 1og 2FC 1 0-2 2 -1 1 2 Ori4lMut4 pvai =0.00045 0.3- 0 1og 2FC Ori3Mut3 pva =0.16 0.3- 0.25- 0.25- 0.2 0.2 0.15 0.15 - 0.1 0.1 0.05 0.05 0 -2 pval = 6.7e-07 - 0.25 -2 PIR-200bc/429/548a I PiR-0b~/0b-p mIR-31 - 026 V.6 --1 U 1 0 -2-1 2 1og 2FC -- U 1og 2 FC 1 2 Supplementary Figure 5.4 miRNAs targeting Lats2 3'UTR and example of LFCs for miR-15 targets. MicroRNAs targeting Lats2 3'UTR were predicted by TargetScanMouse version 6.2 (http://www.targetscan.org/mmu_61/). Targets of regulating miRNAs having broadly conserved sites among vertebrates (shown above) were all tested for significance of LFCs distributions. The results are negative, and miR-15/16 targets was shown as an example, p-values are given by KS test. 112 boxplot of genome-wide FC 32- .5 x i I i+ 1 ++ 0 -r 0 -1 0. x CL CN 0D 0 -2-3-4 -5 OrilMutO Or I Mutl Ori2Mut2 OrWiMut3 Ori4lut4 Supplementary Figure 5.5 Boxplot of LFCs for sorted samples. LFC was defined as LFC = log2 (gene expression in OriUTR transfection/ gene expression in MutUTR transfection). LFCs for all 5 pairs of bins were all centered on 0 with few outliers. The variation for bin2 was significantly smaller than the other 4 bins, due to experimental variation of RNA-Seq. 113 106 :Total reads w/ plasmids :Total reads w/o plasmids 141210CD 816 4 21 01- plasmids reads Tot Reads w/o plasmids(x 107) plasmids/Tot Reads OnO Ori Or2 ORIO 1080 1.22 0.01% OR11 1273 1.18 0.01% OR12 OR13 OR14 2807 1.15 0.02% 9750 1.04 0.09% 114445 1.26 0.90% Or3 0r4 MutM Mut1 Mut2 Mut3 MUTO 923 0.93 MUT1 2579 1.51 MUT2 6143 0.97 MUT3 22421 0.88 0.01% 0.02% 0.06% 0.25% Mut4 MUT4 259108 1.09 2.33% Supplementary Figure 5.6 Expression from transfected plasmid only constitutes a small fraction of total transcript reads from ESCs. Total RNA-Seq reads for all sorted samples were on the order of 107 RPKM. Transcripts expressed from transfected plasmid only constitutes a small fraction of total reads, with a percentage < 2.5% even for the highest expressed bins. 114 5.6 References Alimonti, A., Carracedo, A., Clohessy, J.G., Trotman, L.C., Nardella, C., Egia, A., Salmena, L., Sampieri, K., Haveman, W.J., Brogi, E., et al. (2010). Subtle variations in Pten dose determine cancer susceptibility. Nat Genet 42, 454-458. Cazalla, D., Yario, T., and Steitz, J.A. (2010). Down-regulation of a host microRNA by a Herpesvirus saimiri noncoding RNA. Science 328, 1563-1566. Chen, C. (2007). Defining embryonic stem cell identity using differentiation-related microRNAs and their potential targets. Mamm Genome 18, 316-327. Chiu, H.S., Llobet-Navas, D., Yang, X.R., Chung, W.J., Ambesi-Impiombato, A., Lyer, A., Kim, H.R., Seviour, E.G., Luo, Z.J., Sehga, V., et al. (2015). Cupid: simultaneous reconstruction of microRNA-target and ceRNA networks. Genome Research 25, 257-267. Denzler, R., Agarwal, V., Stefano, J., Bartel, David P., and Stoffel, M. (2014). Assessing the ceRNA Hypothesis with Quantitative Measurements of miRNA and Target Abundance. Molecular Cell 54, 766-776. Ebert, M.S., Neilson, J.R., and Sharp, P.A. (2007). MicroRNA sponges: competitive inhibitors of small RNAs in mammalian cells. Nature Methods 4, 721-726. Hansen, T.B., Jensen, T.I., Clausen, B.H., Bramsen, J.B., Finsen, B., Damgaard, C.K., and Kjems, J. (2013). Natural RNA circles function as efficient microRNA sponges. Nature 495, 384-388. Li, J.-H., Liu, S., Zhou, H., Qu, L.-H., and Yang, J.-H. (2013). starBase v2.0: decoding miRNAceRNA, miRNA-ncRNA and protein-RNA interaction networks from large-scale CLIP-Seq data. Nucleic Acids Research. Libri, V., Helwak, A., Miesen, P., Santhakumar, D., Borger, J.G., Kudla, G., Grey, F., Tollervey, D., and Buck, A.H. (2012). Murine cytomegalovirus encodes a miR-27 inhibitor disguised as a target. Proceedings of the National Academy of Sciences of the United States of America 109, 279-284. Marson, A. (2008). Connecting microRNA genes to the core transcriptional regulatory circuitry of embryonic stem cells. Cell 134, 521-533. Memczak, S., Jens, M., Elefsinioti, A., Torti, F., Krueger, J., Rybak, A., Maier, L., Mackowiak, S.D., Gregersen, L.H., Munschauer, M., et al. (2013). Circular RNAs are a large class of animal RNAs with regulatory potency. Nature 495, 333-338. Poliseno, L. (2010). A coding-independent function of gene and pseudogene mRNAs regulates tumour biology. Nature 465, 1033-1038. Salmena, L., Poliseno, L., Tay, Y., Kats, L., and Pandolfi, P.P. (2011). A ceRNA hypothesis: the Rosetta Stone of a hidden RNA language? Cell 146, 353-358. Sumazin, P. (2011). An extensive microRNA-mediated network of RNA-RNA interactions regulates established oncogenic pathways in glioblastoma. Cell 147, 370-381. Tay, Y. (2011). Coding-independent regulation of the tumor suppressor PTEN by competing endogenous mRNAs. Cell 147, 344-357. 115 Tay, Y., Rinn, J., and Pandolfi, P.P. (2014). The multilayered complexity of ceRNA crosstalk and competition. Nature 505, 344-352. Wang, Y., Xu, Z., Jiang, J., Xu, C., Kang, J., Xiao, L., Wu, M., Xiong, J., Guo, X., and Liu, H. (2013). Endogenous miRNA Sponge lincRNA-RoR Regulates Oct4, Nanog, and Sox2 in Human Embryonic Stem Cell Self-Renewal. Developmental Cell 25, 69-80. Wang, Z., Gerstein, M., and Snyder, M. (2009). RNA-Seq: a revolutionary tool for transcriptomics. Nature reviews Genetics 10, 57-63. Wee, L.M., Flores-Jasso, C.F., Salomon, W.E., and Zamore, P.D. (2012). Argonaute Divides Its RNA Guide into Domains with Distinct Functions and RNA-Binding Properties. Cell 151, 10551067. 116 Chapter 6 Double hybridization of GFP-Lin28a3'UTR transcript reveals a novel expression pattern 6.1 Abstract By hybridizing CDS and 3'UTR of reporter plasmid transcripts with different color smFISH probes, and studying their co-localization under microscopy, an expression dependent threshold behavior was discovered for GFP-Lin28a3'UTR transcript. A significant number of isolated gfp transcript without following Lin28a 3'UTR tail was discovered. Below an expression threshold of 100 gfp mRNA molecules, the probability of GFP having a co-localized Lin28a 3'UTR tail was highly variable between 0 and 1. And above the threshold, the co-localization probability was always high. This phenomenon was not caused by non-specific binding of GFP probes nor imperfect hybridization or detection efficiencies. Stable integration of GFP-Lin28a3'UTR into the genome recapitulated the expression pattern with an even higher transitioning threshold. Transfection of the plasmid into Dgcr8 - ESCs partially rescued but did not abolish the threshold behavior. The mechanism behind this novel expression pattern remains unknown, and the possibility of alternative polyadenylation (APA) was discussed in the end. 6.2 Results MicroRNA activity reporter plasmid pCAG-d2eGFP-Lin28a3'UTR was transiently transfected into wild-type ESCs. We hybridized eGFP with Alexa color probes, and hybridized Lin28a3'UTR region with Cy5 color probes. For intact transcripts expressed from transfected plasmids, we expect to see co-localization of eGFP spot and subsequent Lin28a3'UTR spot. And overlay of the two channel images would yield a yellow spot (Figure 6.1). However, we observed many isolated green spots (Figure 6.2), those would correspond to gfp transcript without Lin28a 3'UTR tail. The majority of isolated red spots would correspond to endogenous Lin28a transcripts rather than decoy transcripts without the 5' GFP head, because the expression distribution of isolated red spots and the distribution of endogenous Lin28a expression were compared, and they are statistically the same (data not shown). 117 Transfection plasmid Endogenous Lin28a FISH probe set Decoy mRNA eGFP-Alexa miRISC Lin28-3'UTR-Cy5 Microscopy spots color and representation intact decoy transcript - Lin28a 3'UTR endogenous Lin28a CDS - - decoy transcript w/o 3' tail eGFP mRNA * endogenous Lin28a decoy transcript w/o 5' head Figure 6.1 GFP-Lin28a3'UTR co-localization experiment schematics. pCAG-d2eGFP-Lin28a3'UTR plasmid was transiently transfected into wild-type ESCs, and cells were hybridized with eGFP-Alexa probe (green) and Lin28a3'UTR-Cy5 probe (red). Cy5 probe would bind to the 3'UTR region of both transgene transcripts and endogenous Lin28a transcripts. An intact transcripts expressed from transfected plasmid would appear as co-localized green and red, i.e. yellow spots. Figure 6.2 Representative images of abnormally low co-localization of eGFP (green) and lin28a3'UTR (red) spots. Cells are arranged according to increasing eGFP expression. Small yellow spots correspond to colocalized red and green spots, i.e. intact transcripts transcribed from reporter plasmid. Big yellow spots correspond to actively transcribing sites from transfected plasmids. We then set out to quantify the co-localization of eGFP and Lin28a3'UTR spots. We noticed that the percentage of eGFP having a co-localized Lin28a3'UTR tail was dependent on gfp expression. 118 9- + + The co-localized spots is defined as one eGFP and one Lin28a3'UTR spot located within a 2-D squared distance of 5 pixels 2 (after correction of the shifts between channels), and the z direction distance must not differ by more than 1 stack (See Methods). The co-localization percentage is plotted against the total number of gfp expression within the cells, which is the summation of isolated and co-localized GFP spots. And we notice that the co-localization exhibit a sharp transition behavior (Figure 6.3). Specifically, the co-localization percentage is stabilized to ~ 0.73 + 0.13 above 100 gfp mRNA molecules / cell. But below the threshold, the co-localization is very variable, ranging from 0 to 1. And a significant fraction of cells bear a low co-localization percentage (<0.5) below the threshold. This expression pattern could not be explained by imperfect hybridization and detection efficiencies (Supplementary Figure 6.1), or by non-specific binding of GFP probes (Supplementary Figure 6.2). The co-localization percentage increase sharply at high mRNA levels (>100 transcript per cell), nor could this be explained by increased random chances of co-localization at higher dot densities (Supplementary Figure 6.3). =0.1 ++ + 7 -i0.* 0 * * 4 ;.* 0 y.rs ++* 6 6 * 4q ** * * #8. W 0. ****4 U0.' 5 ~0; 4 4 * LL '0. 3 4 * .0. 2 CL 0. 1 * ** I IL ___ I I I I 800 900 - "4 100 200 300 700 600 500 400 GFP mRNA decoy # per cell 1000 Figure 6.3 Co-localization of gfp mRNA with Lin28a3'UTR tail. pCAG-d2eGFP-Lin28a3'UTR was transiently transfected into wild-type ESCs, and the percentage of gfp transcript with a co-localized Lin28a3'UTR tail is plotted against gfp mRNA levels. The co-localization exhibits a sharp transitioning behavior. Below ~100 gfp transcripts per cell, the colocalization is very variable, and many cells bear a low co-localization percentage. Above the threshold, the co-localization percentage is stabilized around 0.73. The red lines are plotted as guidance to the eye. 119 To rule out the possibility that the threshold behavior is caused by the artifact of transient transfection, we stably integrated d2eGFP-Lin28a3'UTR transgene into V19 ESCs (See Methods), and induced its expression at a dox concentration of 2 pg/ml. A significant fraction of cells possessing moderate to high gfp expression exhibits low co-localization percentage (Figure 6.4, below y ~ 0.5). Interestingly, the gfp expression threshold that tolerates the low-localization is even larger than those of transient transfection, and it is around 200 gfp transcript per cell. The marginal distribution of co-localization percentage exhibit bi-modality. A fraction of cells centers around 0.73 co-localization percentage, this high co-localization mode is likely to contain all intact transcripts, and the non-perfect co-localization percentage is merely due to imperfect detection efficiency. Another significant fraction of cells bear low co-localization, and peaks around 0.25, this cells are likely to express a combination of intact transcripts and short UTR transcript. The threshold also seemed to depend on dox concentration and induction time (data not shown), and the cellular background of V19 ESCs were confirmed to be clean (no gfp expression under dox induction before transgene integration). Also, transient transfection of neither GFP-Lats2UTR nor GFP-Casp2UTR exhibits the threshold behavior (Supplementary Figure 6.4), indicating transient transfection experimental approach itself generally guarantees the expression of full transcript from delivered plasmid, and the threshold behavior is specific to Lin28a 3'UTRs. V19 ESCs 0.9 . + I 0.58 3 0. 0.37 *# *** , 0.26 #~, ~0.1~ 0 0 100 300 200 400 500 600 GFPmRNA level per cell Figure 6.4 Co-localization of GFP mRNA with Lin28 3'UTR tail for stably integrated d2eGFP-Lin28a3'UTR in V19 ESCs. GFP-Lin28a3'UTR was stably integrated into the collagen locus of V 19 ESCs, and its expression was induced by doxycycline. A significant fraction of cells express isolated GFP transcripts without Lin28a 3'UTR tail. And the marginal distribution of co-localization percentage exhibits bimodality. 120 To study if the threshold behavior is miRNA dependent, we performed the transfection experiment on Dgcr8 knockout ESCs, and compare the threshold with that of wild type ESCs. We observe that the threshold is smaller in Dgcr8-- ESCs, but it's not entirely abrogated (Figure 6.5). Both experiments are performed for three times, with both cell lines measured in parallel. The difference between cell lines has been validated, and the experimental reproducibility has also been verified. To further evaluate if miRNA regulation directly affects transitioning behavior, we mutated all significant MREs on Lin28a 3'UTR, and transfect pCAG-d2eGFP-Lin28MutUTR in to wild type ESCs, and compare the threshold with that of wild type ESCs. Similarly to Dgcr8-- ESCs, the transitioning behavior is shifted to a smaller value (Supplementary Figure 6.5). To explain the miRNA dependency of transitioning threshold, a simple Michaelis-Menten kinetics model was proposed (See Supplementary model), and the threshold is predicted to be dependent on effective miRNA concentration. According to model fitting, the effective miRNA concentration in wild type ESCs is predicted to be 5 fold higher than that of Dgcr8~1~ ESCs (Supplementary Figure 6.8). Thus we performed Taqman qRT-PCR to measure the expression of miRNAs targeting Lin28a 3'UTR, however, the miRNA expression in Dgcr8-- were confirmed to be zero, not proportional to the predicted effective miRNA concentration (Supplementary Figure 6.7). We also tempted to shift the threshold towards higher values by increasing the effective miRNA concentration. We co-transfected the plasmid with let-7 miRNA mimics, or performed transfection together with retinoid acid differentiation, during which endogenous let-7 expression were reported to increase. However both experiments were proven to be difficult as repression of let-7 on Lin28a 3'UTR strongly decreased transgene expression (data not shown). 121 Dgcr8KO ESCs 0.1 0. 6 4 * 4 U * ~* 7 No0. 3* 5* 0. 4 S0.' 0. 14 3* ++1 100 200 8 2 L 0 0) 0 00. CL 0 300 700 600 500 400 GFP mRNA decoy # per cell 800 900 1000 Figure 6.5 Co-localization of gfp mRNA with Lin28a 3'UTR tail in Dgcr8--cells. pCAG-d2eGFP-Lin28a3'UTR is transiently transfected into Dgcr8-1- ESCs, which is devoid of mature miRNA expression. The sharp transitioning behavior of co-localization still persists, however, the transitioning threshold is shifted towards smaller value. The red line corresponds to the threshold in WT ESCs, and it is plotted as a reference. To see if the co-localization transitioning behavior also happens in its natural context, we hybridize Lin28a CDS with Alexa probe, hybridize Lin28a 3'UTR with Cy5 probe, and measure the colorization of the coding region with respect to its 3'UTR tail. Even though very rarely (<1%), the low-localization do happen in its natural context (Supplementary Figure 6.6). It should be noted that due to the ~70% sequence similarity between the coding regions of Lin28a and Lin28b, the CDS probe cannot distinguish between the two homologs (The 3'UTRs of these two genes are very different though), and the x value might be slightly skewed. By quantifying the colocalization percentage of Lin28a UTRs with a CDS spot, we found Lin28a to be the dominant form in wild type ESCs. And Lin28 expression is much lower in Dgcr8~'~ ESCs with a much higher proportion of Lin28b expression (data not shown). 6.3 Discussion We discovered that co-localization of CDS and UTR regions of GFP-Lin28a3'UTRs exhibited a threshold behavior depending on transcript expression. The shift of threshold towards smaller 122 values for Dgcr8-1- cell transfection and transfection of Lin28a MutUTR suggested that the expression pattern was miRNA dependent. And a miRNA-mediated-decay mechanism was proposed to explain the phenomena (Supplementary model). However, the disproportionality of transitioning threshold and miRNA expression in wild type versus Dgcr8& cells raised a question on the hypothesis. Moreover, the stably integrated GFP-Lin28a3'UTR was able to beat the threshold of wild type ESCs without perturbing the endogenous miRNAs expression. It's also hard to image how mRNA in the middle of degradation was stable enough to be observed. Here we discuss the possibility of alternative polyadenylation (APA). APA is a widespread regulatory mechanism that controls gene expression and expands protein diversity. Earlier studies based on expressed sequence tag (EST) databases estimated that 54% and 32% of human and mouse transcripts were alternatively cleaved (Tian et al., 2005), and more recent studies based on deep sequencing brought the current estimate for human genes to be 70-75% (Derti et al., 2012; Shi, 2012). The most common type of APA is 3'UTR APA, which utilizes alternative poly (A) sites (PASs) located within the same terminal exon, and produces mRNA isoforms with different length 3'UTRs without affecting the encoded protein. Lin28a OriUTR sequence with APA sites highlighted was shown in Supplementary sequence information (Tian et al., 2005). And evidences of APA of Lin28 orthologues in human and chicken has been documented in APADB (http://tools.genxpro.net/apadb/), a database for mammalian alternative polyadenylation determined by 3'-end sequencing. If one of the proximal APA site on Lin28a OriUTR is adopted, the shorter UTR may not be long enough to bind to sufficient number of probes to be resolved as a diffraction limited spot, and the alternative form of transcript may appear as an isolated GFP CDS spot. Analysis on diverse human tissues and cell lines demonstrated a substantial anti-correlation between proliferation and 3'UTR length caused APA. Those examples include T cell activation (Sandberg et al., 2008) and various cancer cells (Mayr and Bartel, 2009). Progressive lengthening of 3'UTRs by APA modulation was observed during mouse embryonic development (Ji et al., 2009). And the generation of induced pluripotent stem cells (iPSCs) from differentiated cells was accompanied by global 3'UTR shortening (Ji and Tian, 2009). Thus the utilization of proximal PASs in mESCs is definitely possible. Dgcr8-- ESC lacks all mature form of canonical miRNAs, and it is associated with prolonged cell cycles. And the slower proliferation rate might explain the reduced selection of proximal APA sites in this cell line. Incidentally, we found that one of the most proximal APA site is mutated in Lin28a MutUTR (Supplementary sequence information), and it could potentially explain the differences between OriUTR and MutUTR. The selection of APA site also depends on lot of factors such as extracellular stimuli (Shell et al., 2005), transcription activity, chromatin modifications, regulatory proteins such as splicing and 3'-end-processing factors, and RNAbinding proteins (RBPs), as reviewed in (Elkon et al., 2013). And those factors might explain the variations among transient transfected and stably integrated transcripts, and the artificially constructed GFP-Lin28a3'UTR with respect to endogenous Lin28a. In the end, we proposed that by hybridizing different regions of a transcript with different color probes and study their co-localization, we could study APA in a very quantitative manner. 123 6.4 Methods 6.4.1 Taqman microRNA expression measurements Small RNA (<200nt) was isolated from wild-type and Dgcr8-1- mESCs using mirVANA miRNA Isolation Kit (Ambion AM1560). Expression of the mature microRNAs regulating Lin28a 3'UTR was assayed using Taqman microRNA assays (Life Technologies). 6.4.2 Co-localized spots detection eGFP-Alexa and lin28UTR-Cy5 spots were first detected independently using the standard FISH spots detection algorithm (See 2.4.11), and spots positions were recorded. Alexa and Cy5 spots within a 2-D squared distance of 5 pixels 2 , and within a z direction absolute distance of 1 stack were considered as co-localized spots. Each Alexa dots could at most be co-localized to one Cy5 dots and vice versa. The average position shift for all co-localized spots within one image was calculated, and was taken as shifts between Alexa and Cy5 channels. The channel shift was corrected for next round of co-localized spots determination. The iterative step stopped when adjacent two rounds yielded the same channel shift, which in reality converged pretty fast and was usually within 3 rounds. A typical Z projected image and its detected spots were shown in Figure 6.6. Figure 6.6 Z projection of microscopy image and it's computationally detected spots. Images of different fluorescent channels are not perfectly aligned but usually have a minor shift, and this was calculated by taking the average position shift of co-localized spots. Spots positions were shift corrected for next round of co-localized spots assessment. In this case, the shift between Alexa and Cy5 channels were calculated to be [0, 1, 0]. 124 - lw u C)C0 OC) 0 o * 0 00 * 000 o 0 * eGFP Alexa lin28 Cy5 .8 0 0 o ((fDjO 0 loc QRD C8O 0 125 0 0 (),g 0590o 6.4.3 Stable Integration Stepi. Insert d2eGFP-lin28a3'UTR behind the pTet regulator of ptet.splicePL3 plasmid, and create a Tet-On system for d2eGFP-lin28a3'UTR expression. d2eGFP-lin28a3'UTR was PCR amplified out of pCAG-d2eGFP-lin28a3'UTR plasmid with Forward primer G-EcoRI-d2eGFP: GGAATTCACCGGTCGCCACCATGGT Reverse primeri lin28UTR-SpeI-CC r.c.: GGACTAGTAGATCCCAGTACCAACTCTGGAG lin28UTR-RBGpA-SpeI-CC primer2 Reverse GGACTAGTGATCTCCATAAGAGAAGAGGGACAGC r.c.: r.c.: reverse complementary. ptet.splicePL3-OSKMpA was triple digested with EcoRI, SpeI and SphI. The digestion of SphI was to further fragmentize the OSKMpA insert to be distinguished from the ~ 5.1kb ptet.splicePL3 backbone. The PCR amplified fragments of d2eGFP-lin28UTR was double digested with EcoRI and Spel, and ligated into ptet.splicePL3 backbone. The positive clones were further sequence confirmed by pTet-splice sequence primer: AGTGAAAGTCGAGCTCGGTA. lac_promoter(5052, 5081) T3_promoter(5151, 5170 CMV2promoter(337, 456) NotI(13) SpeI(557 SV40 int(750, 793) SV40 3_Splice(822, 851) pTet-SplicePL3 5.2 kb NotI(2307) f7 . ri T7_promoter(2337, 2355) 126 SV40_int(4331, 4346) SV40_3_splice(4361, 4399) NotI(1) TRE(117, 324) CMV2_promoter(325, 444) d2EGFP(474, 1319) NotI(1321) NotI(5867) SV40PA-terminator(4975, 5260) Figure 6.7 Plasmid map of ptet.splicePL3 and the regional linear map after insertion of d2eGFP-lin28a3'UTR. d2eGFP-lin28a3'UTR was cloned into ptet.splicePL3 plasmid with EcoRI and Spel. TRE: Tetracycline response element. Step2. Cut ptet.splicePL3-d2eGFP-lin28a3'UTR with NotI, and clone into mCol.loxneo plasmid. This step incorporated homologous recombination arms to the transgene for genomic integration. It also incorporated a drug resistance cassette for positive integration selection. mCol.loxneo plasmid was digested with NotI, and dephosphated with Antarctic phosphatase (NEB M0289S) to prevent vector self-ligation. Since one NotI cutting site exists between d2eGFP and lin28a3'UTR, partial digestion was performed (1U NotI-pg plasmid, 370 C digest for 15minutes, 65 0 C heat inactivate for 20min), and the 5.85kb fragment was selected. Also, since the supercoil conformation of the ptet.splicePL3-d2eGFP-lin28a3'UTR plasmid run about the same position as the desired fragment on the gel, the plasmid was first digested with Scal and PvuI and only the ~7.7kb linearized fragment was gel extracted for downstream partial digestion. The ligated positive clones were digestion by Sac to check for ligation directionality, and the corrected direction clones were sequence confirmed by mCol.loxneo sequence primer: TCGCATTGTCTGAGTAGGTGT. 127 POUI(17797) 5'arm ApR T3arm mCOLlal.oxNEO tetO.OSKM. 17959 bp t IL ttpgNEOPA LoxP Notl(Sohi) Nod (irno) tetO.OSKM.pA 5" a M 1C)Oxp 1430xp =cwwnw: 4 M 0 M 4ME0N 4C01 A1 ICMU4W Figure 6.8 Plasmid map of mCol.loxneo and illustration of homologous recombination of transgene into collagen locus. mCol.loxneo-d2eGFP-lin28a3'UTR plasmid was linearized with PvuI, the 5' and 3' contain homologous arms toward the 3' untranslated region of the Coll al (collagen, type l, al locus). The linearized vector also contains a pgk (phosphoglycerine kinase) driven neo (neomycin) resistance cassette for the selection of successful transgenic cells. Step 3. Electroporation of mCol.loxneo-d2eGFP-lin28a3'UTR plasmid into V19 ES cell line, in which rtTA is constitutively expressed, and drug selection for stable integration. 128 mCol.loxneo targeting vector contains both 5' and 3' homology arms toward the 3' untranslated region of the Col Ial locus as well as a pgk-driven neo resistance cassette for selection of transgenic cells. The resulting ~16.5 kb targeting construct (mCol.loxneo-d2eGFP-lin28a3'UTR) was linearized with PvuI restriction enzyme digestion (20 jig), precipitated and resuspended in lml of lx PBS, which contained 500,000 V19 ESCs (V6.5 ESCs containing a reverse tetracycline transactivator (M2rtTA) driven by the Rosa26 promoter), and electroporated at 400V, 25 pF for 1 pulse. The cells were plated onto two 10-cm plates, which were gelatinized, and pre-plated with neoresistant MEF (Global Stem). After 24 h, G418 (Geneticin(R), GIBCO 10131) was added to ESC medium at a concentration of 350 pg per ml. Neo-resistant colonies were picked 10 days later, expanded and tested with dox (Doxycycline hyclate, Sigma-Aldrich D9891) induction at a concentration of 2 jig/ml. Figure 6.9 Illustration of dox induction of d2eGFP-lin28a3'UTR expression in V19 ESCs. V19 ESCs is derived from C57BL6/J background. The cell line is same as V6.5 ESCs, except that it contains a reverse tetracycline trans-activator (M2rtTA) gene driven by the Rosa26 (Reverse orientation splice acceptor) promoter and rtTA is constitutively expressed in V19 cells. The 'reverse' Tet repressor (rTetR) domain of rtTA binds TetOP (tetracycline/doxycycline-responsive operator) and activates the expression of d2eGFP-lin28a3'UTR in the presence of doxycycline. ptet.splicePL3-OSKMpA plasmid, mCol.loxneo plasmid and V19 ESCs were all gifts from Laurie Boyer's lab, and were originally created by Rudolf Jaenisch's lab. 129 6.5 Supplementary I 6.5.1 Supplementary figures 14 I I W.C I I I I I I 0.8 A. *i~ .. A. ~.A..... A. * 5 0.7 0 0. r- 0.6 * 0 'I 0.3 0 73 0.3 flI E0~ U.2 0.1 I - 100 i i I I I I I I 800 900 - I - 200 700 600 500 400 300 simulated GFP mRNA decoy # per cell 1000 Supplementary Figure 6.1 Simulated co-localization of gfp mRNA with Lin28a3'UTR tail assuming a detection efficiency of 0.73. The hybridization and detection efficiency of transcript spots is not perfect. To rule out the possibility that imperfect resolution results in the observed threshold effect, we simulated the colocalization for various transcript expression. The probability of each GFP spots having a colocalized lin28 spots detected was set to be 0.73, same as the stabilized average co-localization percentage. And the co-localization between spots are independent of each other. 100 simulations were performed for each integer transcript level. The horizontal red line is the average colocalization percentage for simulated dots. The vertical red line is the same as in Figure 6.3, and is plotted for visual guidance. The simulation indicates that the observed co-localization threshold effect is not due to imperfect detection efficiency. 130 Supplementary Figure 6.2 Low co-localization at low mRNA level is NOT due to non-specific binding of the GFP probes inside ESCs. (a) ESCs were mock transfected and hybridized with GFP-Alexa probes. No false positive GFP dots were observed in the absence of the reporter plasmid. (b) ESCs were transfected with reporter plasmids and hybridized with GFP-Alexa. Even the dots in the lowly expressing cells are authentic GFP spots. Both images are set to the same contrast. constant shift of GFP dots randomly generalized GFP dots constant offset=[1 0, -10, -2] 1 1 0.9- nA 0.8- 0.7 0.7- o 0.6 0.6- 0.5 0.5 - - 0.9 0 - ' -r - 0.4 0.4 t+ 0.3 3' 0.3 ^^ 0.2 * * 4 0.1 0.1 Ak 0 o 200 400 600 800 1000 GFPmRNA decoys # per cell 131 0 200 * +* - ~ ; t~ + i 400 600 800 GFPmRNA decoys # per cell *~ 1000 Supplementary Figure 6.3 Increased random chances of co-localization at higher dot densities does not result in the sharp increase of co-localization with Lin28a 3'UTR. Left, GFP dots are shifted by a big constant offset and their co-localization percentage with original Lin28-Cy5 dots is re-calculated. Right, same number of GFP dots are randomly generated within the same cell, and the co-localization is re-calculated. Both way of GFP dots position perturbation only result in a slight increase (<0.1) in co-localization percentage at highest dot densities. In reality, the co-localization does not further increase for higher dot density due to poorer resolution for densely connected dots (>800/per cell). And the sharp increase in co-localization percentage of GFP dots to Lin28a 3'UTR tail is not caused by increased random chances of co-localization at higher dot densities. pCAG-eGFP-lats2aUTR pCAG-eGFP-Casp2UTR ~4S- +* + + + + + 4+ + + (A 08 + ~07 4 U06 0 0 0 0 ~04 C. CL 07 + + O05 I. U 6. 06 05 S04 U'- 03 0 0 02 CU M 0 E0o ++ + + + *+ ++ + - * +4, W08 L-) 09 ++ + + 09 * + + 'L03 0 * 02 CM p. p. ~01 4 CL0 C) 200 400 600 800 1000 U 7 GFP mRNA number per cell 200 400 600 800 1000 GFP mRNA number per cell Supplementary Figure 6.4 Co-localization of GFP mRNA with Casp2 or Lats2 3'UTR tail. pCAG-d2eGFP-Casp2UTR or pCAG-d2eGFP-Lats2UTR were transiently transfected into wildtype ESCs. GFP transcript was hybridized with Alexa probe and 3'UTR of Casp2 or Last2a were hybridized with Cy5 probe. The co-localization is calculated as in the Lin28a case. The colocalization of neither UTRs exhibits the threshold behavior, and the sharp transition is unique to Lin28a 3'UTR. 132 Lin28aMutUTR, WT ESCs I 00.9 0.8 +* + +*,+ *+ CJ +*+ +4. 07 ++ +E+ 0.6 0.2 S0.5 ii 0 ,6- 0.3 0 0 CD 0.1 oC F 100 200 300 700 600 500 400 GFP mRNA decoy # per cell 800 900 1000 Supplementary Figure 6.5 Co-localization of gfp mRNA with mutated Lin28a 3'UTR tail in wild type ESCs. MicroRNA regulating elements (MRE) on Lin28a 3'UTR is mutated, and pCAG-d2eGFPLin28MutUTR is transiently transfected into wild type ESCs. The sharp transitioning behavior of co-localization still persists, however, the transitioning threshold is shifted towards smaller value. The red line corresponds to the threshold in WT ESCs, and it is plotted as a reference. 133 WT ESCs 0.8 - + + I4 - n0Co 0.7 ~C) c .O 0.6 --C.44 j *e 0.5 0.1 0 0 100 300 200 Lin28 CDS number per cell 400 500 Supplementary Figure 6.6 Co-localization of endogenous Lin28a CDS with respect to its 3'UTR tail. Lin28a CDS is hybridized with Alexa probe and the 3'UTR is hybridized with Cy5 probe. The colocalization of endogenous coding sequence with respect to its 3'UTR tail is measured. Out of 66 cells expressing less than 100 endogenous Lin28a transcripts, two cells are observed to have low co-localization percentage. 134 WT ESC, No MEF feeder layer 4030S2010- I I. ~ let-7a f i let-7c miR30 miR125 miR1: - - Il RNU6BSNO2C niR294 cel-lin4 H20 Dgcr8KO ESC, No MEF feeder layer 403020- 10 i 0 RNU6B SN0202 let-7a let-7c miR30 miR125 mi R130miR294 cel-lin4 - smallRNA<+>control I II -- +- Regulating miRNAs ref miRNA H20 <-> control Supplementary Figure 6.7 Taqman qRT-PCR measurements of selected mature microRNA expression. Expression of selected mus musculus mature microRNAs were measured using Taqman microRNA assays. let-7a, let-7c, miR-30, miR-125 and miR-130 were all experimentally validated regulating miRNAs of Lin28a 3'UTR. U6 RNA RNU6B and small nucleolar RNA SN0202 were used as normalizing controls. C. elegans miRNA cel-lin4 and water were used as negative controls. miR-294, the most abundantly expressed miRNA in ESCs was measured as a reference. n=3 technical replicates for all measurements. 135 6.5.2 Supplementary model miRNA-mediated-decay model to explain the co-localization transition curve intact mRNA (mc) gene degrading mRNA (mi) kp,,yA -AAA in r I kd protein 9 degraded mRNA s Intact mRNA (me) will be tagged with RNA FISH probes against both CDS and 3'UTR (yellow dot), and degrading mRNA (mi) will only be tagged with RNA FISH probes against CDS (green dot). kpo1yA is the rate at which the polyA-tail is deadenylated (and some of the 3'UTR is digested), and it depends on both microRNA mediated degradation as well as a 'constitutive' degradation of the mRNA. kd is the rate at which the mRNA is digested once the polyA-tail is removed. We assume this rate to be constant. The differential equations for intact and degrading mRNAs are dme t - kpoIyA dt * MC dt And the steady state for both forms of transcripts are MC t kpolyA M'kpolyA * Meckd And the co-localization percentage at particular total transcript level is Coloc - MC~ = co f 1 +=F 1+kpolyA/kd Assume the miRNA mediated degradation follows Michaelis-Menten kinetics, i.e. kpoIyA = dm + e 136 6: effective concentration of miRNAs regulating Lin28a 3'UTR A: dissociation constant din: constitutive degradation rate By substituting kpolyA expression into coloc 1 6k ~ 100C + d m+ O/k d kd A+mC coloc = 0.- 4* 0.96-** WT ESC 0.4 -,i 0.3 "+++ ++ .+ +* S0.2 +++ * *$ +Dgcr8KO ++ ++ 0.7Dgcr8KO 6 *4 + 0r ++**4* * 50 /* +,----WT ESC fit, effective mi RNA con=68 fit, effective miRNA con=1 3 100 150 200 250 300 decoy mRNA # per cell Supplementary Figure 6.8 Model fitting of co-localization curve. The average co-localization percentage at different decoy mRNA levels were calculated, and the data was fitted with proposed model. Effective regulating miRNA concentration in wild type ESCs was predicted to be ~5 fold as high as that of Dgcr8*- ESCs. 137 6.5.3 Supplementary sequence information Lin28a OriUTR sequence with APA sites highlighted in yellow and canonical PAS in red GGCCCAGGAGTCAGGGTTATTCTTTGGCTAATGGGGAGTTTAAGGAAAGAGGCATCAATCTGCAGAGT GGAGAAAGTGGGGGTAAGGGTGGGTTGCGTGGGTAGCTTGCACTGCCGTGTCTCAGGCCGGGGTTCC CAGTGTCACCCTGTCTTTCCTTGGAGGGAAGGAAAGGATGAGACAAAGGAACTCCTACCACACTCTATC TGAAAGCAAGTGAAGGCTTTTGTGGGGAGGAACCACCCTAGAACCCGAGG CTTTGCCAAGTGGCTGGG CTAGGGAAGTTCTTTTGTAGAAGGCTGTGTGATATTTCCCTTGCCAGACGGGAAGCGAAACAAGTGTCA AACCAAGATTACTGAACCTACCCCTCCAGCTACTATGTTCTGGGGAAGGGACTCCCAGGAGCAGGGCGA GGTTATTTTCACACCGTGCTTATTCATAACCCTGTCCTTTGGTGCTGTGCTGGGAATGGTCTCTAGCAACG GGTTGTGATGACAGGCAAAGAGGGTGGTTGGGGAGACAACTGCTGACCTGCTGCCCACACCTCACTCC CAGCCCTTTCTGGGCCAATGGGATTTTAATTTATTTGCTCCCTTAGGTAACTGCACCTTGGGTCCCACTTT CTCCAGGATGCCAACTGCACTATCTACGTGCGAATGACGTATCTTGTGCG I I I I I I I I I I I I I AA1TFTTA AAAT1TTTTCATCTTCTTAATATAAATAATGGGTTTGTATTTTTGTATATTTTAATCTTAAGGCCCTCATT CCTGCACTGTGTTCTCAGGTACATGAGCAATCTCAGGGATAATAAGTCCGTAGCAGCTCCAGGTCTGCTC AGCAGGAATACTTTGTT1GTTTTGTTTTGATCACCATGGAGACCAACCATTTGGAGTGCACAGCCTGTT GAACTACCTCATTTTTGCCGATTACAGCTGGCTTTTCTGCCATAGCGTCCTTGAAAAATGTGTCTCACGGG TTTCGATTGAGCTGCCCCAAGACTTGATCTGGATTTGGCAAAACATAGGACATCACTCTAAACAGGAAA GGGTGGTACAGAGACATTAAAAGGCTGGGCCAGGTGAAAGGCACAAGAGGAACTTTCCATACCAGATC CATCCTTTTGCCAGATTAGTGGAAGCCTGCCATGCACAGCAGGGTGTGAGAGAGAGAGTGTGTATGTAT GTGTGTGTGGATTT11T1TAATGCAAATTTATGAAGACGAGGTGGGTTTTGTTTATTTGATTGC1TT1TGT GCTGGGGATGGAATCTTGGGCTTCATTTGTGCTAGGAAGTACACTGCCACTGAGTTATCCCAGTAAGAA TGCAACTTAAGACCAGTACCCTTATTCCCACACTGTGCTGTCCAGGCATGGGAACATGAGGCAGGGACT CAACTCCTTAGCCTTTCACAATCTTGGCTTTCTGAGAGACTCATGAGTATGGGCCTCAGTGGCAAGTGTC CTGCCCTGCTGTAGCGTGATGGTTGATAGCTAAAGGAAAGAGGGGGTGGGGAGTTTCGTTTACATGCTT TGAGATCGCCACAAACCTACCTCACTGTGTTGAAACGGGACAAATGCAATAGAACACATTGGGTGGTGT GTGTGTGTGTCTGATCTTGGTTTCTTGTCTCCCTCTCCCCCCAAATGCTGCCCTCACCCCTAGTTAATTGTA TTCGTCTGGCCTTTGTAGGACTTTACTGTCTCTGAGTTGGTGATTGCTAGGTGGCCTAGTTGTGTAAATA TAAATGTGTTGGTCTTCATGTTCTTTTGGGGTTTTATTGTTTACAAAACTTTTGTTGTATTGAGAGAAAAA TAGCCAAAGCATCTTTGACAGAAAGCTCTGCACCAGACAACACCATCTGAAACTTAAATGTGCGGTCCTC TTCTCAAAGTGAACCTCTGGGACCATGGCTTATCCTTACCTGTTCCTCCTGTGTCTCCCATTCTGGACCAC AGTGACCTTCAGACAGCCCCTCTTCTCCCTCGTAAGAAAACTTAGGCTCATTTACTCTTTGAGCATCTCT GTAACTCTTGAAGGACCCATGTGAAAATTCTGAAGAAGCCAGGAACCTCATTCTTTCCTTGTCCCTAACT CAGTGAAGAGTTTTGGTTGGTGGTTTTGAGACAGGGCCTCACTCTGTAGCTGGAGATAGAGAGCCTCGG GTTCCTGGCTCTCCTCCTGCCTTCTGCACAGAGTCCCCTGTGCAGGGATTGCAGGTGCCGCTTCTCCCTG GCAAGACCATTTATTTCATGGTGTGATTCGCCTTTGGATGGATCAAACCAATGTAATCTGTCACCCTTAG GTCGAGAGAAGCAATTGTGGGGCCTTCCATGTAGAAAGTTGGAATCTGGACACCAGAAAAGGGACTAT GAATGTACAGTGAGTCACTCAGGAACTTAATGCCGGTGCAAGAAACTTATGTCAAAGAGGCCACAAGAT TGTTACTAGGAGACGGACGAATGTATCTCCATGTTTACTGCTAGAAACCAAAGCTTTGTGAGAAATCTTG AATTTATGGGGAGGGTGGGAAAGGGTGTACTTGTCTGTCCTTTCCCCATCTCTTTCCTGAACTGCAGGAG ACTAAGGCCCCCCACCCCCCGGGGCTTGGATGACCCCCACCCCTGCCTGGGGTGTTTTATTTCCTAGTTG ATTTTTACTGTACCCGGGCCCTTGTATTCCTATCGTATAATCATCCTGTGACACATGCTGACTTTTCCTTCC ACTTATTGGTACTCCAGAGTTGGTACTG CTTCTCTTCCCTGGGAA 138 Lin28a MutUTR sequence with APA sites highlighted in yellow and canonical PAS in red GGCCCAGGAGTCAGGGTTATTATGTGGCTAATGGGGAGTTTAAGGAAAGAGGCATCAATCTGCAGAGT GGAGAAAGTGGGGGTAAGGGTGGGTTGCGTGGGTAGCTTGAACGGACGTGTCTCAGGCCGGGGTTCC CAGTGTCACCCTGTCTTTCCTTGGAGGGAAGGAAAGGATGAGGCAAAGGAACTCCTACCACACTCTATC TGAAAGCAAGTGAAGGCTTTTGTGGGGGAGGAACCACCCTAGAACCCGAGGCTTTGACCAGTGGCTGG GCTAGGGAAGTTCTTTTGTAGAAGGCTGTGTGATATTTCCCTTGCCAGACGGGAAGCGAAACAAGTGTC AAACCAAGATTACTGAACCTACCCCTCCAGCTACTATGTTCTGGGGAAGGGACTCCCAGGAGCAGGACG AGGTTATTTTCACACCGTGCTTATTCATAACCCTGTCCTTTGGTGCTGTGCTGGGAATGGTCTCTAGCAAC GGGTTGTGATGACAGGCAAAGAGGGTGGTTGGGGGAGACAACTGCAGACCTTCGGCCCACACCTCACT CCCAGCCCTTTCTGGGCCAATGGGATTTTAATTTATTTGCTCCCTTAGGTAACTGCAACGTGGGTCCCACT TTCTCCAGGATGCCAACTGAACGATCTACGTGCGAATGACGTATCTTGTGCGTTC I I I I I I I I I IIAATTTT TAAAATTTTTTTTCCTCTTCTTAAAATAAGTAATGGGTTTGTATTTTTTTCTATTTTAATCTTCCGGCCCTCA TTCCTGCCCTTTGTTCTCAGGTACATGAGCAATCTCCGTGATAATAAGTCCGTAGCAGCTCCAGGTCTGCT CAGCCGTAATACTTTGTTTTTTGTTTTGATCACCATGGAGACCAACCATTTGGAGTGCACAGCCTGTT GAACTAACGCATTTUTGCCGATTACAGCTGGCTTTTCTGCAAGAGCGTCCTTGAAAAATGTGTCTCACGG GTTTCGATTGAGCTGCCCCAAGACTTGATCTGGATTTGGCAAAACATAGGACATCACTCTAAACAGGAA AGGGTGGTACAGAGACATTAAAAGGCTGGGCCAGGTAAAAGGCACAAGAGGAACTTTCCATACCAGAT CCATCCTTTTGCCAGATTAGTGGAAGCCTGCCATGCACAGCCGTGTGTGAGAGAGAGAGTGTGTATGTA TGTGTGTGTGGATTTTTTTTAATTCCAATTTATGAAGACGAGGTGGGTTTTGTTTATTTGATTGC1T1T1GT GCTGGGGATAGAATCTTGGGCTTCATTTGTGCTAGGAAGTACACGGACACTGAGTTATCCCAGTAAGAA TTCCACTTAAGACCAGTACCCTTATTCCCACACTGTGCTGTCCAGGCATGGGAACATGAGGCAGGGACTC AACTCCTTAGCCTTTCACAATCTTGGCTTTCAGAGAGACTCATGAGTATGGGCCTCAGTGGCAAGTGTCC TGCCCTTCGGTAGCATGATGGTTGATAGCTAAAGGAAAGAGGGGGTGGGGAGTTTCGTTGAAATGCTG TTAGATCGCCAGAAACCTAACGCACTGTGTTGAAACGGGACAAATTCCATAGAACACATTGGGTGGTGT GTGTGTGTGTCTGATCTTGGTTTCTTGTCTCCCTCTCCCCCCAAATTCGGCCCTCACCCCTAGTTAATTGTA TTCGTCTGGCCTTTGTAGGACTTTTACTGTCTCTGAGTTGGTGATTGCTAGGTGGCCTAGTTGTGTAAATA TAAATGTGTTGGTCTTCATGTTCTTTTGGGGTTTTATTGTTGAAAAAACTTTTGTTGTATTGAGAGAAAAA TAGCCAAAGCATCTTTGACAGAAAGCTCTGCACCAGACAACACCATCTGAAACTTAAATGTGCGGTCCTC TTCTCAAAGTGAACCTCTGGGACCATGGCTTATCCTTACCTGCTCCTCCTGTGTCTCCCATTCTGGACCAC AGTGACCTTCAGACAGCCCCTCTTCTCCCTCGTAAGAAAACTTAGGCTCATTTACTTCTTTGAGCATCTCT GTAACTCTTGAAGGACCCAGGTTAAAATTCTGAAGAAGCCAGGAACCTCATTATGTCCTTGTCCCTAACT CAGTGAAGAGTTTTGGTTGGTGGTTGTTAGACAGGGCCTCACTCTGTAGCTGGAGATAGAGAGCCTCGG GTTCCTGGCTCTCCTCCTGCCTTCTGCACAGAGTCCCCTGTGCAGGGCTTGCAGGTGCCGCTTCTCCCTG GCAAGACCATTTATTTCATGGTGTGATTCGCCTTTGGATGGATCAAACCAATGTAATCTGTCACCCTTAG GTCGAGAGAAGCAATTGTGGGGCCTTCCATGTAGAAAGTTGGAATCTGGACACCAGAAAAGGGACTAT GACTTTACAGTGAGTCACTCAGGAACTTAATGCCGGTGCAAGAAACTTATGTCAAAGAGGCCACAAGAT TGTTACTAGGAGACGGACGACTTTATCTCCATGTTGAATGCTAGAAACCAAAGCTTTGTGAGAAATCTTG AATTTATGGGGAGGGTGGGAAAGGGTGTACTTGTCTGTCCTTTCCCCATCTCTTTCCTGAACTGCAGGAG ACTAAGGCCCCCCACCCCCCGGGGCTTGGATGACCCCCACCCCTGCCTGGGGTGTTTTATTTCCTAGTTG ATTTTTAATGGACCCGGGCCCTTTTCTTCCTATCGTATAATCATCCTGTGACACATGCTGACTTTTCCTTCC ACTTATTGGTACTCCAGAGTTGGGAATG CTTCTCTTCCCTGGGAA 139 6.6 References Derti, A., Garrett-Engele, P., Maclsaac, K.D., Stevens, R.C., Sriram, S., Chen, R., Rohl, C.A., Johnson, J.M., and Babak, T. (2012). A quantitative atlas of polyadenylation in five mammals. Genome Research. Elkon, R., Ugalde, A.P., and Agami, R. (2013). Alternative cleavage and polyadenylation: extent, regulation and function. Nat Rev Genet 14, 496-506. Ji, Z., Lee, J.Y., Pan, Z., Jiang, B., and Tian, B. (2009). Progressive lengthening of 3' untranslated regions of mRNAs by alternative polyadenylation during mouse embryonic development. Proceedings of the National Academy of Sciences of the United States of America 106, 70287033. Ji, Z., and Tian, B. (2009). Reprogramming of 3' Untranslated Regions of mRNAs by Alternative Polyadenylation in Generation of Pluripotent Stem Cells from Different Cell Types. PLoS ONE 4, e8419. Mayr, C., and Bartel, D.P. (2009). Widespread shortening of 3' UTRs by alternative cleavage and polyadenylation activates oncogenes in cancer cells. Cell 138, 673. Sandberg, R., Neilson, J.R., Sarma, A., Sharp, P.A., and Burge, C.B. (2008). Proliferating Cells Express mRNAs with Shortened 3' Untranslated Regions and Fewer MicroRNA Target Sites. Science 320, 1643-1647. Shell, S.A., Hesse, C., Morris, S.M., and Milcarek, C. (2005). Elevated Levels of the 64-kDa Cleavage Stimulatory Factor (CstF-64) in Lipopolysaccharide-stimulated Macrophages Influence Gene Expression and Induce Alternative Poly(A) Site Selection. Journal of Biological Chemistry 280, 39950-39961. Shi, Y. (2012). Alternative polyadenylation: New insights from global analyses. RNA 18, 21052117. Tian, B., Hu, J., Zhang, H., and Lutz, C.S. (2005). A large-scale analysis of mRNA polyadenylation of human and mouse genes. Nucleic Acids Research 33, 201-212. 140 Chapter 7 Conclusions and Perspectives In summary, this thesis constructed a reporter system for 3'UTRs of genes to investigate combinatorial effect of miRNA regulation on its endogenous targets. MutUTR was proposed as general, effective and flexible miRNA unregulated control, and no genetic modifications of cellular background was needed. MicroRNA regulation at the transcriptional and translational level was quantified at single cell resolution over a target expression range of more than 100 fold. Its first order (repression strength) and second order (noise control) effects were quantified, and the potential mechanisms and consequences have been discussed. The reporter system could also be used as a natural sponge to titrate away miRNAs. Combined with high throughput techniques, miRNA-mediated-crosstalk could be studied at genome-wide level. A novel expression pattern was discovered for GFP-Lin28a3'UTR, and the phenomenon was miRNA dependent. We believe that the results of this work represent an important step in quantification of miRNA regulation at single cell levels. Below we discuss some of ideas for possible directions that may results from our work. 7.1 Future Directions Chapter 2 During the design of MutUTRs, miRNA response elements (MREs) with significant context score (targeting efficacy) and conserved probability were mutated irrespective of the expression of targeting miRNAs in ESCs. Thus in principle, MutUTRs could be used in cell context other than ESCs. The versatility of MutUTRs remains to be validated in other systems. The empirically chosen thresholds used in the mutation algorithm for selection of mutation sites could be fine-tuned or tailored in future. And the mutation algorithm could be easily adapted to study the effect from a particular miRNA, and other regulatory elements such as AU-rich elements. Chapter 3 The quantification of miRNA regulation at the transcriptional and translational level can be extended in the following directions. Models can be built to explain the observed regulation transfer functions at both levels. Previously, a molecular titration model was built to fit the threshold behavior of miRNA regulation (Mukherji et al., 2011). It explains the decrease of miRNA regulation for high target expression region, but this model miss the initial increase of regulation for low target expression region for certain targets. Of course additional experimental supports are needed to corroborate this initial increase. Any conclusions extracted from the low target expression region is susceptible to background issue, and a background free method would be especially helpful. Targeted mass spectrometry (MS) technique could be further pursued. Alternatively, the luciferase reporter which is also background free could be constructed. It might be especially useful to link our quantitative observation with molecular mechanisms. And certain miRNA regulation pathway mutants (e.g. GW1 82 mutant), or constructs that specifically eliminate the possibility of miRNA regulation at certain stage (e.g. decapped, IRES containing constructs) could be especially useful. Recent studies reveal another dimension of miRNA regulation, and show that translational inhibition and transcriptional degradation dominates at different times after miRNA activity 141 induction (Eichhorn et al., 2014). Reporter system could easily be applied to study this, and integrate another temporal dimension to reveal the full map for miRNA regulation dynamics. It is still an unresolved issue why genome-wide (Baek, 2008; Eichhorn et al., 2014; Guo et al., 2010; Hendrickson, 2009; Selbach, 2008) and single gene analyses (Behm-Ansmant, 2006; Eulalio, 2007; Filipowicz et al., 2008; Poy et al., 2004; Zhao et al., 2005) usually arrive at different conclusions about the relative contribution from transcriptional regulation. The differences could come from single miRNA perturbation commonly used in genome-wide assays, and the combinatorial effect of miRNAs regulation on endogenous genes. We hypothesize that the discrepancy might merely reflect different modes of regulation at different target expression regions. MicroRNAs preferentially target lowly expressed genes, while selectively avoiding ubiquitous and highly expressed genes (Farh, 2005; Sood et al., 2006). And reporter assays usually overexpress the reporter construct. If the initial increase for translational repression is true, the difference from the two approaches simply reflects the change of relative transcriptional contributions for low and medium/high target expression regions. Chapter 4 We have studied miRNA regulation of target expression noise at protein level. The next step is to explore noise control at mRNA level, and how noise propagate from one level to the next. Our initial studies show that miRNAs decrease protein noise at low protein expression, but increase noise at high protein expression (Figure 7.a and (Jmrn M. Schmiedel, 2015)). The differences of mRNA expression noise is not that obvious, and the two overlap for the measurable region (Figure 7.1b). If we compare transcript noise and protein noise at the same target abundance (i.e. same indicator protein level), we observe that translation from few transcripts increases noise while translation could suppress noise at high transcript levels. The crossover of mRNA and protein noise happens for both OriUTR (reg) and MutUTR (unreg). And the crossover is postponed to higher target abundance for OriUTR, which makes sense because miRNA-mediated transcript degradation reduces the effective level of OriUTR transcripts (Figure 7.2). Here we only presented Casp2 3'UTR as an example, other UTRs yield similar result (data not shown). It is of interest to understand mechanisms behind the observation, and its biological consequences. 142 protein noise mRNA noise 0.6r 0.5 -Casp2OriUTR -Casp2MutUTR -Casp2OrlUTR -Casp2MutUTR 05 0.4j OA A3 I. OA &035 tX 0.35 0.3 0.3 20.25 z 0.2 1-2 0. 0.2 0.2 u-0.15 - 0 0.1 0.1 0.06 0.06 a 10 10 10 10 b 5 4 10 103 102 10 10 10 GFP mRNA GFP protein Figure 7.1 miRNA regulation of protein and mRNA noise. pCAG-d2eGFP-Casp2Ori/MutUTR was co-transfected with pCAG-mCherry into wild-type mESCs. (a) miRNAs decrease protein noise at low protein expression, but increase noise at high protein expression. (b) Expression noise of OriUTR and MutUTR are overlapping at mRNA level. 1.8 0.5 -Casp2Or Casp2Or -Casp2ut -Casp2Mut 0.45 0.4 0 C + mRNA Pro mRNA Pro 1.7 cc 1.5 0.3 1.4 0.25 1.3 0.2 1.2 0.15 Z 1.1 0.1 E 1 0.05 0.9 3 3.5 4.5 4 logI0(mCherry) 5 re gulated U nregulated ,1.6 0.35 i -- .0 I g A' 5.5 3.5 4.5 4 logi 0(mCherry) 5 5.5 Figure 7.2 Noise propagation from mRNA to protein level for different target abundance. pCAG-d2eGFP-Casp2Ori/MutUTR was co-transfected with pCAG-mCherry into wild-type mESCs. Transcript noise and protein noise were quantified for different indicator protein levels. Translation from few transcripts increases noise while it suppresses noise at high transcript levels. The crossover of mRNA and protein noise happens for both OriUTR (reg) and MutUTR (unreg). And the crossover is shifted to higher target abundance for OriUTR. Chapter 5 No miRNA-mediated-crosstalk was found for targets of miRNAs which have MREs on Lats2a 3'UTR even under the highest decoy expression condition. Modeling predicts that 143 as either miRNA abundance or the number of miRNA-binding sites increases, miRNAs become increasingly refractory to competition by changes in the concentration of individual RNA target species (Ala et al., 2013; Mukherji et al., 2011; Wee et al., 2012). In the case of on Lats2a 3'UTR overexpression, even though added MREs was comparable to or much higher than the expression of regulating miRNAs, high abundance of endogenous target sites prevent effective crosstalk. Our result is consistent with the recent model proposed by (Denzler et al., 2014). The number of added MREs required for miRNA target derepression is independent of miRNA levels, but relies on endogenous target site abundance, which usually exceeds that of the miRNAs. ceRNAs must begin to approach the target site abundance of miRNA before they can exert a consequential effect on the repression of targets for that miRNA, which rarely occurs in vivo. Thus it is very natural to wonder why siRNA knockdown of endogenous PTEN, which is less than 40 transcript per cell (Apratim Sahay, personal communication), is able to induce miRNA-mediated-crosstalk. The sensitivity might attribute to the fact that PTEN is a haploinsufficient tumor suppressor, and even 20% decrease in expression can promote cancer growth (Alimonti et al., 2010). Downstream signaling pathway such as PI3K/AKT phosphorylation might able to amplify the signal. Other mechanisms such as subcellular localization thus local enrichment of regulating elements, and ceRNAs network topology might also augment the crosstalk effect. Some mechanisms were employed by miRNA decoys to boost sponging potencies. ncRNA HSUR-1 has miRNA catalytic activity, and can elicit degradation of the bound miRNAs (Cazalla et al., 2010). circRNA CDR1as contains tandem (>70) binding sites for miR-7, and the circular structure is resistant to nucleases activity and is especially stable (Hansen et al., 2013; Memczak et al., 2013). Chapter 6 Alternative polyadenylation (APA) was proposed to explain the observed colocalization transitioning behavior for GFP-Lin28a3'UTR transcripts. Northern blot, qRT-PCR or 3' RACE (rapid amplification of cDNA ends) could be immediately applied to evaluate the possibility of APA (Elkon et al., 2013). And if APA indeed explains the observed phenomenon, mutation of alternative Poly (A) sites could be employed to study the transition quantitatively in detail. Additionally, progressive lengthening of 3'UTRs by APA modulation was observed during mouse embryonic development (Ji et al., 2009). The high incidence of APA in cancer cell lines, with a consequential loss of 3'UTR miRNA response elements, suggests a pervasive role for APA in oncogene activation without genetic alteration (Mayr and Bartel, 2009). Examples include HMGA2, one of the endogenous UTRs used in our previous studies (Mukherji et al., 2011). Thus it might be intriguing to label different regions of APA transcripts with different color probes, and study their co-localization quantitatively at single cell level upon ES cell differentiation or oncogenic transformation. For APA form of miRNA targets, it is especially interesting to study how the loss of miRNA targeting sites affects transcript stability and translation efficiency, and what are the biological consequences. The only caveat is that transcript length between alternative polyadenylation sites has to be long enough to be resolved as diffraction limited spots under microscopy. 144 7.2 References Ala, U., Karreth, F.A., Bosia, C., Pagnani, A., Taulli, R., Leopold, V., Tay, Y., Provero, P., Zecchina, R., and Pandolfi, P.P. (2013). Integrated transcriptional and competitive endogenous RNA networks are cross-regulated in permissive molecular environments. Proceedings of the National Academy of Sciences 110, 7154-7159. Alimonti, A., Carracedo, A., Clohessy, J.G., Trotman, L.C., Nardella, C., Egia, A., Salmena, L., Sampieri, K., Haveman, W.J., Brogi, E., et al. (2010). Subtle variations in Pten dose determine cancer susceptibility. Nat Genet 42, 454-458. Baek, D. (2008). The impact of microRNAs on protein output. Nature 455, 64-71. Behm-Ansmant, I. (2006). mRNA degradation by miRNAs and GW182 requires both CCR4:NOT deadenylase and DCP1:DCP2 decapping complexes. Genes Dev 20, 1885-1898. Cazalla, D., Yario, T., and Steitz, J.A. (2010). Down-regulation of a host microRNA by a Herpesvirus saimiri noncoding RNA. Science 328, 1563-1566. Denzler, R., Agarwal, V., Stefano, J., Bartel, David P., and Stoffel, M. Assessing the ceRNA Hypothesis with Quantitative Measurements of miRNA and Target Abundance. Molecular Cell 54, 766-776. Eichhorn, Stephen W., Guo, H., McGeary, Sean E., Rodriguez-Mias, Ricard A., Shin, C., Baek, D., Hsu, S.-h., Ghoshal, K., Villen, J., and Bartel, David P. (2014). mRNA Destabilization Is the Dominant Effect of Mammalian MicroRNAs by the Time Substantial Repression Ensues. Molecular Cell 56, 104-115. Elkon, R., Ugalde, A.P., and Agami, R. (2013). Alternative cleavage and polyadenylation: extent, regulation and function. Nat Rev Genet 14, 496-506. Eulalio, A. (2007). Target-specific requirements for enhancers of decapping in miRNA-mediated gene silencing. Genes Dev 21, 2558-2570. Farh, K.K. (2005). The widespread impact of mammalian microRNAs on mRNA repression and evolution. Science 310, 1817-1821. Filipowicz, W., Bhattacharyya, S.N., and Sonenberg, N. (2008). Mechanisms of posttranscriptional regulation by microRNAs: are the answers in sight? Nature Rev Genet 9, 102-114. Guo, H., Ingolia, N.T., Weissman, J.S., and Bartel, D.P. (2010). Mammalian microRNAs predominantly act to decrease target mRNA levels. Nature 466, 835-840. Hansen, T.B., Jensen, T.I., Clausen, B.H., Bramsen, J.B., Finsen, B., Damgaard, C.K., and Kjems, J. (2013). Natural RNA circles function as efficient microRNA sponges. Nature 495, 384-388. Hendrickson, D.G. (2009). Concordant regulation of translation and mRNA abundance for hundreds of targets of a human microRNA. PLoS Biol 7, e1000238. Ji, Z., Lee, J.Y., Pan, Z., Jiang, B., and Tian, B. (2009). Progressive lengthening of 3' untranslated regions of mRNAs by alternative polyadenylation during mouse embryonic development. Proceedings of the National Academy of Sciences of the United States of America 106, 7028-703 3. 145 J6m M. Schmiedel, S.L.K., Yannan Zheng, Apratim Sahay, Nils Blithgen, Debora S. Marks, Alexander van Oudenaarden (2015). miRNA control of protein expression noise. Science. Mayr, C., and Bartel, D.P. (2009). Widespread shortening of 3' UTRs by alternative cleavage and polyadenylation activates oncogenes in cancer cells. Cell 138, 673. Memczak, S., Jens, M., Elefsinioti, A., Torti, F., Krueger, J., Rybak, A., Maier, L., Mackowiak, S.D., Gregersen, L.H., Munschauer, M., et al. (2013). Circular RNAs are a large class of animal RNAs with regulatory potency. Nature 495, 333-338. Mukherji, S., Ebert, M.S., Zheng, G.X.Y., Tsang, J.S., Sharp, P.A., and van Oudenaarden, A. (2011). MicroRNAs can generate thresholds in target gene expression. Nat Genet 43, 854-859. Poy, M.N., Eliasson, L., Krutzfeldt, J., Kuwajima, S., Ma, X., MacDonald, P.E., Pfeffer, S., Tuschl, T., Rajewsky, N., Rorsman, P., et al. (2004). A pancreatic islet-specific microRNA regulates insulin secretion. Nature 432, 226-230. Salmena, L., Poliseno, L., Tay, Y., Kats, L., and Pandolfi, P.P. (2011). A ceRNA hypothesis: the Rosetta stone of a hidden RNA language? Cell 146, 353-358. Selbach, M. (2008). Widespread changes in protein synthesis induced by microRNAs. Nature 455, 58-63. Sood, P., Krek, A., Zavolan, M., Macino, G., and Rajewsky, N. (2006). Cell-type-specific signatures of microRNAs on target mRNA expression. Proceedings of the National Academy of Sciences of the United States of America 103, 2746-2751. Wee, L.M., Flores-Jasso, C.F., Salomon, W.E., and Zamore, P.D. (2012). Argonaute Divides Its RNA Guide into Domains with Distinct Functions and RNA-Binding Properties. Cell 151, 10551067. Zhao, Y., Samal, E., and Srivastava, D. (2005). Serum response factor regulates a muscle-specific microRNA that targets Hand2 during cardiogenesis. Nature 436, 214-220. 146