Document 11398966

advertisement
Quantification of MicroRNA Regulation and its Consequences
at the Single Cell Level
ARCHIVES
MASSACHUSETTS INWTTI ITE
OF VECHNOLOLGY
by
Yannan Zheng
JUN 3 0 2015
B.S. in Mathematics and Physics
LIBRARIES
Tsinghua University (2008)
Submitted to the Department of Physics
in partial fulfillment of the requirements for the degree of
Doctor of Philosophy
at the
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
June 2015
0 Massachusetts Institute of Technology 2015. All rights reserved.
Signature redacted
Signature of Author ...................................................
Department of Phyics
/
Certified by ........................
March 27, 2015
redacte!
Signature redacted
Alexander van Oudenaarden
Professor of Physics and Professor of Biology at Massachusetts Institute of Technology
Professor of Quantitative Biology of Gene Regulation at Hubrecht Institute
Thesis Sunervisor
Certified by.....................
............
Signature redacted
(I
Jeff Gore
Latham Family Career Development Assistant Professor of Physics
Thesis Co-Supervisor
Accepted by .................... ....
/
VU
Signature redacted
/7'-
Professor Nergis Mavalvala
Associate Department Head of Physics
In memory of my grandmother
Meixiu Wei
1913-2014
2
Quantification of MicroRNA Regulation and its Consequences
at the Single Cell Level
by
Yannan Zheng
Submitted to the Department of Physics
on March 27, 2015 in partial fulfillment of
the requirements for the degree of
Doctor of Philosophy
Abstract
MicroRNAs (miRNAs) are a class of small non-coding RNAs which play important roles in
posttranscriptional gene regulation. miRNAs regulate more than half of mammalian proteincoding genes. They have been found to participate in almost every cellular process and their
dysregulation is associated with many diseases. miRNAs recognize their targets by base paring to
miRNA response elements (MREs), which are predominantly located at 3' untranslated region
(3'UTR) of mRNAs. This thesis focuses on a microRNA activity reporter system to investigate
various aspects of miRNA regulation on its endogenous 3'UTR targets. Mutation of selected
MREs on 3'UTRs (MutUTRs) was designed and validated as miRNA unregulated control. It does
not require genetic modifications of cellular background and effectively abolishes the majority of
miRNA regulation with minimum perturbation to the UTR sequences. MicroRNAs can induce
target silencing via mRNA transcript degradation and translational inhibition. But the relative
contributions from the two sources have been under debate. It is also unclear how miRNA
regulation varies for different target expression. MicroRNA regulation at the transcriptional and
translational levels was quantified at single cell resolution over a target expression range of more
than 100 fold using our reporter system. The transcriptional regulation was found to be uniform
throughout the range of measurement, whereas translational regulation decreases at high target
expression. Our data also suggests that translational regulation increase initially at low target
expression for certain targets. For all UTRs under study, miRNA regulation from the two sources
were found to be on the same order. In addition to target repression, miRNAs also control target
expression noise. MicroRNAs decrease protein expression noise for lowly expressed genes, but
increase noise for highly expressed genes, and the noise regulation seems to happen at translational
level. By linking reporter assays to transcriptome expression, our findings suggest that microRNAs
confer precision to protein expression in vivo, and transcriptional regulation might dominate for
endogenous targets. Finally we applied the reporter system as miRNA decoys to study miRNAmediated-crosstalk. We also propose that the reporter systems could be used to study alternative
polyadenylation, which is usually accompanied by consequential loss of MREs.
Thesis Supervisor: Alexander van Oudenaarden
Title: Professor of Physics and Professor of Biology
Thesis Co-Supervisor: Jeff Gore
Title: Latham Family Career Development Assistant Professor of Physics
3
Thesis Supervisor: Alexander van Oudenaarden, PhD
Title: Professor of Physics and Biology, Massachusetts Institute of Technology
Director of Hubrecht Institute for Developmental Biology and Stem Cell Research at the
Royal Netherlands Academy of Arts and Sciences and University Medical Center Utrecht
Thesis Co-Supervisor and Thesis Committee Chairman: Jeff Gore, PhD
Title: Assistant Professor, Department of Physics, Massachusetts Institute of Technology
Thesis Committee Member: Leonid Mirny, PhD
Title: Associate Professor of Health Sciences and Technology and Physics, Harvard-MIT Division
of Health Sciences and Technology, Massachusetts Institute of Technology
Thesis Committee Member: Jeremy L. England, Ph.D.
Title: Assistant Professor, Department of Physics, Massachusetts Institute of Technology
4
Acknowledgments
The work presented in this thesis was made possible by the generous help of many friends,
colleagues, and teachers. For their contributions I thank the following:
Alexander van Oudenaarden for exceptional mentorship and inspiration. He gave me the phone
interview and recruited me to MIT, my dream school ever since I was a child, and I will forever
be grateful for this opportunity and the chance to work in his lab.
My thesis committee members, Jeff Gore, Jeremy England, and Leonid Mirny for insightful
comments and mentorship. During the dynamic latter stages of my PhD, my committee offered
additional scientific guidance and support for me to reach the finish line.
My collaborator J6rn M. Schmiedel, Sandy L. Klemm, Apratim Sahay in the study of miRNA
control of target expression noise. And Nikolai Slavov for targeted Mass Spectrometry
measurement.
Members of the van Oudenaarden lab 2008-2015. I cannot think of a more fun, intellectually
stimulating place to spend one's graduate years. I especially want to thank Ni Ji, Sandy Klemm,
Magda Bienko, Dylan Mooijman, Apratim Sahay, Nikolai Slavov, and Stefan Semrau for scientific
discussions; Ya Lin, Miaoqing Fang, Jialing Li, Annalisa Pawlosky, and Clinton Hansen for
genuine friendship; Shankar Mukherji and Gregor Neuert for training when I first joined the lab;
and Monica Wolf for administrative support.
My friends in the Physics Department Wenlan Chen, Jiexi Zhang, Bo Zhen, Wujie Huang, Wenjun
Qiu, and Arghavan Safavi-Naini.
My parents Lianshuang Zheng, and Jinfan Xiao for love and support, for always pushing me to
work harder on my research projects, and for inspiration of my love of nature from childhood.
My husband Yabi Wu for helping me to be a better person
This thesis is dedicated to my grandma, Meixiu Wei, the one person I love most in the world.
5
Contents
Abstract...........................................................................................................................................
3
Acknowledgm ents...........................................................................................................................
5
Chapter 1 Introduction ....................................................................................................................
9
1. 1 Background ...........................................................................................................................
9
1.2 Thesis Outline .....................................................................................................................
10
1.3 References ...........................................................................................................................
12
Chapter 2 Design and validation of microRNA activity reporter system..................................
15
2.1 Abstract ...............................................................................................................................
15
2.2 Introduction .........................................................................................................................
15
2.2.1 M icroRNA s biogenesis.............................................................................................
15
2.2.2 Interpretation of miRNA mutants of embryonic stem cells ...................
16
2.2.3 MicroRNAs profile and function in embryonic stem cells....................
17
2.2.4 Regulatory network of miRNAs and proteins in ESC proliferation and differentiation
...............................................................................................................................................
18
2.2.5 M icroRNAs target recognition m echanism ...............................................................
19
2.3 Results .................................................................................................................................
20
2.3.1 Design of two delivery systems for microRNA activity reporter..............................
20
2.3.2 Choice of UTRs for study .........................................................................................
22
2.3.3 Motivation to design MutUTR as miRNA unregulated control...............................
23
2.3.4 Design of M utUTR as miRN A unregulated control.....................................................
25
2.3.5 Validation of MutUTR as miRNA unregulated control ..........................................
26
2.3.6 Applications of UTR reporter system .......................................................................
27
2.4 M aterials and M ethods....................................................................................................
30
2.4.1 MutUTR design ............................................................................................................
30
2.4.2 Plasm id construction..................................................................................................
30
2.4.3 PCR and sequencing prim ers design ........................................................................
31
2.4.4 M olecular cloning m aterials and kits.........................................................................
34
2.4.5 sm FISH probe design ...................................................................................................
34
2.4.6 Cell lines .......................................................................................................................
35
2.4.7 Cell culturing................................................................................................................
35
2.4.8 Transient transfection of plasm ids and dox induction...............................................
36
2.4.9 Cell fixation and hybridization..................................................................................
36
2.4.10 Flow cytom etry...........................................................................................................
36
6
2.4.11 M icroscopy Im aging and image analysis ...............................................................
37
2.5 Supplem entary.....................................................................................................................
38
2.6 References ...........................................................................................................................
45
Chapter 3 Application of UTR reporter system to study microRNA regulation at transcriptional
48
and translational levels..................................................................................................................
3.1 Abstract ...............................................................................................................................
48
3.2 Introduction .........................................................................................................................
48
3.2.1 m iRN A-m ediated repression of translation...............................................................
48
3.2.2 m iRN A -m ediated m RNA deadenylation and decay ....................................................
49
3.2.3 Cellular compartm entalization of m iRN A repression ..................................................
50
3.2.4 Translation inhibition vs transcript degradation........................................................
50
3.3 Results .................................................................................................................................
51
3.3.1 MicroRNAs exert regulation at both transcriptional and translational levels ........... 51
3.3.2 Quantifying miRNA regulation at transcriptional and translational levels ........
55
3.3.3 The transcriptional regulation stays relatively constant and translational regulation
55
saturates at high target expression......................................................................................
3.4 Discussion ...........................................................................................................................
62
3.5 M ethods...............................................................................................................................
63
3.5.1 Flow cytom etry experim ents ........................................................................................
63
3.5.2 Flow Cytom etry Data Processing .................................................................................
64
3.5.3 Background analysis and repression fold calculation................................................
68
3.5.4 Microscopy im age analysis ........................................................................................
70
3.6 Supplem entary Inform ation.............................................................................................
75
3.7 References:..........................................................................................................................
81
Chapter 4 Application of reporter system to study microRNA control of protein expression noise
.......................................................................................................................................................
85
4.1 Abstract ...............................................................................................................................
85
4.2 Results .................................................................................................................................
85
4.3 M ethods...............................................................................................................................
92
4.3.1 Reporter plasm id construction..................................................................................
92
4.3.2 Transient transfections...............................................................................................
92
4.3.3 Flow cytom etry.............................................................................................................
92
4.3.4 Transcriptom e profiling .............................................................................................
93
4.3.5 Taqm an m icroRNA expression measurem ents ............................................................
93
4.3.6 Flow cytom etry data processing ...................................................................................
93
7
4.3.7 M odel fit to signal m ean and noise...........................................................................
94
4.3.8 Mixed microRNA pool noise for correlated individual microRNA pools .........
94
4.3.9 Mapping flow cytometry experiments to transcriptome expression .............
95
4.3.10 D icer knock-out mESC transcriptom e expression data...........................................
95
4.4 Acknow ledgm ents...............................................................................................................
96
4.5 References...........................................................................................................................
97
Chapter 5 Application of UTR decoy system to study microRNA-mediated-crosstalk...... 100
5.1 Abstract.............................................................................................................................
100
5.2 Introduction .......................................................................................................................
100
5.3 Results...............................................................................................................................
102
5.4 Methods.............................................................................................................................
108
5.4.1 FA CS cell sorting .......................................................................................................
108
5.4.2 RNA sequencing.........................................................................................................
108
5.4.3 MicroRN A targets selection.......................................................................................
108
5.4.4 Targeted M ass Spectrom etry ......................................................................................
108
5.5 Supplem entary...................................................................................................................
109
5.6 References.........................................................................................................................
115
Chapter 6 Double hybridization of GFP-Lin28a3'UTR transcript reveals a novel expression pattern
.....................................................................................................................................................
1 17
6.1 Abstract.............................................................................................................................
117
6.2 Results...............................................................................................................................
117
6.3 Discussion .........................................................................................................................
122
6.4 Methods.............................................................................................................................
124
6.4.1 Taqm an m icroRN A expression m easurem ents ..........................................................
124
6.4.2 Co-localized spots detection.......................................................................................
124
6.4.3 Stable Integration........................................................................................................
126
6.5 Supplem entary...................................................................................................................
130
6.5.1 Supplem entary figures................................................................................................
130
6.5.2 Supplem entary m odel.................................................................................................
136
6.5.3 Supplem entary sequence inform ation ........................................................................
138
6.6 References.........................................................................................................................
140
Chapter 7 Conclusions and Perspectives ....................................................................................
141
7.1 Future D irections...............................................................................................................
141
7.2 References.........................................................................................................................
145
8
Chapter 1 Introduction
1.1 Background
MicroRNAs (miRNAs) are a class of small non-coding RNAs which play important roles in
posttranscriptional gene regulation. The number of identified miRNAs approaches 1/-2% of the
number of protein-coding genes in worms, flies, and mammals (Bartel, 2009). More than a half of
mammalian protein-coding genes are predicted to be conserved targets of miRNAs, and most
mammalian mRNAs are conserved targets of microRNAs (Friedman et al., 2008). miRNAs have
been shown to participate in the regulation of almost every cellular process investigated so far,
from stem cell biology to differentiation, and from proliferation to apoptosis (Bushati and Cohen,
2007; Filipowicz et al., 2008). Given this far-reaching role, it is not surprising that dysregulation
of miRNAs is associated with many diseases, including cancer, heart ailments and
neurodevelopmental disorders (Chang and Mendell, 2007). Accordingly, miRNAs are being
developed as both targets and therapeutics in the clinic hoping to harness the power of RNA-guided
gene regulation to combat disease and infection (Pasquinelli, 2012; Tomari and Zamore, 2005).
miRNAs recognize their targets by base pairing. miRNA response elements (MREs) are
predominantly located at the 3' untranslated region (3'UTR) of messages (Bartel, 2009). Each
miRNA can regulate hundreds of mRNAs (Lim, 2005). On the other side, one 3'UTR can contain
dozens of MREs, and receive combinatorial regulation from multiple miRNAs (Bartel, 2009).
miRNA regulation had a widespread impact on UTR evolution. A large set of housekeeping genes
possesses short 3'UTRs that are specifically depleted of microRNA binding sites to avoid
microRNA regulation (Stark et al., 2005). Genes with tissue-specific expression tend to have
longer 3'UTRs with more miRNA-binding sites (Stark et al., 2005). And the expression of those
genes and their regulating miRNAs are anti-correlated or even mutually exclusive in contiguous
developmental stages or neighboring tissues (Farh, 2005; Stark et al., 2005). Additionally, 3'UTRs
are frequently shortened in tumors and proliferating cells via alternative polyadenylation (APA)
(Ji and Tian, 2009; Mayr and Bartel, 2009; Sandberg et al., 2008). Conversely, progressive
lengthening of 3'UTRs by APA modulation was observed during mouse embryonic development
(Ji et al., 2009).
miRNAs can induce target silencing by mRNA degradation and translation inhibition. In animals,
initial evidence suggested that miRNAs repress their targets at the level of translation, with little
or no influence on mRNA abundance (Olsen and Ambros, 1999; Seggerson et al., 2002). It has
now become clear that miRNAs can also induce mRNA degradation in animals (Behm-Ansmant,
2006; Eulalio, 2007, 2009; Lim, 2005). Furthermore, recent advances in proteome and
transcriptome measurements have enabled the modes of miRNA regulation to be dissected on a
global scale (Baek, 2008; Guo et al., 2010; Hendrickson, 2009; Selbach, 2008). However, the
relative contribution from these two sources has been under debates, with genome-wide assays
and single gene analysis usually suggesting controversial results (Eichhorn et al., 2014; Filipowicz
et al., 2008). Moreover, the mechanistic details of miRNA regulation are still poorly understood.
Translational repression was proposed to occur at multiple stages (Fabian et al., 2010; Filipowicz
et al., 2008). And other modes of miRNA regulation such as compartmentalization and translation
activation have been discovered (Eulalio et al., 2007; Vasudevan et al., 2007).
9
Even though individual microRNAs only weakly repress the vast majority of their target genes
(Baek, 2008; Selbach, 2008) and knockouts rarely show phenotypes (Miska et al., 2007),
microRNA regulation must confer advantages because miRNA targeting is so ubiquitous (Lewis
et al., 2005) and many of the miRNA sites are highly conserved (Friedman et al., 2008). miRNAs
can act both as a switch (Olsen and Ambros, 1999; Reinhart, 2000) or a fine-tuner (Bartel and
Chen, 2004; Karres et al., 2007; Poy et al., 2004) of gene expression, depending on whether the
target residual expression is inconsequential or optimal (Bartel, 2009). Moreover, miRNAs can
confer robustness to expression regulation by acting as reinforcers of switch and noise buffers
(Ebert and Sharp, 2012). Biological systems usually need to turn on/off a gene during
developmental transitions or in response to external signals. Instead of making the decision,
miRNAs can help sharpen and maintain the decision by further dampening the expression of
unwanted transcripts. Thus miRNAs add an additional, functionally redundant layer of repression,
and would provide a failsafe mechanism to ensure the robustness of gene expression program
(Bartel, 2009; Bushati and Cohen, 2007; Ebert and Sharp, 2012). miRNAs have also been proposed
to act as buffers -against variation in gene expression at homeostasis. The noise reduction results
from microRNA-mediated accelerated mRNA turnover and increased transcriptional activity
needed to produce the same amount of protein (Ebert and Sharp, 2012; Noorbakhsh et al., 2013).
miRNAs have been recently discovered to act as mediator of crosstalk between mRNA targets
(Salmena et al., 2011). mRNAs sharing MREs could compete for miRNAs binding, titrate away
the regulating resources from each other, thus achieve co-regulation of expression. This new
discovery has been revealing a new layer of posttranscriptional regulation.
1.2 Thesis Outline
This thesis focuses on miRNAs in animals, and studies the combinatorial effect of miRNA
regulation on its natural targets, which are usually located at 3' untranslated regions (3' UTRs) of
mRNA transcripts. MicroRNA activity reporter systems have been constructed for representative
3'UTRs, and have been applied to explore the following aspects of miRNA regulation.
1. What is a good miRNA unregulated control for a UTR reporter system? (Chapter 2)
2. How does transcript degradation and translation inhibition contribute to miRNA regulation?
And how does miRNA regulation vary for different target abundances? (Chapter 3)
3. How does miRNA control target gene expression noise? (Chapter 4)
4. What conditions are needed for miRNA-mediated-crosstalk? (Chapter 5)
5. How does miRNA affect integrity/alternative polyadenylation of target transcripts?
(Chapter 6)
The thesis is structured into seven chapters, which includes one brief introduction chapter, five
chapters that focus on addressing the above questions, followed by one conclusion chapter.
Chapter 2 introduces a microRNAs activity reporter system, which is used as the foundation
throughout this thesis. We begin with a literature review of the miRNA biogenesis and miRNA
target recognition mechanism. MicroRNA profiles and function in mouse embryonic stem cells
(mESCs) are also introduced as guidance for our choices of UTRs. 3'UTR of an endogenous gene
is appended behind a fluorescent reporter and it allows quantification of miRNA target expression
at single cell level by flow cytometry or microscopy. Another fluorescent protein without
microRNA regulation is used as indicator. Two reporter/indicator delivery systems, bi-directional
10
plasmid and cotransfection systems are described, and the effectiveness of using indicator protein
to reflect target abundance has been validated in both systems. This chapter introduces MutUTR
as the miRNA unregulated control. By mutation of only miRNA response elements (MREs) on
UTRs, MutUTRs effectively abolishes majority of miRNA repression whereas preserves other
structures of UTR. Its advantages over using reporters followed by a short miRNA unregulated
tail or using Dgcr8-'- ESCs as miRNA unregulated control have been discussed. In the end, we
briefly introduce the potential applications of microRNAs reporter system.
Chapter 3 begins with a literature review of current understandings of miRNA repression
mechanisms and controversial view of transcriptional/translational contributions from genomewide studies and single-gene analyses. A UTR reporter system was applied to address miRNA
regulation at transcriptional/translational levels for different target expression. By directly
measuring the integrated fluorescence of reporter and indicator proteins, and label reporter
transcripts with smFISH probes, we traced miRNA regulation at both levels for single cells over a
target expression range of more than 100 fold. The transcriptional regulation is uniform throughout
the range of measurement, whereas translational regulation gets titrated at high target expression.
Our data also suggests that miRNA increase initially at low target expression region for certain
targets. The molecular mechanisms behind the observed trend were discussed. Microscopy and
RNA-Seq confirmed flow cytometry data. For all UTRs under study, miRNA regulation at the two
levels were found to be on the same order.
Chapter 4 focuses on studying how miRNA control target gene expression noise, which is defined
as standard deviation divided by mean. MicroRNAs could reduce protein expression noise when
their repressive post-transcriptional effects are antagonized by accelerated transcriptional
dynamics. However, since microRNA levels are themselves variable, one should expect the
propagation of their fluctuations to introduce additional noise. Mathematical modeling combining
the opposing effects predicts that miRNAs decrease protein expression noise for lowly expressed
genes, but increase noise for highly expressed genes. Assays using our reporter system are
consistent with the model. Reporter expression has been mapped to ESCs transcriptome.
Endogenous expression of highly repressed miRNA targets belong to the low-expression and
reduced-noise region. Thus our findings suggest that microRNAs confer precision to protein
expression in vivo and offer a plausible explanation for the preferential targeting of lowly
expressed genes.
Chapter 5 is inspired by the ceRNA model (Salmena et al., 2011 a), which states that miRNAs can
induce crosstalk between their targets by competitive binding to this limiting regulating factor.
miRNA repression was found to decrease at high target expression for our reporter systems,
suggesting the possibility of miRNA-mediated-crosstalk. To explore its occurrence, we applied
Lats2 3'UTR reporter system as miRNA sponges, sorted ESCs according to decoy expression
levels, and measured genome-wide gene expression response by RNA-Seq. No crosstalk evidence
was found for targets of miRNAs regulating Lats2 3'UTR at transcriptome level. To explain the
lack of crosstalk, decoy expression and endogenous target site abundance (TA) were estimated,
and were compared with endogenous miRNA abundance. Our estimation supports a model in
which the changes in ceRNAs must begin to approach the TA of a miRNA before they can exert
a consequential effect on the repression of targets for that miRNA (Denzler et al., 2014).
Chapter 6 describes a novel expression pattern discovered for GFP-Lin28a3'UTR transcripts.
CDS and 3'UTR of reporter transcripts were hybridized with different color smFISH probes and
11
measured under microscopy. A significant number of isolated gfp transcript without following
Lin28a 3'UTR tail was discovered, and their co-localization percentage is expression dependent.
Below an expression threshold of 100 gfp mRNA molecules, the probability of GFP having a colocalized Lin28a 3'UTR tail was highly variable between 0 and 1. Above the threshold, the colocalization probability was always high. Several trivial explanations for the observed
phenomenon have been ruled out, and the co-localization behavior is miRNA dependent. The
mechanism behind this novel expression pattern remains unknown, and the possibility of
alternative polyadenylation (APA) was discussed in the end.
Chapter 7 describes final conclusions and future directions for these projects.
1.3 References
Baek, D. (2008). The impact of microRNAs on protein output. Nature 455, 64-71.
Bartel, D.P. (2009). MicroRNAs: target recognition and regulatory functions. Cell 136, 215-233.
Bartel, D.P., and Chen, C.-Z. (2004). Micromanagers of gene expression: the potentially
widespread influence of metazoan microRNAs. Nat Rev Genet 5, 396-400.
Behm-Ansmant, I. (2006). mRNA degradation by miRNAs and GW182 requires both CCR4:NOT
deadenylase and DCP1:DCP2 decapping complexes. Genes Dev 20, 1885-1898.
Bushati, N., and Cohen, S.M. (2007). MicroRNA functions. Annu Rev Cell Dev Biol 23, 175-205.
Chang, T.C., and Mendell, J.T. (2007). microRNAs in vertebrate physiology and human disease.
Annu Rev Genomics Hum Genet 8, 215-239.
Denzler, R., Agarwal, V., Stefano, J., Bartel, David P., and Stoffel, M. (2014). Assessing the
ceRNA Hypothesis with Quantitative Measurements of miRNA and Target Abundance. Molecular
Cell 54, 766-776.
Ebert, M.S., and Sharp, P.A. (2012). Roles for microRNAs in conferring robustness to biological
processes. Cell 149, 515-524.
Eichhorn, Stephen W., Guo, H., McGeary, Sean E., Rodriguez-Mias, Ricard A., Shin, C., Baek,
D., Hsu, S.-h., Ghoshal, K., Villen, J., and Bartel, David P. (2014). mRNA Destabilization Is the
Dominant Effect of Mammalian MicroRNAs by the Time Substantial Repression Ensues.
Molecular Cell 56, 104-115.
Eulalio, A. (2007). Target-specific requirements for enhancers of decapping in miRNA-mediated
gene silencing. Genes Dev 21, 2558-2570.
Eulalio, A. (2009). Deadenylation is a widespread effect of miRNA regulation. RNA 15, 21-32.
Eulalio, A., Behm-Ansmant, I., and Izaurralde, E. (2007). P-bodies: at the crossroads of posttranscriptional pathways. Nature Rev Mol Cell Biol 8, 9-22.
Fabian, M.R., Sonenberg, N., and Filipowicz, W. (2010). Regulation of mRNA translation and
stability by microRNAs. Annu Rev Biochem 79, 351-379.
12
Farh, K.K. (2005). The widespread impact of mammalian microRNAs on mRNA repression and
evolution. Science 310, 1817-1821.
Filipowicz, W., Bhattacharyya, S.N., and Sonenberg, N. (2008). Mechanisms of posttranscriptional regulation by microRNAs: are the answers in sight? Nature Rev Genet 9, 102-114.
Friedman, R.C., Farh, K.K.H., Burge, C.B., and Bartel, D.P. (2008). Most mammalian mRNAs
are conserved targets of microRNAs. Genome Research 19, 92-105.
Guo, H., Ingolia, N.T., Weissman, J.S., and Bartel, D.P. (2010). Mammalian microRNAs
predominantly act to decrease target mRNA levels. Nature 466, 835-840.
Hendrickson, D.G. (2009). Concordant regulation of translation and mRNA abundance for
hundreds of targets of a human microRNA. PLoS Biol 7, e1000238.
Inui, M., Martello, G., and Piccolo, S. (2010). MicroRNA control of signal transduction. Nat Rev
Mol Cell Biol 11, 252-263.
Ji, Z., Lee, J.Y., Pan, Z., Jiang, B., and Tian, B. (2009). Progressive lengthening of 3' untranslated
regions of mRNAs by alternative polyadenylation during mouse embryonic development.
Proceedings of the National Academy of Sciences of the United States of America 106, 7028-703 3.
Ji, Z., and Tian, B. (2009). Reprogramming of 3' Untranslated Regions of mRNAs by Alternative
Polyadenylation in Generation of Pluripotent Stem Cells from Different Cell Types. PLoS ONE 4,
e8419.
Karres, J.S., Hilgers, V. , Carrera, I., Treisman, J. , Cohen, S.M. (2007).The conserved
microRNA miR-8 tunesatrophin levels to prevent neurodegeneration in Drosophila. Cell, 131,
136-145
Lewis, B.P., Burge, C.B., and Bartel, D.P. (2005). Conserved seed pairing, often flanked by
adenosines, indicates that thousands of human genes are microRNA targets. Cell 120, 15-20.
Lim, L.P. (2005). Microarray analysis shows that some microRNAs downregulate large numbers
of target mRNAs. Nature 433, 769-773.
Mayr, C., and Bartel, D.P. (2009). Widespread shortening of 3'UTRs by alternative cleavage and
polyadenylation activates oncogenes in cancer cells. Cell 138, 673.
Miska, E.A., Alvarez-Saavedra, E., Abbott, A.L., Lau, N.C., Hellman, A.B., McGonagle, S.M.,
Bartel, D.P., Ambros, V.R., and Horvitz, H.R. (2007). Most Caenorhabditis elegans microRNAs
are individually not essential for development or viability. PLoS Genet 3, e215.
Noorbakhsh, J., Lang, A.H., and Mehta, P. (2013). Intrinsic Noise of microRNA-Regulated Genes
and the ceRNA Hypothesis. PLoS ONE 8, e72676.
Olsen, P.H., and Ambros, V. (1999). The lin-4 regulatory RNA controls developmental timing in
Caenorhabditis elegans by blocking LIN-14 protein synthesis after the initiation of translation.
Dev Biol 216, 671-680.
Pasquinelli, A.E. (2012). MicroRNAs and their targets: recognition, regulation and an emerging
reciprocal relationship. Nat Rev Genet 13, 271-282.
13
Poy, M.N., Eliasson, L., Krutzfeldt, J., Kuwajima, S., Ma, X., MacDonald, P.E., Pfeffer, S., Tuschl,
T., Rajewsky, N., Rorsman, P., et al. (2004). A pancreatic islet-specific microRNA regulates
insulin secretion. Nature 432, 226-230.
Reinhart, B.J. (2000). The 21 -nucleotide let-7 RNA regulates developmental timing in
Caenorhabditis elegans. Nature 403, 901-906.
Salmena, L., Poliseno, L., Tay, Y., Kats, L., and Pandolfi, P.P. (2011). A ceRNA hypothesis: the
Rosetta stone of a hidden RNA language? Cell 146, 353-358.
Sandberg, R., Neilson, J.R., Sarma, A., Sharp, P.A., and Burge, C.B. (2008). Proliferating Cells
Express mRNAs with Shortened 3' Untranslated Regions and Fewer MicroRNA Target Sites.
Science 320, 1643-1647.
Seggerson, K., Tang, L., and Moss, E.G. (2002). Two genetic circuits repress the Caenorhabditis
elegans heterochronic gene lin-28 after translation initiation. Dev Biol 243, 215-225.
Selbach, M. (2008). Widespread changes in protein synthesis induced by microRNAs. Nature 455,
58-63.
Stark, A., Brennecke, J., Bushati, N., Russell, R.B., and Cohen, S.M. (2005). Animal microRNAs
confer robustness to gene expression and have a significant impact on 3'UTR evolution. Cell 123,
1133-1146.
Tomari, Y., and Zamore, P.D. (2005). Perspective: machines for RNAi. Genes Dev 19, 517-529.
Vasudevan, S., Tong, Y., and Steitz, J.A. (2007). Switching from repression to activation:
microRNAs can up-regulate translation. Science 318, 1931-1934.
14
Chapter 2 Design and validation of microRNA activity reporter
system
2.1 Abstract
In this chapter, we introduce the microRNAs activity reporter system, which will be used as the
foundation for this thesis. Different miRNA regulatory elements (e.g. 3'UTR of endogenous gene)
are fused behind a fluorescent reporter to allow quantification of miRNA target expression at
single cell level by flow cytometry or microscopy. Another fluorescent protein without
modifications at 3'UTR is used as indicator to monitor miRNA regulation at different target
expression levels. And the positive correlation between indicator protein and reporter protein in
the absence of miRNA regulation has been validated for both delivery systems used in the thesis,
which are cotransfection and bi-directional plasmid system. Dgcr8-1- cells are generally used to
study the global effect of miRNA loss, and can be used as background for miRNA unregulated
control. Alternatively, we could selectively mutate the consequential miRNA response elements
(MREs) on 3'UTRs and abolish most of miRNA regulation without too much change of the
sequence. The mutation algorithm is described and the effectiveness of the mutated UTRs is
validated. The mutation detangles miRNA regulation from other factors affecting expression such
as transcript length and AU-rich elements. It also provides several additional advantages over
Dgcr8-'- cells. In the end, we introduce some possible applications of the reporter system which
will be discussed in detail in the following chapters.
2.2 Introduction
2.2.1 MicroRNAs biogenesis
MicroRNAs (miRNAs) are single -stranded RNAs (ssRNAs) of -22 nucleotide (nt) in length that
are generated from endogenous hairpin-shaped RNA molecules. MicroRNAs function as guide
molecules in post-transcriptional gene regulation by base -pairing with the target mRNAs, usually
in the 3' untranslated region (UTR).
Canonical miRNA genes are transcribed by RNA polymerase II (Pol II) to generate primary
transcripts (pri-miRNAs). Pri-miRNAs are usually several kilobases long and contain local stemloop structures called hairpins. The initiation step (cropping) occurs in the nucleus, and it is
mediated by the Microprocessor complex composed of the nuclear RNase III enzyme Drosha and
its dsRNA binding protein partner DiGeorge syndrome critical region gene 8 (DGCR8; Pasha in
D. melanogaster and C. elegans). Cropping determines one end of the miRNA and generates -65
nucleotide precursor-miRNAs (pre-miRNAs) containing a short stem plus a 2-nt 3' overhang.
Pre-miRNAs are recognized by the nuclear export factor exportin 5 (EXP5) and transported to the
cytoplasm in a Ran-GTP-dependent manner. Upon export from the nucleus, the cytoplasmic
RNase III enzyme Dicer together with its dsRNA binding protein cofactors TRBP (TAR RNAbinding protein) and/or PACT catalyze the second processing step (dicing), and produce -22nucleotide miRNA duplexes. The duplex is separated and usually one strand (the guide strand or
miRNA) is selected as the mature miRNA, whereas the other strand (the passenger strand or
15
miRNA*) is degraded. Strand selection is usually determined by thermodynamic stability, whereas
some hairpins produce miRNAs from both strands at comparable frequencies.
Mature miRNAs are incorporated into effector complexes that are known as 'miRNP' (miRNAcontaining ribonucleoprotein complex), or 'miRISC' (miRNA-containing RNA-induced silencing
complex). Argonaute (AGO) is the key component of the RISC complex. In humans, all four Ago
proteins, AGO 1-4, bind to miRNAs with only marginal differences in miRNA repertoire even
though AGO2 is the only one with endonucleolytic enzymatic activity (slicer activity) and function
in the siRNA pathway.
Apart from the canonical miRNA biogenesis pathways described above, various alternative
mechanisms can generate miRNAs. A Drosha-independent example is given by mirtrons. After
splicing from host mRNAs, the short intronic lariat is debranched and refolds into a short stemloop structure that resembles a pre-miRNA (Okamura et al., 2007; Ruby et al., 2007). On the other
side, the biogenesis of miR-451 does not require Dicer and instead involves the catalytic activity
of AGO2 (Cheloufi et al., 2010). Even though miRNA biogenesis can be flexible, the vast majority
of functional miRNAs still follow the canonical pathway.
2.2.2 Interpretation of miRNA mutants of embryonic stem cells
Both Dgcr8 knockout and Dicer knockout mouse ES cell lines exist, and are usually used as
models to study effects of overall miRNA loss, but neither one is perfect as miRNA null
background. Both Drosha-DGCR8 and Dicer-TRBP independent non-canonical miRNAs
biogenesis pathways are discovered in mammalian cells. DGCR8 is exclusively involved in the
miRNA pathway (Figure 2.1), whereas Dicer also participates in the endogenous siRNAs pathway.
Conditional knockout of Dicer and Dgcr8 induce highly overlapping phenotypes in cell
proliferation and differentiation, but Dicer knockout is more severe (Kanellopoulou, 2005;
Murchison et al., 2005; Wang et al., 2007). This is possibly due to the existence of a population of
DGCR8-independent, Dicer-dependent small RNAs such as mirtrons, endogenous small hairpin
RNAs, and hairpin derived short-interfering RNAs (hp-siRNAs) in ESCs (Babiarz et al., 2008).
Dicer conditional knockout cells are more difficult to isolate and are thought to require a secondary
genetic or epigenetic event to grow (Murchison et al., 2005). We choose the Dgcr8 knockout cell
line (for brevity referred to as KO ESCs) in this study for its less severe phenotype.
16
dsRIDO
dRBD
WW
miRNA gene
Prt
gDNA
2
1
3
45
9
67
IIdI/
1011
Exon 2
-
Exon 3
N
12 13
14
-,mfRNA
bnd II
-
exon 4 *WSo
W probe
X prQba
Pre-mRNA
56 kb
Md fit
Mnd 10
4nd II
d
III
Cytoplasm
23 kb
ind III
Eon 2
flox
mdm1
Hind III
IoxP Exon 3kiP
3,0 kb
Eon 4 Exon 5
--
S' probe
3'
2.3 kb
3.
Hnd I0
Exon 2
oP
probe
miRNA:rnaRNA* duplex
kb
P"d III
Exon 4 Exon 5
X probe
5'probe
a
52 kb
b
Mature miRNA within RISC
Figure 2.1 Illustration of Dgcr8 knockout strategy and its biological consequences.
(a) Dgcr8 knockout strategy. Dgcr8 knockout cell line were generated from parental V6.5 strain
by removing exon 3 of Dgcr8 gene, resulting in the formation of several premature stop codons.
(b) Canonical miRNA biogenesis pathway. The Dgcr8 knockout cell line lacks the ability to form
a functional microprocessor complex with Drosha and can't process miRNAs in the canonical
pathway from primary miRNA into precursor miRNA. (a) is copied from (Wang et al., 2007) and
(b) is adapted from (Bartel, 2004).
2.2.3 MicroRNAs profile and function in embryonic stem cells
Over one-third of mammalian genes are predicted to be directly targeted by miRNAs (Friedman
et al., 2008). Consequently, the unique combination of miRNAs in each cell type determines the
transcriptome of mRNAs. Recently, miRNAs have emerged as important players in controlling
embryonic stem cell fate and behavior. Embryonic stem cells are derived from the inner cell mass
of embryos, and are known for their capacity to indefinitely maintain an undifferentiated state in
culture (self-renewal) and their potential to develop into every cell type (pluripotency). Knowledge
of how protein-coding genes are controlled by key ES cell pluripotency transcription factors
(Boyer, 2005) and chromatin modifiers (Benetti, 2008; Boyer et al., 2006; Sinkkonen, 2008) has
provided important insights into the molecular control of ES cell identity and cellular
reprogramming. The recent discoveries of miRNAs provide a new dimension to the ES cell core
regulatory circuitry (Marson, 2008).
In ESCs, the miR-290-295 cluster of miRNAs (for brevity referred to as the miR-290 cluster),
expressed from a 2.2-kb polycistronic region on chromosome 7, comprises up to - 70% of total
miRNA expression (Houbaviy et al., 2003; Marson, 2008). This miRNA family is homologous to
human miR-371-373, a cluster expressed in human ESCs. miR-302-367, a miRNA cluster
conserved in both mouse and humans, also shares the same seed sequence AAGUGC. This cluster
of miRNAs is less expressed in naYve ES cells but is upregulated in primed ES cells state upon
17
differentiation (Rosa and Brivanlou, 2011). Both cluster of miRNAs are ESC specific, and
diminish in differentiated cells. ChIP-sequencing data has shown that the promoter of miR-290
family miRNAs is occupied by the core transcriptional regulators of ESCs Oct4/Sox2/Nanog/Tcf3
(Marson, 2008), the same core transcriptional factors also encodes ~250 murine ES cell mRNAs
that appear to be under the control of miRNAs of the miR-290 family (Sinkkonen, 2008), thus
forming an "incoherent feed-forward" loop which can fine-tune the targets expression (Alon,
2007).
miR-17-92, an oncogenic miRNA cluster which promotes cell proliferation (He, 2005) is also
expressed in ESCs. It is interesting to note that the seed of miR-17-92 is only shifted by 1 nt
compared to the seed of miR-290 family miRNAs. The transfection of miRNA mimics from either
families into Dgcr8 knockout ESCs can rescue the proliferation defect of the miRNA null cells,
and shorten the prolonged GI phase of cell cycle by suppressing inhibitors of GI -S transition. The
miR-290 family miRNAs also control de novo DNA methylation via Rbl2 and other transcriptional
repressors, and repress the self-renewal program via modulating the epigenetic status of
pluripotency genes, such as Oct4, upon differentiation (Benetti, 2008; Sinkkonen, 2008) The
primary transcript of let-7 is abundant in ESCs, but its maturation is blocked by Lin28
(Viswanathan et al., 2008). Lin28 is highly expressed in ESCs, and is one of the key regulators of
ESC pluripotency. But upon differentiation, the tug-of-war is reversed, and let-7 wins whereas
lin28 is repressed. Some tissue-specific miRNAs are silent in ESCs, but are co-occupied at the
promoter by both the core pluripotency factors and transcriptionally repressive Polycomb group
proteins, these type of miRNAs are poised, and are ready for quick activation upon differentiation
(Marson, 2008). Some differentiation-related miRNAs, miR-296 (Marson, 2008) and possibly
miR-134 and miR-470 belong to this class. These are lowly expressed in the ESCs, but are
upregulated upon retinoic-acid-induced differentiation. The three miRNAs mentioned above also
further down regulate the pluripotency network by targeting the coding sequence of Nanog, Oct4
and Sox2 (Tay et al., 2008). Thus its double promoter occupation and low expression in ESCs can
be compared to a Trojan horse (Gangaraju and Lin, 2009).
2.2.4 Regulatory network of miRNAs and proteins in ESC proliferation and differentiation
Cyclin-Dependent Kinase Inhibitor 1A (Cdknla, also named as P21), large tumor suppressor 2
(Lats2), and retinoblastoma-like 2 (Rbl2) are all confirmed targets of miR-290 family miRNAs,
and are all inhibitors of the cyclin E-CDK2 regulatory pathway (Wang et al., 2008). Thus by
targeting 3'UTRs of these genes, miR-290 family miRNAs control ES cell cycle and proliferation
by promoting the cell transition from G1 to S phase (Wang et al., 2007). In addition to promoting
cell growth, miR-290 family miRNAs also affect cell death (Zheng et al., 2011). Lats2 and another
validated targets of miR-290 cluster miRNAs caspase 2 (Casp2) are also tumor suppressors and
can induce apoptosis following exposure to genotoxic stressors (Zheng et al., 2011). Thus miR290 cluster also plays an anti-apoptosis / pro-survival role in ESCs. Rbl2 is also known as
transcriptional repressor of de novo DNA methyltransferases (Dnmts) Dnmt3a and Dnmt3b
(responsible for depositing repressive histone markers at gene promoters) (Benetti, 2008;
Sinkkonen, 2008). Thus miR-290 family miRNAs affects methylation, and promotes
differentiation of ESCs by stable silencing of pluripotency factors like Oct4.
OCT4, SOX2, NANOG and LIN28 are four factors used to reprogram human somatic cells to
pluripotent stem cells that exhibit the essential characteristics of embryonic stem cells (Yu et al.,
18
2007). And their mouse homologs are also key pluripotency factors in mouse ESCs. Various
miRNAs such as miR-134, miR-296 and miR-470 have been experimentally shown to target the
CDS of Oct4/Sox2/Nanog trio upon differentiation (Tay et al., 2008). The negative feedback loop
between Lin28 and let-7 is important for both ES cell pluripotency and differentiation
(Viswanathan and Daley, 2010).
miR-134, miR-296, miR 470
Pro-
mlfferentiation
IPPI
MIRN29s29d
IT
Pro-s
pluripotency
G1-4Soxe2/Ncycl
Apotois
-_
Wh.
4a WiA
N*Yde novo DNA
methylation
G1-S cell cycle
progression
Figure 2.2 Regulatory network of miRNAs and proteins in ESC proliferation and
differentiation.
Proteins are represented by ovals and miRNAs are represented by boxes. Red lines represent
activation and blue lines represent inhibition. Casp2, Lats2, P21 and Rbl2 transcripts are all targets
of miR-290-295 cluster of miRNAs, the most abundant miRNA families in ESCs. Other regulating
miRNAs are not shown here. The Oct4-Sox2-Nanog trio of transcription factors are downregulated by a set of miRNAs at CDS during differentiation. The mutual inhibitory network
between Lin28 and let-7 is also crucial in cell-fate decision.
2.2.5 MicroRNAs target recognition mechanism
Most known miRNAs-target recognitions and interactions occur at 3' untranslated regions (3'
UTRs) of mRNA transcripts, even though miRNAs targeting sites have also been identified in
mRNA coding DNA sequence (CDS) (Rigoutsos, 2009) and 5'UTR (Lytle et al., 2007b).
MicroRNAs target recognition relies heavily on the Watson-Crick pairing to the miRNA seed
region, which is defined as positions 2-7 counting from the 5' of a mature miRNA. Structural
19
studies show that Argonaute protein pre-position nucleotides 2-8 of miRNA in a geometry
resembling an A-form helix that would enhance both the affinity and specificity for matched
mRNA segments. Nucleotide 1 is twisted away from the helix and not available for target pairing,
but an A at position 1 of the site is presumably recognized directly by protein of the silencing
complex. This is consistent with genome-wide analysis of site efficacy hierarchy as: 8mer (1-8)>>
7mer-m8 > 7mer-Al > > 6mer (2-7) > no site, with the 6mer differing only slightly from no site
at all (Bartel, 2004; Bartel, 2009; Grimson, 2007). A miRNA family is comprised of miRNAs with
the same seed+m8 sequence (positions 2-8 of the mature miRNA). miRNA members in the same
family are expected to regulate the same set of targets with slight preferences based on different
pairing to the 3' end.
Other factors also boost site efficacy. Supplementary 3'pairing centering at nucleotide 13-16 of
the miRNA increase target pairing efficacy, and sometimes can even compensate for a mismatch
in the seed region. Optimal targeting and repression occurs where the binding site is positioned
within the 3'UTR at least 15 nt downstream of the stop codon, away from the center of long UTRs,
and in an AU-rich neighborhood. This is probably due to effective competition with translation
machinery and increased accessibility in these regions. Adjacent targeting sites (within 40 nt, but
no closer than 8 nt) tend to act cooperatively, and lead to marked enhancement in repression. All
these factors are combined quantitatively into a single context score to reflect the computationally
predicted target efficacy (TargetScan 6.2). Many miRNA sites are conserved under selective
pressure. Sites that are deeply conserved tend to show stronger repression but non-conserved sites
can also be functional (Bartel, 2004; Bartel, 2009; Farh, 2005; Friedman et al., 2008).
2.3 Results
2.3.1 Design of two delivery systems for microRNA activity reporter
To measure miRNA regulation, 3'UTR of an endogenous gene was inserted behind a fluorescent
protein, e.g. enhanced GFP (eGFP), which we refer to as the reporter. Thus we could directly read
miRNA target expression at protein level. By hybridizing reporter transcripts with smFISH probes,
we could also quantify miRNA target expression at mRNA level. The reporter was transiently
transfected into mouse embryonic stem cells (mESCs) and measured by flow cytometry or
microscopy. Since transfection efficiency and the transcription/translation machinery activity vary
from cell to cell, another fluorescent protein, e.g. mCherry, was co-transfected as an indicator. The
indicator has no modifications on 3'UTR, and is devoid of miRNA regulation. By aligning cells
according to indicator expression, we could measure miRNA regulation for different target
abundance.
The canonical miRNAs pathway is blocked in Dgcr8& and Dicer-'- mESCs, and they can be used
as background to measure reporter expression without miRNA repression. Alternatively, as we
will describe shortly, we can use reporter followed by mutated 3'UTR (MutUTR) as miRNA
unregulated control in wild type (WT) ESCs. MutUTR is named for mutation of consequential
miRNA regulating elements (MREs), and the original 3'UTR of endogenous gene is referred to as
OriUTR.
Two delivery systems: the two plasmids cotransfection system and one bi-directional plasmid
system were designed and constructed (Figure 2.3). In the two plasmids cotransfection system,
20
the reporter to indicator plasmid ratio was fixed at population level. This bulk ratio was tunable
and the system could cover broader transfection ranges overall. Whereas in the bi-directional
plasmid system, the reporter and indicator were both driven by a bidirectional Tet-inducible
promoter, and the ratio was always fixed to one at single cell level. This system is especially
helpful for validation of the cotransfection system at low transfection level.
" Two plasmid co-transfection system
Ori/Mut 3'UTR
-
miRNA activity reporter:
Transfection level indicator
TUnable buck ratio
- Bi-directional plasmid system
Ori/Mut 3'UTR
ratio
1
Figure 2.3 Illustration of two delivery systems for miRNA activity reporter.
In the two plasmids cotransfection system, the bulk ratio of reporter to control ratio is tunable and
can cover broader transfection ranges overall. In the bi-directional plasmid system, the reporter to
indicator ratio is always fixed at one-to-one, even at single cell level.
By arranging individual cells according to their indicator expression level, we observed that when
no UTRs were fused behind the reporter, with increasing mCherry, there was a concomitant
increase in eGFP expression in both systems (Figure 2.4). The variations of eGFP expression at
given mCherry levels in cotransfection system were no more pronounced than those of bidirectional plasmid system. Thus even with single cell plasmids delivery ratio variation, cotransfected mCherry plasmids could be used as transfection level indicator.
21
bi-directional plasmid
Co-tansfection
5.5
5.5
5-
6-
4.5-
4.5
.5 4
0..
IL
3
2
1.5
2
2.5
3
3.5
4
4.5
5
5.5
.5
logj(mCherry)
2
2.5
3
3.5
4
1og 1 (mChervy)
4.5
5
Figure 2.4 Scatterplot of GFP reporter expression versus mCherry indicator expression in
two delivery systems.
(a) pTRE-GFP-RFP is transfected into V19 mESCs and induced with 1 pg/ml doxycycline. (b)
pCAG-GFP and pCAG-RFP are co-transfected into WT mESCs. The expression of GFP is plotted
against mCherry. GFP increase on average with increasing mCherry expression. Data is collected
by flow cytometry, and 10% of the measured cells are plotted for visualization. Color corresponds
to local cell density.
2.3.2 Choice of UTRs for study
Cdknla, Lats2, Rbl2, and Casp2 are all experimentally validated targets of miR-290 family
miRNAs (Wang et al., 2008; Zheng et al., 2011), the most abundant miRNAs family in ES cells
(Marson, 2008). They represent various aspects of miR-290 function in ESCs, and have significant
biological implications. In addition to miR-290 miRNA targeting sites, those 3'UTRs also contain
targeting sites for other miRNAs, and some of which have been experimentally validated in ESCs.
Thus those 3'UTRs serve as examples of natural miRNA targets to study combinatorial effect of
miRNA regulation, and they were expected to receive strong repression in ESCs. The negative
feedback loop between Lin28 and let-7 is important for both ES cell pluripotency and
differentiation (Viswanathan and Daley, 2010). Lin28 serves as an example of a mildly repressed
miRNA target due to its high expression in ESCs and low expression of its main regulating
miRNAs let-7 ((Viswanathan et al., 2008) and Taqman data in Chap. 6).
We also constructed 3'UTR reporter systems for ESCs pluripotency trio Oct4/Sox2/Nanog
respectively. No miRNA repression was observed for those UTR reporters in ESCs context (data
not shown). It is worth to note that Nanog transcript level in Dgcr8& ESCs is almost 3-fold higher
than WT ESCs, and its expression in miR-295 cluster knockout ESCs resides between the two cell
lines (Supplementary Figure 2.2). miRNAs have been reported to target Nanog CDS during ESC
differentiation (Tay et al., 2008). Thus it might be interesting to construct the reporter system for
Nanog CDS in the future, to see if miRNA directly repress Nanog expression through this region
in WT ESCs.
22
The 3'UTRs of the endogenous genes chosen for the study are summarized in (Figure 2.2).
2.3.3 Motivation to design MutUTR as miRNA unregulated control
In this study, we could choose either Dgcr8-1- or Dicer-'- mouse embryonic stem cells as "miRNAnull" background to quantify for reporter expression without miRNAs repression. But not every
cell line has its miRNA null mutants available, and sometimes the mutants are not even possible
due to viability issue. And MutUTR was designed as a miRNA unregulated control to be used in
its original cell context.
It's tempting to use reporter protein followed by a short, miRNA-unregulated polyA tail as the
negative control for miRNA regulation. Because it is easy to construct and is universal for UTRs
under study. In fact, in previous studies, GFP followed by rabbit beta-globin polyA tail (RBGpA)
was used as miRNA unregulated control (Mukherji et al., 2011). This works if the UTR under
study is also short, such as the artificially constructed N-consecutive miRNA targeting sites, which
is less than one hundred base pairs (bp) long (Mukherji et al., 2011). But it does not apply to
endogenous UTRs, which can be several kilo bases long. And two main reasons are explained as
follows.
First, in addition to microRNA response elements (MREs), 3'UTR contain other regulatory
elements such as AU-rich elements and poly (A) sites (PAS). Additionally transcript length,
secondary structure, and harbored RNA binding proteins (RBPs) all affect transcripts stability and
translational efficiency. Thus, even in the Dgcr8 knockout cells, at the same transfection level, the
expression of GFP-RBGpA is usually much higher than GFP-OriUTR. This is true for both bidirectional (Supplementary Figure 2.3) and cotransfection (Figure 2.5) systems. And if we use
GFP-RBGpA as the miRNA unregulated expression control, we will also count in contributions
from other factors and exaggerate miRNA regulation. MutUTR is only different from OriUTR by
partial MREs. Other factors like transcript length, and GC content were reserved as much as
possible. The expression of GFP-OriUTR and GFP-MutUTR were confirmed to be the same in
the Dgcr8 knockout cells (Figure 2.8 and Supplementary Figure 2.4). Thus reporter-MutUTRs
should be used instead of reporter-RBGpA as negative control to study miRNA mediated
regulation on endogenous UTRs. It is worth to note that even for miRNA unregulated reporter,
such as GFP-RBGpA, the reporter expression in wild type and knockout cells lines are not exactly
the same. This could due to different doubling time of WT and KO cells, and other factors caused
by the different transcriptome. Indeed, Dgcr8 knockout ESCs have prolonged GI phase compared
to the wild-type (Wang et al., 2008) and global miRNAs knockout usually result in different gene
expression profiles, especially miRNA targets (Lim, 2005).
23
Co-Transfecton
5.5-4.5 4
-
C
WT, GFP-RBGpA
WT, GFP-lats2UTR
WT, GFP-lin28UTR
KO, GFP-RBGpA
KO, GFP-lats2UTR
KO, GFP-lin28UTR
0.3.5U10
2.52-
2
2.5
3
3.5
4
log(mCherry)
4.5
5
5.5
Figure 2.5 Bar plot of GFP expression at different mCherry levels for different UTR
reporters.
pCAG-GFP followed by different 3'UTR tails were co-transfected with pCAG-mCherry into
either wild type (WT) or Dgcr8-' (KO) ESCs. Cells were binned according to mCherry expression,
and the mean of GFP expression in each bin were calculated. Error bars correspond to standard
error of the mean.
Expression of reporter followed by miRNAs null target (RBGpA), or mild target (Lin28a UTR)
were similar in WT and KO ESCs. Lats2 UTR received strong repression, and the reporter
expression in KO cells was much higher than in WT cells. In KO ESCs, where we have eliminated
the difference coming from miRNA regulation, the expression of GFP-RBGpA is much higher
compared to both GFP-Lats2aUTR and GFP-Lin28aUTR, and this was due to factors other than
miRNA.
Secondly, in the cotransfection system, the size-dependent transfection efficiency will further
amplify the difference between GFP-RBGpA and GFP-OriUTR. Transfection efficiency is size
dependent, and smaller plasmid is easier to be delivered into the cells. In the cotransfection system,
even though we can fix the bulk ratio of reporter to indicator plasmid, the reporter plasmid size
with endogenous 3'UTRs are markedly longer than GFP-RBGpA. Thus at the same indicator
plasmid transfection level, cells on average receive more GFP-RBGpA than the GFP-UTR plasmid.
Thus we cannot compare the reporter expression at same indicator levels because the initial
24
received reporter plasmids are different to start with. This size-dependent delivery efficiency
reciprocally affects the transfection distributions of indicator plasmid pCAG-mCherry. The
distribution of indicator protein mCherry expression from cotransfection experiments were
drastically different between pCAG-GFP-RBGpA and pCAG-GFP-OriUTR, but were statistically
indistinguishable between pCAG-GFP-MutUTR and pCAG-GFP-OriUTR (Figure 2.6).
5.6
+
.6[
5
6
A,
4.5F
4.5F
A-
/
0
0
4
4
U
0 3.5'-
3.51-
N.
U.
3
0
2.5
2.562
3
b
a
2.5
4.5
4
3.6
3
GFP-RBGpA CoT, WT
5
5.5
-2
2.5
4.6
4
3.5
3
GFP-lin28MutUTR CoT, WT
6
5.6
Figure 2.6 QQplot of indicator protein expression from different cotransfection experiments.
Different reporter plasmids were co-transfected with indicator plasmid pCAG-mCherry, and the
distributions of mCherry protein expression were compared in the QQplot. (a) mCherry expression
in cotransfection of pCAG-GFP-OriUTR vs pCAG-GFP-RBGpA. The transfection efficiency is
vastly different with p-value = 3.34e-38. (b) mCherry expression in pCAG-GFP-OriUTR vs
pCAG-GFP-MutUTR. The transfection efficiency is statistically the same, with a p-value = 0.567.
2.3.4 Design of MutUTR as miRNA unregulated control
Inspired by the mutagenesis design to study the effect of one particular miRNA on gene expression
(Mayr et al., 2007; Melton et al., 2010; Tay et al., 2008; Wang et al., 2008; Wu and Belasco, 2005),
we carefully select a list of microRNA targeting sites for mutation. We filter all the
computationally predicted miRNA targeting sites (TargetScanMouse v6.2) by the following
factors: miRNA site effectiveness, miRNA expression abundance in mESCs, and probability of
conserved targeting. The filtering thresholds were chosen empirically to balance the tradeoff
between effective mutation and minimal perturbation of the sequence. We mutate targeting sites
with high evolutionary conserved probability independent of other factors. Because these sites are
usually accompanied by high targeting efficacy and are more likely to be biologically
consequential. Their mutations can prevent potential miRNA repression in subpopulations of
spontaneously differentiating ES cells. Moreover, their mutations generalize the application of
mutated UTR in cell context other than ESCs. For each site to be mutated, we follow the general
25
protocol of double point mutation of the seed sequence. The above process is iterated until no
novel miRNA targeting sites are generated by mutation. And the final mutated sequence (MutUTR)
is synthesized ab initio from GeneArt@ Gene Synthesis. The flowchart of the mutation process is
summarized in Figure 2.7.
targetScan prediction of
miRNA targeting sites
Filter by targeting efficacy,
mniRNA expression in ESCs,
and conserved targeting
Mutation of Seed sequence
Final MutUTR sequence
synthesized by GeneArt*
Figure 2.7 Flowchart of MutUTR Design
2.3.5 Validation of MutUTR as miRNA unregulated control
For the chosen set of threshold parameters used in the UTR mutation algorithm, the mutated UTRs
usually maintain >95% sequence identity with respect to their original versions, yet the MutUTRs
have experimentally proven to be effective. For ech endogenous UTRs used in this study, reporter
expression was measured under four conditions. MutUTR or OriUTR was co-transfected with
indicator plasmid into WT or KO ESCs, and the reporter expression at different transfection levels
was quantified. Casp2, Lats2, and Rbl2 were shown to be greatly repressed in ES cells (Wang et
al., 2008). Correspondingly, the expression of OriUTR in WT is much lower compared to other
three conditions. Mutation abolishes majority of the miRNA repression, and its expression is more
similar to the reporter expression in KO cells (Figure 2.8). Also, unlike GFP-RBGpA, which is
expressed much higher than GFP-OriUTRs even in the absence of miRNAs due to factors (Figure
2.5 and Supplementary Figure 2.3), the expression of MutUTR and OriUTR closely mimic each
other in KO ESCs. And the similarity exists at both protein and transcript levels (Supplementary
Figure 2.4). Lin28a is only mildly repressed in ESCs, and the reporter expression are more similar
in all of the four conditions.
So far, we have proven the effectiveness of mutation. Due to the similarity between MutUTR
expression in WT ESCs and OriUTR expression in KO ESCs, either one could be used as miRNA
unregulated control. MutUTR provides several additional advantages over Dgcr8-'- ESCs. Strictly
speaking, no miRNA mutants achieves 100 percent abolishment of endogenous miRNAs, and both
Drosha-DGCR8 and Dicer-TRBP independent non-canonical miRNAs biogenesis pathways exists
in mammalian cells. By restricting ourselves to small perturbation of the MREs sequence,
MutUTR minimizes other experimental variations such as cell seeding densities. It also eliminates
26
any potential secondary effect caused by different transcriptome resulted from global miRNA loss
(Lim, 2005). Dgcr8-1- cells have a prolonged cell cycle compared to the wild-type ESCs (Wang et
al., 2008), and the slight difference in dilution factor will introduce bigger differences upon longer
transfection times.
2.3.6 Applications of UTR reporter system
By transfecting UTR reporter systems into cells, and measuring different combinations of variables,
we can explore various aspects of miRNA regulation. For instance, by measuring both protein and
mRNA expression of reporter at different transfection levels, we could trace miRNA regulation
strength at both transcriptional and translational levels for different target abundance (Figure 2.9c
and Chap. 3). By studying the variation of reporter expression, we could analyze how miRNAs
regulation controls targets expression noise (Chap. 4). By measuring endogenous gene expression
in response to increasing amounts of transfection decoys titration, we can explore the possibility
of miRNA-mediated-crosstalk (Figure 2.9e and Chap. 5). Finally, by labeling different regions
of transcripts with different color smFISH probes, for instance CDS and 3'UTRs, and study their
co-localization probability, we can discover if there's any novel patterns of miRNA mediated
decay on certain targets (Figure 2.9d and Chap. 6).
27
Lats2 Protein burplot
Casp2 Protein barplot
5.5
-
5
---
I OrUT R WT
MUtUTR, WT
OriUTR, Dgcr8KO
- MutUTR, DgcrSKO
5.5
I
-
5
4.5
4.5
4
4
3.51
--
OriUTR, WT
MutiITR WT
OriUTR, Dgcr8KO
-MutUTR, Dgcr8KO
a. 3.5
U-
U-
3
2.51[
2.5 1
2
2-
3
'195
4
3.5
4.5
5
15
5.5
3
Rb12 Protein barplot
5.5r
5 --
5.5
OriUTR, WT
-
U.AIISD
tIRW
IVMI.
IV
Wr
-
OriUTR, Dgcr8KO
-
MutUTR, Dgcr8KO
4
3.5
4.5
5
5.5
5
5.5
log(mCherry)
log(mCherry)
5
Lin28 Protein bwrplot
-
-
OriUTR, WT
MuIUTR, WT
OriUTR, DgcrSKO
MutUTR, Dgcr8KO
4.51
4.5[
4
S4
U-
3.5[
0.3.5
U-
0.
03
3
2.5
2.51
2
2
4
1.5
3
3.5
4
4.5
5
1.
5.5
log(mCherry)
3
3.5
4
4.5
log(mCherry)
Figure 2.8 Validation of MutUTR design.
GFP reporter followed by either original or mutated UTR was co-transfected into WT or Dgcr8
KO ESCs with indicator protein mCherry. For UTRs under strong miRNA repression, Casp2,
Lats2, and Rbl2, three of the four conditions, OriUTR in KO cells, MutUTR in WT cells, and
MutUTR in KO cells, are very similar to each other. And expression of GFP-OriUTR in WT ESCs
is much lower. Lin28 UTR serves as an example of mild miRNA regulation target, and all of the
four conditions are similar to each other.
28
mncherry
0
G ~?
Transfection
0
E
2
0
a
GFP mRNA
Endogenous mRNA
mCherry mRNA
Indicator mRNA
endogenous mRNA CDS
Decoy mRNA
Wd&
3'UTR mRNA
miRISC
GFP protein
mCherry protein
b
GFP mRNA + GFP protein + mCherry protein
S,
miRNA regulation at transcriptional
and translational level
c
+
GFP mRNA
UTR mRNA + mCherry protein
-
miRNA mediated decay
0
d
GFP mRNA + Endogenous + mCherry protein
mRNA CDS
miRNA mediated crosstalk
e
Figure 2.9 Experimental schematics.
(a) Cotransfection of reporter plasmid pCAG-d2eGFP-Ori/MutUTR and transfection level
indicator pCAG-mCherry into WT/KO ESCs. The ratio of reporter plasmid to indicator plasmid
is tunable. (b) A list of total measurable quantities, but due to overlapping spectra of fluorophores,
the actual number of simultaneously measurables are limited. (c-e), Different aspects of miRNA
regulation can be studied by different combinations of measurables.
29
2.4 Materials and Methods
2.4.1 MutUTR design
Perl script of TargetScanMouse v6.2 (able for download from http://www.targetscan.org/code)
was used to generate computationally predicted miRNA targeting sites (both conserved and nonconserved) for any custom sequence. Quantitative description of each site was also extracted from
TargetScan. Context+ score reflects the effectiveness of a targeting site, and PCT stands for
probability of conserved targeting. miRNA expression data in mESCs (miRNA Frequency by
Solexa Sequencing and miRNA Microarray Expression Data) was collected from (Marson, 2008).
MicroRNA targeting sites were filter selected for mutation. Specifically, the targeting miRNA has
to be definitely expressed in mESCs. We chose an expression threshold of 10 counts for
sequencing data (-0.01% of total miRNA count), and a threshold of 30 for microarray data (- the
average of negative control). Secondly, the miRNA sites has to be effective enough, and only sites
with context score larger than -0.1 were selected for mutation. For evolutionary highly conserved
targeting sites (PCT>O. 1), even if the targeting miRNA was not expressed in ES cells or context
score was smaller than -0.1, we insisted on mutating the sites. Because their mutations could
prevent potential miRNA repression in spontaneously differentiating ES cells, which could happen
to express the corresponding miRNA. Moreover, their mutations could generalize the application
of mutated UTR in cell context other than ESCs. There is a trade-off between effective abolishment
of miRNA repression and minimal perturbation of the sequence, and the above thresholds were
chosen empirically to balance the two factors. The effectiveness of each MutUTRs were
experimentally validated.
Two of the six seed sequence (e.g. position (2, 4), (3, 5), or (4, 6)) of selected miRNA targeting
sites were mutated. Adenine (A) was interchanged with cytosine (C), and thymine (T) was
interchanged with guanine (G). Specific position combination was chosen to maintain GC content.
If the miRNA site also involves 3' supplementary paring, one of the 3' sequence (position 13 to
1I6)
_-1
__
vvas ais
_-__
tA 4-
,+
.-1
11AA'7. A4
1+,-.- +-1
'1(1
A. 9X-.
--
mk1ULaLVU t iviayr eL al., 2AJ7 IeVLCiLtIn eL al., 2AJ1U; Tay eL ai., 20J;O, VV anug
-1
UL
'1AAO.
al., 20Vuo;
i1
-
4.
-1
11AAO.
Wu and Belasco, 2005).
The resulting mutated sequence was used as input for another round of TargetScan miRNA sites
prediction. This process was iterated until no novel miRNA targeting sites passing all mutation
thresholds was generated. GC content of the final mutated sequence was checked to make sure it
did not change by more than 1 percent. Finally the mutated sequence was checked for internal
restriction enzyme cutting sites (NEBcutter). And XhoI and BamHI/BglII (depending on which
one was compatible with internal cutting sites) had been added to 5' and 3' end as flanking
sequences. The final sequence was synthesized ab initio from GeneArt@ Gene Synthesis, and came
in a form of plasmid bearing the designed MutUTR sequence.
2.4.2 Plasmid construction
d2eGFP (destabilized enhanced green fluorescent protein #2) has a PEST destabilization tag on
the C-terminus of eGFP, which targets protein for degradation and results in rapid protein turnover
along with healthier cells under d2eGFP overexpression. d2eGFP was subcloned from pcDNA5CMV-d2eGFP vector, and was ligated into pCAGGS-RBGpA vector digested with EcoRI and
30
BglII to generate pCAG-d2eGFP-RBGpA. The purpose of promoter switch was to optimize
transgene expression in ESCs.
3'UTRs of endogenous genes were PCR-amplified from mESCs genomic DNA. Flanking XhoI
and BamHI/BglII cutting sites were appended from PCR primers. PCR fragments were digested,
and ligated into XhoI and BglII double digested pCAG-d2eGFP-RBGpA backbone to generate
pCAG-d2eGFP-OriUTR-RBGpA. MutUTRs were extracted from GeneArt plasmids by either
subcloning or digestion, and ligated into XhoI/BglII double digested pCAG-d2eGFP backbone to
generate pCAG-d2eGFP-MutUTR-RBGpA.
Starting from a previously established bi-directional reporter system (Mukherji et al., 2011), eYFP
was replaced with ZsGreenl-1 (Clontech) or d2eGFP using EcoRI and NdeI digestion sites
because eYFP was silenced in mESCs. OriUTRs and MutUTRs were subcloned from pCAGd2eGFP-Ori/MutUTR plasmid with desirable cutting sites appended, and ligated into pTRETIGHT-Bi-directional plasmid. Two color sets, pTRE-d2eGFP-mCherry and pTRE-ZsGreenmCherry were used in our system. ClaI and SalI/EcoRV were selected as cutting sites for insertion
of UTRs after mCherry, and BglII and XbaI were selected as cutting sites for insertion of UTRs
after ZsGreen/d2eGFP.
The bi-directional system has reporter to indicator ratio fixed at single cell level. But in general,
the cotransfection system is easier to construct, and it is more flexible because it does not require
dox induction for transgene activation. Especially we have proved that even though the reporter to
indicator plasmid delivery ratio could deviate from bulk ratio due to stochasticity, the result in the
low target expression region is still trustworthy.
2.4.3 PCR and sequencing primers design
OriUTRs were PCR amplified from mESCs genomic DNA, as very few mouse 3'UTR contained
introns (Hong et al., 2006). But if decoys of other regions such as coding sequences (CDSs) had
to be constructed, the sequence might have to be amplified from cDNA. MutUTRs were subcloned
from GeneArt@ plasmid. Enzymatic sequences identical to the backbone digestion sites were
incorporated into PCR primers. If the UTR sequence itself contained the enzymatic cutting sites,
another enzyme with compatible ends was added to the primer instead. The usual compatible pairs
used in this thesis were BamHI and BglII, XbaI and NheI. All PCR primers were listed in Table
2.1.
Table 2.1 PCR primers
target size
.
PCR pmer
digestion
enzyme
sequence(bp)
d2eGFPF
d2eGFPR
EcoRi
BgIII
GGAATTCACCGGTCGCCACCATGGT
GAAGATCTAAATATTGGCGCTCGAGGCG
878
Nanog3UTR F
Xhol
CCGCTCGAGGACTTACGCAACATCTGGGC
223
Nanog_3UTRR
BamHI
CGCGGATCCCCGACTGCTCTTCCGAAGG
31
Oct4_3UTRF
Xhol
CCGCTCGAGAGGCACCAGCCCTCCCTG
Oct4_3UTR_R
BamH I
CGCGGATCCAGCTATCTACTGTGTGTCCCAGTC
Sox2_3UTRF
Xhol
CCGCTCGAGGGGCTGGACTGCGAACTG
Sox2_3 UTRR
Sox23UTR
BamH
BmHI
CGCGGATCCCGCTTTCAGTGTCCATATTTCAAAAATTTATTT
ATCTC
Lin28a_3UTRF
Xhol
CCGCTCGAGAGGCCCAGGAGTCAGGGTTATTC
Lin28a_3UTRR
BamHI
CGCGGATCCCAGTACCAACTCTGGAGTACCAATAAG
Casp2_3UTRF
Casp2_3UTRR
Xhol
BamHI
AAGCTCGAGTGCCGCCTGCTATTCCTGC
CGGGATCCTCAACATTTATTTGGCACCTGATGGCAATAC
Lats2a_3UTRF
Xhol
AAACTCGAGCGAGGAAACCCAAAATGAGATTTCTTTC
Lats2a 3UTR R
-a R
Bglll
BTAGA
GGAAGATCTGGCTTTAAAGTTTTAATAATAAATTGTGCCAG
Rbl2_3UTRF
Xhol
CCGCTCGAGGGTTAGTGTCCAGGAGGAAACTGTCTTC
Rbl2_3UTRR
BamHl
CGCGGATCCTAAGTGCTTTATTGAAAAATACACATATTTTC
ATATAAAATTACAGTAGCG
Cdknla_3UTRF
Xhol
CCGCTCGAGAGTGCCCACGGGAGCC
Cdknla 3UTR R
-k-
GGAAGATCTCCGAATCATCGAGAAGTATTTATTGAGCACC
Bglll
BAGCTTTGG
NanogCDS_F
NanogCDSR
Xhol
BamHI
CCGCTCGAGTGAGTGTGGGTCTTCCTGGTCC
CGCGGATCCTGCCCTGACTTTAAGCCCAGATGT
Cdknla3UTR_F-
Clal
CCATCGATAGTGCCCACGGGAGCC
RFP
__________________________
Sal
AATAAGTCGACAATCATCGAGAAGTATTTATTGAGCACCA
GCTTTGG
Rbl2_3UTR_F_R
Clal
AATAAATCGATGGTTAGTGTCCAGGAGGAAACTGTCTTC
Rbl2_3UTR_R_R
FP
EcoRV
TTGGGATATCTAAGTGCTTTATTGAAAAATACACATATT
CATATAAAATTACAGTAGCG
Casp_3UTR_F1
Clal
CCATCGATTGCCGCCTGCTATTCCTGC
Sall
TTATTGTCGACTCAACATTTATTTGGCACCTGATGGCAATA
C
Casp_3UTR_R_R
FP
1085
2775
2018
1605
1391
1329
918
2018___
Cdknla3UTR_R_
RFP
RFP
226
__________________________
2018
1391
1605___
32
1605
Nanog_CDS_F_R
FP
NanogCDSR_
RFP
lats2a_3UTR_F_
Clal
CCATCGATTGAGTGTGGGTCTTCCTGGTCC
918
EcoRV
TTATTGATATCTGCCCTGACTTTAAGCCCAGATGT
Clal
CCATCGATACGAGGAAACCCAAAATGAGATTTCTTTTC
RFP
_________________________
Iats2a_3UTR_R_
RFP
EcoRV
TTATTGATATCGGCTTTAAAGTTTTAATAATAAATTGTGCC
AGTAGA
BamHI_lats2aUT
RF
Nhellats2aOriU
TRR
BamHI
AATAAGGATCCCGAGGAAACCCAAAATGAGATTTCTTTTC
Nhel
AACAAGCTAGCGGCTTTAAAGTTTTAATAATAAATTGTGCC
AGTAGA
lats2aMut_F_R
Clal
CCATCGATACGAGGAAACCCAAAATGAGATTTCTTTTC
lats2aMutRR
FP
EcoRV
TTATTGATATCTAGGCTTTAAAGTTTTAATAATAAATTGTGC
CAGTAG
BamHlIats2aM
utUTR F
Nhellats2aMut
UTRR
BamHI
Use BamHlIats2aMutUTR_F
Nhel
AACATGCTAGCTAGGCTTTAAAGTTTTAATAATAAATTGTG
CCAGTAG
FP
1605__
1605
1605
1605________________________
1605
1605
Sequencing primers were designed for verification of DNA sequences. They were designed -50
bases upstream of the sequence to be validated. And their design followed the primer
considerations such as length and melting temperature listed by MIT sequencing facility
(http://web.mit.edu/biopolymers /www/DNA.html). All sequencing primers were listed in Table
2.2.
Table 2.2 Sequencing primers
sequencing
primer
sequence
pCAGGS4
GCTCTAGAGCCTCTGCTAAC
pCAGGS20_d
2eGFP
PCAGGS19_b
PolyA
ATGTCTTGTGCCCAGGAGAG
nCh2MREpA
CGTGGAACAGTACGAACGC
pA2YFPside
AGTCAGTGAGCGAGGAAGCT
description
located on pCAG backbone, for sequence
d2eGFP
located on d2eGFP, for sequence what's
after d2eGFP
located on pCAG backbone, for seuqnce
what's before RBGpA
located on pTRE backbone, for sequence
what's after mCherry
loated on pTRE backbone, for sequence
p sGA what's before pA on the ZsGreen/GFP side
CAGCATATGGGCATATGTTGCC
33
Rbl2Seq1
RbI2SeqMut1
CTTCCCCAGTAGGTACTGTAC
p21Seqi
GTGATCTGCTGCTCTTTTCCC
p2lSeqMutl
CCTCTATTTTGGAGGGTTAATCTGG
GATACTGATCCCTTGAGCACTC
lats2Seql
GGATATGACTGAGTTCTTCGGG
CaspSeq1
GTCTGTATGCCATGACACTGG
CaspSeq2
TCTGGTGATGTCATTCTCTTGC
CaspSeq3
GGAAGAGGGCATTTGGATTTCTC
CaspSeqMut3
GAAGAGGTCCTTTGGCTGTATC
plin28_1
CCTGCACTGTGTTCTCAGGTAC
plin28_2
TCTCTCGACCTAAGGGTGACAG
plin28 3
ATAACCCTGTCCTTTGGTGCTG
2.4.4 Molecular cloning materials and kits
KOD Hot Start Master Mix (Novagen) was used for PCR amplification. QlAquick® PCR
Purification Kit (Qiagen) was used for purification of PCR fragments after enzymatic digestion or
general cleanup of DNA. QlAquick@ Gel Extraction Kit (Qiagen) was used for extraction of DNA
fragments from polyacrylamide gels. All restriction enzymes and buffers for Endonucleases were
ordered from NEB. Quick Ligation TM Kit (NEB) was used for quick ligation of DNA after
digestion. DNA plasmid was transformed into One Shot® TOP lOF' Chemically Competent E. coli
cells (Invitrogen C303003) and cultured overnight in LB. QlAprep@ Spin Miniprep Kit (Qiagen)
was used for small amounts (from 5ml LB) of DNA plasmid preparation from pelleted bacteria,
whereas HiSpeed@ Plasmid Maxi Kit (Qiagen) was used for large amounts (from 200ml LB)
nlsmid preparation. DNeasy@ Blood & Tissue Kit (Qiagen) was used for total DNA extraction
from cultured mESCs. 2x1 06 cells were used as starting material to prevent over-loading of silicabased spin-column.
2.4.5 smFISH probe design
smFISH probes were designed using a custom algorithm [now publicly available at
https://www.biosearchtech.com/stellarisdesigner/ ] to locate twenty oligonucleotide regions with
35 -65% GC content in the cDNA of the gene of interest, with a separation of at least 2 nucleotides
between adjacent probes. Pre-designed probes were then subjected to BLAST (Basic Local
Alignment Search Tool) analysis of the entire genome, and those with significant alternative
targets were removed from the selection process. Oligonucleotide probes were then coupled to
fluorophore tetramethylrhodamine (TMR) (Invitrogen), Alexa 594 (Invitrogen) or Cy5 (GE
Amersham) strategically to allow simultaneous visualization of different transcripts. The
concentration of coupled probes was measured using Nanodrop and diluted to a working
concentration of -lng/pl during FISH hybridization.
For each 3'UTR of choice, we designed the FISH probes against its coding sequences (CDSs) and
3'UTRs separately. We aimed to design 48 probes against each region, but sometimes due to the
limited sequence length and low GC contents in the 3'UTR region, as few as 36 probes were
34
designed. Due to the high sequence similarity between original and mutated version of 3'UTR,
FISH probes designed against OriUTRs were confirmed to also work for their mutated versions.
FISH probes for transcript of fluorescent protein d2eGFP and mCherry were also designed, and
were labeled with Cy5 probes for flow cytometry experiment fluorescence compatibility. The
sequences of the probes are available upon request.
2.4.6 Cell lines
V6.5 mouse embryonic stem cells (WT mESCs) were derived from the inner cell mass (ICM) of a
3.5 day old male mouse embryo from a C57BL/6 X 129/sv cross background. Dgcr8& ES cell line
is a gift from Robert Blelloch's lab. The cell line was derived from V6.5 strain, but was incapable
of forming functional microprocessor and producing canonical miRNAs. V19 ES cell line is a gift
from Laurie Boyer's lab. The cell line was also derived from V6.5 background, but it contains a
reverse tetracycline trans-activator (M2rtTA) driven by the Rosa26 promoter. It is able to activate
the expression of Tet-On promoter driven genes under doxycycline induction. Dicer1 ES cell line
and miR-290 cluster knockout ES cell line are gifts from Phil Sharp's lab.
2.4.7 Cell culturing
All tissue culture plates were gelatin coated for maximum ES cell attachment. 0.2% gelatin
solution was made from dissolving gelatin powder (Sigma GI 890) in 1xPBS (GIBCO 14190), and
it was autoclaved, filter sterilized and stored at 4'C. To make gelatinized plates, we distributed
suitable amounts of 0.2% gelatin solution to cover the plate surface, incubated it for at least 10
min at room temperature, and removed the solution after.
mESCs were all cultured on gelatinized cell culture plates in standard mESCs media. During the
4 hours of transient transfection, all cell lines were grown in mESCs media without antibiotics
Penicillin-Streptomycin.
Table 2.3 mESCs media
Final
concentration
GIBCO 10829
Quantit
y
(500ml)
410ml
FBS (heat-inactivated)
L-glutamine
Penicillin-Streptomycin
Hyclone SH30070.03EH
GIBCO 25030 (200mM)
GIBCO 15140 (10,000 U/mL)
75ml
5ml
5ml
MEM Non-Essential Amino Acids
GIBCO 11140 (100x)
Sigma M6250
MILLIPORE ESG1107
5ml
4/PI
50 P I
15%
2mM
100U/ml Pen
100pg/ml Strep
100pM
0.1 mM
10 3 U/ml
Component
Purchase number
Knockout DMEM
,8 -Mercaptoethanol
Leukemia Inhibitory Factor (LIF)
35
The ES cells were seeded at a density of 2M for 10cm plates, and were passaged every two days
at a ratio from 1:6 to 1:10. ESCs were detached with 0.25% Trypsin supplemented with 0.53 mM
EDTA.
To facilitate the maintenance of undifferentiated ESCs, y-irradiated MEF cells (GSC-6202G) were
plated as feeder layers one day before the plating of ESCs. MEF cells were seeded at a density of
2M per 10cm plate. The MEF media was the same as standard mESCs media except that we did
not supplement it with LIF and the FBS concentration was reduced to 10%. The MEF feeder layer
was only used for maintenance and passage of ESCs, and it was weed off by differential
sedimentation before the transfection experiment to reduce contamination of MEF cell lines.
2.4.8 Transient transfection of plasmids and dox induction
One day before the transfection, ESCs were plated at a density of 1.4M per 60mm plate, i.e.
5e4/cm 2 . The cells were transfected about 18 hours later at a confluence of 80%. For each
transfection sample in 60mm plate format, 20pl Lipofectamine 2000 (Invitrogen 11668) was
mixed with 8 pg plasmid in lml Opti-MEM I Reduced Serum Medium (Invitrogen 31985)
according to Lipofectamine 2000 plasmid DNA transfection protocol. In the cotransfection system,
7 pg reporter plasmid and 1 pg indicator plasmid were mixed and added. In V19 ESCs
transfection, doxycycline (Sigma-Aldrich D9891) was added to the media right after transfection
at a concentration of 1 jig/ml. Cells were cultured in mESCs media without antibiotics for 4 hours
during the transfection, and were then changed back to standard mESCs media. Transfected cells
were passaged the next day at a ratio of 1 to 3 from each 60mm plate into one 10cm plate. And
48hs after transfection start, cells were harvested for downstream experiments.
2.4.9 Cell fixation and hybridization
We performed smFISH as previously described (Raj et al., 2006). Harvested cells from 10cm plate
were fixed with lml fixation buffer (3.7% para-formaldehyde in 1xPBS) at room temperature for
10 minutes, washed twice with 1xPBS, and permeabilized in lml 70% ethanol at 4*C for at least
overnight. For hybridization, the samples were resuspended in 100 pl of hybridization solution
containing labeled DNA probes in 2xSSC, 1 mg/ml BSA, 10mM VRC, 0.5 mg/ml Escherichia coli
tRNA and 0.1 g/ml dextran sulfate, and 25% formamide, and incubated overnight at 30 C. Optimal
probe concentrations during hybridization were determined empirically, and the working
concentrations were usually around lng/l. The next day, the samples were washed twice by
incubating in 1 ml of wash solution consisting of 25% formamide and 2xSSC at 30'C for 30
minutes. And the cells were transferred to -1ml FACS buffer for flow cytometry or ~100ul glox
buffer for microscopy imaging.
2.4.10 Flow cytometry
Cells were resuspended in -ml FACS buffer (1xPBS supplemented with 2% RNase-free BSA
(NEB B9000S) to reduce cell stickiness to container. Samples were filtered through strainer cap
of polystyrene test tubes (Falcon #352235) to reduce clumps. Cells were assayed on LSRII
analyzer (BD Biosciences) with FACSDiva software. Single cells were gate separated from cell
36
clusters and debris according to their forward scatter (FSC) and side scatter (SSC) profiles. eGFP
and ZsGreen proteins intensities were read from FITC channel, mCherry protein intensity was read
from PE-TxRed channel, and Cy5 intensity (i.e. corresponded to labeled transcript) was read from
APC channel.
2.4.11 Microscopy Imaging and image analysis
We counted the mRNAs in individual cells as described previously (Raj et al., 2006).
Briefly speaking, the samples were resuspended in glucose oxidase anti-fade solution, which
contains 10 mM Tris (pH 7.5), 2xSSC, 0.4% glucose, supplemented with glucose oxidase and
catalase. Then 3 pl cell suspension were sandwiched between two coverglasses, and mounted on
a glass slides using a silicone gasket. Images were taken with a Nikon TE2000 inverted
fluorescence microscope equipped with a 100x oil-immersion objective and a Princeton
Instruments camera using MetaMorph software. Stacks of images were taken automatically with
0.35 microns between the z-slices.
To segment the cells, a marker-guided watershed algorithm was used. Briefly, cell boundaries were
obtained by running an edge detection algorithm on the bright-field image of the cells. To generate
markers, the centroid of the region enclosed by individual cell boundaries was computed. A
marker-guided watershed algorithm was then run on the distance transformation of the cell
boundaries, using the markers located within the cell boundaries. The resultant cell segmentation
image was then manually curated for mis-segmentations. A manual segmentation method was used
as a supplement, and GFP mRNA images were used as a reference for manual drawing of cell
boundary polygons. To quantify the number of RNA molecules in each cell, a log filter was run
over each optical slice of an image stack to enhance signals. A threshold was taken on the resultant
image stack to pick up mRNA spots. The locations of mRNA spots were then taken to be the
regional maximum pixel value of each connected region. The number of mRNA spots located
within the cell boundaries of an individual cell was thus quantified.
37
2.5 Supplementary
Quantitative measurement of mRNA using smFISH
48 20-mer probes, each coupled to a fluorophore
Target mRNA
Max-Zprojection of raw image
Computationally detected mRNA dots
Supplementary Figure 2.1 Illustration of smFISH and image analysis.
Above, smFISH technique illustration. smFISH method probes each mRNA species with 30 or more
short, singly labeled oligonucleotide probes that are about 20-mers in lengths. Simultaneous binding
of a probe set, which typically consists of 48 different oligonucleotide probes, to each mRNA molecule
results in a diffraction-limited fluorescent spot under fluorescence microscope.
Below, Analysis of mRNA spots. The left panel is a fluorescent maximum Z-projection image showing
Oct4 transcripts in WT ESCs. The right panel is processed image showing each individual mRNA
transcript as a single bright pixel. Cells were segmented using bright-field images, and cell boundaries
were shown as red polygons.
38
0CM
&7.32+002
s
60
Sox2
s0
n035700
21+002
U
60
60
90.-M+001
40
40
20
20i
0
mrn 18t002
00
~
2
40
20
200
400
600
800
0
1000
00
100
200
300
400
500
600
700
0
100
200
00
40
403
20
20
20
60
200
400
600
800
0
1000
500
00
j -t65+0
60
40
00
400
8.12
00
0
60
300
100
200
300
400
500
600
10
20
30
40
50
60
0
100
200
300
400
600
600
00
120
300
4W0
S0
60
700
0CIA
80
300
40
man .34*+002
200
0
30
20
100
20
0
200
400
600
1000
I00
0
100
200
300
400
500
0O
700
Supplementary Figure 2.2 Transcript levels of ES cell core transcription factors Oct4, Sox2
and Nanog in various ES cell lines.
Transcript levels of Pou5fi (encodes Oct4), Sox2 and Nanog were measured with smFISH for
Dicer knockout ESCs (First row in red), miR-290-295 cluster knockout ESCs (second row in
black), and wild-type ESCs (third row in blue). Pou5fi and Sox2 expressions are similar in the
three cell lines. Nanog transcript level in Dicer-'- ESCs is almost 3-fold higher than WT ESCs,
miR-290-295 cluster and other miRNAs affect Nanog transcript expression, whether this effect is
direct or not needs further study.
39
-
6
bi-directional plasmid protein barplot
5--pTRE-GFPlin28UTR-RFP, KO
pTRE-GFP-RFP, KO
-
5
C
* 4.5-
0 40.
LL
03.50
2.52
1.5
2
3
4
log1 0(mCherry)
5
Supplementary Figure 2.3 Bar plot of GFP expression at different mCherry levels in
different bi-directional systems.
Bi-directional plasmids were transfected in Dgcr8 knockout ESCs. The GFP expression from
pTRE-GFPlin28aOriUTR-RFP differs from pTRE-GFPRBGpA-RFP. This difference is miRNA
irrelevant, and comes from factors such as 3'UTR length, RNA binding proteins, etc.
40
Casp2 Protein barplot
5.5
6 .-
i
OdIUTR Dnr6 KO
-MuttJTR, Dqcr9 KO
C
5
a
2
RNA b
lt
Dgcr8 KO
-OdUTR,
4.5
4.5
4
E
. 3.5
0.
U-
- Mu tt
35
O
crO K
DUTR
,
3
2.6
2.5[
2
2
2.5
3
1.5
3.5
4
log(mCherry)
4.5
5
RbI2 Protein barplot
5.6
OriUTR, Dqcr9
5 I-
5.5
2
C
Sr
2.5
3
3.5
4
log(mCherry)
4.5
5
d
Rbl2 mRNA barplot
KO
MutUTR, Dgcr8 KO
4.5
5.5
-OriUTR.
Dgcr8 KO
MutTR. Dacr6 KO
4.5
4
z
3.6
Ix 3.5
U.
3
2.6
2.5
2
2.5
3
3.5
4
ig(mCheny)
4.5
5
-OdUTR
5. I- -MuUTR,
2
e
Lais2 Protein barplot
5.6
5.5
3.5
3
2.5
2.6
3
3.6
4
4
log(mCherry)
4.5
5
f
Las2 mRNA barplot
Sr I--.i
Dgcr8 KO
4.5 I-
Dgcr8 KO
T
5.5
-
1.5
WrwTRIF%$8 KO
M
-MutUJTR, Docre KO
4.5~
4
LL
z
W
3.5[
U.
3
CL3.5
3
2.5
2
2.5
3
3.6
4
g(mChery)
4.5
5
-1. . I I . .
2
5.5
2.5
3
3.5
4
Iog(mCherry)
4.5
.
2.6
2
5
5.5
Supplementary Figure 2.4 Bar plot of GFP protein and mRNA at different mCherry levels
for Dgcr8- ESCs transfection.
Unlike GFP-RBGpA, the expression of which is much higher than GFP-OriUTRs even in Dgcr8~
' ESCs. GFP-MutUTR closely mimics the expression of its original counterpart, at both protein
and mRNA levels, in the miRNA null background. The mutation of MREs barely perturbs other
factors affecting expression, and MutUTR should be used instead of GFP-RBGpA as miRNA
unregulated control in the wild type cell lines.
41
Supplementary sequence information
MutUTR sequence for GeneArt@ ordering
>XhoI-Casp2UTRMut-BamHI
CTCGAGTGCCGCCTGCTATTCCGGATGTTGGAGGCCACTGGACCACTGGGGGCACAAGGTAGACTTCTC
TTCAGAATGGTTTTTGTTCTGTATCCCCTCTAATGGATATGAGATTCTCCCAGGCTTGTTTCCTGTCAGCC
ATCTCTGTCTTTGGGTATGAAACATAAGGATGGCTCCTCCGGTGTCGTGTTCTCGAACTATAGAGCCAGC
TCTGAATGGATGTGTTACCAGAAGCATTTTAGCTACAGCCTAGAAAATGACATTTTTAACACATTCTTATT
GTGGGAAGAGGTCCTTTGGCTGTATCAATGTTGGGGATATTTTTGTTCCCAAGGCATCTTAGGAGTACTT
GGATCATAGCTTTTTTTTTTTTTCCTAAATCAGTTAAGGAGTCTCAGAGATCCTATCCTTTTTT1TCCATATC
TACACCATCATTTTTCCCACAGTGGAGATTTGGAAGATGTCCCAATTTAATGTAGGTGTTTTCATCTGTCA
TTACGGGACAGATGAGATCCTACTACTTGCGAAGTTTCTATGCATACCTTTAAGTTCAGGCCCTAGGTTA
CGGACAGTCCCTCAGCCTTTCCATTGGTTCCTTTGTGTTCAGTGCACCCAGCCTTTGAACAGAGCCTAGG
GTCTGTATGCCATGACACTGGAAGTCATAGAAATTTCCCTGGTCATGCTTTGTTTGAACTTTAACTGAATG
AACCTTATCGGGCATAACGAAATGAAAATGCAGTGACAGCTGAGTGTGCTGTGTCTCACACTATCACCC
GTCATCAGGATGTCGCGCCTTCCTTACTGTGGCTTCTGCATGCCCGTACACTGTACTTGACGGCTGGCCT
CCAGGGTCTCTCTTGCTTTGTACTGGTTCCCCTCTTTACCTTCACCATTCGATTCTTCTGCCAAGTCTGTGA
AGCCGTCCTTTGTAGGATTTGTCTTGCCACTTACGCTGTCCGGTAGTTGCTTATTCTTTCTGCCTTCTGCTT
CAGCGTGAGGCTTCTTTGGTTTTCTGTGGCAGCGTCTCCCTTCTCATTGTTTCTCTGTGTTTTAGTGGGGA
TAGTACCATATGTGATATAACCTAGAAGAAATTGTCTCTGCTCTTATGAAACTTGCTTATTCTTGAAAACC
TTCTGCATTTCCATTTTTTCCTCTCGTACAATTTATTCTCCATGTAACAGAGTAGTTTGGTTTTTAAAATATC
TGGTGATGTCATTCTCTTGCTTAGAACACTAGCTTCCTGTTACGCTTCATCTAAAATGCAAATTCTTACACC
CAGCTTACGAGATCTGGCTCATACCTTCCCTTTGGATCTCATTAAATGGTGATGTATCACTATGCTCCAGC
CCCTCTTAGGTCCTCCTATCCGTCTTGCAGGTGTTCTGAACTCTCCTTTGGCTAGTCTCTGATTTTTGAGTC
TGGCGGAGGCCTCTTGACCATTCGGCCCATGCTGTCTACTGTGCCTCCTTATGAGGGCATCATGTTGGTC
TCTGTTGTGCTTACTGCAGGCTGTAATGGCCCGTTTGCTTGTGTAACTTGTTCCCTCTGAGGCTGAATGCT
CCAAGAGAGTGGGAACTGTGCTTCTTACTTACTGATATCCAGTAACTGGCCCGTACTAGGTCTTCATGCA
GGTTTCCTGAGTAAAGGAAGGAGACCAGCATCGAACCTTAGTTAGAGCCTACCTTTTGCAGTTTCTAAAT
TGCTATTATAGTGTACAGTTCAATTAGTATATGGGTTTTTTTTTCCAGGTGTTTTATTTTTATCCACTGTTTT
GTTGTTGT1TTTTTATATTTTCTAAAGATCACGTTTTAGAAACCTTCTTTCACATCTCCATAGTGCCCAGCA
AATTTGAGGCCTATGGTAGTTGAGGTGCTCACCGAATGTGTTTTGTATGAACCAAGTGGTTTGAAGACTT
GCTCCAACATTCTGCCTTTTGGGTCAGTATAGGCTTCATAAGTGGTAGAATCTTCACACTTCCCACGGAC
AAGATTTTGTATTGCCATCAGGGTACCAATAAATGTTGATGGATCC
>XhoI-Cdkn1aUTRMut-BgIII
CTCGAGAGTGCCCACGTGCGCACAGCCCTCTTCTGCTGTGGGTCAGGAGGCCTCTTCCCCATCTTCGGCC
TTAGCCCTCACTCTGTGTGGCGTAATTATTATTTGTGTTTTAATTTAAACGTCTCCTGTATATACGCTGCCT
GCCCTCTCCCAGTCTCCAAACTTAAAGTTATTTAAAAAAAGACCCAAACCACACAAAAAAAACCACACCA
CACCAAACCTAAATTAGTAGGACGGTAGGGCCCTTAGTGTGGGGGATTTCTATTATGTAGATTATTATTA
TTTAAGCCCCTCCCAACCCAAGCTCTGTGTTTCCTATACCGGAGGAACAGTCCTACTGATATCAACCCATC
TGCATCCGTTTCACCCAACCCCCCTCCCCCCATTCCCTGCCTGGTTCGTTGCCACTTCTTACCTGGGGGTG
ATCCTCAGACCTGAATAGAAATTTGGAAAAATGAGTAGGACTTTGGGGTCTCCTTGTCACCTCTAAGGCC
42
AGCTAGGATGACAGTGAAGCATGAACAGCCTAGAACAGGGATGGCAGTTAGGACTCAACCGTAATATC
CCGACTCTTGACATTGCTCAGACCTGTGAAGACAGGAATGGTCACAACTCTGGATCCCCTTTGCCACTCC
TGGGGAGCCCACCTCTCCTGTGGGTCTCTG CCAGCTGCCCCTCTATTTTGGAGGGTTAATCTGGTGATCT
GATTCTCTTTTCCCCCACCCCATACTTCCCCTTCTGCAGGTCGGCAGGAGGCATATCTAGGAAATTGCCAC
CCAGCTCAGTGGACTGGACGTGCATGTATATGCAGGGTACACTAAGTGGGATTCCCTGGTCTTACCTTA
GGAATCTCCAGTGGCAACCCCCTGCATTGTGGGTCTAGGGTGGGTCCTTGGTGGTGAGACAGGCCTCCA
ATAGCATTCTATGGGGGGTGGTGGTGGGGGTGGGCTTATCTGGGATGGGGACCCCAGTTGGGGTTCTC
AGTGACTTCTCCCATTTCTTAGTAGCAGTTGTACAAGGAGCCAGGCCAAGATGGTGTCTTGGGGGCTAA
GGGAGCTCACAGGACACTGAGCAATGG CTGATCCTTTCTCATTTTTGAATACCGTGGGTGTCAAAGAAA
TTAGTGGGTCTGACTCCAGCCCCAAACATCCCTGTTTCTGTAACATCCTGGTCTGGACTGTATCCCCTTAG
CCCGCACCCCAAGAACATGTATTGTGGCTCCCTCCCTGTCTCCACTCAGATTGTAAGCGTCTCACGAGAA
GGGACAGCACCCTGCATTGTCCCGAGTCCTCACACCCGACCCCAAAGCTTGGGCTCAATAAATACTTCTC
GATGATTAGATCT
>Xhol-lats2aUTRMut-BgI I
CTCGAGCGAGGAAACCCAAAATGAGATTTCTTTTCAGAAGACAAACTCAAGCTTAGGAAGCATTCATTTT
TAGTTCTGGTAAATGGGCAACAGGAAGAGTCAACATGATGTAAAATTAGCCCTCTGAGGACCTTCACTG
AAGTAAAACATACT1TTTAAAAAATTAGTACAGTATGGACAGATCCCTTATTTTGTGGATACCCATCTTTT
TCTTACTAAATTATAAGGACTGACGGGGAGAAACCATGATTCTGTTATTTCCATGTGTGTTGTATCGGCT
AGAAATTGTCCACAGCTAGAAAAGGAAGAGCTGGAGAGCGTGAGGCAAGACGTCTGTTCCATAAGAGA
GGATGAGGCGACGGAGCTCTGCTCAAGTCACGAGGACCGCTTATCTACACAGTGGCTTAGTTTTGTATT
TTCGCACATGTAAAATTGTGATGTAATGTTTGAAAGCTGCTTTTGTATTTTCTCCTTTTCCTATTATAGTTC
CTAGAAAGAGTGAGCAGAGAGCTGGTGGGTGTGACTCCGGTGTCTGGTGTGGAGAGTACTGCATGAGC
AGGGGTTTCTAGTATAAAATACCGTATCGTTCCATTCACATCCGGTCCTTTTAATACGTTTTTAAATGAGG
TATTCTAGACAGTGTGCTTAGATTGTATTGTGTGGATGTGTGTTAAAGAAATGCATATGTATAACTGAAG
TGTGAT1TTTTTTTTAATGTGTGTGTGTCTTGGATATGACTGAGTTCTTCGGGAGGCAAATGTAAACATTTG
TCATATATAAAACACATCAAACGTGATTAAGTCAGCTTTTCAAAAACATTGACATAATTCTAGCGTTTTGT
CCATTTCCGTAGTCCTGTCTGCTGTCAGGTGTGGCTGTGGGAGCTGGACCCTGCTATTCATTCTTTCACTC
ACAGGGTTCCCG CAGACCTAGGTGATGTAAGGGTCCTGCTTCCTGTGTTCTCAGCCAACCAGGAGGTTC
TTTTAAACCCAGTTCTTTGGGCCTCTTTCACATGAGAGGTGTCTTTAACATCTCAATGTGAATGAATACGT
TTTTCTAACTTTGTAAAAAGAAAAAAAGATTCTTTGAAGCAACATTGGAGTACAAAAACAATCAATACTT
TTTTCTTAGACATATAGGGGGGTATATAACTATAGATAAACACACAAAATAGTCCTTATGTAAAATTAGT
ACGCTTCCTACTTAAGGTGATTTATATTTGAGTACATTCAGTTTCTTTTGCTTTTCAAGGATGGAACACAT
CCCATTTTCATTATGCTATGACCAATCTTCTCACCAAGGTTCTTAGCACAGTGCACCCGTTACTTAGGAGT
ATCTAGGCAGAAACACTTACAAATTTATCGAGGTCTAAGAAACCTGCCTGTGTCTGGTGTCCATTTGTAT
GAATGGCATATTCTGAAGTCTGCTGTGCTGGGATTGTTAATTACATTCTTCTCGCTATTTTGTAGTAATGC
CGTGTTATTTACAGCGCTCTGACATAGTTTGATGTGGTAGGTTCTTTCTCAGGAACTCAATTTAACTATTA
TTTATTGATATATCATTGCCTTTGAAAGCTTCTACTGGCACAATTTATTATTAAAACTTTAAAGCCTAAGAT
CT
>Xhol-Rbl2UTRM ut-BglI I
43
CTCGAGGGTTAGTGTCCAGGAGGAAACTGTCTTCACATGAACTGGTTATCCGGACTTAATG CATGCAGG
ACTACGGAACCTTGCTCCTGAATCCAGCAACTGATTAAAGGAGGGGATAAAAGGGAAGCGCTTCTGACT
AATTGTGGCAGCAAATGCCTGGTATCCCATCACCCAAGGGGTAGGGGACAAGAGGACCAGGAGTTTAA
GGCCAACCTGAGCAATACAGCATGTCTGAGGGCAACTTGGACGAAATGTAACCCTTACTCAAAAACAAA
GAACCGGAAGGGATGTTTTGGTAAGAAATCAGACTTATCTCACTGTCCTTTGACTATTTTTCATCCCAGTT
GCCTCTTCCTCTACTTAGTGCTTACCTTCAACACGGCTCAGAATCCAAACTTGGGGTTTTGAACTCTGGCA
AACTTTTACAAGTACTGCAGGAAGCAAATCTTTAGAGGCTTTTGTAGGTAGGCCCCAGGAGAGGAACTG
TATTTAACTTCATTTCCACGTTCATATGGTTAGGTCCAACCATGTGTTTTAGGATGAAAACCAATAGACAT
TTACAAACAGAACAAGAGGGGCTGGCCCCGACCTGGAAGTGTCCAGGCCTTGGCCTAAAGATACTGAT
CCCTTGAGCACTCACTCTCCCTTCCCCAGTAGGTCCGGTACAGTTTTAAAGCGTTCCATGTCTGAAGGAA
CCTGTGTAATTGGTGGCCCGTTATGGCTGTAAGATGCATAGCATTGTGACCCAGGGTTTGCTGTATATTT
ATGATGGCCCGTTCTATGGTTTTAACTTTGGTAGGTACAAGCCTTAGGCTAAACAGCTAATAATTTCTTTT
AATGCTTTTCTTAAAAGACTTCGGATATAGCTACATGTTCTGGCCACATGTAAAAAACTTCCATTTGTGGT
AGTGGAAGTACATAGGGATCTTTTAGCTAAGTAAAGATTTTTAAGTCAAGTTGAATTGAGAGTATTGA
AAAGTTTTGACCCCTTCCTTTTGGAAGTAGTTATCCCCACGAAACTATCTTTGAGGGTATTCCTGGAAGTT
AAAAAAATAGGTTGGAGAAGTGAGGTTTTTATTAGTACATAGTACCATTTATACAAATTAGAAAATTATT
TAACAGCTATTGATTATCTACGCATATCTTTATTAATCATTATTGTCGTTT1T TAAGTTGGATTAATAATC
CTAAGGAAAAAATTCAATTGTAAATTGGATCATGATAAACCAAGTTACTAGGTAACTTCATGATTCTCTA
CAGCACCCAGCTGAGGACCTACAAGCCTGGCACTCCCCCCCCCACCACAGAGTAGTGCTGTGCAGAGTA
CTTAGAAAACCTTAGTACCGCTAATTTAATTTTATATGAAAATATGTGTATTTTTCAATAAAGAAATTATA
AATTAAGATCT
> CCG-Xhol-C+Iin28UTRMut+Bgll+TC
CCGCTCGAGCGGCCCAGGAGTCAGGGTTATTATGTGGCTAATGGGGAGTTTAAGGAAAGAGGCATCAA
TCTGCAG AG.TGG
A ATGGGGGTA AGGTGTTCTGGGTACTTG A ArCG ArGTTrTrAGG
CCGGGGTTCCCAGTGTCACCCTGTCTTTCCTTGGAGGGAAGGAAAGGATGAGGCAAAGGAACTCCTACC
ACACTCTATCTGAAAGCAAGTGAAGGCTTTTGTGGGGGAGGAACCACCCTAGAACCCGAGGCTTTGACC
AGTGGCTGGGCTAGGGAAGTTCTTTTGTAGAAGGCTGTGTGATATTTCCCTTGCCAGACGGGAAGCGAA
ACAAGTGTCAAACCAAGATTACTGAACCTACCCCTCCAGCTACTATGTTCTGGGGAAGGGACTCCCAGG
AGCAGGACGAGGTTATTTTCACACCGTGCTTATTCATAACCCTGTCCTTTGGTGCTGTGCTGGGAATGGT
CTCTAGCAACGGGTTGTGATGACAGGCAAAGAGGGTGGTTGGGGGAGACAACTGCAGACCTTCGGCCC
ACACCTCACTCCCAGCCCTTTCTGGGCCAATGGGATTTTAATTTATTTGCTCCCTTAGGTAACTGCAACGT
GGGTCCCACTTTCTCCAGGATGCCAACTGAACGATCTACGTGCGAATGACGTATCTTGTGCGTTCTTTTT1
1T TTAAT1TFVAAAA I I I I I IICCTCTTCTTAAAATAAGTAATGGGTTTGTATTTTTTTCTATTTTAATCTT
CCGGCCCTCATTCCTGCCCTTTGTTCTCAGGTACATGAGCAATCTCCGTGATAATAAGTCCGTAGCAGCTC
CAGGTCTGCTCAGCCGTAATACTTTGTTTTGTTTTGTTTTGATCACCATGGAGACCAACCATTTGGAGTGC
ACAGCCTGTTGAACTAACGCATTTTTGCCGATTACAGCTGGCTTTTCTGCAAGAGCGTCCTTGAAAAATG
TGTCTCACGGGTTTCGATTGAGCTGCCCCAAGACTTGATCTGGATTTGGCAAAACATAGGACATCACTCT
AAACAGGAAAGGGTGGTACAGAGACATTAAAAGGCTGGGCCAGGTAAAAGGCACAAGAGGAACTTTC
CATACCAGATCCATCCTTTTGCCAGATTAGTGGAAGCCTGCCATGCACAGCCGTGTGTGAGAGAGAGAG
TGTGTATGTATGTGTGTGTGGATTTTTTTTAATTCCAATTTATGAAGACGAGGTGGGTTTTGTTTATTTGA
TTGCTTTTTGTGCTGGGGATAGAATCTTGGGCTTCATTTGTGCTAGGAAGTACACGGACACTGAGTTATC
44
CCAGTAAGAATTCCACTTAAGACCAGTACCCTTATTCCCACACTGTGCTGTCCAGGCATGGGAACATGAG
GCAGGGACTCAACTCCTTAGCCTTTCACAATCTTGGCTTTCAGAGAGACTCATGAGTATGGGCCTCAGTG
GCAAGTGTCCTGCCCTTCGGTAGCATGATGGTTGATAGCTAAAGGAAAGAGGGGGTGGGGAGTTTCGT
TGAAATGCTGTTAGATCGCCAGAAACCTAACGCACTGTGTTGAAACGGGACAAATTCCATAGAACACAT
TGGGTGGTGTGTGTGTGTGTCTGATCTTGGTTTCTTGTCTCCCTCTCCCCCCAAATTCGGCCCTCACCCCT
AGTTAATTGTATTCGTCTGGCCTTTGTAGGACTTTTACTGTCTCTGAGTTGGTGATTGCTAGGTGGCCTAG
TTGTGTAAATATAAATGTGTTGGTCTTCATGTTCTTTTGGGGTTTTATTGTTGAAAAAACTTTTGTTGTATT
GAGAGAAAAATAGCCAAAGCATCTTTGACAGAAAGCTCTGCACCAGACAACACCATCTGAAACTTAAAT
GTGCGGTCCTCTTCTCAAAGTGAACCTCTGGGACCATGGCTTATCCTTACCTGCTCCTCCTGTGTCTCCCA
TTCTGGACCACAGTGACCTTCAGACAGCCCCTCTTCTCCCTCGTAAGAAAACTTAGGCTCATTTACTTCTT
TGAGCATCTCTGTAACTCTTGAAGGACCCAGGTTAAAATTCTGAAGAAGCCAGGAACCTCATTATGTCCT
TGTCCCTAACTCAGTGAAGAGTTTTGGTTGGTGGTTGTTAGACAGGGCCTCACTCTGTAGCTGGAGATA
GAGAGCCTCGGGTTCCTGGCTCTCCTCCTGCCTTCTGCACAGAGTCCCCTGTGCAGGG CTTGCAGGTGCC
GCTTCTCCCTGGCAAGACCATTTATTTCATGGTGTGATTCGCCTTTGGATGGATCAAACCAATGTAATCTG
TCACCCTTAGGTCGAGAGAAGCAATTGTGGGGCCTTCCATGTAGAAAGTTGGAATCTGGACACCAGAAA
AGGGACTATGACTTTACAGTGAGTCACTCAGGAACTTAATGCCGGTGCAAGAAACTTATGTCAAAGAGG
CCACAAGATTGTTACTAGGAGACGGACGACTTTATCTCCATGTTGAATGCTAGAAACCAAAGCTTTGTGA
GAAATCTTGAATTTATGGGGAGGGTGGGAAAGGGTGTACTTGTCTGTCCTTTCCCCATCTCTTTCCTGAA
CTGCAGGAGACTAAGGCCCCCCACCCCCCGGGGCTTGGATGACCCCCACCCCTGCCTGGGGTGTTTTATT
TCCTAGTTGATTTTTAATGGACCCGGGCCCTTTTCTTCCTATCGTATAATCATCCTGTGACACATGCTGACT
TTTCCTTCCCTTCTCTTCCCTGGGAAAATAAAGACTTATTGGTACTCCAGAGTTGGGAATGAGATCTTC
2.6 References
Alon, U. (2007). Network motifs: theory and experimental approaches. Nat Rev Genet 8, 450-461.
Babiarz, J.E., Ruby, J.G., Wang, Y., Bartel, D.P., and Blelloch, R. (2008). Mouse ES cells express
endogenous shRNAs, siRNAs, and other Microprocessor-independent, Dicer-dependent small
RNAs. Genes & development 22, 2773-2785.
Bartel, D.P. (2004). MicroRNAs: genomics, biogenesis, mechanism and function. Cell 116, 281-
297.
Bartel, P.D. (2009). MicroRNAs: target recognition and regulatory functions. Cell 136, 215-233.
Benetti, R. (2008). A mammalian microRNA cluster controls DNA methylation and telomere
recombination via Rbl2-dependent regulation of DNA methyltransferases. Nature Struct Biol 15,
268-279.
Boyer, L.A. (2005). Core transcriptional regulatory circuitry in human embryonic stem cells. Cell
122, 947-956.
Boyer, L.A., Plath, K., Zeitlinger, J., Brambrink, T., Medeiros, L.A., Lee, T.I., Levine, S.S.,
Wernig, M., Tajonar, A., Ray, M.K., et al. (2006). Polycomb complexes repress developmental
regulators in murine embryonic stem cells. Nature 441, 349-353.
45
Cheloufi, S., Dos Santos, C.O., Chong, M.M.W., and Hannon, G.J. (2010). A dicer-independent
miRNA biogenesis pathway that requires Ago catalysis. Nature 465, 584-589.
Farh, K.K. (2005). The widespread impact of mammalian microRNAs on mRNA repression and
evolution. Science 310, 1817-1821.
Friedman, R.C., Farh, K.K.H., Burge, C.B., and Bartel, D.P. (2008). Most mammalian mRNAs
are conserved targets of microRNAs. Genome Research 19, 92-105.
Gangaraju, V.K., and Lin, H. (2009). MicroRNAs: key regulators of stem cells. Nat Rev Mol Cell
Biol 10, 116-125.
Grimson, A. (2007). MicroRNA targeting specificity in mammals: determinants beyond seed
pairing. Mol Cell 27, 91-105.
He, L. (2005). A microRNA polycistron as a potential human oncogene. Nature 435, 828-833.
Hong, X., Scofield, D.G., and Lynch, M. (2006). Intron Size, Abundance, and Distribution within
Untranslated Regions of Genes. Molecular Biology and Evolution 23, 2392-2404.
Houbaviy, H.B., Murray, M.F., and Sharp, P.A. (2003). Embryonic stem cell-specific microRNAs.
Dev Cell 5, 351-358.
Kanellopoulou, C. (2005). Dicer-deficient mouse embryonic stem cells are defective in
differentiation and centromeric silencing. Genes Dev 19, 489-501.
Lim, L.P. (2005). Microarray analysis shows that some microRNAs downregulate large numbers
of target mRNAs. Nature 433, 769-773.
Lytle, J.R., Yario, T.A., and Steitz, J.A. (2007). Target mRNAs are repressed as efficiently by
microRNA-binding sites in the 5' UTR as in the 3' UTR. Proceedings of the National Academy
of Sciences of the United States of America 104, 9667-9672.
Marson, A. (2008). Connecting microRNA genes to the core transcriptional regulatory circuitry of
embryonic stem cells. Cell 134, 521-533.
Mayr, C., Hemann, M.T., and Bartel, D.P. (2007). Disrupting the Pairing Between let-7 and
Hmga2 Enhances Oncogenic Transformation. Science 315, 1576-1579.
Melton, C., Judson, R.L., and Blelloch, R. (2010). Opposing microRNA families regulate selfrenewal in mouse embryonic stem cells. Nature 463, 621-626.
Mukherji, S., Ebert, M.S., Zheng, G.X.Y., Tsang, J.S., Sharp, P.A., and van Oudenaarden, A.
(2011). MicroRNAs can generate thresholds in target gene expression. Nat Genet 43, 854-859.
Murchison, E.P., Partridge, J.F., Tam, O.H., Cheloufi, S., and Hannon, G.J. (2005).
Characterization of Dicer-deficient murine embryonic stem cells. Proc Natl Acad Sci USA 102,
12135-12140.
Okamura, K., Hagen, J.W., Duan, H., Tyler, D.M., and Lai, E.C. (2007). The mirtron pathway
generates microRNA-class regulatory RNAs in Drosophila. Cell 130, 89-100.
46
Raj, A., Peskin, C.S., Tranchina, D., Vargas, D.Y., and Tyagi, S. (2006). Stochastic mRNA
Synthesis in Mammalian Cells. PLoS Biol 4, e309.
Rigoutsos, I. (2009). New tricks for animal microRNAS: targeting of amino acid coding regions
at conserved and nonconserved sites. Cancer research 69, 3245-3248.
Rosa, A., and Brivanlou, A.H. (2011). A regulatory circuitry comprised of miR-302 and the
transcription factors OCT4 and NR2F2 regulates human embryonic stem cell differentiation. The
EMBO Journal 30, 237-248.
Ruby, J.G., Jan, C.H., and Bartel, D.P. (2007). Intronic microRNA precursors that bypass drosha
processing. Nature 448, 83-86.
Sinkkonen, L. (2008). MicroRNAs control de novo DNA methylation through regulation of
transcriptional repressors in mouse embryonic stem cells. Nature Struct Biol 15, 259-267.
Tay, Y., Zhang, J., Thomson, A.M., Lim, B., and Rigoutsos, I. (2008). MicroRNAs to Nanog, Oct4
and Sox2 coding regions modulate embryonic stem cell differentiation. Nature 455, 1124-1128.
Viswanathan, S.R., and Daley, G.Q. (2010). Lin28: A MicroRNA Regulator with a Macro Role.
Cell 140, 445-449.
Viswanathan, S.R., Daley, G.Q., and Gregory, R.I. (2008). Selective Blockade of MicroRNA
Processing by Lin28. Science 320, 97-100.
Wang, Y., Baskerville, S., Shenoy, A., Babiarz, J.E., Baehner, L., and Blelloch, R. (2008).
Embryonic stem cell-specific microRNAs regulate the Gi-S transition and promote rapid
proliferation. Nat Genet 40, 1478-1483.
Wang, Y., Medvid, R., Melton, C., Jaenisch, R., and Blelloch, R. (2007). DGCR8 is essential for
microRNA biogenesis and silencing of embryonic stem cell self-renewal. Nat Genet 39, 380-385.
Wu, L., and Belasco, J.G. (2005). Micro-RNA regulation of the mammalian lin-28 gene during
neuronal differentiation of embryonal carcinoma cells. Mol Cell Biol 25, 9198-9208.
Yu, J., Vodyanik, M.A., Smuga-Otto, K., Antosiewicz-Bourget, J., Frane, J.L., Tian, S., Nie, J.,
Jonsdottir, G.A., Ruotti, V., Stewart, R., et al. (2007). Induced Pluripotent Stem Cell Lines Derived
from Human Somatic Cells. Science 318, 1917-1920.
Zheng, G.X.Y., Ravi, A., Calabrese, J.M., Medeiros, L.A., Kirak, 0., Dennis, L.M., Jaenisch, R.,
Burge, C.B., and Sharp, P.A. (2011). A Latent Pro-Survival Function for the Mir-290-295 Cluster
in Mouse Embryonic Stem Cells. PLoS Genet 7, e 1002054.
47
Chapter 3 Application of UTR reporter system to study microRNA
regulation at transcriptional and translational levels
3.1 Abstract
MicroRNAs are known to regulate their targets via inducing mRNA degradation and inhibiting
translation. But the relative contributions from the two sources have been under debate. It's also
unclear how miRNA regulation varies for different target expression. Here we apply the UTR
reporter system to study miRNA regulation at these two levels, and monitor miRNA regulation
over a target expression range which spans more than two orders of magnitudes. Unlike some
genome-wide studies, which suggest transcript degradation account for most (> 84%) of miRNA
repression (Guo et al., 2010). We found that the contribution from the two sources were on the
same order for all UTRs under study. Moreover, miRNA regulation strength were found to vary
for different target expression. The transcriptional regulation is more stable throughout the range
of measurement, whereas translational regulation gets saturated and decreases for high target
expression. Our data also suggests that miRNA might increase initially at low target expression
region. Taken together, our measurements provide single cell information on miRNA regulation
at two levels for a wide range of target expression.
3.2 Introduction
3.2.1 miRNA-mediated repression of translation
mRNA translation can be divided into three steps: initiation, elongation and termination. Initiation
starts with the recognition of 5' cap by the eukaryotic translation initiation factor (elF) eIF4F,
which contains eIF4G, an important scaffold protein for the assembly of the ribosome initiation
complex. eIF4G also interacts with the poly(A)-binding protein (PABP) and brings the two ends
of the mRNA in close proximity (Derry et al., 2006). This 'circularization' stimulates translation
initiation by increasing the affinity of eIF4E with the cap, and facilitating ribosome recycling.
Transcriptional initiation of some viral mRNAs is independent of the m7 G cap; in this case, 40S
ribosomes are directly recruited to the internal ribosome entry site(IRES) (Jackson, 2005).
Initiation is usually the rate limiting step in translation, and consequently the common target for
translational control (Fabian et al., 2010).
Repression at the initiation step
The first proof of miRNA-mediated translational repression at initiation step comes from
repression of m7 G-capped, but not IRES containing or ApppN capped (non-functional) mRNAs
targeted by either endogenous (let-7) (Pillai, 2005) or artificial (CXCR4) (Humphreys et al., 2005)
miRNAs. Polysome gradient analysis shows that transcripts targeted by miRNAs shift towards the
top in sucrose gradient sedimentation. This is shown for both cultured cells (Bhattacharyya et al.,
2006; Huang, 2007) and in C. elegans (Ding and Grosshans, 2009). Several in vitro experiments
using cell-free systems faithfully recapitulate the mode of miRNAs regulation in cultured cells. In
all of them, the presence of the m7 G cap was required for translational repression (Mathonnet,
2007; Thermann and Hentze, 2007; Wakiyama et al., 2007; Wang et al., 2006). MicroRNA
48
specificity was validated either by mutating targeting sites of specific miRNA or by transfection
of antisense oligonucleotides (antimiRs) which specifically block the targeting miRNA.
Some discrepancies exist regarding the role of the poly(A) tail in miRNA-mediated translational
repression. Poly(A) tail was shown to be both necessary (Humphreys et al., 2005) and dispensable
(Pillai, 2005) for optimal miRNA repression. Translational repression was also observed in the
absence of poly(A) tail (Eulalio, 2009; Eulalio et al., 2008; Wu et al., 2006). The data suggest that
poly(A) tailper se is not absolutely required for the repression, but miRNA-mediated
deadenylation might further contribute to translational repression by preventing the synergy
between the 5' cap and 3' poly(A) tail (Wakiyama et al., 2007).
Repression at post-initiation steps
A number of studies concluded that miRNAs could also inhibit translation at post-initiation steps
in addition to initiation suppression.
Polysome sedimentation analyses showed miRNAs in association of microRNA ribonucleoprotein
complex (miRNP, also known as miRISC) components and repressed mRNA were found to cosediment with active translating polysomes both in C. elegans and mammalian systems (Maroney
et al., 2006; Nottrott et al., 2006; Olsen and Ambros, 1999; Petersen et al., 2006; Seggerson et al.,
2002). Moreover, several groups observed IRES-driven translation being repressed by the miRNA
machinery (Lytle et al., 2007a; Nottrott et al., 2006). In addition, repression ofpall mRNA
by GLD 1 in C. elegans seems to involve the stalling or slowing down of elongating ribosomes
(Mootz et al., 2004), translational repression of miRNA targets was also found in yeast and D.
melanogasterembryos (Braat et al., 2004; Clark et al., 2000; Ruegsegger et al., 2001).
Several models have been proposed to explain miRNAs inhibition at post-initiation stage. Petersen
et al. (Petersen et al., 2006) proposed a drop-off model, in which miRNAs render ribosomes prone
to premature termination. Alternatively, Maroney et al. (Maroney et al., 2006) speculated that
miRNAs decelerate translation elongation. The inability to detect nascent polypeptides from
repressed reporters in both immunoprecipitation (Olsen and Ambros, 1999) and pulse-labeling
experiments (Petersen et al., 2006) has led Nottrott et al. to come up with the proteolysis model
(Nottrott et al., 2006).
3.2.2 miRNA-mediated mRNA deadenylation and decay
Evidence on miRNA-mediated mRNA degradation comes from studies on specific miRNA-target
pairs, and more generally from transcriptome studies. Microarrays or deep sequencing data both
showed that the abundance of miRNA targets inversely correlates with the level of miRNA.
Cellular levels of selected miRNAs could be modified by miRNA transfection, antimiRs or genetic
knockdown, and differentiation (Baek, 2008; Guo et al., 2010; Hendrickson, 2009; Krutzfeldt,
2005; Lim, 2005; Selbach, 2008). Furthermore, in cultured cells, depleting essential components
of the miRNA pathway (for example, Dicer, AGOs or miRNA silencing effector GW182)
increased the abundance of miRNA targets (Behm-Ansmant, 2006; Eulalio, 2007, 2009; Giraldez,
2006; Rehwinkel, 2006; Rehwinkel et al., 2005; Schmitter, 2006). Expression profiles from
differentiating and developing cells also provided examples showing anti-correlated expression of
miRNAs and their targets. For example the dramatic increase in miR-430 expression at the onset
of zebrafish zygotic transcription correlates with the degradation of a large number of maternal
49
mRNAs containing miR-430-binding sites in their 3' UTRs (Farh, 2005; Giraldez, 2006; Mishima,
2006; Stark et al., 2005).
Animal miRNAs rarely induce endonucleolytic cleavage of target transcripts due to partial
complementary binding. Instead, they direct their targets to deadenylation, and accelerates mRNA
destabilization through 5'-to-3' mRNA decay pathway. Transcripts targeted by miRNAs are
primarily deadenylated by the CAF1-CCR4-NOT deadenylase complex, and then decapped by
the decapping-complex proteins, DCP 1 and DCP2, and ultimately degraded by the 5'-to-3'
exonuclease XRN1. The role of miRNA-mediated mRNA decay factors is evidenced by the
observation that the abundance of miRNA targets increases when these factors are depleted or
when dominant-negative forms are overexpressed (Behm-Ansmant, 2006; Chu and Rana, 2006;
Eulalio, 2007, 2009; Piao et al., 2010; Rehwinkel et al., 2005).
3.2.3 Cellular compartmentalization of miRNA repression
P-bodies or GW-bodies are cellular structures that are enriched in mRNA-catabolizing enzymes
and translational repressors. They are considered as discrete foci for mRNA degradation, and
temporary storage sites for repressed mRNAs in yeast and mammals (Eulalio et al., 2007a).
The demonstration that AGO proteins, GW182, miRNAs and mRNAs repressed by miRNAs are
all enriched in P-bodies implicated that P-bodies are evolved in miRNA repression (BehmAnsmant, 2006; Bhattacharyya et al., 2006; Jakymiw, 2005; Leung et al., 2006; Liu et al., 2005;
Meister, 2005; Pillai, 2005). There is a good correlation between miRNA-mediated repression and
accumulation of mRNAs in visible P-bodies (Bhattacharyya et al., 2006; Huang, 2007; Liu et al.,
2005; Pillai, 2005). The endogenous CAT] mRNA, a target of miR-122, localizes to P-bodies
when translation is repressed, and it can be reversed by stress. In addition, overexpression of miR122 is sufficient to concentrate CAT] mRNA in P-bodies (Bhattacharyya et al., 2006).
Despite observations supporting the role of P-bodies in miRNA regulation, unresolved issues
remain. Knockdown of some P-body components disperses visible P-bodies, but has no effect on
miRNA function (Eulalio et al., 2007c). The relative distribution of miRISC components between
P bodies and the cytosol dispute the significance of P-bodies. Only -1.3% of enhanced GFP
(EGFP)-tagged AGO2 localized to P-bodies in HeLa cells (Leung et al., 2006). Moreover,
exchange of miRNPs between P-bodies and cytoplasm is very slow (Andrei, 2005; Kedersha, 2005;
Leung et al., 2006). Collectively, these data suggest that miRNA repression either involves
submicroscopic P-bodies or occurs outside of them. Because most P-body components are also
found throughout cytosol (Eulalio et al., 2007a), it is likely that miRNA-mediated repression is
initiated in the cytosol and the microscopically visible P-body is a consequence rather than the
cause of silencing (Chu and Rana, 2006; Pillai, 2005).
3.2.4 Translation inhibition vs transcript degradation
Compelling evidences support that miRNA repress targets via both translational inhibition and
transcript degradation. However, it was controversial which of these mechanisms dominates.
Recent development in proteomics methods enables genome-wide analyses on both proteome and
transcriptome, and allow us to assess to what degree silencing was caused by translational
50
repression versus mRNA degradation (Baek, 2008; Eichhorn et al., 2014; Guo et al., 2010;
Hendrickson, 2009; Selbach, 2008). All of those studies agree on one main conclusion: miRNAs
only modestly inhibit protein production, rarely resulting in more than a fourfold reduction in
protein levels. However, they disagree on the extent of translational contribution. Bartel and
colleagues (Baek, 2008; Eichhorn et al., 2014; Guo et al., 2010) use quantitative mass spectrometry
and ribosome profiling to measure the effect of miRNAs on protein output, and found that changes
in protein and mRNA levels strongly correlate. Accordingly, changes in mRNA levels accounted
for most of the regulation. The initial ribosome profiling experiments (Guo et al., 2010) were
performed on cytoplasm-extracted and poly(A)-selected RNA only. It might lead to
underestimation of translational repression, because it missed the deadenylated transcript as well
as transcripts in the P-bodies. Those issues have been addressed in (Eichhorn et al., 2014), and
mRNA destabilization still explains most (66% 90%) miRNA-mediated repression. Hendrickson
et al. (Hendrickson, 2009) used polysome profiling to estimate translation rate, and concluded that
mRNA degradation accounted for about 75% of the total changes. However, Selbach et al.
(Selbach, 2008) used pulse-labeled mass spectrometry and to demonstrate direct translational
repression for hundreds of genes. And many genes were down regulated at protein level with little
mRNA changes at early time points (8 hours) after miRNA transfection, a phenomenon
recapitulated in (Eichhorn et al., 2014).
In contrast to genome-wide studies which suggest that target degradation is the predominant mode
of regulation by miRNAs in mammalian cultured cells, single-gene analysis usually show that
repression at the protein level is generally stronger and more robust than the repression at transcript
level. There are also numerous examples of miRNA regulation at the translational level, with no
or minimal effect on mRNA degradation (Behm-Ansmant, 2006; Eulalio, 2007; Filipowicz et al.,
2008; Poy et al., 2004; Zhao et al., 2005). Moreover, studies using reporter rather than endogenous
genes usually observe major contributions from translational repression (Doench and Sharp, 2004;
Kiriakidou, 2007; Nelson et al., 2004; Pillai, 2005; Yekta et al., 2004). The discrepancies between
genome-wide analysis and single gene studies still need to be addressed.
There is debate regarding the order of events, because deadenylation has been reported both to
precede (Beilharz, 2009; Iwasaki et al., 2009; Wakiyama et al., 2007) and to follow translational
repression (Fabian, 2009; Zdanowicz, 2009). Even though disruption of transcripts circularization
could account for both transcript degradation and translational inhibition, those two processes are
not entirely coupled, nor is one supplementary to another. miRNA-dependent target degradation
is seen even when translation of miRNA targets is precluded, either by a defective cap structure
that impairs translation (Mishima, 2006; Wakiyama et al., 2007) or by translational arrest using
cycloheximide treatment (Eulalio, 2007; Fabian, 2009). Conversely, miRNA-mediated translation
inhibition can be observed in the absence of deadenylation. MicroRNA repression still occurs
when poly(A) tail is replaced by a histone mRNA stem-loop structure or by a self-cleavable
ribozyme (Eulalio, 2009; Eulalio et al., 2008; Wu et al., 2006).
3.3 Results
3.3.1 MicroRNAs exert regulation at both transcriptional and translational levels
Numerous evidences support that miRNAs induce target silencing at two levels, via transcriptional
degradation and translational inhibition. However, which of these two mechanisms occurs
51
predominantly has been highly controversial, with conflicting lines of evidence supporting both
views (See 3.2.4). Also, it's not clear whether or not miRNA regulation varies for different target
expression, and if it does, how would it vary.
We set out to address these questions with our UTR reporter systems. The 3'UTR of an
endogenous gene was appended behind GFP reporter, and the fluorescence of the protein can be
directly measured. GFP transcripts were hybridized with FISH (fluorescent in
situ hybridization) probes, and rendering mRNA level to be quantifiable too (Klemm et al., 2014).
We compare the expression of the original UTR (OriUTR) with respect to its MREs mutated
version (MutUTR), which has been validated as miRNA unregulated control (Chap. 2), and total
repression as well as repression at transcriptional level could be quantified. The translational
inhibition can be derived by dividing total repression with transcriptional contribution. Another
fluorescent protein mCherry was delivered into the cells together with miRNA activity reporter.
mCherry was followed by a short RBGpA which is devoid of miRNA regulation. The expression
of mCherry reflects the variability in delivery efficiency and expression machinery activity in
single cells, and it is correlated with reporter expression in the absence of miRNA regulation
(Chap. 2). We can further arrange the cells according to mCherry abundance, and quantify miRNA
repression at different transfection levels. Thus we utilize the variability of individual cells, and
cover miRNA regulation for target expression varying by orders of magnitude in a single transient
transfection experiment.
Only cells 4 standard deviations away from transfected population majority background were
selected for analysis (See 3.5.2), and the conclusions drawn from those cells are very robust to
background estimation (Figure 3.12). By arranging cells according to mCherry expression, and
overlay the scatterplot of GFP-OriUTR and GFP-MutUTR (Figure 3.1), we observe that at the
same transfection level, GFP-MutUTR expression is always higher than GFP-OriUTR on average.
This is true for both GFP transcripts and protein, and the difference is more pronounced at protein
level. GFP protein is plotted against gfp mRNA, and we notice that the marginal distributions of
gfn transcrint and GrFP nrntein hnth shift tmward-, lmwer enc
fnr mnRNA
reresccdpA (rhTTTR
(Figure 3.2a), and the shift is also more prominent at the protein level. Both plots illustrate the
contribution of miRNA regulation from transcript degradation, and strongly suggest the
contribution from translational inhibition. A proof of latter comes from quantification of
conditional mean of GFP protein expression at different GFP transcript levels. Less proteins are
produced for GFP-OriUTR on average except at very high transcript levels (Figure 3.2b). Protein
produced out of unit mRNA was computed (Figure 3.2c), and we find that there's about three fold
repression on OriUTR versus MutUTR. The translational inhibition is de-repressed at high
transcript levels, indicating that translational inhibition mechanism get titrated for very high target
expression. We only present Casp2 as an example here, other UTRs yield qualitatively similar
results (Supplementary Figure 3.5). Thus, our data suggests that miRNAs regulation happens at
both transcriptional and translational levels in general.
52
5.6 r
6.61
5
Casp2M utUTR
Casp24 ri UTR -.
5
Casp2MutUTR
Casp2OriUTR
4.51
4-
4
3.51F
U.
U-
0.
L-
R$
%, 3
.2
2.5
3
2.5 [
2
2
1.5-
1.5
3
3.5
4
4.5
5
5.5
-
a
3
6
log(mChorfy)
3.5
4.5
4
log(mChwny)
5
5.5
6
Figure 3.1 Scatterplot of GFP protein and mRNA versus transfection level indicator
mCherry.
(a) Scatterplot of GFP protein vs mCherry, (b) Scatterplot of gfp mRNA vs mCherry. pCAGd2eGFP-Casp2Ori/MutUTR were co-transfected with pCAG-mCherry plasmid into wild-type
ESCs. Only cells 4 standard deviations away from the transfected majority background were
selected for the plot. OriUTRs was plotted in red, and MutUTR was plotted in blue. Only 10% of
the measured cells are plotted for visualization. MicroRNAs repress both transcript and protein
production of GFP-Casp2OriUTR.
Figure 3.2 The relationship between protein and mRNA for Casp2 3'UTR.
pCAG-d2eGFP-Casp2Ori/MutUTR were co-transfected with pCAG-mCherry plasmid into wildtype ESCs. Only cells 4 std away from the transfected majority background are selected for the
plot. (a) Scatterplot of GFP protein versus mRNA, with marginal distributions plotted on the side.
OriUTR was plotted in red, and MutUTRs was plotted in blue. Only 2% of the measured cells
were plotted for better visualization. MicroRNAs repress both transcript and protein levels of GFPOriUTR, and the histograms of both are shifted towards smaller values. (b) Bar plot of protein
and mRNA. Cells were binned according to mRNA expression, and the mean of protein expression
in each bin were calculated. Error bars correspond to SEM. (c) Proteins produced out of unit
transcript at different transcript expression. The conversion rate for OriUTR stays relatively
constant, but decreases for MutUTR at high transcript levels. This reflects the de-repression of
translation inhibition at high transcript levels, as depicted in the black dotted line.
53
a
5.564.54-
3.5-
A
'IL
--
.~
L32.521.51-
I
I
I
al
3
2.5
2
5.5
6
b
OriUTR
-MutTR
5
5.6
5
4.5
4
3.5
log(GFP mRNA)
C
----
OIUTR
MutUTR
5
4.5F
4
4
I'
z
\Jj.{i
a.
U.
'.~
03.5
3
*I-.
I
2.5
.5
3
3.5
4
log(GFP
4'5
5
5.5
mRNA)
54
.5
3
3.5
4
4.5
log(GFP mRNA)
5
5.5
3.3.2 Quantifying miRNA regulation at transcriptional and translational levels
Now we confirmed that miRNAs regulate their targets at both transcriptional and translational
levels, we want to know the relative contribution from the two and see if one mechanism dominates
regulation.
Cells were arranged according to mCherry intensities, and repression fold was calculated for
different target expression by RF =
regulated Ibtn - background
. The quantified repression fold
was independent of the specific background estimation method we took (See 3.5.3).
Transcriptional and total repression were determined by comparing transcript and protein
expressions of OriUTR to MutUTR, and translational inhibition was derived by dividing the total
repression with respect to the contribution from transcriptional degradation.
3.3.3 The transcriptional regulation stays relatively constant and translational regulation
saturates at high target expression
MicroRNAs repression for endogenous 3'UTRs were quantified as described. And the repression
was assessed over a region which spans more than 100 fold of target expression (Figure 3.3 a-c
and Supplementary Figure 3.1 a-b). To compare our measurements with population-based assays,
cells with different transfection levels were combined, and a single value repression fold was
calculated (See 3.5.3). Consistent with our previous studies (Mukherji et al., 2011) and genomewide assays (Guo et al., 2010; Selbach, 2008), even though miRNA regulation could exceed 10
fold for certain target expression range, its regulation at population level is usually subtle, and
rarely exceeds 4 fold on average (Figure 3.3d and Supplementary Figure 3.1c). Here we extend
the conclusion from artificially constructed miRNA targets ((Mukherji et al., 2011) e.g. 7
consecutive miR-20 sites) to endogenous UTRs. Genome-wide miRNA regulation strength are
expected to be even smaller than what we presented in Figure 3.3d, because Casp2, Lats2, Rbl2
are all strongly repressed miRNA targets. Repression on mildly regulated Lin28a UTR and another
miR-290 targets P21 are less than twofold (Supplementary Figure 3.1). Also, the relative
contributions from transcriptional degradation and translational inhibition varies for different
UTRs, and mRNA destabilization accounts for 39% to 88% of total regulation (Figure 3.3d and
Supplementary Figure 3.1c). For the same UTR, the contribution also varies for different target
expression levels (Figure 3.3 a-c and Supplementary Figure 3.1 a, b). Despite the variability,
transcriptional and translational regulation strength are at the same order for all UTRs under study,
and there's no single dominating source of contribution factor.
Transcriptional regulation stays relatively constant
Throughout the measured range, which spans over 100 fold of target expression, miRNA
regulation at transcriptional level appears to be relatively stable, even though the actual value
varies for different UTRs (blue line in Figure 3.3 a-c and Supplementary Figure 3.1 a-b).
Moreover, transcriptional regulation does not saturate at high target expression, in contrast to
miRNA regulation at translational level. The stableness of transcriptional miRNA regulation is
further corroborated by two independent experimental techniques, microscopy (Figure 3.5) and
RNA sequencing (Supplementary Figure 3.2).
55
a
b
Casp2 Repression Fold
-+-total repression fold
Lats2 Repression Fold
-+-total repression fold
9 -4-mRNA degradation repression
-- protein translation inhibition
8
RF = 1 reference line
degradation repression
-+-protein translation inhibition
RF I reference line
-+-mRNA
710
C
4-
32
1
01
10
0.I
3
104
10
mCherry
C
d
Rbl2 Repression Fold
12
4
10
mChery
10
Popuimion Repression
6
total repression fold
-+-mRNA degradation repression
-+- protein translation inhibition
=I reference ine
10 -RF
--
43%
10
5
47%0/
A
5-
39%
8-
3S
4
I
ZT
2-
2
1
0
0
Rb2
Lat2
Cau2
mCherry
Figure 3.3 miRNA repression at transcriptional and translational levels, data quantified
from flow cytometry.
Total miRNA-mediated repression (red), transcriptional repression (blue) and translational
contribution (green) were quantified for Casp2 UTR (a), Lats2 UTR (b) and Rbl2 UTR (c) over a
target expression range of about 100 fold. (d) All transfected cells were combined and a single
population repression value was derived like in bulk assays, with total repression in red bars and
transcriptional repression in blue bars. Error bars correspond to standard deviation from more than
56
three experiment replicates. And the percentage corresponds to the relative contribution from
transcriptional regulation to total repressions.
Flow cytometry measures the integrated fluorescent intensity from cells. But cells have
autofluorescence, and non-specific binding of smFISH probes within cells further increases the
background in mRNA channel. The repression fold calculated is more sensitive to background
estimation at low target expression region. On the other side, FISH probe bound transcripts were
resolved as diffraction limited spots under microscopy. This method achieves single molecule
resolution for transcript detection and overcomes the background issue in mRNA channel. Thus
microscopy complements with flow cytometry, and provides high confidence data for
transcriptional regulation at low target expression. Naturally the microscopy approach has its own
limit. Transcript spots become interconnected and resolution is lost at high mRNA expression
(more than 600 ~1200 mRNA per cell). The measurement is also low throughput.
We integrated fluorescent protein signal within cells (See 3.5.4) for protein expression to mimic
the measureable in flow cytometry. And we counted GFP transcript number for mRNA expression.
Cells were also arranged according to mCherry expression. Repression at transcriptional level is
obvious from the scatterplot (second column of Figure 3.12). Repression fold is quantified
similarly to flow cytometry, except that transcript background equals zero. Due to the limitation
of microscopy experiment, we can only measure up to medium transfection levels, and this part
exactly mimics the first half of the repression fold given by flow cytometry experiment. Two UTRs
measured by microscopy, Casp2 and Lats2, confirm that transcriptional regulation is indeed stable
at low target expression, which still spans a region of ~10 fold (ref blue line in Figure 3.5).
The cells were also flow sorted according to mCherry signal into 5 bins, and followed by
downstream genome-wide RNA sequencing (Chap. 5). Repression fold determined by sequence
reads of gfp further confirms that transcriptional regulation does not saturates for high target
expression (Supplementary Figure 3.2). Repression fold for the first bin (lowest target expression)
is slightly lower than others, and it is likely caused by relatively high background reads for gfp.
Animal miRNAs rarely acts through perfect complementary targeting and Ago2-dependent
catalysis, it rather facilitates mRNA destabilization by accelerating transcripts deadenylation and
decapping. MicroRNAs promote recruitment of CAF1-CCR4-NOT deadenylases, and the
deadenylated mRNA is then decapped by Dcp1 /Dcp2 enzymes. Without the protection of poly
(A) tail and 5'cap, the naked mRNA is vulnerable to 5'-to-3' exonucleolytic decay by XrnI and 3'to-5' decay by the exosome, and rapidly degraded. Thus the relative constant transcriptional
repression might reflect the different turnover rate of intact transcripts versus deadenylated and
decapped transcripts. The deadenylation, decapping and endonucleolytic cleavage processes
themselves are biochemical, and are relatively fast compared to the recruitment process. And all
of the participating enzymes can be quickly recycled. That might explain the large capacity of
transcriptional regulation and its refractory to saturation at high target expression.
57
a.
Lats2MutUTR, mCherry
b.
Lats2MutUTR, GFP
c.
Lats2OriUTR, mCherry
d.
Lats2OriUTR, GFP
Figure 3.4 Typical microscopy images of GFP and mCherry proteins expression for Lats2
Mut/OriUTR in Cotransfection experiment.
GFP and mCherry proteins are approximately correlated without miRNA regulation in the
MutUTR experiment (a-b). GFP is strongly repressed and the correlation between two channels is
lost with miRNA regulation in the OriUTR experiment (c-d).
58
Casp2 repression fold
Lats2 repression fold
8
1s
12-
b
--- total repression fold
-- mRNA degradation repression
total repression fold
-+-mRNA degradation repression
-+protein translation inhibition
10 ..... RF =1 reference line
--
16
-+-
protein translation inhibition
RF = 1 reference line
1412-
8
10-
8 -6-
4
4-
2
2
0'
10
-
- -
10
0--0
10
10
transfection lovel(RFP total protein)
transfection level(RFP total protein)
Figure 3.5 miRNA repression at transcriptional and translational levels for different target
expression, data is quantified from microscopy experiment.
pCAG-d2eGFP-Casp2/Lats2 Ori/MutUTR is co-transfected with pCAG-mCherry into WT ESCs,
mRNA degradation repression is confirmed to be stable for low target expression. And
translational inhibition is confirmed to increase initially. Total repression is plotted in red,
transcriptional repression is plotted in blue and translational contribution is plotted in green.
59
Translational regulation saturates at high target expression
Contrary to the relative stableness of transcriptional regulation, translational regulation varies for
different target expression levels (green line in Figure 3.3 a-c).
At the high end of target expression, translation inhibition always decreases for increasing targets
(Figure 3.3). The decrease is consistent with translational efficiency directly quantified from
reporter mRNA-protein relationships (black line in Figure 3.2c). It is known that miRISCs
compete with initiation complexes eIF4F for binding to the cap structure and inhibit translation
initiation. It is possible that miRISCs and related translation repressors are limiting factors within
cells. MicroRNA targets have to compete with each other for shared regulating elements, and the
accessible resources per target get titrated away when target expression is high. And it results in
decrease of regulation at translational level.
Surprisingly for some UTRs (e.g. Casp2, Lats2, and Rbl2, but not for Lin28a and P21),
translational regulation increases initially at the low end of target expression (Figure 3.3), and
regulation only reach its maximum capacity at the ultra-sensitivity region (Mukherji et al., 2011).
And this is further confirmed by microscopy measurement. Figure 3.4 presented one typical
microscopy image of MutUTR and OriUTR for Last2 UTR respectively. We observe that for Lats2
MutUTR, which is devoid of miRNA repression, intensity of reporter protein GFP roughly
correlates with transfection level indicator mCherry (Figure 3.4 a and b). For Lats2 OriUTR, GFP
expression is strongly repressed to background levels for the given field of view, and the
correlation is lost (Figure 3.4 c and d). Protein expression is quantified as described (See 3.5.4),
and translational repression do increase initially for Lats2 and Casp2 3'UTRs (green line in Figure
3.5).
One trivial explanation for the observed initial increase of translational repression is that, in the
cotransfection system, the actual delivered GFP to mCherry plasmids could deviate from bulk ratio.
In the extreme case, some cells might only take in mCherry plasmid. The variability is more
pronounced at low transfection levels, and could result in underestimation of repression. To rule
out this possibility, we conducted the experiments with bi-directional plasmids, which have a fixed
GFP to mCherry ratio of 1:1 at single cell level. The initial increase still exists for the bi-directional
plasmids (Figure 3.6), excluding the possibility that the initial increase is caused by contamination
of mCherry singly transfected cells. The repression values derived from the two system are also
similar. However, we do not observe the final decrease of translational inhibition. This is due to
different reporter to indicator ratios in the two systems, bi-directional system has a fixed ratio of 1
and the ratio was set to 7 in the cotransfection system. Thus bi-directional system has not reached
the saturation region, and the derived repression resembles the left part of the repression plot from
cotransfection system.
MicroRNA-mediated translational inhibition is a complicated process which could happen at
translation initiation, elongation and termination stages. This multilayer process might not happen
simultaneously. And depending on the relative concentration of miRNA targets and regulating
factors, translational regulation at one stage might occur in addition to one another. Indeed, the
formation of microscopically visible P-bodies has been observed as a consequence of miRNA
regulation. Those subcellular loci are enriched with translational repressors such as RCK/p54 and
eIF4E-transporter. P-bodies are also enriched in mRNA deadenylation and decapping enzymes,
which in turn contribute to translation inhibition by disrupting the synergy between the two ends
of transcripts. It might be possible that P-bodies aggregation only formed under certain target
60
concentration, and their formation in turn facilitate miRNA regulation, and results in the initial
increase for translational repressions.
Lats2 UTR repression, bi-directioani plasmid
10 ,*
1 Total repression fold
* mRNA degradation repression
Protein translation inhibition
RF = I reference line
8M
-a 6
C4
40
2-
10
10
mCherry
10
Figure 3.6 miRNA repression at transcriptional and translational levels for Lats2 UTR, data
is quantified from bi-directional plasmids.
Bi-directional plasmid pTRE-GFP-Lats2Ori/MutUTR-mCherry is transfected into V19 ESCs, and
induced with 1 pg/ml doxycycline. MicroRNA repression is measured by flow cytometry.
Translational inhibition is confirmed to increase initially. Insert is miRNA repression for the same
UTR measured from cotransfection system. Repression values derived from the two measurements
are similar, except we do not observe the final titration of regulation in the bi-directional system.
This is due to different ratio of reporter to indicator in the two systems. We fix this ratio to be 7 in
the cotransfection system whereas the ratio always equals to 1 in the di-directional system. Total
repression is plotted in red, transcriptional repression is plotted in blue and translational
contribution is plotted in green.
61
3.4 Discussion
It is still an unresolved issue why genome-wide analysis usually differs from single-gene analysis
and reporter assays (Eichhorn et al., 2014). One possible explanation is that single-gene analyses
study the effect of a cohort of miRNAs acting on an endogenous gene whereas genome-wide
analysis normally explores the effect after a single miRNA overexpression or depletion.
Additionally, transgene is usually overexpressed in reporter assays, which could be much higher
than majority of endogenous gene expression within cells, especially miRNA targets. We have
shown that relative contribution from translational inhibition and transcriptional degradation varies
for different targets expression, and the discrepancies between reporter assays and genome-wide
analysis might simply reflect different modes of regulation at different target expression regions
(Table 3.2 and Supplementary Figure 3.6). Some recent analysis revealed that the miRNAmediated transcriptional and translational regulation varies in terms of relative contribution at
different time points after miRNA activity induction (Eichhorn et al., 2014; Selbach, 2008).
Translational regulation dominates miRNA regulation at early phase of exogenous miRNA
introduction, and mRNA destabilization accounts for majority of repression at steady state
(Eichhorn et al., 2014). Some of our initial effort showed that miRNA regulation varies both in
strength and profiles for cells harvested 24hs to 72hs after transfection (data not shown). In
addition, our data adds another dimension to describe miRNA regulation, and revealed that the
relative contribution of translational and transcriptional regulation also varies for different target
expression.
It might seem surprising that translational inhibition was observed to increase at low target
expression region for some UTRs under study. And further experiments is needed for
corroboration of the conclusion. Targeted mass spectrometry (targeted MS) could be one possible
solution. By sacrificing total number of monitored peptides for resolution, targeted MS could
achieve very high resolution for user-specified list of targeted precursor-fragment pairs
('transitions'), i.e. the fingerprints for the selected proteins. In our pilot experiment, the linearity
of detection holds down to 10,000 cells. It also does not require isotope amino acid culturing. But
most importantly, this method is background free. Thus it could provide confident data for low
target expression region. Cells under normal culturing and transfection are sorted into 5 bins
according to mCherry intensities, and followed by targeted MS. Reporter protein, indicator protein,
and proteins of several top miR-290 targets were selected for measurements. However, even
though fingerprints for all desirable proteins were successfully picked with clear resolutions, the
initial attempts from sorting have not been successful. The main challenge is the time scale of
experiment and the lack of intermediate check on sample quality. Some efforts have been made to
alleviate this (Supplementary Figure 3.4), and the sorted sample were also split for RNAsequencing, which provides confidence on RNA integrity and serves as side proof on sample
quality. But more efforts were needed for the reproducibility of targeted MS itself, and we are
looking forward to the results provided by this independent measurement technique.
62
3.5 Methods
3.5.1 Flow cytometry experiments
Each set of experiments at least contained the mock transfection experiment 1 and single color
transfection experiments (3 and 4) for characterization of cellular background and color
compensation between channels. For each 3'UTRs of choice, experiments 7 and 8 have been
performed for validation of the MutUTR design. Experiments 5 and 6 or 5 and 7 are performed
together to allow quantification of miRNA-mediated repression fold and noise control.
Experiments
abbreviation
Goals
1
Mock Transfected, HB
with GFP mRNA -Cy5 FISH
probe
Mock T
Single value estimation of background signal
2
GFP single Transfection,
no HB
3
GFP single Transfection,
HB
GFP single T
FMO control for Cy5. Color compensation for Cy5 BT
into other 2 channels.
4
single
mCherry
Transfection, no HB
mCh single T
Single color control for mCherry. Color compensation
for mCherry BT into other 2 channels. Also used as bin
wise estimation of background
5
WT OriUTR,
directional TT
Co-T/bi-
XXX OriUTR
miRNA activity reporter for an endogenous UTR
6
WT MutUTR,
directional TT
Co-T/bi-
XXX MutUTR
miRNA unregulated control for the corresponding UTR
7
OriUTR,
KO
directional TT
Co-T/bi-
miRNA unregulated control for the corresponding UTR
8
KO MutUTR,
directional TT
Co-T/bi-
Validation of MutUTR design for the corresponding UTR
Single color control for GFP. Color compensation for
GFP BT into other 2 channels.
Table 3.1 Transfection experiment and control set.
HB: hybridization. TT: transient transfection. BT: bleed through. FMO: Fluorescence minus one.
WT: wild-type mESCs. KO: Dgcr8 knockout mESCs. Co-T: pCAG-d2eGFP-Ori/MutUTR and
pCAG-mCherry cotransfection. Bi-directional: pTRE-d2eGFP-Ori/MutUTR-mCherry or pTREmCherry-Ori/MutUTR-ZsGreen bi-directional plasmid. The bleed through from GFP into other
two channels was proven to be negligible, and only experiments 1, 3, and 4 were performed for
estimation of background signal and color compensation for later experiments.
63
3.5.2 Flow Cytometry Data Processing
Sample gating:
Single cells were first gated from cell clusters and debris according to their forward scatter (FSC)
and side scatter (SSC) profiles.
Transfection conditions were optimized, and transfection efficiency was larger than 90%. Majority
of the cells took in some (even though it can be few) plasmids, and expression of fluorescent
proteins resulted in a global shift of fluorescent signals in the corresponding channels, as could be
seen in the shift of the 'red eye' in the scatterplot Figure 3.7. If we want to reduce the shift of
background, we could dilute the reporter plasmids with carrier plasmid. In previous studies
(Mukherji et al., 2011), carrier plasmid pUCI8b was mixed with reporter plasmid with a ratio of
50:1 in the transfection. The carrier plasmid is non fluorescent, thus we can effectively reduce the
shift of background, and extend the confidently quantifiable region. The usage of carrier plasmid
is necessary, and could not be replaced by simply reducing the amount of transfected plasmid.
Because certain ratio of DNA to transfection reagent has to be maintained to ensure effective
transfection.
Within the 'red eye', fluorescent signals between channels were correlated due to cell-cycle effect,
so did the cell background signals. Our single value background estimation did not work for this
region. The mCherry single transfection background did not work either. Because due to the
competitive nature of the co-transfected plasmids in the cotransfection, the average admission of
mCherry plasmid in the cotransfection experiment was less than single mCherry plasmid
transfection experiment, and their 'red eyes' did not fully overlap. Thus for simplicity, we just
excluded this part of cells for later analysis.
Specifically, the marginal distributions of transfection level indicator mCherry were plotted
(Figure 3.8). A Gaussian distribution was fitted to the bell shape in the lower end, and only cells
with log 10 (mCherry) signal> 3 were gated for later analysis. This corresponded to the cells having
mCherry expressed larger than 4 standard deviation away from the mean of Gaussian in the
cotransfection experiments. And those cells were far enough from the 'red eye', such that cellcycle effect and background estimation methods had very little effect on the data analysis (Figure
3.9).
64
Figure 3.7 Scatterplot of GFP protein or mRNA levels versus mCherry expression.
Color corresponds to the Jet heat map of the local cell densities. Note the shift of fluorescent signals
for population majority (red eye) in various transfection experiments. Only cells on the right of the
black lines were selected for data analysis, since they were far enough away from cell majority
background and were very robust to whichever data analysis methods applied.
Mock T
Mock T
5
5
4.5
4.5
4
4-
3.5
3.5-
3
3-
2.5
2.5-
2
2
1.5
1.5
0.5
1:5
2
2.5
3.5
3
log10(mChey)
4
4.5
I1
5
1.5
2
2.5
3
3.5
log10(mchey)
4
4.5
5
4
4.5
5
4
4.5
5
inCh single T
rnCh single T
5
5
4.5
4.5
4
4
3.5
3.5
3
3
2.5
.92.5
2
2
1.5
1.5
I
1
1.5
2
2.5
3.5
3
Iog10(mChey)
4
4.5
11
5
1.5
2
2.5
3
3.5
loglO(mChery)
GFP single T
GFP single T
S
4.5
-
4.5
4-
4
I.5
3.5- ~
S3
3g 2.
ar2.5
4M
2
1.5
-1
1.5
2
2.5
3
3.5
4
4.5
V
5
log10(rnCheny)
65
-.-
- I
.
2
1.5
15
2
2.5
3
3.5
loglO(mCheny)
Canp2OriUTR
Casp2OriUTR
5
4.5
4.5
4-
4
-E3.5CL
3.5
3-
3
92.5
12.5
2-
2
1.5
1
1.S
1.5
2.5
3-3.
2
2.6
3
1.5
3.5
loglO(mChwry)
4
4.5
1L
1
5
1.5
2
2.5
3
3.5
2
2.5
3
3.5
loglO(mChemry)
4
4.5
5
4
4.5
5
Casp2MutUTR
Casp2MutUTR
5[
4.5
-
-
4.5
4-
4
3.5
S3.5-
92.5
2.5
2
1.5[
1.5
1.5
2
2.5
~
*1
1.5
2
2.5
loglo(mCherry)
3.5
3
klo10(mchwy)
kts2OriUTR
lats2OriUTR
3
3.5
4
4.5
5
45
54.5-
4
4-
3.5
13.61
L3
* :~..
3-
~2.5
S2.5
2
2-
124
1.5-
1.5.
2.5
0.5
1.5
2
2.5
3
3.5
3
3.5
loglO(mCherry)
4
4.5
5
IWts2MutIJTR
lats2MutUTR
5
5
4.5
4
4
3.5
W3.5
3
;39 2.5
2
2-
1.5
1.5
0.5'
I1
1.5
2
2.5
3
3.5
loglO(mCherry)
4
4.5
5
IoglO(mctwrry)
66
Figure 3.8 Marginal distribution of mCherry expression.
Transfection of fluorescent plasmids did not affect background signal in other channels (e.g.
mCherry channel of Mock T and GFP single T were statistically identical despite high GFP
expression).
But expression of fluorescent proteins increased signals in the corresponding channel, and high
transfection efficiency resulted in shifts of the 'red eye' (e.g. comparing row 2, and 4-7 to row 1
and 3).
Gaussians were fitted to the bell shapes in the lower end of cotransfection experiments (row 4-7),
parameters were estimated to be y = 2.2, a = 0.2. Only cells with logio (mCherry) value larger
than 3 were gated for later analysis. Those cells were larger than 4 standard deviations away from
the mean of population majority background (> 0.9999 quantile), and were very robust to
whichever background estimation methods we applied.
-1
Mock T
0.01
0.005-
I
I
I
-
*
*
i
1
T.5
0.01
0.005
[
1.5
II
1
.5
1.5
2
3.5
2.5
mCh stgle T
I
I
2
2.5
3.5
3
GFP Qigle T
I
0.0110.005-
1 0.01
rC0.005
.5
1
1.5
2
I
I
i
1
1.5
2
2.5
.5
1.5
2
4.5
5
5.5
4.5
5
5.5
4
I
-1
I
I
I
I
I
3.5
4
4.5
I
I
4
4.5
3.5
2.5
Casp2 rUTR
I
1
4
5
5.5
5
5. 5
5. 5
lats2l UTR
L
0.01
0.005
I
_____________________________________
*
.5
I
3.5
TR
2.5
lats2M
I
I
4
4.5
5
I
I
I
4
4.5
5
5.5
4
4.5
5
5.5
0.010.005
b.5
1
1.5
2
1
3.5
2.5
Casp2l lutUTR
II
0.01
0.005
~h&t*hEImmmj~L~
II
.5
1
___________
1.5
2
3.5
3
2.5
log I0(mCherry)
67
3.5.3 Background analysis and repression fold calculation
Cells have autofluorescence, and non-specific binding of FISH probes inside cells particularly
increases the background in the mRNA channel. The background signal needs to be deconvolved
from measurements to get true biological signal from protein and mRNA product. On the
population level, we need to estimate the mean of background signals. Two approaches were
adopted. Mock transfection experiment (experiment 1 in Table 3.1) provided a single value
estimation of background in each channel. And mCherry single plasmid transfection experiment
(experiment 4 in Table 3.1) provided a bin-wise background estimation for different mCherry
expression levels.
Cells were binned according to mCherry intensities, and bin width was set to ~0.2 in loglO space.
The mean of GFP protein and mRNA were calculated for each bin, and standard error of the mean
(SEM = Ad ) were plotted as error bars together with the mean in all the bar plots. There was still
residue bleed through between different channels after color compensation (y value increased
slightly at very high mCherry levels in mCherry single transfection experiment, which was plotted
in black in Figure 3.9). And the second approach took into consideration the effect of non-perfect
color compensation.
Repression fold were calculated accordingly with the two approaches.
For the single value estimation of background, repression fold was calculated as:
repressionfold =
unregulated bin - background
regulated Ibin - background
Equation
(1)
And for the bin-wise estimation of background, repression fold was calculated as:
repressionfold
-
unregulated bin - background I bin
regulated Ibin - background Ibin
Equation
(2)
We could see that these two approach yielded essentially the same result on repression fold
quantification in the region for analysis (i.e. logio (mCherry)>3, Figure 3.9 c and d). And the
conclusions we drew in this study was independent of the background estimation approaches we
took.
For population repression fold calculation, all the transfected cells were combined. This included
the cells in the 'red eyes'. Similar to bulk experiments like luciferase assays, background (single
value estimation from mock transfection) was subtracted from total signals. The fluorescent signals
from all cells were summed up, and the total expression of OriUTR was compared to MutUTR to
derive single value repression fold.
Figure 3.9 Bar plot of protein and mRNA expression at different transfection levels and
repression fold calculated using different background estimations.
Cells were binned according to mCherry expression (i.e. different transfection levels), and the
mean of GFP protein (a) and GFP mRNA (b) were calculated in each bin. Error bars corresponds
to SEM. Mock transfection and mCherry single plasmid transfection experiments were conducted
68
in parallel to evaluate cellular background, and the two approaches yield very similar estimation
(black and gray lines in a and b). Repression fold were calculated by subtracting single value
background estimated from mock transfection (c) and bin-wise background estimated from
mCherry transfection (d), and the repression fold derived from the two approaches are very similar.
Iats2 protein barpiot
10
a
-
MutUTR, WT
-
-
OriUTR, WT
-OrUTR,
WT
-mCh singleT background
mean background
-MCh
singleT background
mean background
10
lats2 mRNA barplot
10
b
MutTR, WT
10'
-1. 10'
E
0.
L-
10,
10
102
10
'102L.
10 3
10'
10 3
10
log(mCherry)
C
Ifs2UTR Repression Fold
MockT rean bg ubtraction)
15
-- total repression
+transcriptional degradation
-+-trnslational inhibition
RF =I reference lne
15
-+-+-
--
10'
y
i0s
log(mCherry)
d
Iats2UTR Repression Fold
inCh singleT bg subtraction
total repression
transcriptional degradation
translational inhibition
RF = I reference line
10-
101F
32
0
L
0
U-
I
5.
(a
5
5
01
El]
10
10'
mCherry
10,
69
10 3
104
mnCherry
i0s
3.5.4 Microscopy image analysis
Microscopy imaging, cell segmentation and FISH transcript counting were the same as (Methods
2.4.11).
To quantify fluorescent protein expression. Images were taken for the corresponding fluorescent
channel at the focal plane of field of cells, which was chosen by the auto focus function of the
microscope. The mCherry protein we used was susceptible to photo-bleaching, thus a stack-wise
measurements along z direction was not adopted. Cells were segmented as previously described,
and pixel values within the cell boundaries were summed up. Cell height was counted as the stack
number between the appearance of first transcript and the disappearance of last transcript in the
view. The expression at 2-D plane was then multiplied with z-height to get total protein expression.
Here we assume that cells were bounded by the slides and cover glass, thus all cells within one
field of view had similar z-direction height, and cell volume was proportional to intersection area.
But different sample slides were squeezed differently, and the z-height had to be taken into account.
To extend the range of measurement, different exposure conditions of mCherry images were taken
for Casp2UTR experiment. The signal of cells with high mCherry expression was saturated at
larger gain (50ms, gain3), but not for smaller gain (50ms gaini). Cells which were not saturated
in both conditions were used for regression, and the expression of saturated cells could be
extrapolated (Figure 3.11). Images with shorter exposure time could also be used for extrapolation,
but exposure time of less than 1 Oms was not recommended as it challenged the accuracy of
mechanical camera shutters.
To estimate cellular autofluorescence background, images were taken on the mock transfected
sample, and the average was taken as the single value estimation for background in the protein
channel, similar to background estimation method 1 in flow cytometry.
70
Figure 3.10 Illustration of cell segmentation and transcript quantification of transfection
sample.
GFP transcripts were labeled with Cy5 probes. (a) Z-projection of transcripts expression. (b) Cell
segmentation and transcripts counting. Cells were segmented according to bright-field images, and
the boundaries were plotted in red. GFP transcripts detected by the algorithm were plotted as green
+S.
71
RFP mg Condition Calibration
14 X 104
+ y = -2094+ 4.41*x
r=0.9996
12C
10*i
co
E
4
-
LO
U
(L4
U-
2-
0
0.5
2.5
2
1
1.5
RFP Focal Plane 50ms, gaini
3
3.5
X 10'
Figure 3.11 Extrapolation of RFP expression.
Usually, images with higher gain or longer exposure time was used for quantification due to its
better correlation with real signals. To extend the range of measurement, images with different
exposure conditions were taken for the RFP channel. Cells with high RFP expression saturated in
signal in gain 3 condition but not in gain 1. And the latter was used for extrapolation of RFP
expression. For GFP, the extrapolation was not needed because the protein expression was highly
repressed for the measured UTRs.
72
Casp2 MutUTR Co-T
Ifild.
2439 cells
44
10 10
1000
*4
.4
*.
*.. **4~*,
g
*
v;.4
#
CL
*,*. :
**
0
*
*
*
4
.~
*4
*
#~
*.
**
10
,
4
*
4
0
10
10
10
4
U. 109
SW0
10
10
4
*
10,I
.4
10 10
*
* **
10CO-
10
1000
500
10
154
Gasp2 OrIUTR Co-T
2497 cells
10
10 10
1000
z
U.-0
E
0
*
60
.
44
**
*
4~*
.~
..
*
a
4
*
*
L
o~9
10
,
,
*
*4
*
4*
.C,
~4.
10
10
10
0
10
10
10
10
RFP total protein
10"
10
10
RFP totl protein
I0
11
10
0
1000
500
GFP mRNA
1500
Figure 3.12 Microscopy data analysis.
Top row is data from cotransfection of pCAG-d2eGFP-Casp2 MutUTR and pCAG-mCherry, and
the bottom row is from cotransfection of pCAG-d2eGFP-Casp2OriUTR and pCAG-mCherry.
Data were presented in a similar fashion as the scatterplot of flow cytometry data, except that GFP
transcript level were plotted on a linear scale due to zero expression in some cells.
The first two columns present the expression of GFP protein and GFP mRNA arranged by the
expression of transfection level indicator mCherry. The scatterplot of GFP protein versus mCherry
from microscopy is very similar to flow cytometry data (compare first column in Figure 3.12 and
Figure 3.7). The strongly correlated left-lower part of GFP protein plot corresponds to the 'red
eye' in the flow cytometry scatterplot. For the 2nd column, if we estimate the cellular
autofluorescence in transcript channel to be -50 molecules (Klemm et al., 2014), add the
background average to transcript count, and plot the total count on a log scale, the scatterplot would
also resemble its flow cytometry counterpart (data not shown).
By comparing the first and second column, we observe that at the same transfection level, miRNA
repression of both the protein and transcript production is obvious. By comparing the third column,
we notice that the transcript expression under miRNA regulation shrinks to lower values. Also at
the same transcript expression, GFP protein expression is lower. Thus, the microscopy data is
consistent with flow cytometry experiment, and miRNA exerts its regulation via both
transcriptional and translational level.
73
11
Casp2 mRNA Barplot
Casp2 Protein Barplot
800
I
10" r
700
600
1010
z
E 400
0.
Ua, 300
0.
0
500
10
200
10 8
1010
108
O
10
1011
-
-
100
10
1010
1011
Lats2 mRNA Barplot
Lats2 Protein Barplot
1200
100r
1000800-
z
CL
0.
aL
E
10
0.
L
C,
I
10 9
108
600400200fL
1 0'
1010
1011
10
10'
1010
1011
Figure 3.13 Microscopy data bar plot.
Protein and transcript expression were quantified for microscopy experiment. A background
threshold of 10 9 on mCherry expression was applied, and only cells on the right side of the black
line were shown for repression fold plot. Microscopy is background free in the mRNA channel,
the transcriptional repression of miRNA is apparent even for low target expression.
74
3.6 Supplementary Information
Lin28 Repression Fold
a
-+-Overall Repression Fold
4.5 -+-mRNA Degradation Repression
Protein Translation Inhibition
-+RF - I reference line
-
3.5F
3.5
3
3
2.51F
I
2
b
Overall Repression Fold
4.5 --- mRNA Degradation Repression
- Protein Translation Inhibition
RF - I reference line
A
--
4
I
P21 Repression Fold
5
2.5
5W.
a
1.5
1.5I
II
0
0.5-
..
4
10 5
10
mCherry
2
0
10
3
10'
mCherry
10
Population Repression
.6 r
C
88%
2
1.
70%
0.56
0
IZTI
Supplementary Figure 3.1 miRNA repression at transcriptional and translational levels for
Lin28a and P21 UTRs, data quantified from flow cytometry.
Total miRNA-mediated repression (red), transcriptional repression (blue) and translational
contribution (green) were quantified for Lin28a UTR (a), and P21 UTR (b) over a target
expression region of about 100 fold. (c) All transfected cells were combined and a single
population repression value was derived, with total repression represented in red bars and
transcriptional repression in blue bars. No error bars were included because experiments have been
75
performed for less than 3 times. The percentage corresponds to the relative contribution from
transcriptional regulation to total repressions.
5
2LAM
E
0
0
100
3
2..
1
102
101
1o
mCherry (normalized to binO)
Supplementary Figure 3.2 miRNA-mediated transcriptional repression on Lats2 UTR, data
quantified from cell sorting and downstream RNA sequencing.
pCAG-d2eGFP-Lats2Ori/MutUTR and pCAG-mCherry were co-transfected into WT ESCs,
sorted into 5 bins according to mCherry expression and followed by RNA sequencing. MicroRNA
mediated transcriptional repression via Lats2 UTR was calculated from gfp reads. Transcriptional
regulation does not get titrated for the last bin.
76
lats2 Repression Fold
GLORvsGLMR
10
-+-Overall Repression Fold
mRNA Degradation Repression
9 --- Protein Translation Inhibition
0.99 quantile of Mock T background
8
a
al
--7
-+-
lats2 Repression Fold
RLOX vs RLMX
Overall Repression Fold
mRNA Degradation Repression
Protein Translation Inhibition
RF = I reference line
b
6
7
.
6
54
5
1
4
3
2
2
1
1
%I
a03
-
104
i
u10 3
10
mCherry
10'
ZsGreen
105
Supplementary Figure 3.3 Color switch experiment.
Different combinations of fluorescent proteins were used as reporters and indicators. (a) GFP was
used as reporter, and either the original and mutated version of Lats2a 3'UTR was appended
behind it. mCherry was used as indicator protein, and was followed by a non-miRNA-regulated
tail. The repression fold was calculated for different target expression levels in wild type ESCs.
(b) mCherry was used as reporter, and ZsGreen was used as indicator protein. MicroRNA
repression strength was quantified for Lats2a 3'UTR too. Similar miRNA repression fold trend,
i.e. relative constant repression strength at transcriptional level, and the initial increase and final
decrease at overall protein product level were observed for both color sets, indicating that the
miRNA repression behavior we observed is attributed to miRNA mediated regulation, not to any
intrinsic properties of the fluorescent proteins we used.
77
pwaforrnaldehyde
A'
5
A
4
3
2
1
1
U-
EtOH long fix
3
2
4
EtOH short fix
5
5
4
4
I
3
2
1
2
5
3
2
1
4
Methanol
+
11,
2
C
RNAprotector
Acetic Acid
0
4
b
a
RNAlater
5
5
4
4
3
3
2
2
;IPX
%
1
2
3
4
0
6
d
2
4
e
0
2
4
f
loglO(mCherry)
Supplementary Figure 3.4 The effects of fixation methods on fluorescent protein signal.
During FACS experiment, live cells could stay out of cell culture condition and stay on ice for up
to several hours between trypsinization and sorting into lysate buffer. The experiment schedule
was very restricted due to short time window of cell viability after trypsinization. Also, cell states
could possibly change and protein could get degraded during these time. Thus we tried various
fixation methods, and hoped to find a method to fix the cell states without perturbing the
fluorescent protein signal. Since the fixed cells also had to be compatible for cell sorting the
downstream Mass spectrometry (MS) and RNA sequencing, the fixation methods had to preserve
the morphology of the cells and the integrity of both protein and transcripts. The standard
paraformaldehyde fixation method was not applicable due to its cross-linking nature. We tried the
following methods, which could be coarsely separated into two types of mechanisms, fixation by
denaturation and fixation by precipitation. Shown are the fluorescent signal scatterplot after cell
fixation. (a) Fixation by standard 4% paraformaldehyde followed by 70% Ethanol. (b) Fixation by
70% ethanol, and preservation in ethanol. (c) Fixation by 70% ethanol for 10 minutes, and
preservation in IX PBS. (d) Fixation by 3:1 v/v Methanol + acetic acid. (e) Fixation by
RNAprotector@ (Qiagen) (f) Fixation by RNAlater@ (Ambion). None of the methods fully
preserve the fluorescent signal. But if in future fixation before sorting becomes essential, fixation
by short time ethanol looks to be most promising, and shorter time can be explored.
78
-
5.5
54
AI
4 -4I:-
3.5
P4-
3
U-
'
2.5
2
1.5
1
nd
I
1.5
I
2
I
2.5
I
3.5
Iog(GFP mRNA)
loom MWIW qq
3
0V
Lats2
Lats2
4
10
OrIUTR
-MutUTR
- reatv
Ori JTR
UTR
3.5[
-
-Mut
5.5
5
4.5
4
10
2.5 Iz
E 2
C.
.
A
CL
.5S 1.5
CL1
I 03
0.5
10
GFP mRNA
'
10 5
10 6
2
-2
10
3
.
'3'
10
'
1021 2
10
10
3
4
10
GFP mRNA
10 5
10 6
Supplementary Figure 3.5 The relationship between protein and mRNA for Lats2 3'UTR.
pCAG-d2eGFP-Lats2Ori/MutUTR were co-transfected with pCAG-mCherry plasmid into wildtype ESCs. Only cells 4 std away from the transfected majority background are selected for the
plot. (a) Scatterplot of GFP protein versus mRNA, with marginal distributions plotted on the side.
OriUTR was plotted in red, and MutUTRs was plotted in blue. Only 1% of the measured cells
were plotted for better visualization. MicroRNAs repress both transcript and protein levels of GFPOriUTR, and the histograms of both are shifted towards smaller values. (b) Bar plot of protein
and mRNA. Cells were binned according to mRNA expression, and the mean of protein expression
in each bin were calculated. Error bars correspond to SEM. (c) Proteins produced out of unit
transcript at different transcript expression. The conversion rate for OriUTR stays relatively
constant, but decreases for MutUTR at high transcript levels. This reflects the de-repression of
translation inhibition at high transcript levels, as depicted in the black dotted line.
79
LdtS7 tbP
mRNA f11012
Pill
D9
Oc4
SOX2
3s8
203
0.97
1.17
(other literature)
Table 3.2 FISH quantification of gene expression at transcriptlevel in WT and Dgcr8' ESCs.
mRNA Barplot
800
-+-total repression fold
-eCasp; OrQUTR
-*-mRNA degradation repression
-+-protein translaio inhi=ition
10
RF = I reference line
MutUTR
700
Casp2 repression fold
12
600
8
500
z
E 400,
3C
6
ce
0U-
(9 300
4
200
2
100
0
1.'-~
10
n
1010
mCherry (A.U.)
109
10 10
mCherry (A.U.)
Supplementary Figure 3.6 Linking endogenous gene expression to miRNA repression in vivo.
The relationship between endogenous gene expression and total miRNA repression fold is
monotonic in the low target abundance region. Thus we can link the two, and estimate miRNA
repression and the relative contribution from mRNA degradation and translation inhibition in this
region. miRNAs preferentially target lowly expressed genes in vivo (Farh, 2005; Sood et al., 2006).
Consistent with this, Table 3.1 shows that most of the miR-290 targets are expressed less than 100
transcripts per cell. Here we utilized microscopy measurement. First we linked mRNA levels with
indicator expression (Left), then we linked indicator expression with miRNA repression (Right).
And we observe that miRNA repression is subtle in the low target abundance region, and the
majority of contribution comes from mRNA degradation. Our estimation is consistent with in vivo
genome-wide assays (Baek, 2008; Guo et al., 2010; Hendrickson, 2009).
80
3.7 References:
Andrei, M.A. (2005). A role for eIF4E and eIF4E-transporter in targeting mRNPs to mammalian
processing bodies. RNA 11, 717-727.
Baek, D. (2008). The impact of microRNAs on protein output. Nature 455, 64-71.
Behm-Ansmant, I. (2006). mRNA degradation by miRNAs and GW182 requires both CCR4:NOT
deadenylase and DCP1 :DCP2 decapping complexes. Genes Dev 20, 1885-1898.
Beilharz, T.H. (2009). microRNA-mediated messenger RNA deadenylation contributes to
translational repression in mammalian cells. PLoS ONE 4, e6783.
Bhattacharyya, S.N., Habermacher, R., Martine, U., Closs, E.I., and Filipowicz, W. (2006). Relief
of microRNA-mediated translational repression in human cells subjected to stress. Cell 125, 11111124.
Braat, A.K., Yan, N., Arn, E., Harrison, D., and Macdonald, P.M. (2004). Localization-dependent
oskar protein accumulation; control after the initiation of translation. Dev Cell 7, 125-131.
Chu, C.Y., and Rana, T.M. (2006). Translation repression in human cells by microRNA-induced
gene silencing requires RCK/p54. PLoS Biol 4, e210.
Clark, I.E., Wyckoff, D., and Gavis, E.R. (2000). Synthesis of the posterior determinant nanos is
spatially restricted by a novel cotranslational regulatory mechanism. Curr Biol 10, 1311-1314.
Derry, M.C., Yanagiya, A., Martineau, Y., and Sonenberg, N. (2006). Regulation of poly(A)binding protein through PABP-interacting proteins. Cold Spring Harb Symp Quant Biol 71, 537543.
Ding, X.C., and Grosshans, H. (2009). Repression of C. elegans microRNA targets at the initiation
level of translation requires GW182 proteins. EMBO J 28, 213-222.
Doench, J.G., and Sharp, P.A. (2004). Specificity of microRNA target selection in translational
repression. Genes Dev 18, 504-511.
Eichhorn, Stephen W., Guo, H., McGeary, Sean E., Rodriguez-Mias, Ricard A., Shin, C., Baek,
D., Hsu, S.-h., Ghoshal, K., Villen, J., and Bartel, David P. (2014). mRNA Destabilization Is the
Dominant Effect of Mammalian MicroRNAs by the Time Substantial Repression Ensues.
Molecular Cell 56, 104-115.
Eulalio, A. (2007). Target-specific requirements for enhancers of decapping in miRNA-mediated
gene silencing. Genes Dev 21, 2558-2570.
Eulalio, A. (2009). Deadenylation is a widespread effect of miRNA regulation. RNA 15, 21-32.
Eulalio, A., Behm-Ansmant, I., and Izaurralde, E. (2007a). P-bodies: at the crossroads of posttranscriptional pathways. Nature Rev Mol Cell Biol 8, 9-22.
Eulalio, A., Behm-Ansmant, I., Schweizer, D., and Izaurralde, E. (2007b). P-body formation is a
consequence, not the cause, of RNA-mediated gene silencing. Mol Cell Biol 27, 3970-3981.
81
Eulalio, A., Huntzinger, E., and Izaurralde, E. (2008). GW182 interaction with Argonaute is
essential for miRNA-mediated translational repression and mRNA decay. Nature Struct Mol Biol
15, 346-353.
Fabian, M.R. (2009). Mammalian miRNA RISC recruits CAF1 and PABP to affect PABPdependent deadenylation. Mol Cell 35, 868-880.
Fabian, M.R., Sonenberg, N., and Filipowicz, W. (2010). Regulation of mRNA translation and
stability by microRNAs. Annu Rev Biochem 79, 351-379.
Farh, K.K. (2005). The widespread impact of mammalian microRNAs on mRNA repression and
evolution. Science 310, 1817-1821.
Filipowicz, W., Bhattacharyya, S.N., and Sonenberg, N. (2008). Mechanisms of posttranscriptional regulation by microRNAs: are the answers in sight? Nature Rev Genet 9, 102-114.
Giraldez, A.J. (2006). Zebrafish MiR-430 promotes deadenylation and clearance of maternal
mRNAs. Science 312, 75-79.
Guo, H., Ingolia, N.T., Weissman, J.S., and Bartel, D.P. (2010). Mammalian microRNAs
predominantly act to decrease target mRNA levels. Nature 466, 835-840.
Hendrickson, D.G. (2009). Concordant regulation of translation and mRNA abundance for
hundreds of targets of a human microRNA. PLoS Biol 7, e1000238.
Huang, J. (2007). Derepression of micro-RNA-mediated protein translation inhibition by
apolipoprotein B mRNA-editing enzyme catalytic polypeptide-like 3G (APOBEC3G) and its
family members. J Biol Chem 282, 33632-33640.
Humphreys, D.T., Westman, B.J., Martin, D.I., and Preiss, T. (2005). MicroRNAs control
translation initiation by inhibiting eukaryotic initiation factor 4E/cap and poly(A) tail function.
A
A~ Q _: T TO A
4Q_ 1 1 40AC
Proc XT-4T.1
IN atL Acad .3%.
USAX I1n1,1 11696VV
-1_6966V.
Iwasaki, S., Kawamata, T., and Tomari, Y. (2009). Drosophila argonauteI and argonaute2 employ
distinct mechanisms for translational repression. Mol Cell 34, 58-67.
Jackson, R.J. (2005). Alternative mechanisms of initiating translation of mammalian mRNAs.
Biochem Soc Trans 33, 1231-1241.
Jakymiw, A. (2005). Disruption of GW bodies impairs mammalian RNA interference. Nature Cell
Biol 7, 1267-1274.
Kedersha, N. (2005). Stress granules and processing bodies are dynamically linked sites of mRNP
remodeling. J Cell Biol 169, 871 -884.
Kiriakidou, M. (2007). An mRNA m7G cap binding-like motif within human Ago2 represses
translation. Cell 129, 1141-1151.
Klemm, S., Semrau, S., Wiebrands, K., Mooijman, D., Faddah, D.A., Jaenisch, R., and van
Oudenaarden, A. (2014). Transcriptional profiling of cells sorted by RNA abundance. Nat Meth
11, 549-551.
Krutzfeldt, J. (2005). Silencing of microRNAs in vivo with 'antagomirs'. Nature 438, 685-689.
82
Leung, A.K., Calabrese, J.M., and Sharp, P.A. (2006). Quantitative analysis of Argonaute protein
reveals microRNA-dependent localization to stress granules. Proc Natl Acad Sci USA 103, 1812518130.
Lim, L.P. (2005). Microarray analysis shows that some microRNAs downregulate large numbers
of target mRNAs. Nature 433, 769-773.
Liu, J., Valencia-Sanchez, M.A., Hannon, G.J., and Parker, R. (2005). MicroRNA-dependent
localization of targeted mRNAs to mammalian P-bodies. Nature Cell Biol 7, 719-723.
Lytle, J.R., Yario, T.A., and Steitz, J.A. (2007). Target mRNAs are repressed as efficiently by
microRNA-binding sites in the 5 [prime] UTR as in the 3 [prime] UTR. Proc Natl Acad Sci USA
104, 9667-6972.
Maroney, P.A., Yu, Y., Fisher, J., and Nilsen, T.W. (2006). Evidence that microRNAs are
associated with translating messenger RNAs in human cells. Nature Struct Mol Biol 13, 11021107.
Mathonnet, G. (2007). MicroRNA inhibition of translation initiation in vitro by targeting the capbinding complex eIF4F. Science 17, 1764-1767.
Meister, G. (2005). Identification of novel argonaute-associated proteins. Curr Biol 15, 2149-2155.
Mishima, Y. (2006). Differential regulation of germline mRNAs in soma and germ cells by
zebrafish miR-430. Curr Biol 16, 2135-2142.
Mootz, D., Ho, D.M., and Hunter, C.P. (2004). The STAR-Maxi-KH domain protein GLD-1
mediates a developmental switch in the translational control of C. elegans PAL-1. Development
131, 3263-3272.
Mukherji, S., Ebert, M.S., Zheng, G.X.Y., Tsang, J.S., Sharp, P.A., and van Oudenaarden, A.
(2011). MicroRNAs can generate thresholds in target gene expression. Nat Genet 43, 854-859.
Nelson, P.T., Hatzigeorgiou, A.G., and Mourelatos, Z. (2004). miRNP: mRNA association in
polyribosomes in a human neuronal cell line. RNA 10, 387-394.
Nottrott, S., Simard, M.J., and Richter, J.D. (2006). Human let-7a miRNA blocks protein
production on actively translating polyribosomes. Nature Struct Mol Biol 13, 1108-1114.
Olsen, P.H., and Ambros, V. (1999). The lin-4 regulatory RNA controls developmental timing in
Caenorhabditis elegans by blocking LIN-14 protein synthesis after the initiation of translation.
Dev Biol 216, 671-680.
Petersen, C.P., Bordeleau, M.E., Pelletier, J., and Sharp, P.A. (2006). Short RNAs repress
translation after initiation in mammalian cells. Mol Cell 21, 533-542.
Piao, X., Zhang, X., Wu, L., and Belasco, J.G. (2010). CCR4-NOT deadenylates mRNA associated
with RNA-induced silencing complexes in human cells. Mol Cell Biol 30, 1486-1494.
Pillai, R.S. (2005). Inhibition of translational initiation by Let-7 MicroRNA in human cells.
Science 309, 1573-1576.
83
Poy, M.N., Eliasson, L., Krutzfeldt, J., Kuwajima, S., Ma, X., MacDonald, P.E., Pfeffer, S., Tuschl,
T., Rajewsky, N., Rorsman, P., et al. (2004). A pancreatic islet-specific microRNA regulates
insulin secretion. Nature 432, 226-230.
Rehwinkel, J. (2006). Genome-wide analysis of mRNAs regulated by Drosha and Argonaute
proteins in Drosophila melanogaster. Mol Cell Biol 26, 2965-2975.
Rehwinkel, J., Behm-Ansmant, I., Gatfield, D., and Izaurralde, E. (2005). A crucial role for
GW182 and the DCP1:DCP2 decapping complex in miRNA-mediated gene silencing. RNA 11,
1640-1647.
Ruegsegger, U., Leber, J.H., and Walter, P. (2001). Block of HACI mRNA translation by longrange base pairing is released by cytoplasmic splicing upon induction of the unfolded protein
response. Cell 107, 103-114.
Schmitter, D. (2006). Effects of Dicer and Argonaute down-regulation on mRNA levels in human
HEK293 cells. Nucleic Acids Res 34, 4801-4815.
Seggerson, K., Tang, L., and Moss, E.G. (2002). Two genetic circuits repress the Caenorhabditis
elegans heterochronic gene lin-28 after translation initiation. Dev Biol 243, 215-225.
Selbach, M. (2008). Widespread changes in protein synthesis induced by microRNAs. Nature 455,
58-63.
Sood, P., Krek, A., Zavolan, M., Macino, G., and Rajewsky, N. (2006). Cell-type-specific
signatures of microRNAs on target mRNA expression. Proceedings of the National Academy of
Sciences of the United States of America 103, 2746-275 1.
Stark, A., Brennecke, J., Bushati, N., Russell, R.B., and Cohen, S.M. (2005). Animal microRNAs
confer robustness to gene expression and have a significant impact on 3' UTR evolution. Cell
123, 1133-1146.
Thermann, R., and Hentze, M.W. (2007). Drosophila miR2 induces pseudo-polysomes and inhibits
translation initiation. Nature 447, 875-878.
Wakiyama, M., Takimoto, K., Ohara, 0., and Yokoyama, S. (2007). Let-7 microRNA-mediated
mRNA deadenylation and translational repression in a mammalian cell-free system. Genes Dev
21, 1857-1862.
Wang, B., Love, T.M., Call, M.E., Doench, J.G., and Novina, C.D. (2006). Recapitulation of short
RNA-directed translational gene silencing in vitro. Mol Cell 22, 553-560.
Wu, L., Fan, J., and Belasco, J.G. (2006). MicroRNAs direct rapid deadenylation of mRNA. Proc
Nati Acad Sci USA I03, 4034-4039.
Yekta, S., Shih, I.H., and Bartel, D.P. (2004). MicroRNA-directed cleavage of HOXB8 mRNA.
Science 304, 594-596.
Zdanowicz, A. (2009). Drosophila miR2 primarily targets the m7GpppN cap structure for
translational repression. Mol Cell 35, 881-888.
Zhao, Y., Samal, E., and Srivastava, D. (2005). Serum response factor regulates a muscle-specific
microRNA that targets Hand2 during cardiogenesis. Nature 436, 214-220.
84
Chapter 4 Application of reporter system to study microRNA control
of protein expression noise
4.1 Abstract
MicroRNAs repress many genes in metazoan organisms by accelerating mRNA degradation and
inhibiting translation, thereby reducing the level of protein. However, microRNAs only slightly
reduce the mean expression for most targeted proteins, leading to speculation about their role in
the variability of protein expression, or noise. Here we use mathematical modeling and single cell
reporter assays to show that microRNAs - in conjunction with increased transcription - decrease
protein expression noise for lowly expressed genes, but increase noise for highly expressed genes.
Genes that are regulated by multiple microRNAs show more pronounced noise reduction. We
estimate that hundreds of (lowly expressed) genes in mouse embryonic stem cells have reduced
noise due to substantial microRNA regulation. Our findings therefore suggest that microRNAs
confer precision to protein expression and thus offer plausible explanations for the commonly
observed combinatorial targeting of endogenous genes by multiple microRNAs as well as the
preferential targeting of lowly expressed genes.
4.2 Results
MicroRNAs regulate numerous genes in metazoan organisms (Enright et al., 2003; John et al.,
2004; Lee et al., 1993; Lewis et al., 2005; Wightman et al., 1993) by accelerating mRNA
degradation and inhibiting translation (Guo et al., 2010; Lim, 2005). Although the physiological
function of some microRNAs is known in detail (Brennecke et al., 2003; Johnston and Hobert,
2003; Lee et al., 1993; Wightman et al., 1993), it is unclear why microRNA regulation is so
ubiquitous and conserved, since individual microRNAs only weakly repress the vast majority of
their target genes (Baek, 2008; Selbach, 2008) and knockouts rarely show phenotypes (Miska et
al., 2007). One proposed reason for this widespread regulation is the ability of microRNAs to
provide precision to gene expression (Bartel and Chen, 2004), and previous work has hypothesized
that microRNAs could reduce protein expression variability (noise) when their repressive posttranscriptional effects are antagonized by accelerated transcriptional dynamics (Ebert and Sharp,
2012; Noorbakhsh et al., 2013). However, since microRNA levels are themselves variable, one
should expect the propagation of their fluctuations to introduce additional noise (Figure 4.1a).
To test the effects of endogenous microRNAs, we quantified protein levels and fluctuations in
mouse embryonic stem cells (mESCs) using a dual fluorescent reporter system (Mukherji et al.,
2011), where two different reporters (ZsGreen and mCherry) are transcribed from a common bidirectional promoter (Figure 4.1b). One of the reporters (mCherry) contained several variants and
numbers of microRNA binding sites in its 3'UTR and we quantified single cell fluorescence using
a flow cytometer (Figure 4.1c).
We used ZsGreen fluorescence intensity to bin cells with similar transcriptional activity (mostly
due to varying plasmid copy numbers) and calculated mean and noise (standard deviation divided
by mean) of mCherry intensities distributions in each bin (Figure 4.ld).
85
We first assessed the effects of endogenous miR-20a in mESCs, on a designed target site in the
reporter. In cells with low reporter (mCherry) expression containing a miR-20a site, noise was
reduced (compared to an unregulated control at equal mCherry expression) in contrast to increased
noise at high reporter expression (Figure 4.1e). These changes in mCherry noise are more
pronounced when the miR-20a sites in the reporter are perfect targets or when there are multiple
sites in the 3' UTR (Figure 4.1 f, g).
a
maCd*
4
machy
gene
i
mRNA
protein
mocroRNA
00
t
d
C
b
10'
no 3UTR
tout buled mR-20E
-
-
-TRI
sites
p
A 10'
I
#,
-
10*
microRNA
1*
potemA'&
1004
mchery iensity [a-u.]
Of one ZsGree bin)
1o0
10'
10'
100
10v
ZsGreen intensity [a.u.]
f
e
g
* no 3UrR
* no 3UTR
0 One bulgPrn1'fat
1
*n
1
0.51
- i~
~~~lo
U,
10'
10'
1.5
-
*no 3UTR
0 tour !u(11rdmi
1.
0.51
109
100
0.51
10'
10'
10'
101
100
U'
mcherry intensity mean [au.]
Figure 4.1 microRNA regulation has opposing effects on noise at low and high protein
expression.
(a) The expression of a microRNA regulated gene. Noise in protein expression originates from
stochastic molecular reactions in the production of the protein (intrinsic noise; jagged arrows) or
fluctuations propagating from external factors (extrinsic noise). (b) Plasmid reporter system
coding for two fluorescent proteins ZsGreen and mCherry, transcribed from a common bi86
directional promoter. mCherry 3'UTR can be modified to contain microRNA binding sites. (c)
Overlay of two flow cytometry measurements of mESC populations transiently transfected with
different variants of the plasmid system: empty mCherry 3'UTR (black) and mCherry 3'UTR
containing four bulged miR-20a binding sites (blue). For further processing cells are binned
according to ZsGreen intensity (red lines) and cells below ZsGreen background are discarded (grey)
(See Methods). (d) Mean and noise (standard deviation divided by mean) of mCherry intensities
are calculated from marginal distributions in each bin. (e-g) Noise of mCherry intensity as a
function of mean mCherry intensity in each bin for three different miR-20a regulated constructs
(blue) compared to respective unregulated constructs (black). Panels are ordered from left to right
according to increasing repression of constructs by miR-20a. Dots: data, lines and shaded area:
model fit.
In order to explore the mechanism for these seemingly opposing effects on protein expression
noise, we built a mathematical model where we decomposed total noise into intrinsic noise and
extrinsic noise (ot t = 77int + 17xt , Eq. 1) (Elowitz et al., 2002; Swain et al., 2002) (See
Supplementary Model of (JMrn M. Schmiedel, 2015)). Intrinsic noise 7int results from the
stochasticity of transcription, translation and decay but is mostly dominated by transcriptional
dynamics (Blake et al., 2003; Raj et al., 2006) and low mRNA copy numbers (Bar-Even et al.,
2006; Ozbudak et al., 2002). Extrinsic noise 77ext stems from fluctuations propagating from
external factors to the gene (Pedraza and van Oudenaarden, 2005). The modeling predicted
opposing effects of microRNA regulation on intrinsic and extrinsic noise. On the one hand, the
model predicted that a microRNA-regulated gene (reg) has reduced intrinsic noise compared to an
unregulated gene (unreg) at equal protein expression levels; intrinsic noise is approximately
unreg
reduced by the square root of microRNA-mediated fold-repression r,
reg
=
V7 (Eq.2) (Figure
4.2a). Noise reduction results from microRNA-mediated accelerated mRNA turnover and
increased transcriptional activity needed to produce the same amount of protein (Ebert and Sharp,
2012). The model predicts that the effect occurs independently of the mode of microRNAmediated repression (Jrn M. Schmiedel, 2015). On the other hand, the model predicted that
, p (Eq.3) (Figure
microRNA regulation acts as an additional extrinsic noise source 7lext =
4.2b). The magnitude of 7lext depends on the noise in the pool of regulating microRNAs (4,) and
on how strongly microRNAs repress the target (V) (Jhrn M. Schmiedel, 2015). Therefore the
model predicted that the combined net effects of decreased intrinsic and additional extrinsic noise
would result in decreased total noise at low expression, but increased total noise at high expression
(Figure 4.2c); and model-fits, with the microRNA pool noise 4, as the only free parameter, yield
accurate agreement with the experimentally observed total noise profiles (Figure 4.1 e-g).
87
a
-no
1.5
.3
2
itS
mRN-miN
11"fa.3
1
0.5
c
extrinsic noise
b
intninsic noise%
I..3
1.51
11
2
0.51
n
05 1
5
4
1
0.51t
2i34t
10
10
10
10,
10 10 10
protein expression [a.u.]
protein expression [a.u.]
1.51
"I
10
10
10
10
102
10
3
2
1
total noiseY4
10
1
protein expression [a.u.]
Figure 4.2 Predictions of the noise model for a microRNA-regulated gene.
(a) Intrinsic noise due to low molecule numbers declines with increasing expression. MicroRNA
regulation reduces intrinsic noise as a function of repression due to higher mRNA turnover. (b)
Noise in microRNA pool propagating to target gene results in additional extrinsic noise dependent
on conferred repression and saturation of the microRNA pool. (c) Net influence of microRNA
regulation results in decreased total noise at low and increased total noise at high expression levels.
b
a
no 3UTRs
no 3'UTR
no 3UTR
xu mR20a
0.5-
.4
mCR-20a
iXpeRee
1
xug
"d miR 2
__pT~4A Tmf
-r t4A
MMu
U~oF
p~omim
10o
C
10
10'
1c
mean mCherny + ZsGreen intensity [a.u-]
10'
-
--
-
---...............
C
I
-
0 .5
1
-
-
0.25}
1
z 0.
-
2
ftR-26. a R
2
3
sqrt(fold-ieptession)
4
10
e
d
0.5
efec erR-2 ct
10'
I
3
I
rA9
11
01#
1-p
101
+'
fok1-repressao
Figure 4.3 Exploration of intrinsic and extrinsic noise effects.
(a) Plasmid reporter system with identical 3'UTRs for ZsGreen and mCherry, to quantify
expression-dependent intrinsic noise. (b) Intrinsic noise as a function of expression for three
different miR-20a bi-regulated constructs. Dots: data, lines and shaded area: model fit. (c)
88
Measured intrinsic noise reduction for bi-regulated constructs compared to fold-repression, as
measured independently by mCherry-regulated constructs. 3 biological replicates. (d) MicroRNA
pool noise estimates for nine different microRNAs endogenously expressed in mESC. Subset of
microRNAs with two instead of one gene copies indicated in red. n>3 biological replicates. (e)
MicroRNA pool noise estimates for individual and mixed pools, using data from reporters with
two perfect binding sites behind mCherry as indicated. Red bars: expectation for mixed pool noise
when sub-pools were fully correlated. n=3 biological replicates.
To distinguish between microRNA-mediated intrinsic and extrinsic noise effects experimentally,
we modified the plasmid reporter system so that both reporters contained identical 3'UTRs (Figure
4.3a). Now intracellular differences in their expression can only result from processes individual
to each gene, i.e. intrinsic noise. Comparing identical reporters both with and without miR-20a
sites, we show that miR-20a regulation reduces intrinsic noise compared to an unregulated
construct (Figure 4.3b) by the square root of fold-repression, as predicted by modeling (Figure
4.3c). These results also show that the observed increase in total noise at high mCherry expression
must be due to additional extrinsic noise (J5rn M. Schmiedel, 2015).
The model together with the experiments suggest that the reduction of intrinsic noise is a generic
property of microRNAs and should occur irrespective of the specific microRNAs or the molecular
details of the mRNA-microRNA interaction. To test the generality of these conclusions we
constructed eight additional reporters with mCherry 3'UTRs containing a perfect binding site for
a variety of microRNAs that are endogenously expressed in mESC (Jmrn M. Schmiedel, 2015).
For all constructs, the intrinsic noise reduction was approximately the square root of foldrepression (Jbrn M. Schmiedel, 2015). This was also confirmed by direct measurement for miR291 a target sites (Figure 4.3c and (Jrn M. Schmiedel, 2015)) and reporters containing AU-rich
elements (Barreau et al., 2005; Jbrn M. Schmiedel, 2015), the latter further supporting the
plausibility that reduction of intrinsic noise is a generic property of post-transcriptional repressors.
Additional extrinsic noise stems from the variability of the microRNA pool and consistent with
this we find that microRNA pool noise indeed differs between microRNAs (Figure 4.3d). The
validity of these results is supported by the observation that different constructs assaying the same
microRNA result yield similar pool noise estimates (Jrn M. Schmiedel, 2015). Although
microRNA pool noise decreases for microRNAs conferring stronger repression, it is still
substantial for the most potent and highly expressed microRNAs in mESC (miR-290 cluster
(Marson, 2008)) (Figure 4.3d). Interestingly, the microRNAs with two independent gene copies,
producing the identical mature microRNA (Figure 4.3d, red), tend to have lower microRNA pool
noise compared to single gene microRNAs. This suggested to us that microRNA pools could have
lower noise if they consist of independently transcribed microRNAs and thus uncorrelated
fluctuations can average out. To test this hypothesis, we constructed reporters with a perfect target
sites for miR-20a and either miR-16 or miR-290 in the mCherry 3'UTR and compared them to
reporters with two perfect target sites for miR- 16, miR-20a or miR-290, respectively. We find that
the noise levels in the mixed pools are lower than expected if the individual microRNA pools were
fully correlated and can be lower than the noise in the individual microRNA pools (JMn M.
Schmiedel, 2015). Therefore our data show that, if noise between different microRNAs is not
correlated, combinatorial regulation can result in lower noise of the target protein.
89
In contrast to our artificial 3'UTRs, endogenous mRNAs often contain many binding sites to
different microRNAs and with less complementarity (Enright et al., 2003; Krek et al., 2005). To
test if our findings are likely applicable in vivo, we constructed mCherry reporters with the 3'UTRs
from Wee], Lats2, Casp2 and Rbl2; all predicted to be combinatorial regulated by mESC
microRNAs (Jmrn M. Schmiedel, 2015). This multiple-microRNAs regulation resulted in 3 to 5.5fold repression compared to the control 3'UTRs containing mutated sites and reduced total noise
except when reporter expression levels were high (Figure 4.4a and (Jmrn M. Schmiedel, 2015)).
Model fits estimate intrinsic noise reduction for the wild-type 3'UTRs as large as the square root
of fold-repression (Jrn M. Schmiedel, 2015), consistent with our findings for the artificial 3'UTRs.
Furthermore, little additional noise at high expression levels results from low noise in the mixed
microRNA pools regulating the wild-type 3'UTRs (Jbrm M. Schmiedel, 2015), corroborating that
combinatorial microRNA regulation is a potent way to optimize overall noise reduction.
To determine whether the reporter assay covers expression levels relevant to endogenous genes,
we used fluorescence-activated cell sorting and RNA sequencing (J6m M. Schmiedel, 2015). The
reporter assay covers the range of 25% to 99% (~l RPKM to ~500 RPKM) of expressed genes in
mESC (Figure 4.4b). Model-based extrapolation shows that reduction of total noise for the
endogenous 3'UTRs extends in a graded fashion up to the top 10% of the transcriptome expression
distribution (Figure 4.4c). While most microRNAs individually repress genes only to a small
extend (Baek, 2008; Selbach, 2008), we find that hundreds of genes are substantially repressed
(>2 fold) by the combinatorial action of microRNAs in mESC (Jmrn M. Schmiedel, 2015), as
determined from transcriptome expression data for wild-type and microRNA-deficient Dicer
knockout mESC (Leung, 2011). Furthermore, most of the highly repressed genes have low
expression levels ((Jbm M. Schmiedel, 2015) consistent with refs. (Farh, 2005; Sood et al., 2006)),
suggesting that these genes should have reduced protein expression noise as a consequence of
microRNA regulation in vivo.
In summary, our integrated theoretical and experimental analyses show that reduction of intrinsic
nolse is a generic property of microRNA, and more gnenrally pQt-.trnsrtional reguiltin that
is linked to repression of protein expression. MicroRNAs preferentially target lowly expressed
genes, for which noise reduction will be strongest, while selectively avoiding ubiquitous and
highly expressed genes (Farh, 2005; Sood et al., 2006). Combinatorial microRNA regulation, a
widely observed phenomenon in vivo (Enright et al., 2003; Krek et al., 2005), enhances overall
noise reduction by providing strong repression to endogenous genes with only little additional
noise from microRNA pools. Combinatorial microRNA regulation may thus be a potent
mechanism to reinforce cellular identity by reducing gene expression fluctuations that are
undesirable for the cell.
90
a
K
I.
C
.
percentage of genes expressed below
25 50 75 9095 99
Lats2 3'UTR mut
Lats2 3UTR wt
1
-----------50
0.51
b
.
.
.
.
.
10'
10'
10
my iChery intensity mean [a.u.]
Ii
"1
q1W
-
I
transcuiptome
0.1
- - - - - - ---
Lats2 3'TR
-
-
501
--
0
100 ---
- -
--
50
100 -----
V.'exl
10
10'
00
berry mRNA love
0.2
I
100
---
2-3-VTTR
--
01
1w i
0.3
I
Casp2 3UTR
------
100
4
6
10
[RPKMI
1
----
-
-
0or__________________
10' 10
100 10
mRNA levels [RPKM]
Weel3'UTR
*edgn
us expiessen
nOise teducuon
I
mESC
*
*
C
C
*
*
S
S
S
S
S
S
*
*
S
6
*
U
*
6
10o
10'
10
10'
mRNA levels [RPKM]
Figure 4.4 Reduction of total noise dominates for microRNA-regulated endogenous 3'UTRs.
(a) Noise as function of mean for mCherry with Lats2 3'UTR (blue) or control 3'UTR with pointmutated microRNA binding sites (black). Dots: data, lines and shaded area: model fit. (b) Mapping
fluorescent reporter range to mESC transcriptome. (Upper panel) FACS sorting and least square
regression was used to determine conversion between mean mCherry fluorescent intensities and
mCherry mRNA levels (as measured by RNA-seq). (Lower panel) Range covered by mCherry in
relation to transcriptome expression in mESC (~25% to -99%). (c) Model-based extrapolation of
total noise in assayed endogenous 3'UTRs relative to control 3'UTRs as a function of
transcriptome expression (blue line and area: mean and 95% confidence interval based on
parameter estimates of three biological replicates).
91
4.3 Methods
4.3.1 Reporter plasmid construction
Starting from a previously established reporter system (Mukherji et al., 2011), eYFP was replaced
with ZsGreenl-1 (Clontech) using EcoRI and NdeI digestion sites. MicroRNA binding sites were
inserted into the mCherry 3'UTR using Clal and EcoRV digestions sites and into the ZsGreenl-]
3'UTR using NdeI and XbaI digestion sites. N=1 bulged (full complementary to microRNA except
central bulge, as in (Mukherji et al., 2011)) and perfect (full complementary) microRNA target
sites were created by aligning complementary single stranded oligonucleotides with respective
overhangs for digestions sites (IDT) at 65'C for 30 minutes, with previous heating to 95'C for 5
minutes. N=4 bulged miR-20a binding site 3'UTR contains random 50bp spacers between
individual binding sites and was synthesized (IDT gBlocks). Wee] wild-type and mutated 3'UTR
fragments (nt 130-610) as well as Casp2 and Rbl2 wild-type and mutated 3'UTRs were
synthesized (IDT gBlocks). The Lats2 wild-type 3'UTR was amplified from murine embryonic
stem cell cDNA and was sequence confirmed. The mutated version of Lats2 3'UTR was
synthesized (GeneArt).
The mutated 3'UTRs were synthesized with double point mutations in all predicted microRNA
binding sites (Targetscan6.2 (Garcia et al., 2011)) of significantly expressed mESC microRNAs
(Marson, 2008). Seed positions 3 and 5 were mutated such that purines and pyrimidines were
interchanged, yielding mutated 3'UTRs that maintain >95% sequence similarity to wild-type
3'UTRs. Refer to Supplementary Table S1 of (Jrn M. Schmiedel, 2015) for a list of mutated
microRNA seed sites. Synthesized fragments were PCR amplified to append necessary digestion
sites. MicroRNA binding sites and 3'UTRs were cloned into digested and dephosphorylated
plasmid backbone using T4 ligase (NEB). For a list of target site sequences, endogenous 3'UTR
sequences and their mutated versions refer to (Jirn M. Schmiedel, 2015).
4.3.2 Transient transfections
Murine embryonic stem cells V19 below passage 20 were plated two days before transfection in 2
ml synthetic 2i medium (Ying et al., 2003) (Gibco) on gelatinized 6-well plates, starting at ~105
cells. Medium was refreshed after 24 hours. Reporter plasmids were diluted 1:25 in pUC19b
carrier plasmid (NEB) and mixed with Lipofectamine 2000 (Invitrogen). 10 gl reagent with 4 pg
DNA in 300 p1 Opti-MEM was added to 2 ml 2i medium per well. 4 hours post transfection, cells
were detached using Accutase (EMD Millipore), split 1:2 and passaged onto gelatinized 60 mm
plates in 3 ml 2i medium containing 3 pg doxycycline. Medium was refreshed 24 hours after
passaging.
4.3.3 Flow cytometry
Cells were assayed on a LSRFortessa analyzer (BD Biosciences) two days after transfection. Cells
were gated according to their forward (FSC-A) and side (SSC-A) scatter profiles. Each set of
experiments contained at least one cell population transfected with the corresponding unregulated
reporter construct and one mock transfected cell population (pUC 1 9b carrier plasmid only), which
was used to characterize background fluorescence.
92
4.3.4 Transcriptome profiling
Cells were transfected with plasmid containing mCherry-Weel wildtype 3'UTR (as described
above). Cells were sorted into four fractions (~100.000 cells each) on a FACSAriaIII cell sorter
(BD Biosciences) according to ZsGreen intensities. RNA from cells in each fraction was extracted
using Trizol LS (Life Technologies). From isolated RNA sequencing libraries were prepared using
Illumnia TrueSeq Stranded mRNA kit. Libraries were sequences on an Illumnia HiSeq 2500
sequencer. Sequencing results were mapped to RefSeq mRNAs (mm10) and mCherry sequences
using Bowtie v2.2.0 (Langmead and Salzberg, 2012). Reads per kilobase gene model per million
mapped reads (RPKM) was calculated for all transcripts and transcript isoforms were then
aggregated to GeneSymbols. For further analysis we only considered genes expressed above 0.1
RPKM, what corresponds to about one transcript per mouse embryonic stem cells (Dominic Grin,
personal communication)
4.3.5 Taqman microRNA expression measurements
RNA was isolated from mESC V19 cells two days after transiently transfection with control
plasmid reporter (no 3'UTR behind mCherry) : pUC 19 carrier plasmid mix (as described above)
using Life Technologies miRVANA miRNA Isolation Kit. Expression of microRNAs mmu-miR16, mmu-miR-20a and mmu-miR-290 was assayed using Life Technologies Taqman microRNA
assays.
4.3.6 Flow cytometry data processing
For uni-regulated constructs (3'UTR only behind mCherry), cells were binned according to
ZsGreen intensities (bin-width 0.2 in log10 space). The lower bin limit was set to the 0.9999quantile of the background distribution. In each bin, cells below 0.001 -quantile and above 0.999quantile were discarded to deal with outliers. 1,000 iterations of 50% bootstrapping were used to
evaluate uncertainty of the data in each bin. From each bootstrap, mean and noise of mCherry
intensities were calculated. Mean of mean and noise values over all 1,000 bootstraps serve as
observables for each particular bin. Standard deviation of mean and noise values over all 1,000
bootstraps serve as uncertainty of the observables for each particular bin.
For bi-regulated constructs (identical 3'UTRs behind ZsGreen and mCherry), cells were binned
according to the summed [ZsGreen + mCherry] intensity (bin-width 0.2 in log 10 space). The lower
bin limit was set to the 0.9999-quantile of the summed [ZsGreen + mCherry] intensity of the
background distribution. In each bin, ZsGreen intensity was normalized such that ZsGreen and
mCherry intensity distributions had identical means. Mean and bootstrapped standard deviations
for intrinsic noise were calculated in each bin by bootstrapping as described above. Intrinsic noise
was calculated as rint =
(Elowitz et al., 2002), with z and m as ZsGreen and mCherry
intensities of cells and <x> denoting the mean of a variable over all cells in the bin.
93
The aforementioned observables describe mean and noise of the flow cytometry measurements.
Actual biological signal mean and noise were deconvolved from measurement noise as described
in (Jmrn M. Schmiedel, 2015).
4.3.7 Model fit to signal mean and noise
For all model fits to single cell data, a MATLAB implementation of the profile likelihood approach
(Raue et al., 2009) was used to determine optimal fits and 95% confidence intervals of parameter
estimates.
Uni-regulated reporter constructs (3'UTR only behind mCherry):
The mass action kinetics model (see Supplementary Equation 21 of (Jbrn M. Schmiedel, 2015))
was fitted to the background corrected and binned mean signal intensity data by using ZsGreen
signal intensities as being proportional to the transcription rate, and mean mCherry signal intensity
as being proportional to the mean protein concentration. From the fit, microRNA-mediated
repression R and microRNA saturation S could be estimated.
The noise model (Supplementary Equation 37 of (J6rn M. Schmiedel, 2015)) was fitted to the
corrected mCherry signal noise data of both the regulated reporter and the respective unregulated
control reporter simultaneously. The fit yielded parameter estimates for the scaling factor x =
<mCerr!y>, which relates mean mCherry signal intensity <mCherry>
to molecule numbers, the
microRNA-independent extrinsic noise iext and the effective microRNA pool noise q, -.
Bi-regulated constructs (identical 3'UTRs behind ZsGreen and mCherry):
Intrinsic signal noise was fitted as proportional to the square root of summed [ZsGreen + mCherry]
mean intensities <z+m> as iy =
__z__,
with y as a scaling factor. Scaling factors for both the
regulated reporters and the respective unregulated control reporters were estimated and their ratio
yielded the intrinsic noise reduction conferred by microRNA regulation.
4.3.8 Mixed microRNA pool noise for correlated individual microRNA pools
4B) was calculated as
=x~y
=
.
The hypothetical microRNA pool noise of fully correlated individual microRNA pools (cf. Figure
Here, the correlation coefficient p was set to 1. The standard deviation a was calculated as ai =
i, -< i >, the product of noise in the individual microRNA pool 5 (known from mCherry
reporters only regulated by the specific microRNA) times the relative microRNA pool size <i>,
which we measured using Taqman microRNA assays.
94
4.3.9 Mapping flow cytometry experiments to transcriptome expression
To calculate the conversion factor from mCherry fluorescent intensities to RPKM values a leastsquare fit of respective values over the four bins was performed. Comparability of transcriptome
expression from different bins is given by high similarity (R2 > 0.96, Figure SI OC of (Jmrn M.
Schmiedel, 2015)). Relative effects of microRNA regulation on total noise as a function of RPKM
values were calculated based on the parameters obtained from model fits to noise data from
endogenous 3'UTRs (n=3) and the mCherry fluorescent intensity to RPKM conversion factor.
4.3.10 Dicer knock-out mESC transcriptome expression data
Microarray expression data and Ago2 CLIP-seq data from wild-type and Dicer knockout mouse
embryonic stem cells were obtained from Gene Expression Omnibus GSE25310 and data was
processed as described in (Leung, 2011). MicroRNA-mediated repression was calculated as the
fold-change in mean expression between wild-type and Dicer knock-out samples. Loess regression
was performed to obtain an error model relating standard deviations of expression for each gene
as a function of mean expression over three replicates for both wild-type and knock-out samples.
Significance of fold-changes was assessed at alpha<0.05 (Bonferroni corrected) by calculating z<KO>-<WT>
scores as z =
,. Genes below microarray intensity of -4.2 (40% of genes) were
4'UKO'WT
discarded as background. Genes were labeled as 'AGO-bound' if at least one read cluster in their
3'UTR was found in CLIP-seq data. Genes were labeled as predicted microRNA targets if they
contain at least one predicted conserved microRNA binding site (Targetscan6.2 (Garcia et al.,
2011)) for a microRNA seed family expressed above 0.1% of total microRNA expression in mESC
(Marson, 2008).
95
4.4 Acknowledgments
This chapter is in collaboration with Jbrm M. Schmiedel, Sandy L. Klemm, and Apratim Sahay
under the instruction of Nils Bltithgen, Debora S. Marks, and Alexander van Oudenaarden. The
original paper 'MicroRNA control of protein expression noise' is accepted in Science.
We thank Margaret Ebert, Shankar Mukherji, Dylan Moojiman, Lennart Kester, Dominic GrUn
and Mauro Muraro for discussions and help, the Boyer lab for mESC line V19, the Cuppen lab for
sequencing and Stefan van der Elst for help with FAC-sorting. Support by NWO (VICI award,
AvO), ERC (ERC-AdG 294325-GeneNoiseControl, AvO), DFG (GK1772, JMS), EMBO (STF,
JMS), DFG (SPP 1395, NB), BMBF (FORSYS & BCCN, NB), HMS institutional support (DSM).
96
4.5 References
Baek, D. (2008). The impact of microRNAs on protein output. Nature 455, 64-71.
Bar-Even, A., Paulsson, J., Maheshri, N., Carmi, M., O'Shea, E., Pilpel, Y., and Barkai, N. (2006).
Noise in protein expression scales with natural protein abundance. Nature Genetics 38, 636-643.
Barreau, C., Paillard, L., and Osborne, H.B. (2005). AU-rich elements and associated factors: are
there unifying principles? Nucleic Acids Research 33, 7138-7150.
Bartel, D.P., and Chen, C.-Z. (2004). Micromanagers of gene expression: the potentially
widespread influence of metazoan microRNAs. Nat Rev Genet 5, 396-400.
Blake, W.J., Kaern, M., Cantor, C.R., and Collins, J.J. (2003). Noise in eukaryotic gene expression.
Nature 422, 633-637.
Brennecke, J., Hipfner, D.R., Stark, A., Russell, R.B., and Cohen, S.M. (2003). bantam Encodes a
Developmentally Regulated microRNA that Controls Cell Proliferation and Regulates the
Proapoptotic Gene hid in Drosophila. Cell 113, 25-36.
Ebert, M.S., and Sharp, P.A. (2012). Roles for microRNAs in conferring robustness to biological
processes. Cell 149, 515-524.
Elowitz, M.B., Levine, A.J., Siggia, E.D., and Swain, P.S. (2002). Stochastic Gene Expression in
a Single Cell. Science 297, 1183-1186.
Enright, A., John, B., Gaul, U., Tuschl, T., Sander, C., and Marks, D. (2003). MicroRNA targets
in Drosophila. Genome Biology 5, RI.
Farh, K.K. (2005). The widespread impact of mammalian microRNAs on mRNA repression and
evolution. Science 310, 1817-1821.
Garcia, D.M., Baek, D., Shin, C., Bell, G.W., Grimson, A., and Bartel, D.P. (2011). Weak seedpairing stability and high target-site abundance decrease the proficiency of lsy-6 and other
microRNAs. Nat Struct Mol Biol 18, 1139-1146.
Guo, H., Ingolia, N.T., Weissman, J.S., and Bartel, D.P. (2010). Mammalian microRNAs
predominantly act to decrease target mRNA levels. Nature 466, 835-840.
John, B., Enright, A.J., Aravin, A., Tuschl, T., Sander, C., and Marks, D.S. (2004). Human
MicroRNA Targets. PLoS Biol 2, e363.
Johnston, R.J., and Hobert, 0. (2003). A microRNA controlling left/right neuronal asymmetry in
Caenorhabditis elegans. Nature 426, 845-849.
Jdrn M. Schmiedel, S.L.K., Yannan Zheng, Apratim Sahay, Nils Bluthgen, Debora S. Marks,
Alexander van Oudenaarden (2015). miRNA control of protein expression noise. Science.
Krek, A., Grin, D., Poy, M.N., Wolf, R., Rosenberg, L., Epstein, E.J., MacMenamin, P., da
Piedade, I., Gunsalus, K.C., Stoffel, M., et al. (2005). Combinatorial microRNA target predictions.
Nature Genetics 37, 495-500.
Langmead, B., and Salzberg, S.L. (2012). Fast gapped-read alignment with Bowtie 2. Nat Meth 9,
357-359.
97
Lee, R.C., Feinbaum, R.L., and Ambros, V. (1993). The C. elegans heterochronic gene lin-4
encodes small RNAs with antisense complementarity to lin-14. Cell 75, 843-854.
Leung, A.K. (2011). Genome-wide identification of Ago2 binding sites from mouse embryonic
stem cells with and without mature microRNAs. Nature Struct Mol Biol 18, 237-244.
Lewis, B.P., Burge, C.B., and Bartel, D.P. (2005). Conserved seed pairing, often flanked by
adenosines, indicates that thousands of human genes are microRNA targets. Cell 120, 15-20.
Lim, L.P. (2005). Microarray analysis shows that some microRNAs downregulate large numbers
of target mRNAs. Nature 433, 769-773.
Marson, A. (2008). Connecting microRNA genes to the core transcriptional regulatory circuitry of
embryonic stem cells. Cell 134, 521-533.
Miska, E.A., Alvarez-Saavedra, E., Abbott, A.L., Lau, N.C., Hellman, A.B., McGonagle, S.M.,
Bartel, D.P., Ambros, V.R., and Horvitz, H.R. (2007). Most Caenorhabditis elegans microRNAs
are individually not essential for development or viability. PLoS Genet 3, e215.
Mukherji, S., Ebert, M.S., Zheng, G.X.Y., Tsang, J.S., Sharp, P.A., and van Oudenaarden, A.
(2011). MicroRNAs can generate thresholds in target gene expression. Nat Genet 43, 854-859.
Noorbakhsh, J., Lang, A.H., and Mehta, P. (2013). Intrinsic Noise of microRNA-Regulated Genes
and the ceRNA Hypothesis. PLoS ONE 8, e72676.
Ozbudak, E.M., Thattai, M., Kurtser, I., Grossman, A.D., and van Oudenaarden, A. (2002).
Regulation of noise in the expression of a single gene. Nature Genetics 31, 69-73.
Pedraza, J.M., and van Oudenaarden, A. (2005). Noise Propagation in Gene Networks. Science
307, 1965-1969.
Raj, A., Peskin, C.S., Tranchina, D., Vargas, D.Y., and Tyagi, S. (2006). Stochastic mRNA
Synthesis in Mammalian Cells. PLoS Biol 4, e309.
Raue, A., Kreutz, C., Maiwald, T., Bachmann, J., Schilling, M., Klingmiller, U., and Timmer, J.
(2009). Structural and practical identifiability analysis of partially observed dynamical models by
exploiting the profile likelihood. Bioinformatics 25, 1923-1929.
Selbach, M. (2008). Widespread changes in protein synthesis induced by microRNAs. Nature 455,
58-63.
Sood, P., Krek, A., Zavolan, M., Macino, G., and Rajewsky, N. (2006). Cell-type-specific
signatures of microRNAs on target mRNA expression. Proceedings of the National Academy of
Sciences of the United States of America 103, 2746-2751.
Swain, P.S., Elowitz, M.B., and Siggia, E.D. (2002). Intrinsic and extrinsic contributions to
stochasticity in gene expression. Proceedings of the National Academy of Sciences 99, 1279512800.
Wightman, B., Ha, I., and Ruvkun, G. (1993). Posttranscriptional regulation of the heterochronic
gene lin-14 by lin-4 mediates temporal pattern formation in C. elegans. Cell 75, 855-862.
98
Ying, Q.-L., Stavridis, M., Griffiths, D., Li, M., and Smith, A. (2003). Conversion of embryonic
stem cells into neuroectodermal precursors in adherent monoculture. Nat Biotech 21, 183-186.
99
Chapter 5 Application of UTR decoy system to study microRNAmediated-crosstalk
5.1 Abstract
MicroRNAs regulation strength is observed to decrease at high target expression for nearly all of
our UTR reporters, it suggests that regulating miRNAs or co-factors are subject to titration in this
region. Titration of miRNAs is a necessary condition for miRNA-mediated-crosstalk of coregulated targets. To study if over-expression of UTR decoys is sufficient to induce crosstalk
generally, we applied Lats2 3'UTR reporter system as miRNA sponges, sorted ESCs according to
decoy expression levels, and measured genome-wide gene expression response by RNA-Seq. No
statistically significant derepression was found for miR-290 miRNA targets, and other miRNAs
targeting Lats2 3'UTR. We estimated that decoy expression and added miRNA targeting sites were
comparable to or much higher than endogenous miRNA expression. However, the total expression
of endogenous miRNA targets, or the endogenous target site abundance (TA), were between 1.5
to 4 fold high as the added MREs. And the lack of miRNA-mediated-crosstalk in our system
supports a model in which the changes in ceRNAs must begin to approach the TA of a miRNA
before they can exert a consequential effect on the repression of targets for that miRNA (Denzler
et al., 2014).
5.2 Introduction
In recent years, endogenous RNAs have been found to communicate via a microRNA response
elements (MREs) language. By harboring same set of MREs, RNAs could compete with each other
for binding to the shared pool of regulating miRNAs. Therefore those competing endogenous
RNAs (ceRNAs) could titrate miRNAs availability and co-regulate the expression of each other
(Salmena et al., 2011 b).
Diverse RNA species have been added into this ceRNAs regulatory network, those include proteincoding mRNAs and non-coding RNAs (ncRNAs) such as pseudogenes, long non-coding RNAs
(lncRNAs), and circular RNAs (circRNAs). By combining computational prediction with
experimental validation, Tay et al. (Tay, 2011) discovered that protein-coding transcripts, such as
VAPA and CNOT6L, could function as PTEN ceRNAs. Their expressions are significantly
correlated with PTEN in vivo, and those ceRNAs modulate PTEN expression via a miRNAdependent manner. PTENP] is a pseudogene of PTEN (Poliseno, 2010). The proximal region of
its 3'UTR shares 95% identity with PTEN and contains conserved binding sites for five of the corerilating mRPNA.
RPtrrnira1
n'-rovenrei(n-n
of the PTEP1
UATR T preglate PT1NT Tn
Dicer-dependent manner. And siRNA knockdown of endogenous PTENP1 in prostate cancer cells
results in a decrease in PTEN levels. A similar correlation of expression is found
between KRAS and its pseudogene KRAS]P. T cells transformed with primate virus Herpesvirus
saimiri(HSV) express viral ncRNA called H. saimiri U-rich RNA 1 (HSUR-1). It contains miR27 sites and in turn accelerates the turnover of miR-27 (Libri et al., 2012). Moreover, the
expression of HSUR-l is correlated with miR-27 target gene FOXO1, suggesting its ability to
control host gene expression and its function as ceRNA. Another ncRNA example is linc-RoR.
Being highly expressed in human ESCs, it is able to titrate miR-145 from
100
OCT4, SOX2 and NANOG transcripts and is essential for ESCs pluripotency and self-renewal
(Wang et al., 2013). circRNA ciRS-7 has been recently identified as a natural sponge for miR-7.
It contains more than 70 conserved binding sites for miR-7. And it is highly expressed with
complete resistance to miRNA-mediated target degradation. Thus circRNAs with tandem MREs
may be exceptionally potent miRNA-mediated-crosstalk modulators (Hansen et al., 2013).
Actually before the discovery of ceRNAs, artificial miRNA sponges have been shown as effective
miRNA inhibitors (Ebert et al., 2007). These sponges are usually expressed from strong promoters,
contain multiple binding sites for a miRNA of interest and have been shown to derepress miRNA
targets at least as effectively as chemically modified antisense oligonucleotides. Intriguingly,
imperfectly complementary 'bulged sponges' sequester miRNAs more effectively than perfectly
complementary miRNA sponges, and endonucleolytic cleavage of perfect targets might reduce its
ability to hold miRNAs for longer time.
Efforts have been made to identify ceRNAs networks at transcriptome-wide scale. Several
miRNA-target prediction algorithms, including TargetScan, miRanda, ma22 and PITA, have been
used in Cupid to identify ceRNA interactions in breast cancer cell lines (Chiu et al., 2015).
Sumazin et al. (Sumazin, 2011) investigated the mRNA and miRNA network in glioblastoma cells
using data from the Cancer Genome Atlas and a new multivariate analysis method called
Hermes. They identified a post-transcriptional regulation layer of surprising magnitude,
comprising over 248,000 pairwise miRNA-mediated interactions and 7,000 RNAs that can
function as miRNA sponges. High-throughput biochemical techniques such as crosslinking
immunoprecipitation-sequencing followed by high-throughput sequencing (CLIP-Seq) have been
integrated in starBase, and genome-wide interaction maps of endogenous miRNA-targets
provided further insight into ceRNA network (Li et al., 2013).
Although multiple examples of ceRNA interactions have been described, little is known about the
molecular conditions necessary for optimal ceRNA activity. The abundance of ceRNAs and
miRNAs as well as their stoichiometry are obviously important. Other factors like the miRNA
catalytic activity of HSUR-l (Cazalla et al., 2010), and exceptional stability of circRNAs
(Memczak et al., 2013) could further increase the potency of sponging decoys. The effectiveness
of a ceRNA would also depend on the accessibility, affinity, and subcellular localization of RNAs.
Titration with other RNA binding proteins (RBPs) and indirect interactions may also be
intertwined with ceRNAs network. Changes in the ceRNA expression levels have to be large
enough to relieve the miRNA repression, whether or not this could happen under physiological
conditions is another issue. PTENP1 RNA is expressed at a much lower level than the PTEN
mRNA (~I%) in DU145 cells, and it is unlikely to significantly perturb PTEN and other ceRNAs
in this context. Yet it is conceivable that the effectiveness of the crosstalk depends on the
sensitivity of the regulated genes to subtle changes in expression level. PTEN is a haploinsufficient
tumor suppressor, and even 20% decrease in expression can promote cancer growth (Alimonti et
al., 2010).
101
5.3 Results
In order to apply UTR decoy system to study miRNA-mediated-crosstalk, we transfected wild type
mESCs with either functional decoys (Lats2 OriUTR) or miRNA regulation elements (MREs)
mutated controls (Lats2 MutUTR). Transfected cells were sorted into five bins according to
mCherry intensities (Figure 5.1). Cells were split in half for downstream applications. RNA
sequencing (RNA-Seq) was performed to measure genome-wide miRNA-mediated-crosstalk at
transcript level, and targeted mass spectrometry (MS) was conducted in parallel on a list of preselected proteins to illustrate the crosstalk at protein level. Due to the capacity of targeted MS,
only a dozen computationally predicted top targets of miR-290 family including Lats2 itself were
selected, and those proteins were most likely to be affected by Lats2 UTR overexpression, if
crosstalk occurred.
11 2013 ATS2 mut
11 O11jLATS2prd
'C
a
11 2
B4.
oil
a
U-
1
2
i
B2
.
Cl
0
P4
C.,
0
BO
3-1
I
M
IM
BO
C4,
111 . M M
I
2
'I
I
1' 1411
1
-
10
0 1 1 1-1
10
-
11
1
-I
.
CL
10
a
a42
0
10
10
10
10
mCherry
mCherry
Figure 5.1 Flow cytometry data and gate positions illustration.
pCAG-d2eGFP-Lats2 Ori/MutUTR was co-transfected with pCAG-mCherry into WT ESCs, and
cells were sorted according to mCherry intensities. Positively transfected cells were sorted into 4
bins (BI, B2, B3, and B4), and 1 bin (BO) was set as background control. miRNA repression on
GFP protein production is evident from the plot.
Sorting data recapitulated what we have observed on FACS analyzers. MicroRNAs repression on
GFP protein production increases initially and decreases at high target expression (Table 5.1). The
5 bins of sorted cells cover a mCherry expression range of -3 x 103 fold, with the mean from B4
and B 1 differ by more than 150 fold (data not shown).
102
0.8
0.6.
0.2
0
Lats2
Casp2
3
-01
-021
OAui
11111I-
-0321
01 M1 02 M2 03 M3 04 M4
01 M1 02 M2 03 M3 04 M4
o
2
2
3
6
1.5
2,
4
1
2
0.5
0
01 M1 02 M2 03 M3 04 M4
0
01 M1 02 M2 03 M3 04 M4
01 M1 02 M2 03 M3 04 M4
Nr2c2
Hiflan
E22
6
2
01 M1 02 M2 03 M3 04 M4
Teti
Tgfbr2
Cdknla
8
-u.4
1.5
1.5.
*
1
2
0.5
001 M1 02 M2 03 M3 04 M4
M
2 20M34M
0 01 M1 02 M2 03 M3 04 M4
0.5
m01imli:
010M102 M2 03M3 04M4
Figure 5.4 Expression of top miR-290 targets in sorted samples.
Lats2 and Tgfb2 exhibit higher expression in OriUTR transfected samples compared to MutUTR
transfected ones. But similar number of counter examples were also observed for Casp2, Teti, and
Nr2c2. Moreover, we do not observe consistent increase in gene expression upon higher decoy
expression. Thus miRNA-mediated-crosstalk is weak even for top miR-290 targets.
107
B4
B1
mChery si nsle 1or T
ftpressgn
50
87
48
51
65
7J
3.8
TI68
/
Table 5.1 GFP protein intensities and repression fold at protein level, data quantified from
cell sorting.
GFP protein expression averages were extracted from FACS, repression fold was quantified
using RF = (GFPunreg - GFPbg)/(GFPreg - GFPbg).
/
Targeted MS measurement has not been successful, and we only restrict ourselves to discussion
of RNA-Seq data from now. RNA-Seq has very low, if any, background signal (Wang et al., 2009).
Thus we define fold change of a gene expression to be FC = expression in OriUTR Transfection
expression in MutUTR Transfection, and no background subtraction is needed. If overexpression
of decoys does lead to miRNA-mediated-crosstalk, we expect to see titration of regulating
miRNAs by OriUTR but not MutUTR. This will result in derepression of co-regulated genes in
OriUTR transfected bins compared to MutUTR ones. Consequentially FC of miRNA targets
defined above will be larger than 1 on average, and the log of fold changes (LFCs) will be larger
than 0 on average.
Distributions of fold changes were plotted on a log scale, and the ith bin was compared to 0 th bin
control with two-sample Kolmogorov-Smimov test (KS test). As expected, the expression
distribution of all genes was unaffected by decoy overexpression, and the LFCs were always
centered on 0 irrespective of decoy levels (Supplementary Figure 5.3, bini to bin4). However,
we did not observe significant shifts in LFCs for miR-290 targets either (Figure 5.2), and the
differences of cumulative distribution function (CDF) of miR-290 targets LFCs throughout bins is
no more significant than that of all genes (Figure 5.3). We concluded that no genome-wide
miRNA-mediated-crosstalk for miR-290 targets was detected under our experimental conditions.
It has been hypothesized that miRNA-RNA competition would only apply to a small subset of
moderate or low abundance miRNAs, as the overexpression of decoys would have little impact on
regulation on highly abundant miRNAs (Wee et al., 2012). miR-290 family is the most abundant
miRNA families expressed in ESCs and it accounts for 70% of total ESCs miRNAs expression
(Marson, 2008). It's estimated that ES cells have on average -7,600 copies of miR-290 family
miRNAs per cell based on TaqMan@ assays (Chen, 2007). To estimate the number of decoys,
RNA-Seq RPKM reads was regressed against smFISH counts for 8 available genes, which cover
a wide expression range (-100 fold). And 1 RPKM roughly corresponds to 1 transcript per cell
(data not shown). This means that about 900 and 8000 decoy transcripts were expressed per cell
in bin3 and bin4 originally, and even after miRNA-mediated degradation, we have more than 200
and 2000 remaining decoy mRNAs per cell respectively (Table 5.2 gfp reads in MutUTR and
OriUTR transfection). Given that each Lats2 OriUTR contains two binding sites for miR-290
miRNAs (TargetScan 6.2), the number of overexpressed miR-290 MREs is comparable to the total
103
number of miR-290 miRNAs. On the other side, decoys have to compete against all ceRNAs for
miR-290 miRNAs binding. mESC is one of the hard-to-transfect cell lines. Even in the highest
expressed bin (bin4), decoy transcripts only account for -% and -2.5% of the total transcriptome
expression for OriUTR and MutUTR respectively (Supplementary Figure 5.6). This is in drastic
contrast to transfection of human colon cancer cells HCT1 16, in which decoys expression could
constitute more than half of the transcriptome (Apratim Sahay, personal communication). The total
expression of all miR-290 targets is about 10,500 transcripts per cell. The average miR-290 MRE
number for those targets is -1.3 per transcript, and the median is 1. Thus decoy overexpression,
which adds another -30% of miR-290 MREs to the total transcriptome in bin4, may still not be
able to perturb endogenous miR-290 availability in a significant way.
Next we examined several other miRNAs targeting Lats2 3'UTR. Those include miR-31, miR135, miR-103/107, and miR-15/16, which were expressed at -0.01%, -0.1%, -1%, and -5% of
miR-290 cluster expression (extracted from Solexa-Seq data from (Marson, 2008)). The
summation of expression for all endogenous targets of those miRNAs were on the same order of
miR-290 targets. And none of the miRNAs targets explored exhibited a significant miRNAmediated-crosstalk on a genome-wide scale (Supplementary Figure 5.4).
Factors such as number of shared MREs, miRNAs binding affinity and overlap of regulating
miRNAs positively affect the strength of miRNA-mediated-crosstalk (Tay et al., 2014). Thus we
limit ourselves to only top targets of miR-290 family miRNAs, which usually contain more than
two miR-290 MREs per transcript with strong miRNA binding affinity, and monitor their
expressions throughout sorted samples (Figure 5.4). Only a few genes (Lats2 and Tgjb2) exhibit
higher expression in OriUTR transfected samples compared to MutUTR transfected ones. But
similar number of counter examples were also observed (Casp2, Teti, and Nr2c2). Moreover, even
for endogenous Lats2, which is a perfect competing RNA for Lats2 decoy UTRs, we do not
observe consistent increase in gene expression upon increasing amounts of decoys expression. For
example, expression from bin 3 and 4 is smaller than that of bini and 2. Thus we suspect the
apparent variations of gene exnression throughout samples are largely expression noise for lowly
expressed genes, as the expression for more abundant genes (Cdknla) and housekeeping genes
(Supplementary Figure 5.2) are very uniform throughout all measured samples.
It is important to point out that the derepression of miRNA regulation we observed for UTR
reporters happened at translational rather than transcriptional level. Therefore, even though we did
not observe miRNA-mediated-crosstalk at transcriptome level, crosstalk might exist at proteome
level. At transcriptome level, even though the number of overexpressed MREs was comparable to
or much larger than endogenous miRNA expression, due to the high abundance of endogenous
miRNA targets, it was difficult to observe crosstalk between ceRNAs. Our study is consistent with
the high target site abundance (TA) model proposed in (Denzler et al., 2014), which states that the
changes in ceRNAs must begin to approach the TA of miRNA before they can exert a
consequential effect on the repression of targets for that miRNA. Interestingly, siRNA knockdown
of endogenous PTEN, which was expressed lower than 40 transcripts per cell, was sufficient to
induce a significant ceRNAs crosstalk (Apratim Sahay, personal communication). It might be
intriguing to see if subcellular localization and downstream signaling process could amplify
transcript perturbation.
104
Bin I
Bin 2
0.3[
0.31
pval= 0.27
g 0.2-
pvai = 3.3.-06
0.2* 0.1.
*0.1.
I
-2
-1
0
1
2
-2
0
-1
1og2 FC
Bin 3
pval = 0.59
0.31
0.3
0.2-
g 0.2-
0.1-
0.1
-2
-1
0
1
2
1og2 FC
Bin 4
1
0
-f2
2
1og 2 FC
pval = 0.53
..
-1 1
W
U
1
2
1og 2 FC
Figure 5.2 Distributions of LFCs for miR-290 family targets are not affected by Lats2 UTR
decoy overexpression.
Log fold change distribution of bin X (red stairs) was overlaid on top of binO (grey bars)
background control for miR-290 targets, and the means of LFCs were plotted as red line and grey
dashed line respectively. Two sample KS-test was performed between the two bins, and the
differences were not significant. The small p-value for bin2 is due to small variation of the LFCs
in bin 2 (Supplementary Figure 5.5).
105
miR-290 targets
all genes
1
1
0.9 0.8
0.7 S-
-b inO
0.9- -b
in1
-b in2
0.8- -b
in3
- b in4
0.7-
bin 0
bin 1
bin2
bin 31
bin 4
0.6
0.6L
0
0
0.5-
0.5
0.4
0.4
0.3-
0.3
0.2
0.2-
0.1
0.1
0'
-1
-0.5
0
0.5
-11
1
log2(fold changes)
-0. 5
0.5
0
log2(fold changes)
1
Figure 5.3 Cumulative distribution function (CDF) of log2 (fold changes) for all genes and
miR-290 family targets.
CDF of LFCs from binO to bin5 were shown for all genes (left) and miR-290 targets (right). No
consistent shift of LFC to the right was observed for miR-290 targets. And the differences
throughout bins for miR-290 targets were no smaller than all genes control. The biggest change
comes from bin2 (magenta line), and this is caused by smaller variation LFCs for all the sequenced
genes for bin2 (Supplementary Figure 5.5).
106
5.4 Methods
5.4.1 FACS cell sorting
Wild type mESCs were co-transfected with plasmid pCAG-d2eGFP-Lats2Ori/MutUTR and
pCAG-mCherry. Two days after transfection, cells were harvested and sorted into five fractions
(~100,000 cells each) on a FACSAriaIII cell sorter (BD Biosciences) according to mCherry
intensities.
5.4.2 RNA sequencing
RNA from cells in each fraction was extracted using Trizol LS (Life Technologies). From isolated
RNA sequencing libraries were prepared using Illumnia TrueSeq Stranded mRNA kit. Libraries
were sequences on an Illumnia HiSeq 2500 sequencer. Sequencing results were mapped using
Bowtie v2.2.0 to a RNA sequence library consisting of RefSeq mRNAs (mm10) except Lats2, plus
gfp, Lats2 CDS, Lats2 OriUTR, Lats2 MutUTR, and mCherry sequences. Reads per kilo base per
million mapped reads (RPKM) was calculated for all transcripts and transcript isoforms were then
aggregated to GeneSymbols. Both RNA-Seq and smFISH data are available for the following
genes, Pou5fi, Sox2, Nanog, Lin28a, Lats2, Casp2, Rbl2, and Cdknla (data not shown). Based on
their regression, 1 RPKM roughly corresponds to 1 transcript per cell. For further analysis we only
considered genes expressed above 1 RPKM, because lowly expressed genes were susceptible to
small number deviations and highly variable LFCs.
5.4.3 MicroRNA targets selection
Targets of a certain miRNA family were predicted using TargetScanMouse version 6.2 (http://
www.targetscan.org/mmu_61/). Target sets given by three different criteria, conserved targets only,
top 1000 targets and top 500 targets were extracted for each miRNA family. The top miRNA
targets were ranked by total context score (i.e. site efficacy), irrespective of site conservation.
Targets were also filtered by a gene expression threshold of 1 RPKM. The conclusion we drew is
independent of which target set we chose, and all the figures were shown for top 1000 targets.
5.4.4 Targeted Mass Spectrometry
A dozen computationally predicted top targets of miR-290 miRNAs, LATS2A, RBL2, CASP2,
P21, TGFBR2, TETI, E2F2, EDNRB, together with mCherry, d2EGFP and GAPDH were
selected for targeted MS monitoring. At least three interference-free, sequence-specific fragment
ions (transition ions) were selected for each protein of detection. Sorted cells were resuspended in
lysis buffer (provided by Nikolai), and send for targeted MS on a Harvard Mass Spectrometry core
facility machine Agilent 6460 Triple Quadrupole Mass Spectrometer with Agilent 1290 uHPLC.
And data is analyzed using Skyline Targeted Proteomics Environment.
108
5.5 Supplementary
gfp
Lats2 UTR
mCherry
OriO
Oril
Ori2
Ori3
Ori4
91.25
4.29
5.37
84.62
10.77
24.26
110.44
32.86
123.74
291.44
146.96
572.71
2818.43
1780.41
4705.22
Muto
86.31
13.84
5.13
Muti
105.26
39.38
22.82
Mut2
240.86
205.73
121.89
Mut3
839.31
904.79
446.89
Mut4
7557.12
8499.21
3865.01
Table 5.2 gfp, Lats2 UTR and mCherry RPKM reads for sorted samples.
The reads for Lats2 UTR is the summation of Lats2 OriUTR and Lats2 MutUTR reads.
GFP reporter
14-
Iat&2UTR
14-
b
a
12-
12-
10-
10-
Ca
4-
414
2a-
m5er
trn8cin-ee niao
2
14-
C
1210-
42
0
O1O Mut0.11
M.*UI
0612 KW*2
0113 MutS 0114 W44
Supplementary Figure 5.1 Histograms of RPKM reads for gfp, Lats2 UTR and mCherry.
(a) RNA-Seq reads for gfp have relatively high background, yet the miRNA-mediated repression
at transcriptional level is explicit as reads from Ori samples are always lower than reads from
corresponding Mut samples. (b) RNA-Seq reads for Lats2 UTR also reflect the miRNA repression
at mRNA level. The difference between OriO and MutO reflects the different stability of OriUTR
and MutUTR. (c) Contrary to miRNA activity reporter, RNA expression for indicator mCherry is
comparable for Ori and Mut samples.
109
betwAUn
$
Gapdh
12
12
10
10
8
8
S
$
6
4-
4
2-
2
0,Od0
WutO Ordl Mu
O2 Mut2 06r3 Mud O4 MubM
010
MutO
0i1
MuLW
O2 Mut2 Or6S
MuS
0r14 Mud 46
Supplementary Figure 5.2 Expression for housekeeping genes are homogenous throughout
sorted samples.
RNA-Seq reads for housekeeping genes Beta-A ctin and Gapdh are homogenous throughout sorted
samples.
110
Bin2
Bini
]
0.3 [
0.2,
pval = 0.0026
0.3 [
pva = OA
0.2.
X0.1-
-
19 0.1
-2
-1
' 'A
U
1
0
-2
2
-1
-1
1og 2 FC
Bin3
0.3 F
pva = 0.65
0.3[
C
1
2
pval = 0.76
0.2-
0.2[F
0r
0*
9 0.1
-
Z~ 0.1
-1
0
1og2 FC
1
01
-2
2
'
0
-2
-M
0
1og 2F C
Bin4
-1
-
0'
1
2
1og 2FC
Supplementary Figure 5.3 Distributions of LFCs for all genes are not affected by Lats2 UTR
decoy overexpression.
Log fold change distribution of bin X (red stairs) was overlaid on top of binO (grey bars)
background control for all RNA-Seq genes, and the means of LFCs were plotted as red line and
grey dashed line respectively. Since power of KS test depends on sample size, 500 genes (similar
to total number of miRNA targets after expression filtering) were sub-sampled from all genes.
Two-sample KS-test was performed between bin X and binO, and the differences between samples
were not significant.
111
Mouse Lats2 3' UTR
1V
.I
.
0.ik
0.2k
I.
t
0.4k
.
06k
0.7k
0m1
0.6
1k
1.1k
1.2k
1k
1.4k
Y.
.6
.
GeN
fous Lat*2 NN01577 3' UTR Iength:IM2
Conserved sites for MiNl fmaflie broadig conserved
miR-93/93&/105/IOW6a/2
iemi vertebrates
miR-135Wb/135a-5p
?-3p//9/302Abed/372/373/428/519&/520b/520cd-3p/1378/1420&c
0
mIR-25/32/92abc/363/363-4/367
IR-203
MiR-103/107/O07b
OrliMuti
Ori2Mut2
pval a0.056
0.3-
Ai
0.30.25-
0.2
0.2
Cr0.15
s0.15-
0.1
0.1
0.05-
0.05-1
U
1og 2FC
1
0-2
2
-1
1
2
Ori4lMut4
pvai =0.00045
0.3-
0
1og 2FC
Ori3Mut3
pva =0.16
0.3-
0.25-
0.25-
0.2
0.2
0.15
0.15
-
0.1
0.1
0.05
0.05
0
-2
pval = 6.7e-07
-
0.25
-2
PIR-200bc/429/548a
I
PiR-0b~/0b-p
mIR-31
-
026
V.6
--1
U
1
0
-2-1
2
1og 2FC
--
U
1og 2 FC
1
2
Supplementary Figure 5.4 miRNAs targeting Lats2 3'UTR and example of LFCs for miR-15
targets.
MicroRNAs targeting Lats2 3'UTR were predicted by TargetScanMouse version 6.2
(http://www.targetscan.org/mmu_61/). Targets of regulating miRNAs having broadly conserved
sites among vertebrates (shown above) were all tested for significance of LFCs distributions. The
results are negative, and miR-15/16 targets was shown as an example, p-values are given by KS
test.
112
boxplot of genome-wide FC
32-
.5
x
i
I
i+
1
++
0 -r
0
-1
0.
x
CL
CN
0D
0
-2-3-4
-5
OrilMutO
Or I Mutl
Ori2Mut2
OrWiMut3
Ori4lut4
Supplementary Figure 5.5 Boxplot of LFCs for sorted samples.
LFC was defined as LFC = log2 (gene expression in OriUTR transfection/ gene expression in
MutUTR transfection). LFCs for all 5 pairs of bins were all centered on 0 with few outliers. The
variation for bin2 was significantly smaller than the other 4 bins, due to experimental variation of
RNA-Seq.
113
106
:Total reads w/ plasmids
:Total reads w/o plasmids
141210CD
816
4
21
01-
plasmids reads
Tot Reads w/o plasmids(x 107)
plasmids/Tot Reads
OnO
Ori
Or2
ORIO
1080
1.22
0.01%
OR11
1273
1.18
0.01%
OR12
OR13
OR14
2807
1.15
0.02%
9750
1.04
0.09%
114445
1.26
0.90%
Or3
0r4
MutM Mut1
Mut2 Mut3
MUTO
923
0.93
MUT1
2579
1.51
MUT2
6143
0.97
MUT3
22421
0.88
0.01%
0.02%
0.06%
0.25%
Mut4
MUT4
259108
1.09
2.33%
Supplementary Figure 5.6 Expression from transfected plasmid only constitutes a small
fraction of total transcript reads from ESCs.
Total RNA-Seq reads for all sorted samples were on the order of 107 RPKM. Transcripts expressed
from transfected plasmid only constitutes a small fraction of total reads, with a percentage < 2.5%
even for the highest expressed bins.
114
5.6 References
Alimonti, A., Carracedo, A., Clohessy, J.G., Trotman, L.C., Nardella, C., Egia, A., Salmena, L.,
Sampieri, K., Haveman, W.J., Brogi, E., et al. (2010). Subtle variations in Pten dose determine
cancer susceptibility. Nat Genet 42, 454-458.
Cazalla, D., Yario, T., and Steitz, J.A. (2010). Down-regulation of a host microRNA by a
Herpesvirus saimiri noncoding RNA. Science 328, 1563-1566.
Chen, C. (2007). Defining embryonic stem cell identity using differentiation-related microRNAs
and their potential targets. Mamm Genome 18, 316-327.
Chiu, H.S., Llobet-Navas, D., Yang, X.R., Chung, W.J., Ambesi-Impiombato, A., Lyer, A., Kim,
H.R., Seviour, E.G., Luo, Z.J., Sehga, V., et al. (2015). Cupid: simultaneous reconstruction of
microRNA-target and ceRNA networks. Genome Research 25, 257-267.
Denzler, R., Agarwal, V., Stefano, J., Bartel, David P., and Stoffel, M. (2014). Assessing the
ceRNA Hypothesis with Quantitative Measurements of miRNA and Target Abundance. Molecular
Cell 54, 766-776.
Ebert, M.S., Neilson, J.R., and Sharp, P.A. (2007). MicroRNA sponges: competitive inhibitors of
small RNAs in mammalian cells. Nature Methods 4, 721-726.
Hansen, T.B., Jensen, T.I., Clausen, B.H., Bramsen, J.B., Finsen, B., Damgaard, C.K., and Kjems,
J. (2013). Natural RNA circles function as efficient microRNA sponges. Nature 495, 384-388.
Li, J.-H., Liu, S., Zhou, H., Qu, L.-H., and Yang, J.-H. (2013). starBase v2.0: decoding miRNAceRNA, miRNA-ncRNA and protein-RNA interaction networks from large-scale CLIP-Seq data.
Nucleic Acids Research.
Libri, V., Helwak, A., Miesen, P., Santhakumar, D., Borger, J.G., Kudla, G., Grey, F., Tollervey,
D., and Buck, A.H. (2012). Murine cytomegalovirus encodes a miR-27 inhibitor disguised as a
target. Proceedings of the National Academy of Sciences of the United States of America 109,
279-284.
Marson, A. (2008). Connecting microRNA genes to the core transcriptional regulatory circuitry of
embryonic stem cells. Cell 134, 521-533.
Memczak, S., Jens, M., Elefsinioti, A., Torti, F., Krueger, J., Rybak, A., Maier, L., Mackowiak,
S.D., Gregersen, L.H., Munschauer, M., et al. (2013). Circular RNAs are a large class of animal
RNAs with regulatory potency. Nature 495, 333-338.
Poliseno, L. (2010). A coding-independent function of gene and pseudogene mRNAs regulates
tumour biology. Nature 465, 1033-1038.
Salmena, L., Poliseno, L., Tay, Y., Kats, L., and Pandolfi, P.P. (2011). A ceRNA hypothesis: the
Rosetta Stone of a hidden RNA language? Cell 146, 353-358.
Sumazin, P. (2011). An extensive microRNA-mediated network of RNA-RNA interactions
regulates established oncogenic pathways in glioblastoma. Cell 147, 370-381.
Tay, Y. (2011). Coding-independent regulation of the tumor suppressor PTEN by competing
endogenous mRNAs. Cell 147, 344-357.
115
Tay, Y., Rinn, J., and Pandolfi, P.P. (2014). The multilayered complexity of ceRNA crosstalk and
competition. Nature 505, 344-352.
Wang, Y., Xu, Z., Jiang, J., Xu, C., Kang, J., Xiao, L., Wu, M., Xiong, J., Guo, X., and Liu, H.
(2013). Endogenous miRNA Sponge lincRNA-RoR Regulates Oct4, Nanog, and Sox2 in Human
Embryonic Stem Cell Self-Renewal. Developmental Cell 25, 69-80.
Wang, Z., Gerstein, M., and Snyder, M. (2009). RNA-Seq: a revolutionary tool for transcriptomics.
Nature reviews Genetics 10, 57-63.
Wee, L.M., Flores-Jasso, C.F., Salomon, W.E., and Zamore, P.D. (2012). Argonaute Divides Its
RNA Guide into Domains with Distinct Functions and RNA-Binding Properties. Cell 151, 10551067.
116
Chapter 6 Double hybridization of GFP-Lin28a3'UTR transcript
reveals a novel expression pattern
6.1 Abstract
By hybridizing CDS and 3'UTR of reporter plasmid transcripts with different color smFISH
probes, and studying their co-localization under microscopy, an expression dependent threshold
behavior was discovered for GFP-Lin28a3'UTR transcript. A significant number of isolated gfp
transcript without following Lin28a 3'UTR tail was discovered. Below an expression threshold of
100 gfp mRNA molecules, the probability of GFP having a co-localized Lin28a 3'UTR tail was
highly variable between 0 and 1. And above the threshold, the co-localization probability was
always high. This phenomenon was not caused by non-specific binding of GFP probes nor
imperfect hybridization or detection efficiencies. Stable integration of GFP-Lin28a3'UTR into the
genome recapitulated the expression pattern with an even higher transitioning threshold.
Transfection of the plasmid into Dgcr8 - ESCs partially rescued but did not abolish the threshold
behavior. The mechanism behind this novel expression pattern remains unknown, and the
possibility of alternative polyadenylation (APA) was discussed in the end.
6.2 Results
MicroRNA activity reporter plasmid pCAG-d2eGFP-Lin28a3'UTR was transiently transfected
into wild-type ESCs. We hybridized eGFP with Alexa color probes, and hybridized Lin28a3'UTR
region with Cy5 color probes. For intact transcripts expressed from transfected plasmids, we
expect to see co-localization of eGFP spot and subsequent Lin28a3'UTR spot. And overlay of the
two channel images would yield a yellow spot (Figure 6.1). However, we observed many isolated
green spots (Figure 6.2), those would correspond to gfp transcript without Lin28a 3'UTR tail. The
majority of isolated red spots would correspond to endogenous Lin28a transcripts rather than
decoy transcripts without the 5' GFP head, because the expression distribution of isolated red spots
and the distribution of endogenous Lin28a expression were compared, and they are statistically
the same (data not shown).
117
Transfection plasmid
Endogenous Lin28a
FISH probe set
Decoy mRNA
eGFP-Alexa
miRISC
Lin28-3'UTR-Cy5
Microscopy spots color and representation
intact decoy transcript
-
Lin28a 3'UTR
endogenous Lin28a CDS
-
- decoy transcript w/o 3' tail
eGFP mRNA
* endogenous Lin28a
decoy transcript w/o 5' head
Figure 6.1 GFP-Lin28a3'UTR co-localization experiment schematics.
pCAG-d2eGFP-Lin28a3'UTR plasmid was transiently transfected into wild-type ESCs, and cells
were hybridized with eGFP-Alexa probe (green) and Lin28a3'UTR-Cy5 probe (red). Cy5 probe
would bind to the 3'UTR region of both transgene transcripts and endogenous Lin28a transcripts.
An intact transcripts expressed from transfected plasmid would appear as co-localized green and
red, i.e. yellow spots.
Figure 6.2 Representative images of abnormally low co-localization of eGFP (green) and
lin28a3'UTR (red) spots.
Cells are arranged according to increasing eGFP expression. Small yellow spots correspond to colocalized red and green spots, i.e. intact transcripts transcribed from reporter plasmid. Big yellow
spots correspond to actively transcribing sites from transfected plasmids.
We then set out to quantify the co-localization of eGFP and Lin28a3'UTR spots. We noticed that
the percentage of eGFP having a co-localized Lin28a3'UTR tail was dependent on gfp expression.
118
9-
+
+
The co-localized spots is defined as one eGFP and one Lin28a3'UTR spot located within a 2-D
squared distance of 5 pixels 2 (after correction of the shifts between channels), and the z direction
distance must not differ by more than 1 stack (See Methods). The co-localization percentage is
plotted against the total number of gfp expression within the cells, which is the summation of
isolated and co-localized GFP spots. And we notice that the co-localization exhibit a sharp
transition behavior (Figure 6.3). Specifically, the co-localization percentage is stabilized to ~
0.73 + 0.13 above 100 gfp mRNA molecules / cell. But below the threshold, the co-localization
is very variable, ranging from 0 to 1. And a significant fraction of cells bear a low co-localization
percentage (<0.5) below the threshold. This expression pattern could not be explained by imperfect
hybridization and detection efficiencies (Supplementary Figure 6.1), or by non-specific binding
of GFP probes (Supplementary Figure 6.2). The co-localization percentage increase sharply at
high mRNA levels (>100 transcript per cell), nor could this be explained by increased random
chances of co-localization at higher dot densities (Supplementary Figure 6.3).
=0.1
++
+
7
-i0.*
0
* * 4 ;.*
0
y.rs
++*
6
6
* 4q
**
*
*
#8. W
0.
****4
U0.'
5
~0;
4
4
*
LL
'0. 3
4
*
.0. 2
CL 0. 1
*
**
I
IL ___
I
I
I
I
800
900
-
"4
100
200
300
700
600
500
400
GFP mRNA decoy # per cell
1000
Figure 6.3 Co-localization of gfp mRNA with Lin28a3'UTR tail.
pCAG-d2eGFP-Lin28a3'UTR was transiently transfected into wild-type ESCs, and the percentage
of gfp transcript with a co-localized Lin28a3'UTR tail is plotted against gfp mRNA levels. The
co-localization exhibits a sharp transitioning behavior. Below ~100 gfp transcripts per cell, the colocalization is very variable, and many cells bear a low co-localization percentage. Above the
threshold, the co-localization percentage is stabilized around 0.73. The red lines are plotted as
guidance to the eye.
119
To rule out the possibility that the threshold behavior is caused by the artifact of transient
transfection, we stably integrated d2eGFP-Lin28a3'UTR transgene into V19 ESCs (See Methods),
and induced its expression at a dox concentration of 2 pg/ml. A significant fraction of cells
possessing moderate to high gfp expression exhibits low co-localization percentage (Figure 6.4,
below y ~ 0.5). Interestingly, the gfp expression threshold that tolerates the low-localization is
even larger than those of transient transfection, and it is around 200 gfp transcript per cell. The
marginal distribution of co-localization percentage exhibit bi-modality. A fraction of cells centers
around 0.73 co-localization percentage, this high co-localization mode is likely to contain all intact
transcripts, and the non-perfect co-localization percentage is merely due to imperfect detection
efficiency. Another significant fraction of cells bear low co-localization, and peaks around 0.25,
this cells are likely to express a combination of intact transcripts and short UTR transcript. The
threshold also seemed to depend on dox concentration and induction time (data not shown), and
the cellular background of V19 ESCs were confirmed to be clean (no gfp expression under dox
induction before transgene integration). Also, transient transfection of neither GFP-Lats2UTR nor
GFP-Casp2UTR exhibits the threshold behavior (Supplementary Figure 6.4), indicating transient
transfection experimental approach itself generally guarantees the expression of full transcript
from delivered plasmid, and the threshold behavior is specific to Lin28a 3'UTRs.
V19 ESCs
0.9
.
+
I
0.58
3
0.
0.37
*#
***
,
0.26
#~,
~0.1~
0
0
100
300
200
400
500
600
GFPmRNA level per cell
Figure 6.4 Co-localization of GFP mRNA with Lin28 3'UTR tail for stably integrated
d2eGFP-Lin28a3'UTR in V19 ESCs.
GFP-Lin28a3'UTR was stably integrated into the collagen locus of V 19 ESCs, and its expression
was induced by doxycycline. A significant fraction of cells express isolated GFP transcripts
without Lin28a 3'UTR tail. And the marginal distribution of co-localization percentage exhibits
bimodality.
120
To study if the threshold behavior is miRNA dependent, we performed the transfection experiment
on Dgcr8 knockout ESCs, and compare the threshold with that of wild type ESCs. We observe
that the threshold is smaller in Dgcr8-- ESCs, but it's not entirely abrogated (Figure 6.5). Both
experiments are performed for three times, with both cell lines measured in parallel. The difference
between cell lines has been validated, and the experimental reproducibility has also been verified.
To further evaluate if miRNA regulation directly affects transitioning behavior, we mutated all
significant MREs on Lin28a 3'UTR, and transfect pCAG-d2eGFP-Lin28MutUTR in to wild type
ESCs, and compare the threshold with that of wild type ESCs. Similarly to Dgcr8-- ESCs, the
transitioning behavior is shifted to a smaller value (Supplementary Figure 6.5).
To explain the miRNA dependency of transitioning threshold, a simple Michaelis-Menten kinetics
model was proposed (See Supplementary model), and the threshold is predicted to be dependent
on effective miRNA concentration. According to model fitting, the effective miRNA concentration
in wild type ESCs is predicted to be 5 fold higher than that of Dgcr8~1~ ESCs (Supplementary
Figure 6.8). Thus we performed Taqman qRT-PCR to measure the expression of miRNAs
targeting Lin28a 3'UTR, however, the miRNA expression in Dgcr8-- were confirmed to be zero,
not proportional to the predicted effective miRNA concentration (Supplementary Figure 6.7).
We also tempted to shift the threshold towards higher values by increasing the effective miRNA
concentration. We co-transfected the plasmid with let-7 miRNA mimics, or performed transfection
together with retinoid acid differentiation, during which endogenous let-7 expression were
reported to increase. However both experiments were proven to be difficult as repression of let-7
on Lin28a 3'UTR strongly decreased transgene expression (data not shown).
121
Dgcr8KO ESCs
0.1
0.
6
4
*
4
U
*
~*
7
No0.
3*
5*
0.
4
S0.'
0.
14
3*
++1
100
200
8
2
L
0
0)
0
00.
CL
0
300
700
600
500
400
GFP mRNA decoy # per cell
800
900
1000
Figure 6.5 Co-localization of gfp mRNA with Lin28a 3'UTR tail in Dgcr8--cells.
pCAG-d2eGFP-Lin28a3'UTR is transiently transfected into Dgcr8-1- ESCs, which is devoid of
mature miRNA expression. The sharp transitioning behavior of co-localization still persists,
however, the transitioning threshold is shifted towards smaller value. The red line corresponds to
the threshold in WT ESCs, and it is plotted as a reference.
To see if the co-localization transitioning behavior also happens in its natural context, we hybridize
Lin28a CDS with Alexa probe, hybridize Lin28a 3'UTR with Cy5 probe, and measure the
colorization of the coding region with respect to its 3'UTR tail. Even though very rarely (<1%),
the low-localization do happen in its natural context (Supplementary Figure 6.6). It should be
noted that due to the ~70% sequence similarity between the coding regions of Lin28a and Lin28b,
the CDS probe cannot distinguish between the two homologs (The 3'UTRs of these two genes are
very different though), and the x value might be slightly skewed. By quantifying the colocalization percentage of Lin28a UTRs with a CDS spot, we found Lin28a to be the dominant
form in wild type ESCs. And Lin28 expression is much lower in Dgcr8~'~ ESCs with a much higher
proportion of Lin28b expression (data not shown).
6.3 Discussion
We discovered that co-localization of CDS and UTR regions of GFP-Lin28a3'UTRs exhibited a
threshold behavior depending on transcript expression. The shift of threshold towards smaller
122
values for Dgcr8-1- cell transfection and transfection of Lin28a MutUTR suggested that the
expression pattern was miRNA dependent. And a miRNA-mediated-decay mechanism was
proposed to explain the phenomena (Supplementary model). However, the disproportionality of
transitioning threshold and miRNA expression in wild type versus Dgcr8& cells raised a question
on the hypothesis. Moreover, the stably integrated GFP-Lin28a3'UTR was able to beat the
threshold of wild type ESCs without perturbing the endogenous miRNAs expression. It's also hard
to image how mRNA in the middle of degradation was stable enough to be observed.
Here we discuss the possibility of alternative polyadenylation (APA). APA is a widespread
regulatory mechanism that controls gene expression and expands protein diversity. Earlier studies
based on expressed sequence tag (EST) databases estimated that 54% and 32% of human and
mouse transcripts were alternatively cleaved (Tian et al., 2005), and more recent studies based on
deep sequencing brought the current estimate for human genes to be 70-75% (Derti et al., 2012;
Shi, 2012). The most common type of APA is 3'UTR APA, which utilizes alternative poly (A)
sites (PASs) located within the same terminal exon, and produces mRNA isoforms with different
length 3'UTRs without affecting the encoded protein. Lin28a OriUTR sequence with APA sites
highlighted was shown in Supplementary sequence information (Tian et al., 2005). And
evidences of APA of Lin28 orthologues in human and chicken has been documented in APADB
(http://tools.genxpro.net/apadb/), a database for mammalian alternative polyadenylation
determined by 3'-end sequencing.
If one of the proximal APA site on Lin28a OriUTR is adopted, the shorter UTR may not be long
enough to bind to sufficient number of probes to be resolved as a diffraction limited spot, and the
alternative form of transcript may appear as an isolated GFP CDS spot. Analysis on diverse human
tissues and cell lines demonstrated a substantial anti-correlation between proliferation and 3'UTR
length caused APA. Those examples include T cell activation (Sandberg et al., 2008) and various
cancer cells (Mayr and Bartel, 2009). Progressive lengthening of 3'UTRs by APA modulation was
observed during mouse embryonic development (Ji et al., 2009). And the generation of induced
pluripotent stem cells (iPSCs) from differentiated cells was accompanied by global 3'UTR
shortening (Ji and Tian, 2009). Thus the utilization of proximal PASs in mESCs is definitely
possible. Dgcr8-- ESC lacks all mature form of canonical miRNAs, and it is associated with
prolonged cell cycles. And the slower proliferation rate might explain the reduced selection of
proximal APA sites in this cell line. Incidentally, we found that one of the most proximal APA site
is mutated in Lin28a MutUTR (Supplementary sequence information), and it could potentially
explain the differences between OriUTR and MutUTR. The selection of APA site also depends on
lot of factors such as extracellular stimuli (Shell et al., 2005), transcription activity, chromatin
modifications, regulatory proteins such as splicing and 3'-end-processing factors, and RNAbinding proteins (RBPs), as reviewed in (Elkon et al., 2013). And those factors might explain the
variations among transient transfected and stably integrated transcripts, and the artificially
constructed GFP-Lin28a3'UTR with respect to endogenous Lin28a.
In the end, we proposed that by hybridizing different regions of a transcript with different color
probes and study their co-localization, we could study APA in a very quantitative manner.
123
6.4 Methods
6.4.1 Taqman microRNA expression measurements
Small RNA (<200nt) was isolated from wild-type and Dgcr8-1- mESCs using mirVANA miRNA
Isolation Kit (Ambion AM1560). Expression of the mature microRNAs regulating Lin28a 3'UTR
was assayed using Taqman microRNA assays (Life Technologies).
6.4.2 Co-localized spots detection
eGFP-Alexa and lin28UTR-Cy5 spots were first detected independently using the standard FISH
spots detection algorithm (See 2.4.11), and spots positions were recorded. Alexa and Cy5 spots
within a 2-D squared distance of 5 pixels 2 , and within a z direction absolute distance of 1 stack
were considered as co-localized spots. Each Alexa dots could at most be co-localized to one Cy5
dots and vice versa. The average position shift for all co-localized spots within one image was
calculated, and was taken as shifts between Alexa and Cy5 channels. The channel shift was
corrected for next round of co-localized spots determination. The iterative step stopped when
adjacent two rounds yielded the same channel shift, which in reality converged pretty fast and was
usually within 3 rounds. A typical Z projected image and its detected spots were shown in Figure
6.6.
Figure 6.6 Z projection of microscopy image and it's computationally detected spots.
Images of different fluorescent channels are not perfectly aligned but usually have a minor shift,
and this was calculated by taking the average position shift of co-localized spots. Spots positions
were shift corrected for next round of co-localized spots assessment. In this case, the shift between
Alexa and Cy5 channels were calculated to be [0, 1, 0].
124
-
lw
u
C)C0
OC)
0
o
*
0
00
*
000
o
0
* eGFP Alexa
lin28 Cy5
.8
0
0
o
((fDjO
0
loc
QRD
C8O
0
125
0
0 (),g
0590o
6.4.3 Stable Integration
Stepi. Insert d2eGFP-lin28a3'UTR behind the pTet regulator of ptet.splicePL3 plasmid, and
create a Tet-On system for d2eGFP-lin28a3'UTR expression.
d2eGFP-lin28a3'UTR was PCR amplified out of pCAG-d2eGFP-lin28a3'UTR plasmid with
Forward primer G-EcoRI-d2eGFP: GGAATTCACCGGTCGCCACCATGGT
Reverse primeri lin28UTR-SpeI-CC r.c.: GGACTAGTAGATCCCAGTACCAACTCTGGAG
lin28UTR-RBGpA-SpeI-CC
primer2
Reverse
GGACTAGTGATCTCCATAAGAGAAGAGGGACAGC
r.c.:
r.c.: reverse complementary.
ptet.splicePL3-OSKMpA was triple digested with EcoRI, SpeI and SphI. The digestion of SphI
was to further fragmentize the OSKMpA insert to be distinguished from the ~ 5.1kb ptet.splicePL3
backbone. The PCR amplified fragments of d2eGFP-lin28UTR was double digested with EcoRI
and Spel, and ligated into ptet.splicePL3 backbone. The positive clones were further sequence
confirmed by pTet-splice sequence primer: AGTGAAAGTCGAGCTCGGTA.
lac_promoter(5052, 5081)
T3_promoter(5151, 5170
CMV2promoter(337, 456)
NotI(13)
SpeI(557
SV40 int(750, 793)
SV40 3_Splice(822, 851)
pTet-SplicePL3
5.2 kb
NotI(2307)
f7 . ri
T7_promoter(2337, 2355)
126
SV40_int(4331, 4346)
SV40_3_splice(4361, 4399)
NotI(1)
TRE(117, 324)
CMV2_promoter(325, 444)
d2EGFP(474, 1319)
NotI(1321)
NotI(5867)
SV40PA-terminator(4975, 5260)
Figure 6.7 Plasmid map of ptet.splicePL3 and the regional linear map after insertion of
d2eGFP-lin28a3'UTR.
d2eGFP-lin28a3'UTR was cloned into ptet.splicePL3 plasmid with EcoRI and Spel. TRE:
Tetracycline response element.
Step2. Cut ptet.splicePL3-d2eGFP-lin28a3'UTR with NotI, and clone into mCol.loxneo plasmid.
This step incorporated homologous recombination arms to the transgene for genomic integration.
It also incorporated a drug resistance cassette for positive integration selection.
mCol.loxneo plasmid was digested with NotI, and dephosphated with Antarctic phosphatase (NEB
M0289S) to prevent vector self-ligation. Since one NotI cutting site exists between d2eGFP and
lin28a3'UTR, partial digestion was performed (1U NotI-pg plasmid, 370 C digest for 15minutes,
65 0 C heat inactivate for 20min), and the 5.85kb fragment was selected. Also, since the supercoil
conformation of the ptet.splicePL3-d2eGFP-lin28a3'UTR plasmid run about the same position as
the desired fragment on the gel, the plasmid was first digested with Scal and PvuI and only the
~7.7kb linearized fragment was gel extracted for downstream partial digestion. The ligated positive
clones were digestion by Sac to check for ligation directionality, and the corrected direction clones
were sequence confirmed by mCol.loxneo sequence primer: TCGCATTGTCTGAGTAGGTGT.
127
POUI(17797)
5'arm
ApR
T3arm
mCOLlal.oxNEO tetO.OSKM.
17959 bp
t
IL
ttpgNEOPA
LoxP
Notl(Sohi)
Nod (irno)
tetO.OSKM.pA
5" a
M
1C)Oxp
1430xp
=cwwnw:
4
M 0
M
4ME0N
4C01 A1 ICMU4W
Figure 6.8 Plasmid map of mCol.loxneo and illustration of homologous recombination of
transgene into collagen locus.
mCol.loxneo-d2eGFP-lin28a3'UTR plasmid was linearized with PvuI, the 5' and 3' contain
homologous arms toward the 3' untranslated region of the Coll al (collagen, type l, al locus). The
linearized vector also contains a pgk (phosphoglycerine kinase) driven neo (neomycin) resistance
cassette for the selection of successful transgenic cells.
Step 3. Electroporation of mCol.loxneo-d2eGFP-lin28a3'UTR plasmid into V19 ES cell line, in
which rtTA is constitutively expressed, and drug selection for stable integration.
128
mCol.loxneo targeting vector contains both 5' and 3' homology arms toward the 3' untranslated
region of the Col Ial locus as well as a pgk-driven neo resistance cassette for selection of transgenic
cells. The resulting ~16.5 kb targeting construct (mCol.loxneo-d2eGFP-lin28a3'UTR) was
linearized with PvuI restriction enzyme digestion (20 jig), precipitated and resuspended in lml of
lx PBS, which contained 500,000 V19 ESCs (V6.5 ESCs containing a reverse tetracycline transactivator (M2rtTA) driven by the Rosa26 promoter), and electroporated at 400V, 25 pF for 1 pulse.
The cells were plated onto two 10-cm plates, which were gelatinized, and pre-plated with neoresistant MEF (Global Stem). After 24 h, G418 (Geneticin(R), GIBCO 10131) was added to ESC
medium at a concentration of 350 pg per ml. Neo-resistant colonies were picked 10 days later,
expanded and tested with dox (Doxycycline hyclate, Sigma-Aldrich D9891) induction at a
concentration of 2 jig/ml.
Figure 6.9 Illustration of dox induction of d2eGFP-lin28a3'UTR expression in V19 ESCs.
V19 ESCs is derived from C57BL6/J background. The cell line is same as V6.5 ESCs, except that
it contains a reverse tetracycline trans-activator (M2rtTA) gene driven by the Rosa26 (Reverse
orientation splice acceptor) promoter and rtTA is constitutively expressed in V19 cells. The
'reverse' Tet repressor (rTetR) domain of rtTA binds TetOP (tetracycline/doxycycline-responsive
operator) and activates the expression of d2eGFP-lin28a3'UTR in the presence of doxycycline.
ptet.splicePL3-OSKMpA plasmid, mCol.loxneo plasmid and V19 ESCs were all gifts from Laurie
Boyer's lab, and were originally created by Rudolf Jaenisch's lab.
129
6.5 Supplementary
I
6.5.1 Supplementary figures
14
I
I
W.C
I
I
I
I
I
I
0.8
A. *i~
..
A.
~.A..... A.
*
5 0.7
0
0.
r- 0.6
*
0
'I
0.3
0
73
0.3
flI
E0~
U.2
0.1 I
-
100
i
i
I
I
I
I
I
I
800
900
-
I -
200
700
600
500
400
300
simulated GFP mRNA decoy # per cell
1000
Supplementary Figure 6.1 Simulated co-localization of gfp mRNA with Lin28a3'UTR tail
assuming a detection efficiency of 0.73.
The hybridization and detection efficiency of transcript spots is not perfect. To rule out the
possibility that imperfect resolution results in the observed threshold effect, we simulated the colocalization for various transcript expression. The probability of each GFP spots having a colocalized lin28 spots detected was set to be 0.73, same as the stabilized average co-localization
percentage. And the co-localization between spots are independent of each other. 100 simulations
were performed for each integer transcript level. The horizontal red line is the average colocalization percentage for simulated dots. The vertical red line is the same as in Figure 6.3, and
is plotted for visual guidance. The simulation indicates that the observed co-localization threshold
effect is not due to imperfect detection efficiency.
130
Supplementary Figure 6.2 Low co-localization at low mRNA level is NOT due to non-specific
binding of the GFP probes inside ESCs.
(a) ESCs were mock transfected and hybridized with GFP-Alexa probes. No false positive GFP
dots were observed in the absence of the reporter plasmid. (b) ESCs were transfected with reporter
plasmids and hybridized with GFP-Alexa. Even the dots in the lowly expressing cells are authentic
GFP spots. Both images are set to the same contrast.
constant shift of GFP dots
randomly generalized GFP dots
constant offset=[1 0, -10, -2]
1
1
0.9-
nA
0.8-
0.7
0.7-
o 0.6
0.6-
0.5
0.5
-
-
0.9
0
-
'
-r
-
0.4
0.4
t+
0.3
3' 0.3
^^
0.2
*
*
4
0.1
0.1
Ak
0
o
200
400
600
800
1000
GFPmRNA decoys # per cell
131
0
200
*
+*
-
~
;
t~
+
i
400
600
800
GFPmRNA decoys # per cell
*~
1000
Supplementary Figure 6.3 Increased random chances of co-localization at higher dot
densities does not result in the sharp increase of co-localization with Lin28a 3'UTR.
Left, GFP dots are shifted by a big constant offset and their co-localization percentage with original
Lin28-Cy5 dots is re-calculated. Right, same number of GFP dots are randomly generated within
the same cell, and the co-localization is re-calculated. Both way of GFP dots position perturbation
only result in a slight increase (<0.1) in co-localization percentage at highest dot densities. In
reality, the co-localization does not further increase for higher dot density due to poorer resolution
for densely connected dots (>800/per cell). And the sharp increase in co-localization percentage
of GFP dots to Lin28a 3'UTR tail is not caused by increased random chances of co-localization at
higher dot densities.
pCAG-eGFP-lats2aUTR
pCAG-eGFP-Casp2UTR
~4S-
+*
+
+
+
+
+
4+
+
+
(A 08
+
~07
4
U06
0
0
0
0
~04
C.
CL
07
+
+
O05 I.
U
6.
06
05
S04
U'- 03
0
0 02
CU
M 0
E0o
++
+
+
+
*+
++
+
-
*
+4,
W08
L-)
09
++
+
+
09
* + +
'L03
0
* 02
CM
p.
p.
~01
4
CL0
C)
200
400
600
800
1000
U
7
GFP mRNA number per cell
200
400
600
800
1000
GFP mRNA number per cell
Supplementary Figure 6.4 Co-localization of GFP mRNA with Casp2 or Lats2 3'UTR tail.
pCAG-d2eGFP-Casp2UTR or pCAG-d2eGFP-Lats2UTR were transiently transfected into wildtype ESCs. GFP transcript was hybridized with Alexa probe and 3'UTR of Casp2 or Last2a were
hybridized with Cy5 probe. The co-localization is calculated as in the Lin28a case. The colocalization of neither UTRs exhibits the threshold behavior, and the sharp transition is unique to
Lin28a 3'UTR.
132
Lin28aMutUTR, WT ESCs
I
00.9
0.8
+*
+
+*,+
*+
CJ
+*+
+4.
07
++
+E+
0.6
0.2
S0.5
ii
0
,6- 0.3
0
0
CD
0.1
oC
F
100
200
300
700
600
500
400
GFP mRNA decoy # per cell
800
900
1000
Supplementary Figure 6.5 Co-localization of gfp mRNA with mutated Lin28a 3'UTR tail in
wild type ESCs.
MicroRNA regulating elements (MRE) on Lin28a 3'UTR is mutated, and pCAG-d2eGFPLin28MutUTR is transiently transfected into wild type ESCs. The sharp transitioning behavior of
co-localization still persists, however, the transitioning threshold is shifted towards smaller value.
The red line corresponds to the threshold in WT ESCs, and it is plotted as a reference.
133
WT ESCs
0.8 -
+
+
I4
-
n0Co 0.7
~C)
c .O 0.6 --C.44
j
*e
0.5
0.1
0
0
100
300
200
Lin28 CDS number per cell
400
500
Supplementary Figure 6.6 Co-localization of endogenous Lin28a CDS with respect to its
3'UTR tail.
Lin28a CDS is hybridized with Alexa probe and the 3'UTR is hybridized with Cy5 probe. The colocalization of endogenous coding sequence with respect to its 3'UTR tail is measured. Out of 66
cells expressing less than 100 endogenous Lin28a transcripts, two cells are observed to have low
co-localization percentage.
134
WT ESC, No MEF feeder layer
4030S2010-
I
I.
~
let-7a
f
i
let-7c
miR30 miR125 miR1:
-
-
Il
RNU6BSNO2C
niR294 cel-lin4
H20
Dgcr8KO ESC, No MEF feeder layer
403020-
10
i
0
RNU6B SN0202 let-7a
let-7c
miR30 miR125 mi R130miR294 cel-lin4
-
smallRNA<+>control
I
II
--
+-
Regulating miRNAs
ref
miRNA
H20
<-> control
Supplementary Figure 6.7 Taqman qRT-PCR measurements of selected mature microRNA
expression.
Expression of selected mus musculus mature microRNAs were measured using Taqman
microRNA assays. let-7a, let-7c, miR-30, miR-125 and miR-130 were all experimentally validated
regulating miRNAs of Lin28a 3'UTR. U6 RNA RNU6B and small nucleolar RNA SN0202 were
used as normalizing controls. C. elegans miRNA cel-lin4 and water were used as negative controls.
miR-294, the most abundantly expressed miRNA in ESCs was measured as a reference. n=3
technical replicates for all measurements.
135
6.5.2 Supplementary model
miRNA-mediated-decay model to explain the co-localization transition curve
intact mRNA (mc)
gene
degrading mRNA (mi)
kp,,yA
-AAA
in
r
I
kd
protein
9
degraded
mRNA
s
Intact mRNA (me) will be tagged with RNA FISH probes against both CDS and 3'UTR (yellow
dot), and degrading mRNA (mi) will only be tagged with RNA FISH probes against CDS (green
dot).
kpo1yA is the rate at which the polyA-tail is deadenylated (and some of the 3'UTR is digested), and
it depends on both microRNA mediated degradation as well as a 'constitutive' degradation of the
mRNA.
kd is the rate at which the mRNA is digested once the polyA-tail is removed. We assume this rate
to be constant.
The differential equations for intact and degrading mRNAs are
dme
t - kpoIyA
dt
* MC
dt
And the steady state for both forms of transcripts are
MC t
kpolyA
M'kpolyA
* Meckd
And the co-localization percentage at particular total transcript level is
Coloc
-
MC~ =
co f
1
+=F 1+kpolyA/kd
Assume the miRNA mediated degradation follows Michaelis-Menten kinetics, i.e.
kpoIyA = dm +
e
136
6: effective concentration of miRNAs regulating Lin28a 3'UTR
A: dissociation constant
din: constitutive degradation rate
By substituting kpolyA expression into coloc
1
6k
~
100C
+ d m+ O/k d
kd
A+mC
coloc =
0.-
4*
0.96-**
WT
ESC
0.4 -,i
0.3
"+++
++
.+
+*
S0.2
+++
*
*$
+Dgcr8KO
++
++
0.7Dgcr8KO
6
*4
+
0r
++**4*
*
50
/*
+,----WT
ESC
fit, effective mi RNA con=68
fit, effective miRNA con=1 3
100
150
200
250
300
decoy mRNA # per cell
Supplementary Figure 6.8 Model fitting of co-localization curve.
The average co-localization percentage at different decoy mRNA levels were calculated, and the
data was fitted with proposed model. Effective regulating miRNA concentration in wild type ESCs
was predicted to be ~5 fold as high as that of Dgcr8*- ESCs.
137
6.5.3 Supplementary sequence information
Lin28a OriUTR sequence with APA sites highlighted in yellow and canonical PAS in red
GGCCCAGGAGTCAGGGTTATTCTTTGGCTAATGGGGAGTTTAAGGAAAGAGGCATCAATCTGCAGAGT
GGAGAAAGTGGGGGTAAGGGTGGGTTGCGTGGGTAGCTTGCACTGCCGTGTCTCAGGCCGGGGTTCC
CAGTGTCACCCTGTCTTTCCTTGGAGGGAAGGAAAGGATGAGACAAAGGAACTCCTACCACACTCTATC
TGAAAGCAAGTGAAGGCTTTTGTGGGGAGGAACCACCCTAGAACCCGAGG CTTTGCCAAGTGGCTGGG
CTAGGGAAGTTCTTTTGTAGAAGGCTGTGTGATATTTCCCTTGCCAGACGGGAAGCGAAACAAGTGTCA
AACCAAGATTACTGAACCTACCCCTCCAGCTACTATGTTCTGGGGAAGGGACTCCCAGGAGCAGGGCGA
GGTTATTTTCACACCGTGCTTATTCATAACCCTGTCCTTTGGTGCTGTGCTGGGAATGGTCTCTAGCAACG
GGTTGTGATGACAGGCAAAGAGGGTGGTTGGGGAGACAACTGCTGACCTGCTGCCCACACCTCACTCC
CAGCCCTTTCTGGGCCAATGGGATTTTAATTTATTTGCTCCCTTAGGTAACTGCACCTTGGGTCCCACTTT
CTCCAGGATGCCAACTGCACTATCTACGTGCGAATGACGTATCTTGTGCG I I I I I I I I I I I I I AA1TFTTA
AAAT1TTTTCATCTTCTTAATATAAATAATGGGTTTGTATTTTTGTATATTTTAATCTTAAGGCCCTCATT
CCTGCACTGTGTTCTCAGGTACATGAGCAATCTCAGGGATAATAAGTCCGTAGCAGCTCCAGGTCTGCTC
AGCAGGAATACTTTGTT1GTTTTGTTTTGATCACCATGGAGACCAACCATTTGGAGTGCACAGCCTGTT
GAACTACCTCATTTTTGCCGATTACAGCTGGCTTTTCTGCCATAGCGTCCTTGAAAAATGTGTCTCACGGG
TTTCGATTGAGCTGCCCCAAGACTTGATCTGGATTTGGCAAAACATAGGACATCACTCTAAACAGGAAA
GGGTGGTACAGAGACATTAAAAGGCTGGGCCAGGTGAAAGGCACAAGAGGAACTTTCCATACCAGATC
CATCCTTTTGCCAGATTAGTGGAAGCCTGCCATGCACAGCAGGGTGTGAGAGAGAGAGTGTGTATGTAT
GTGTGTGTGGATTT11T1TAATGCAAATTTATGAAGACGAGGTGGGTTTTGTTTATTTGATTGC1TT1TGT
GCTGGGGATGGAATCTTGGGCTTCATTTGTGCTAGGAAGTACACTGCCACTGAGTTATCCCAGTAAGAA
TGCAACTTAAGACCAGTACCCTTATTCCCACACTGTGCTGTCCAGGCATGGGAACATGAGGCAGGGACT
CAACTCCTTAGCCTTTCACAATCTTGGCTTTCTGAGAGACTCATGAGTATGGGCCTCAGTGGCAAGTGTC
CTGCCCTGCTGTAGCGTGATGGTTGATAGCTAAAGGAAAGAGGGGGTGGGGAGTTTCGTTTACATGCTT
TGAGATCGCCACAAACCTACCTCACTGTGTTGAAACGGGACAAATGCAATAGAACACATTGGGTGGTGT
GTGTGTGTGTCTGATCTTGGTTTCTTGTCTCCCTCTCCCCCCAAATGCTGCCCTCACCCCTAGTTAATTGTA
TTCGTCTGGCCTTTGTAGGACTTTACTGTCTCTGAGTTGGTGATTGCTAGGTGGCCTAGTTGTGTAAATA
TAAATGTGTTGGTCTTCATGTTCTTTTGGGGTTTTATTGTTTACAAAACTTTTGTTGTATTGAGAGAAAAA
TAGCCAAAGCATCTTTGACAGAAAGCTCTGCACCAGACAACACCATCTGAAACTTAAATGTGCGGTCCTC
TTCTCAAAGTGAACCTCTGGGACCATGGCTTATCCTTACCTGTTCCTCCTGTGTCTCCCATTCTGGACCAC
AGTGACCTTCAGACAGCCCCTCTTCTCCCTCGTAAGAAAACTTAGGCTCATTTACTCTTTGAGCATCTCT
GTAACTCTTGAAGGACCCATGTGAAAATTCTGAAGAAGCCAGGAACCTCATTCTTTCCTTGTCCCTAACT
CAGTGAAGAGTTTTGGTTGGTGGTTTTGAGACAGGGCCTCACTCTGTAGCTGGAGATAGAGAGCCTCGG
GTTCCTGGCTCTCCTCCTGCCTTCTGCACAGAGTCCCCTGTGCAGGGATTGCAGGTGCCGCTTCTCCCTG
GCAAGACCATTTATTTCATGGTGTGATTCGCCTTTGGATGGATCAAACCAATGTAATCTGTCACCCTTAG
GTCGAGAGAAGCAATTGTGGGGCCTTCCATGTAGAAAGTTGGAATCTGGACACCAGAAAAGGGACTAT
GAATGTACAGTGAGTCACTCAGGAACTTAATGCCGGTGCAAGAAACTTATGTCAAAGAGGCCACAAGAT
TGTTACTAGGAGACGGACGAATGTATCTCCATGTTTACTGCTAGAAACCAAAGCTTTGTGAGAAATCTTG
AATTTATGGGGAGGGTGGGAAAGGGTGTACTTGTCTGTCCTTTCCCCATCTCTTTCCTGAACTGCAGGAG
ACTAAGGCCCCCCACCCCCCGGGGCTTGGATGACCCCCACCCCTGCCTGGGGTGTTTTATTTCCTAGTTG
ATTTTTACTGTACCCGGGCCCTTGTATTCCTATCGTATAATCATCCTGTGACACATGCTGACTTTTCCTTCC
ACTTATTGGTACTCCAGAGTTGGTACTG
CTTCTCTTCCCTGGGAA
138
Lin28a MutUTR sequence with APA sites highlighted in yellow and canonical PAS in red
GGCCCAGGAGTCAGGGTTATTATGTGGCTAATGGGGAGTTTAAGGAAAGAGGCATCAATCTGCAGAGT
GGAGAAAGTGGGGGTAAGGGTGGGTTGCGTGGGTAGCTTGAACGGACGTGTCTCAGGCCGGGGTTCC
CAGTGTCACCCTGTCTTTCCTTGGAGGGAAGGAAAGGATGAGGCAAAGGAACTCCTACCACACTCTATC
TGAAAGCAAGTGAAGGCTTTTGTGGGGGAGGAACCACCCTAGAACCCGAGGCTTTGACCAGTGGCTGG
GCTAGGGAAGTTCTTTTGTAGAAGGCTGTGTGATATTTCCCTTGCCAGACGGGAAGCGAAACAAGTGTC
AAACCAAGATTACTGAACCTACCCCTCCAGCTACTATGTTCTGGGGAAGGGACTCCCAGGAGCAGGACG
AGGTTATTTTCACACCGTGCTTATTCATAACCCTGTCCTTTGGTGCTGTGCTGGGAATGGTCTCTAGCAAC
GGGTTGTGATGACAGGCAAAGAGGGTGGTTGGGGGAGACAACTGCAGACCTTCGGCCCACACCTCACT
CCCAGCCCTTTCTGGGCCAATGGGATTTTAATTTATTTGCTCCCTTAGGTAACTGCAACGTGGGTCCCACT
TTCTCCAGGATGCCAACTGAACGATCTACGTGCGAATGACGTATCTTGTGCGTTC I I I I I I I I I IIAATTTT
TAAAATTTTTTTTCCTCTTCTTAAAATAAGTAATGGGTTTGTATTTTTTTCTATTTTAATCTTCCGGCCCTCA
TTCCTGCCCTTTGTTCTCAGGTACATGAGCAATCTCCGTGATAATAAGTCCGTAGCAGCTCCAGGTCTGCT
CAGCCGTAATACTTTGTTTTTTGTTTTGATCACCATGGAGACCAACCATTTGGAGTGCACAGCCTGTT
GAACTAACGCATTTUTGCCGATTACAGCTGGCTTTTCTGCAAGAGCGTCCTTGAAAAATGTGTCTCACGG
GTTTCGATTGAGCTGCCCCAAGACTTGATCTGGATTTGGCAAAACATAGGACATCACTCTAAACAGGAA
AGGGTGGTACAGAGACATTAAAAGGCTGGGCCAGGTAAAAGGCACAAGAGGAACTTTCCATACCAGAT
CCATCCTTTTGCCAGATTAGTGGAAGCCTGCCATGCACAGCCGTGTGTGAGAGAGAGAGTGTGTATGTA
TGTGTGTGTGGATTTTTTTTAATTCCAATTTATGAAGACGAGGTGGGTTTTGTTTATTTGATTGC1T1T1GT
GCTGGGGATAGAATCTTGGGCTTCATTTGTGCTAGGAAGTACACGGACACTGAGTTATCCCAGTAAGAA
TTCCACTTAAGACCAGTACCCTTATTCCCACACTGTGCTGTCCAGGCATGGGAACATGAGGCAGGGACTC
AACTCCTTAGCCTTTCACAATCTTGGCTTTCAGAGAGACTCATGAGTATGGGCCTCAGTGGCAAGTGTCC
TGCCCTTCGGTAGCATGATGGTTGATAGCTAAAGGAAAGAGGGGGTGGGGAGTTTCGTTGAAATGCTG
TTAGATCGCCAGAAACCTAACGCACTGTGTTGAAACGGGACAAATTCCATAGAACACATTGGGTGGTGT
GTGTGTGTGTCTGATCTTGGTTTCTTGTCTCCCTCTCCCCCCAAATTCGGCCCTCACCCCTAGTTAATTGTA
TTCGTCTGGCCTTTGTAGGACTTTTACTGTCTCTGAGTTGGTGATTGCTAGGTGGCCTAGTTGTGTAAATA
TAAATGTGTTGGTCTTCATGTTCTTTTGGGGTTTTATTGTTGAAAAAACTTTTGTTGTATTGAGAGAAAAA
TAGCCAAAGCATCTTTGACAGAAAGCTCTGCACCAGACAACACCATCTGAAACTTAAATGTGCGGTCCTC
TTCTCAAAGTGAACCTCTGGGACCATGGCTTATCCTTACCTGCTCCTCCTGTGTCTCCCATTCTGGACCAC
AGTGACCTTCAGACAGCCCCTCTTCTCCCTCGTAAGAAAACTTAGGCTCATTTACTTCTTTGAGCATCTCT
GTAACTCTTGAAGGACCCAGGTTAAAATTCTGAAGAAGCCAGGAACCTCATTATGTCCTTGTCCCTAACT
CAGTGAAGAGTTTTGGTTGGTGGTTGTTAGACAGGGCCTCACTCTGTAGCTGGAGATAGAGAGCCTCGG
GTTCCTGGCTCTCCTCCTGCCTTCTGCACAGAGTCCCCTGTGCAGGGCTTGCAGGTGCCGCTTCTCCCTG
GCAAGACCATTTATTTCATGGTGTGATTCGCCTTTGGATGGATCAAACCAATGTAATCTGTCACCCTTAG
GTCGAGAGAAGCAATTGTGGGGCCTTCCATGTAGAAAGTTGGAATCTGGACACCAGAAAAGGGACTAT
GACTTTACAGTGAGTCACTCAGGAACTTAATGCCGGTGCAAGAAACTTATGTCAAAGAGGCCACAAGAT
TGTTACTAGGAGACGGACGACTTTATCTCCATGTTGAATGCTAGAAACCAAAGCTTTGTGAGAAATCTTG
AATTTATGGGGAGGGTGGGAAAGGGTGTACTTGTCTGTCCTTTCCCCATCTCTTTCCTGAACTGCAGGAG
ACTAAGGCCCCCCACCCCCCGGGGCTTGGATGACCCCCACCCCTGCCTGGGGTGTTTTATTTCCTAGTTG
ATTTTTAATGGACCCGGGCCCTTTTCTTCCTATCGTATAATCATCCTGTGACACATGCTGACTTTTCCTTCC
ACTTATTGGTACTCCAGAGTTGGGAATG
CTTCTCTTCCCTGGGAA
139
6.6 References
Derti, A., Garrett-Engele, P., Maclsaac, K.D., Stevens, R.C., Sriram, S., Chen, R., Rohl, C.A.,
Johnson, J.M., and Babak, T. (2012). A quantitative atlas of polyadenylation in five mammals.
Genome Research.
Elkon, R., Ugalde, A.P., and Agami, R. (2013). Alternative cleavage and polyadenylation: extent,
regulation and function. Nat Rev Genet 14, 496-506.
Ji, Z., Lee, J.Y., Pan, Z., Jiang, B., and Tian, B. (2009). Progressive lengthening of 3' untranslated
regions of mRNAs by alternative polyadenylation during mouse embryonic development.
Proceedings of the National Academy of Sciences of the United States of America 106, 70287033.
Ji, Z., and Tian, B. (2009). Reprogramming of 3' Untranslated Regions of mRNAs by Alternative
Polyadenylation in Generation of Pluripotent Stem Cells from Different Cell Types. PLoS ONE 4,
e8419.
Mayr, C., and Bartel, D.P. (2009). Widespread shortening of 3' UTRs by alternative cleavage
and polyadenylation activates oncogenes in cancer cells. Cell 138, 673.
Sandberg, R., Neilson, J.R., Sarma, A., Sharp, P.A., and Burge, C.B. (2008). Proliferating Cells
Express mRNAs with Shortened 3' Untranslated Regions and Fewer MicroRNA Target Sites.
Science 320, 1643-1647.
Shell, S.A., Hesse, C., Morris, S.M., and Milcarek, C. (2005). Elevated Levels of the 64-kDa
Cleavage Stimulatory Factor (CstF-64) in Lipopolysaccharide-stimulated Macrophages Influence
Gene Expression and Induce Alternative Poly(A) Site Selection. Journal of Biological Chemistry
280, 39950-39961.
Shi, Y. (2012). Alternative polyadenylation: New insights from global analyses. RNA 18, 21052117.
Tian, B., Hu, J., Zhang, H., and Lutz, C.S. (2005). A large-scale analysis of mRNA polyadenylation
of human and mouse genes. Nucleic Acids Research 33, 201-212.
140
Chapter 7 Conclusions and Perspectives
In summary, this thesis constructed a reporter system for 3'UTRs of genes to investigate
combinatorial effect of miRNA regulation on its endogenous targets. MutUTR was proposed as
general, effective and flexible miRNA unregulated control, and no genetic modifications of
cellular background was needed. MicroRNA regulation at the transcriptional and translational
level was quantified at single cell resolution over a target expression range of more than 100 fold.
Its first order (repression strength) and second order (noise control) effects were quantified, and
the potential mechanisms and consequences have been discussed. The reporter system could also
be used as a natural sponge to titrate away miRNAs. Combined with high throughput techniques,
miRNA-mediated-crosstalk could be studied at genome-wide level. A novel expression pattern
was discovered for GFP-Lin28a3'UTR, and the phenomenon was miRNA dependent. We believe
that the results of this work represent an important step in quantification of miRNA regulation at
single cell levels. Below we discuss some of ideas for possible directions that may results from
our work.
7.1 Future Directions
Chapter 2 During the design of MutUTRs, miRNA response elements (MREs) with significant
context score (targeting efficacy) and conserved probability were mutated irrespective of the
expression of targeting miRNAs in ESCs. Thus in principle, MutUTRs could be used in cell
context other than ESCs. The versatility of MutUTRs remains to be validated in other systems.
The empirically chosen thresholds used in the mutation algorithm for selection of mutation sites
could be fine-tuned or tailored in future. And the mutation algorithm could be easily adapted to
study the effect from a particular miRNA, and other regulatory elements such as AU-rich elements.
Chapter 3 The quantification of miRNA regulation at the transcriptional and translational level
can be extended in the following directions. Models can be built to explain the observed regulation
transfer functions at both levels. Previously, a molecular titration model was built to fit the
threshold behavior of miRNA regulation (Mukherji et al., 2011). It explains the decrease of
miRNA regulation for high target expression region, but this model miss the initial increase of
regulation for low target expression region for certain targets.
Of course additional experimental supports are needed to corroborate this initial increase. Any
conclusions extracted from the low target expression region is susceptible to background issue,
and a background free method would be especially helpful. Targeted mass spectrometry (MS)
technique could be further pursued. Alternatively, the luciferase reporter which is also background
free could be constructed.
It might be especially useful to link our quantitative observation with molecular mechanisms. And
certain miRNA regulation pathway mutants (e.g. GW1 82 mutant), or constructs that specifically
eliminate the possibility of miRNA regulation at certain stage (e.g. decapped, IRES containing
constructs) could be especially useful.
Recent studies reveal another dimension of miRNA regulation, and show that translational
inhibition and transcriptional degradation dominates at different times after miRNA activity
141
induction (Eichhorn et al., 2014). Reporter system could easily be applied to study this, and
integrate another temporal dimension to reveal the full map for miRNA regulation dynamics.
It is still an unresolved issue why genome-wide (Baek, 2008; Eichhorn et al., 2014; Guo et al.,
2010; Hendrickson, 2009; Selbach, 2008) and single gene analyses (Behm-Ansmant, 2006; Eulalio,
2007; Filipowicz et al., 2008; Poy et al., 2004; Zhao et al., 2005) usually arrive at different
conclusions about the relative contribution from transcriptional regulation. The differences could
come from single miRNA perturbation commonly used in genome-wide assays, and the
combinatorial effect of miRNAs regulation on endogenous genes. We hypothesize that the
discrepancy might merely reflect different modes of regulation at different target expression
regions. MicroRNAs preferentially target lowly expressed genes, while selectively avoiding
ubiquitous and highly expressed genes (Farh, 2005; Sood et al., 2006). And reporter assays usually
overexpress the reporter construct. If the initial increase for translational repression is true, the
difference from the two approaches simply reflects the change of relative transcriptional
contributions for low and medium/high target expression regions.
Chapter 4 We have studied miRNA regulation of target expression noise at protein level. The
next step is to explore noise control at mRNA level, and how noise propagate from one level to
the next.
Our initial studies show that miRNAs decrease protein noise at low protein expression, but increase
noise at high protein expression (Figure 7.a and (Jmrn M. Schmiedel, 2015)). The differences of
mRNA expression noise is not that obvious, and the two overlap for the measurable region (Figure
7.1b). If we compare transcript noise and protein noise at the same target abundance (i.e. same
indicator protein level), we observe that translation from few transcripts increases noise while
translation could suppress noise at high transcript levels. The crossover of mRNA and protein
noise happens for both OriUTR (reg) and MutUTR (unreg). And the crossover is postponed to
higher target abundance for OriUTR, which makes sense because miRNA-mediated transcript
degradation reduces the effective level of OriUTR transcripts (Figure 7.2). Here we only presented
Casp2 3'UTR as an example, other UTRs yield similar result (data not shown). It is of interest to
understand mechanisms behind the observation, and its biological consequences.
142
protein noise
mRNA noise
0.6r
0.5
-Casp2OriUTR
-Casp2MutUTR
-Casp2OrlUTR
-Casp2MutUTR
05
0.4j
OA
A3
I.
OA
&035
tX 0.35
0.3
0.3
20.25
z
0.2
1-2
0.
0.2
0.2
u-0.15
-
0
0.1
0.1
0.06
0.06
a
10
10
10
10
b
5
4
10
103
102
10
10
10
GFP mRNA
GFP protein
Figure 7.1 miRNA regulation of protein and mRNA noise.
pCAG-d2eGFP-Casp2Ori/MutUTR was co-transfected with pCAG-mCherry into wild-type
mESCs. (a) miRNAs decrease protein noise at low protein expression, but increase noise at high
protein expression. (b) Expression noise of OriUTR and MutUTR are overlapping at mRNA level.
1.8
0.5
-Casp2Or
Casp2Or
-Casp2ut
-Casp2Mut
0.45
0.4
0
C
+
mRNA
Pro
mRNA
Pro
1.7
cc 1.5
0.3
1.4
0.25
1.3
0.2
1.2
0.15
Z 1.1
0.1
E 1
0.05
0.9
3
3.5
4.5
4
logI0(mCherry)
5
re gulated
U nregulated
,1.6
0.35
i
--
.0 I
g
A'
5.5
3.5
4.5
4
logi 0(mCherry)
5
5.5
Figure 7.2 Noise propagation from mRNA to protein level for different target abundance.
pCAG-d2eGFP-Casp2Ori/MutUTR was co-transfected with pCAG-mCherry into wild-type
mESCs. Transcript noise and protein noise were quantified for different indicator protein levels.
Translation from few transcripts increases noise while it suppresses noise at high transcript levels.
The crossover of mRNA and protein noise happens for both OriUTR (reg) and MutUTR (unreg).
And the crossover is shifted to higher target abundance for OriUTR.
Chapter 5 No miRNA-mediated-crosstalk was found for targets of miRNAs which have MREs
on Lats2a 3'UTR even under the highest decoy expression condition. Modeling predicts that
143
as either miRNA abundance or the number of miRNA-binding sites increases, miRNAs become
increasingly refractory to competition by changes in the concentration of individual RNA target
species (Ala et al., 2013; Mukherji et al., 2011; Wee et al., 2012). In the case of on Lats2a 3'UTR
overexpression, even though added MREs was comparable to or much higher than the expression
of regulating miRNAs, high abundance of endogenous target sites prevent effective crosstalk. Our
result is consistent with the recent model proposed by (Denzler et al., 2014). The number of added
MREs required for miRNA target derepression is independent of miRNA levels, but relies on
endogenous target site abundance, which usually exceeds that of the miRNAs. ceRNAs must
begin to approach the target site abundance of miRNA before they can exert a consequential effect
on the repression of targets for that miRNA, which rarely occurs in vivo. Thus it is very natural to
wonder why siRNA knockdown of endogenous PTEN, which is less than 40 transcript per cell
(Apratim Sahay, personal communication), is able to induce miRNA-mediated-crosstalk. The
sensitivity might attribute to the fact that PTEN is a haploinsufficient tumor suppressor, and even
20% decrease in expression can promote cancer growth (Alimonti et al., 2010). Downstream
signaling pathway such as PI3K/AKT phosphorylation might able to amplify the signal. Other
mechanisms such as subcellular localization thus local enrichment of regulating elements, and
ceRNAs network topology might also augment the crosstalk effect. Some mechanisms were
employed by miRNA decoys to boost sponging potencies. ncRNA HSUR-1 has miRNA catalytic
activity, and can elicit degradation of the bound miRNAs (Cazalla et al., 2010). circRNA CDR1as
contains tandem (>70) binding sites for miR-7, and the circular structure is resistant to nucleases
activity and is especially stable (Hansen et al., 2013; Memczak et al., 2013).
Chapter 6 Alternative polyadenylation (APA) was proposed to explain the observed colocalization transitioning behavior for GFP-Lin28a3'UTR transcripts. Northern blot, qRT-PCR or
3' RACE (rapid amplification of cDNA ends) could be immediately applied to evaluate the
possibility of APA (Elkon et al., 2013). And if APA indeed explains the observed phenomenon,
mutation of alternative Poly (A) sites could be employed to study the transition quantitatively in
detail. Additionally, progressive lengthening of 3'UTRs by APA modulation was observed during
mouse embryonic development (Ji et al., 2009). The high incidence of APA in cancer cell lines,
with a consequential loss of 3'UTR miRNA response elements, suggests a pervasive role for APA
in oncogene activation without genetic alteration (Mayr and Bartel, 2009). Examples include
HMGA2, one of the endogenous UTRs used in our previous studies (Mukherji et al., 2011). Thus
it might be intriguing to label different regions of APA transcripts with different color probes, and
study their co-localization quantitatively at single cell level upon ES cell differentiation or
oncogenic transformation. For APA form of miRNA targets, it is especially interesting to study
how the loss of miRNA targeting sites affects transcript stability and translation efficiency, and
what are the biological consequences. The only caveat is that transcript length between alternative
polyadenylation sites has to be long enough to be resolved as diffraction limited spots under
microscopy.
144
7.2 References
Ala, U., Karreth, F.A., Bosia, C., Pagnani, A., Taulli, R., Leopold, V., Tay, Y., Provero, P.,
Zecchina, R., and Pandolfi, P.P. (2013). Integrated transcriptional and competitive endogenous
RNA networks are cross-regulated in permissive molecular environments. Proceedings of the
National Academy of Sciences 110, 7154-7159.
Alimonti, A., Carracedo, A., Clohessy, J.G., Trotman, L.C., Nardella, C., Egia, A., Salmena, L.,
Sampieri, K., Haveman, W.J., Brogi, E., et al. (2010). Subtle variations in Pten dose determine
cancer susceptibility. Nat Genet 42, 454-458.
Baek, D. (2008). The impact of microRNAs on protein output. Nature 455, 64-71.
Behm-Ansmant, I. (2006). mRNA degradation by miRNAs and GW182 requires both CCR4:NOT
deadenylase and DCP1:DCP2 decapping complexes. Genes Dev 20, 1885-1898.
Cazalla, D., Yario, T., and Steitz, J.A. (2010). Down-regulation of a host microRNA by a
Herpesvirus saimiri noncoding RNA. Science 328, 1563-1566.
Denzler, R., Agarwal, V., Stefano, J., Bartel, David P., and Stoffel, M. Assessing the ceRNA
Hypothesis with Quantitative Measurements of miRNA and Target Abundance. Molecular Cell
54, 766-776.
Eichhorn, Stephen W., Guo, H., McGeary, Sean E., Rodriguez-Mias, Ricard A., Shin, C., Baek,
D., Hsu, S.-h., Ghoshal, K., Villen, J., and Bartel, David P. (2014). mRNA Destabilization Is the
Dominant Effect of Mammalian MicroRNAs by the Time Substantial Repression Ensues.
Molecular Cell 56, 104-115.
Elkon, R., Ugalde, A.P., and Agami, R. (2013). Alternative cleavage and polyadenylation: extent,
regulation and function. Nat Rev Genet 14, 496-506.
Eulalio, A. (2007). Target-specific requirements for enhancers of decapping in miRNA-mediated
gene silencing. Genes Dev 21, 2558-2570.
Farh, K.K. (2005). The widespread impact of mammalian microRNAs on mRNA repression and
evolution. Science 310, 1817-1821.
Filipowicz, W., Bhattacharyya, S.N., and Sonenberg, N. (2008). Mechanisms of posttranscriptional regulation by microRNAs: are the answers in sight? Nature Rev Genet 9, 102-114.
Guo, H., Ingolia, N.T., Weissman, J.S., and Bartel, D.P. (2010). Mammalian microRNAs
predominantly act to decrease target mRNA levels. Nature 466, 835-840.
Hansen, T.B., Jensen, T.I., Clausen, B.H., Bramsen, J.B., Finsen, B., Damgaard, C.K., and Kjems,
J. (2013). Natural RNA circles function as efficient microRNA sponges. Nature 495, 384-388.
Hendrickson, D.G. (2009). Concordant regulation of translation and mRNA abundance for
hundreds of targets of a human microRNA. PLoS Biol 7, e1000238.
Ji, Z., Lee, J.Y., Pan, Z., Jiang, B., and Tian, B. (2009). Progressive lengthening of 3' untranslated
regions of mRNAs by alternative polyadenylation during mouse embryonic development.
Proceedings of the National Academy of Sciences of the United States of America 106, 7028-703 3.
145
J6m M. Schmiedel, S.L.K., Yannan Zheng, Apratim Sahay, Nils Blithgen, Debora S. Marks,
Alexander van Oudenaarden (2015). miRNA control of protein expression noise. Science.
Mayr, C., and Bartel, D.P. (2009). Widespread shortening of 3' UTRs by alternative cleavage and
polyadenylation activates oncogenes in cancer cells. Cell 138, 673.
Memczak, S., Jens, M., Elefsinioti, A., Torti, F., Krueger, J., Rybak, A., Maier, L., Mackowiak,
S.D., Gregersen, L.H., Munschauer, M., et al. (2013). Circular RNAs are a large class of animal
RNAs with regulatory potency. Nature 495, 333-338.
Mukherji, S., Ebert, M.S., Zheng, G.X.Y., Tsang, J.S., Sharp, P.A., and van Oudenaarden, A.
(2011). MicroRNAs can generate thresholds in target gene expression. Nat Genet 43, 854-859.
Poy, M.N., Eliasson, L., Krutzfeldt, J., Kuwajima, S., Ma, X., MacDonald, P.E., Pfeffer, S., Tuschl,
T., Rajewsky, N., Rorsman, P., et al. (2004). A pancreatic islet-specific microRNA regulates
insulin secretion. Nature 432, 226-230.
Salmena, L., Poliseno, L., Tay, Y., Kats, L., and Pandolfi, P.P. (2011). A ceRNA hypothesis: the
Rosetta stone of a hidden RNA language? Cell 146, 353-358.
Selbach, M. (2008). Widespread changes in protein synthesis induced by microRNAs. Nature 455,
58-63.
Sood, P., Krek, A., Zavolan, M., Macino, G., and Rajewsky, N. (2006). Cell-type-specific
signatures of microRNAs on target mRNA expression. Proceedings of the National Academy of
Sciences of the United States of America 103, 2746-2751.
Wee, L.M., Flores-Jasso, C.F., Salomon, W.E., and Zamore, P.D. (2012). Argonaute Divides Its
RNA Guide into Domains with Distinct Functions and RNA-Binding Properties. Cell 151, 10551067.
Zhao, Y., Samal, E., and Srivastava, D. (2005). Serum response factor regulates a muscle-specific
microRNA that targets Hand2 during cardiogenesis. Nature 436, 214-220.
146
Download