I

advertisement
Investigating the competing endogenous RNA hypothesis
Genome-wide and in Single Cells
by
Apratim Sahay
B.S in Physics and Mathematics, University of Chicago (2008)
Submitted to the Department of Physics
in partial fulfillment of the requirements for the degree of
DOCTOR OF PHILOSOPHY
at the
I
I-
CO cO-
Ul)
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
June 2015
Massachusetts Institute of Technology 2015. All rights reserved.
Author.
Signature redacted
Department of Physics
May 22nd, 2015
/1
Signature redacted
Certified by
A/
/7
Alexander van Oudenaarden
MIT Pro sor of Physics and Professor of Biology
Director, Hubrecht Intitute for evelopmental Biology
I
Certified by
Signature redacted
Thesis Supervisor
Jeff Gore
Latham Family Career Development Assistant Professor of Physics
Thesis Supervisor
Accepted by
Signature redacted_
_
Professor Nergis Mavalvala
Associate Department Head of Physics
C
MITLibraries
77 Massachusetts Avenue
Cambridge, MA 02139
htp://Iibraries.mit.edu/ask
DISCLAIMER NOTICE
Due to the condition of the original material, there are unavoidable
flaws in this reproduction. We have made every effort possible to
provide you with the best copy available.
Thank you.
The images contained in this document are of the
best quality available.
Investigating the competing endogenous RNA hypothesis Genome-wide
and in Single Cells
by
Apratim Sahay
Submitted to the Department of Physics
on May 22nd, 2015, in partial fulfillment of the requirements for the degree of
DOCTOR OF PHILOSOPHY
Abstract
The observation that microRNAs (miRNAs), through a titration mechanism can couple interactions of their common targets (competing endogenous RNAs or ceRNAs) has
prompted a general "ceRNA hypothesis' that RNAs can regulate each other indirectly
through global RNA-miRNA-RNA networks. These ceRNAs are said to "crosstalk' with
each other by competing for common miRNAs. Although many individual ceRNAs have
been found, fundamental questions about both the magnitude and generality of the crosstalk
effect remain. In our study we combine RNA sequencing and single-molecule FISH (smFISH)
approaches to both measure the magnitude of the crosstalk effect genome-wide by perturbing three known ceRNAs (Pten, Vapa, Cnot6l) and to identify mechanisms by which the
crosstalk effect acts. We identify hundreds of putative ceRNAs and dissect the contributions
of individual miRNAs in transmitting crosstalk. We demonstrate that while the crosstalk
effect is pervasive, it nevertheless remains bounded by the size of the perturbation. Furthermore, we show that both the number and affinity of shared miRNA binding sites between
targets is crucial in determining the magnitude of the crosstalk strength. Using the smFISH
data, we examined the single-cell gene expression profiles of pairs of ceRNAs and found that
ceRNA gene expression is correlated only in the presence of active miRNAs. Additionally, on
inspecting the intra-cellular localization of RNA molecules, we found a miRNA-dependent
colocalization of ceRNAs, suggesting a new signature of crosstalk between ceRNAs that
extends and modifies the original hypothesis.
Thesis Supervisor: Alexander van Oudenaarden
Title: MIT Professor of Physics and Professor of Biology
Director, Hubrecht Institute for Developmental Biology
Thesis Supervisor: Jeff Gore
Title: Latham Family Career Development Assistant Professor of Physics
This work is dedicated to my grandparents
Gaur Priya Devi & Krishnanand Sahay,
Veena Srivastava & Shailendra Nath Srivastava
who instilled in me their love for the life of the mind
and the desire to share its fruits with others.
Acknowledgements
This thesis would not have been possible without the help, encouragement and support of
many people to whom I owe a debt of gratitude. First and foremost, Alexander van Oudenaarden, my thesis advisor, who welcomed
me into his lab and gave me great freedom and
support throughout my PhD. Alexander's grasp of experimental biophysics is truly broad
and deep, which I found as he led the lab through the smFISH era, the RNA sequencing
era and the single-cell sequencing era. Not only was he an inspiring scientist, but he also
created a fantastic group of enormously talented students and post-docs in building 68 that
buzzed with stimulating ideas. After introducing me to microRNAs and suggesting an experimental plan of attack, he then stepped back to let me find my own way. Always there to
offer a suggestion, to share in excitement or to help think through a problem, he has been
a great mentor. After his move to Utrecht, he offered me numerous opportunities to visit
him there and work with another set of fantastic people. Finally, I am also thankful for the
opportunity as a graduate student to be able to make mistakes. I will be forever grateful
for Alexanders limitless patience throughout this process.
I sincerely thank my thesis committee members, Jeff Gore, Jeremy England and Mehran
Kardar for their support and advice throughout my graduate years. Jeff in particular for
his blend of unflappable enthusiasm and guidance during some of the more trying phases of
research.
Next, my wonderful collaborators - Joern Schmiedel, Yannan Zheng, Sandy Klemm,
Dominic Gruen. Joern came to MIT a year into my thesis project and has helped shape
and sharpen my ideas tremendously. His enthusiasm and dogged persistence in solving
problems were a great boost whenever I was stuck in dark alleys. Yannan and I started
vi
and finished our PhD's together and also been through all the ups and downs of graduate
student life together. She taught me a lot about microRNA biology and was an invaluable
source of experimental guidance, especially cell culturing and cloning. Dominic helped set up
the RNA Sequencing pipeline in Utrecht and generously shared his expertise in microRNA
bioinformatic analysis. Sandy was a fantastic friend, a critical sounding board for hypothesis,
and taught me the intricacies of live-cell FACS sorting.
My graduate life would not have been half as much fun without the tremendous people
at the AvO lab: Dong Hyun Kim for mentoring me in worm biology when I first came to the
lab and training me in the dark arts of FISH. He and Christoph Engert were vital founts
of friendship, mentorship and cheer. To the postdocs: Stefan, Jeroen, Magda, Nick,Nikolai,
Lenny, Shalev, Philipp, Anna, Gregor, Arjun, Scott, who took the time to provide critical
advice on experiments, research, and life. To the amazing graduate students in the lab
office who shared all the joy and frustrations of research. You made the AvO lab fun and
exciting: Ruizhen, Miaoqing, Bernardo, Clinton, Ni, Annnalisa, Dylan, Kay, Juan, Shankar,
Hyun. Lastly, Monica Wolf, Annemiek van Rooijen, Crystal, Cathy and Katie who have
meticulously taken care of any and all administrative issues that have cropped up.
During my time at MIT, I've been lucky to have some wonderful roommates and friendsMichelle, Andrew, Andrew Stecker, Arghavan, David who have been fantastic at keeping a
balanced life. Friends on the squash courts who have offered huge support and camaraderie
over the years, thank you for helping me maintain my sanity- Najib, Ann, Pam, Jan, Frans,
Christopher, Christoph, Justin, Mehmood.
Finally I would like to thank my parents Aparajita and Avinash, for being so amazingly supportive throughout my entire academic career, and life in general, and providing
countless opportunities to me. My sisters Ananya and Apoorva for your love and feigned
excitement at my research! My cousins, Sunny, Pranay, and Abhilash for their encouragement and shared geekdom. My extended family in India for their tremendous support over
the years. Lastly, my wife Liz, without whom I would never have been introduced to the
world of biology, and without whose unwavering support none of this would have happened.
Your intelligence, encouragement and limitless love makes all things possible.
vii
Table Of Contents
Acknowledgements
vi
List of Figures
xii
Introduction
1.4
. . . . . . . . . . . . .
10
1.1.2
Biogenesis of miRNAs . . . . . . . . . . . .
. . . . . . . . . . . . .
11
1.1.3
miRNAs: target binding and competition .
. . . . . . . . . . . . .
12
ceRNAs: Discovery . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
13
1.2.1
Different types of endogenous ceRNAs . . .
. . . . . . . . . . . . .
15
1.2.2
3'UTRs as ceRNAs
. . . . . . . . . . . . .
. . . . . . . . . . . . .
15
1.2.3
Circular RNAs . . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
16
1.2.4
Pseudogenes as ceRNAs . . . . . . . . . . .
. . . . . . . . . . . . .
16
1.2.5
Long non coding (lncRNA) as ceRNAs . . .
. . . . . . . . . . . . .
17
Modulators of crosstalk activity . . . . . . . . . . .
. . . . . . . . . . . . .
18
.
.
.
.
.
.
.
.
. . . . . .
Abundance of miRNA binding sites and miRNA concentration
. .
19
1.3.2
MiRNA binding affinity . . . . . . . . . . . . . . . . . . . . . . . .
20
1.3.3
MRE Accessibility and Local concentrations . . . . . . . . . . . . .
20
1.3.4
Post-transcriptional network effects . . . . . . . . . . . . . . . . . .
21
Summary and Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
22
.
.
.
.
1.3.1
Assesment of the ceRNA hypothesis with integrated genome-wide measurements reveals bounded yet pervasive crosstalk activity
24
2.1
26
.
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.1
ODE biochemical model of crosstalk predicts that crosstalk strength
28
2.1.2
Quantification of crosstalk following siRNA knockdown of sender
32
2.1.3
Pervasive yet bounded mRNA Crosstalk upon siRNA knockdown.
35
2.1.4
Crosstalk strength correlates with the number of shared binding sites
37
2.1.5
miRNA's hierarchically contribute to transmitting crosstalk . . . .
40
.
should be bounded by 1 . . . . . . . . . . . . . . . . . . . . . . . .
.
2
Discovery of miRNA Regulation
.
1.3
10
1.1.1
.
1.2
9
MicroRNAs-discovery, biogenesis, target binding and competition
.
1.1
.
1
viii
TABLE OF CONTENTS
2.1.6
Pten miRNAs have the greatest crosstalk power due to high [miRNA]:
Target abundance ratios . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.7
2.2
2.3
Transfecting Pten UTR as a sponge de-represses putative ceRNA's in
a dose-dependent and miRNA dependent manner . . . . . . . . . . .
46
Discussion and conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . .
54
Methods and Materials
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
58
2.3.1
Cell culture and siRNA Transfection . . . . . . . . . . . . . . . . . .
58
2.3.2
RNA extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
59
2.3.3
RT-PCR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
59
2.3.4
Reporter Plasmid Construction . . . . . . . . . . . . . . . . . . . . .
60
2.3.5
Transient Transfection of plasmid . . . . . . . . . . . . . . . . . . . .
60
2.3.6
FACS sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
60
2.3.7
RNA Sequencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
61
2.3.8
RNASeq Data Analysis
. . . . . . . . . . . . . . . . . . . . . . . . .
61
2.3.9
miRNA-mRNA Target prediction . . . . . . . . . . . . . . . . . . . .
62
2.3.10 miRNA expression Data sources
. . . . . . . . . . . . . . . . . . . .
3
62
2.3.12 GO term analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
62
2.3.13 TMM (Trimmed Mean of M-values) Normalization . . . . . . . . . .
63
Supplementary Figures and Tables
64
. . . . . . . . . . . . . . . . . . . . . . .
A single molecule analysis of ceRNAs reveals miRNA-dependent correlation and colocalization
69
3.1
70
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.1
Quantification of gene expression for Pten, Vapa and Cnot6l in single
cells with 3-colour smFISH
3.1.2
4
62
. . . . . . . . . . .
2.3.11 Target Abundance and Sequestration estimation
2.4
44
. . . . . . . . . . . . . . . . . . . . . . .
70
Presence of shared miRNAs generates correlated fluctuations of Pten
ceRNAs in single cells . . . . . . . . . . . . . . . . . . . . . . . . . .
72
3.1.3
Pten, Vapa, Cnot6l are mutually reciprocal ceRNAs
75
3.1.4
Individual molecules of Pten ceRNAs are colocalized in a miRNA-
. . . . . . . . . .
dependent manner . . . . . . . . . . . . . . . . . . . . . . . . . . . .
77
3.2
D iscussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
80
3.3
Methods..........
81
......................................
3.3.1
Fluorescent in situ hybridization and imaging
. . . . . . . . . . . .
81
3.3.2
Image analysis
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
82
3.3.3
siRNA transfection and cell culturing
. . . . . . . . . . . . . . . . .
MicroRNA-mediated control of protein expression noise
82
83
4.1
Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
83
4.2
Effects of microRNAs on gene expression noise
. . . . . . . . . . . . . . . .
84
ix
Chapter 0
4.3
5
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Conclusions and Future Directions
References
94
95
98
Appendix A: Mathematical model of microRNA
regulation by Joern Schmiedel
x
101
List of Figures
1.1
Canonical miRNA biogenesis pathway (adapted from (Davis-Dusenbery Hata,
2010)...........
11
........................................
1.2
Logic of the ceRNA language (adapted from (Salmena, 2011) . . . . . . . .
1.3
Various types of validated competing endogenous RNAs (adapted from (Tay
& Pandolfi, 2014) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4
13
15
Extensive co-targeting of miRNAs - many targets share miRNA binding
sites(adapted from Obermayer(2014). The color of the edges indicates the
number of pairs which share a given pair of miRNAs while the size of the
nodes indicates the total number of shared targets for a given miRNA . . .
2.1
ODE biochemical model of a miRNA mediated crosstalk predicts that crosstalk
strength should be bounded . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1
31
ODE biochemical model of a miRNA mediated crosstalk predicts that crosstalk
strength should be bounded . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2
21
32
siRNA knockdown of 3 different endogenous senders shows crosstalk strength
is bounded by 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
34
. . . .
36
2.3
Crosstalk is miRNA-mediated and pervasive on a genome-wide scale
2.4
Crosstalk strength of receivers with sender CNOT6L does not depend on
their predicted number of shared binding sites with CNOT6L Related to
(Figure 2.5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.5
39
Crosstalk strength of receivers correlates with the predicted number of miRNA
binding sites shared with the sender
. . . . . . . . . . . . . . . . . . . . . .
40
2.6
Dissecting relative contributions of miRNAs in transmitting crosstalk . . . .
43
2.7
Greater miRNA:Target ratios underlie Pten's superior ability to send crosstalk 45
2.8
Derepression of Pten ceRNAs is detected upon modulating the levels of Pten
3' UTR with a transiently trasnfected synthetic reporter construct
2.9
. . . . .
47
Normalization is required for FACS Sorted RNAseq data as reads from plasmid occupy a large percentage of total sequencing reads leading to an overall
offset in fold changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
51
2.10 Transfecting Pten UTR as a sponge derepresses putative ceRNAs in a dosedependent and miRNA dependent manner . . . . . . . . . . . . . . . . . . .
xi
53
Chapter 0
2.10 Transfecting Pten UTR as a sponge derepresses putative ceRNAs in a dosedependent and miRNA dependent manner . . . . . . . . . . . . . . . . . . .
54
2.11 Predicted TargetScan conserved miRNA binding sites in the 3'UTR of the
ceRNAs chosen in this study
. . . . . . . . . . . . . . . . . . . . . . . . . .
64
2.12 Crosstalk is microRNA mediated and pervasive on a genome-wide scale. Re-
lated to (Figure 2.3)
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.13 Distribution of log 2 fold changes (PTEN UTR/NULL) for all genes post
TMM normalization is centered around zero in each bin
i.e no bin-dependent
effects are seen. Related to Figure 2.10 . . . . . . . . . . . . . . . . . . . . .
3.1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
74
Measuring crosstalk strength with smFISH for 3 different senders in HCT116
and DICER
3.5
73
Pten does not lose correlation in DICER for a gene with which it doesn't
share miRNAs
3.4
71
Crosstalk helps ceRNAs co-fluctuate in single cells thereby tightening their
stoichiometric ratios in the presence of active miRNAs . . . . . . . . . . . .
3.3
65
Measuring Pten, Vapa and Cnot6l gene expression in single cells with 3-colour
single-molecule FISH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2
65
-/-. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
76
Single molecule FISH shows Pten ceRNAs are colocalized in a DICER dependent m anner
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
78
4.1
Opposing noise effects of microRNA regulation at low and high gene expression 85
4.1
Opposing noise effects of microRNA regulation at low and high gene expression 86
4.2
Noise model predictions for a microRNA regulated gene
. . . . . . . . . . .
87
4.3
microRNA-mediated intrinsic noise effects . . . . . . . . . . . . . . . . . . .
89
4.4
Estimation of microRNA pool noise and noise effects for endogenous genes .
91
4.4
Estimation of microRNA pool noise and noise effects for endogenous genes .
92
5.1
Colocalization of ceRNA's can enhance crosstalk by increasing their local
concentrations hence promoting rates of miRNA association between ceRNA
as free miRNA's are more likely to bind to nearby mRNA than other targets
(adapted from Jens (2015) . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xii
96
Chapter 1
Introduction
According to the central dogma of molecular biology, RNAs are passive messengers of genetic information, or carrying out DNA instructions for protein production in cells. Studies
on gene regulatory networks in the past focused on transcriptional regulation in the form of
protein transcription factors binding to DNA, but increasing evidence suggests that posttranscriptional regulation are a significant part of the regulatory network. The discovery of
microRNAs, a class of short noncoding RNA 18-25 nucleotides in length,that were shown
to inhibit their target genes through binding to sites on the 3' untranslated regions (UTRs)
of target RNA transcripts with imperfect complementarity, and leading to decreased expression of their target proteins either by mRNA degradation or translational inhibition
(Bartel,2009) has dramatically increased the complexity of the gene regulatory networks.
Each microRNAs can act in a combinatorial manner as a single mRNA usually contains
binding for multiple miRNAs. At the same time, individual miRNA often targets up to 200
transcripts which are diverse in their function. Within the network of potential interactions
that ensue, miRNAs have been thought to function mainly as fine tuners of gene regulation
by weakly dampening protein output (Bartel 2004) but more recently attention has been
directed to their system-level effects. In particular, If microRNAs act to negatively regulate
RNAs, could RNA's themselves regulate microRNA levels? After all, each target binding
Chapter 1
site sequesters miRNA from their other targets. The central mechanism underlying the
ceRNA hypothesis proposed by (Salmena 2011) is the idea that RNA species are coupled
by their targeting miRNA through their shared miRNA binding sites. Therefore, they may
have interactions that are not direct, but instead indirect and mediated by competition and
depletion of shared microRNA pools. Thus RNA's could be said to "crosstalk" with each
other. Moreover, the hypothesis contends that these indirect RNA interactions result in a
biologically important mRNA network- either by functional changes in protein levels or by
inducing correlations in different RNA species or by reducing noise in protein levels. This
mechanism is believed to play a role in many biological processes, from cancer (Tay 2011)
to cell differentiation (Cesana 2011).
In the next section, we discuss miRNA biology and literature summarizing the experimental evidence of RNA-RNA crosstalk, as well as the modulators of crosstalk activity.
1.1
MicroRNAs-discovery, biogenesis, target binding and competition
1.1.1
Discovery of miRNA Regulation
MicroRNAs were first discovered in the nematode C. Elegans in 1993 where lin-4, a short
non-coding RNA, was found to imperfectly base-pair to complementary sequences on the
3'UTR of the lin-41 transcript (Wightman 1993, Lee 1993), and block lin-41 gene expression. Reduction of LIN-41 protein resulted in mis-timing of the developmental stages of
the animal. Lin-4 remained the only miRNA discovered until 2000, when another miRNA
important in the development of C. elegans, let-7 was discovered (Reinhart et al., 2000).
Analogues of lin-4 and let-7 were found in a wide-range of other species, including humans
and in the following years, over 1500 different miRNA sequences were discovered. A huge
amount of research focused on the identification of target sites(Lewis 2005, Stark 2003), their
likely cellular functions (Giraldez 2006, Vigorito 2007) and their biogenesis (Hutvagner and
Zamore, 2002). MiRNAs have been ascribed roles in nearly every biological process, includ10
1.1.
MICRORNAS-DISCOVERY, BIOGENESIS, TARGET BINDING AND COMPETITION
ing apoptosis (Cimmino 2005), pluripotency (Subramanyam 2011), and cell-cycle control
(Ivanovska 2008).
i Ftt r
1..Tioee
ils
of miN
.C
A
s
In animals, miRNAs are transcribed by RNA Pol II as long primary transcripts (primiRNAs) with both a 5' cap and 3' poly-adenylated ends (Cai 2004). miRNA genes are often genomically clustered such that pri-miRNA transcripts contain multiple mature
miRNA
sequences (Lau 2001). These primary mIRNAs are recognized and clipped by the microprocessor complex, comprising the RNAse III enzyme DROSHA (Lee 2002) and its co-factor
RNA-binding protein DGCR8 (Gregory 2004), into hairpin loops 60-65 bp long. These hairpin loops are bound and exported from the nucleus into the cytoplasm by Exporin-5. Once
in the cytoplasm, the pre-miRNA are bound by a second RNAase III enzyme DICERi
11
Chapter 1
which cleaves the precursor
loops into short double-stranded 20-24 nt RNA (Grishok 2001),
containing the mature miRNA "guide-strand' and "passenger strand". In a less understood
process, DICERI loads the mature miRNA into the Argonaute complex (usually AGO2),
that in turn recruits the RNA-induced silencing complex (RISC) (Sontheimer 2005). Upon
loading of the miRNA into the RISC complex, the passenger strand of the double-strand
miRNA is usually degraded while the guide-strand bound to the silencing complex seeks
out its complementary RNA sequence. As biogenesis consists of multiple steps, numerous
mechanisms for modulating its propagation have been shown, with implications for ceRNA
competition that will be discussed later. In particular, over expression of Ago2 was found to
increase mature miRNA levels in some cells, while disruption of DICER1 enzyme resulting
in lowered levels of mature miRNAs (Diederichs 2007, Tay 2011). We will use cells lacking
in DICER1 as an important control in all our experiments in Chapters 2 and 3.
1.1.3
miRNAs: target binding and competition
The specificity of the target recognition process depends upon a crucial "seed" region of
the miRNA (usually nt 2-7/8) recognizing as few as 6-7 nucleotides in the 3'UTR of target
mRNA (called the microRNA Response Element or MRE). In most cases, even a single
mismatch in the seed sequence leads to disruption of miRNA binding (Lewis 2005). Even
so, with such few nucleotides in the seed region responsible for target recognition, individual miRNAs potentially bind to a large number of target mRNAs. However, as with any
bimolecular binding reaction of the form A
+B
+--
AB, the mass-action law dictates what
proportion of targets would be bound, and thus repressed by a miRNA. This relates the
molecular concentrations of the miRNA and its targets to the Kd of the interaction. If miRNAs are limiting, then increasing the number of targets would result in lower occupancy
per target. Put another way, each miRNA bound to a target necessarily prevents, to some
extent, the binding of that miRNA to other target sites. Thus, target sites can be said to
compete with each other for miRNAs. More generally, competition and saturation effects
occur in other parts of the miRNA regulatory process. When mature miRNAs are loaded
onto the Argonaute complex there is competition for access due to the small number of
12
1.2. CERNAS: DISCOVERY
molecules involved which can lead to saturation conditions for the RISC machinery.
The concept of competitive target inhibition by miRNAs inside the cell was first shown
in 2007 by (Ebert 2007), who used plasmids overexpressing miRNA seed-sites (upto
-10,000
copies) to 'sponge up" specific endogenous miRNAs, and thereby titrate away those miRNAs
from their other targets, resulting in a specific up-regulation of the corresponding miRNAs
targets. Consistent with the limited power of miRNA repression, they measured a mild
1.5-2 fold up-regulation of the miRNA target. In order to stress the large number of strong
binding sites for a single miRNA that had been introduced into the cell, they used the term
PmiRNA sponge". Later (Seitz,2009) proposed that these highly expressed artificial sponges
may have a biological function and that the role of a substantial fraction of computationally
identified miRNA targets may be to sequester miRNAs, preventing them from binding to
their authentic targets. Such sponges had been discovered in plants where over-expression of
the long non-coding RNA IPS1 sequesters miR-339 and results in the up regulation of miR339 target gene (Franco-Zorrilla 2007). To what extent similar competition and saturation
effects naturally occurred in animals remained unexplored.
1.2
ceRNAs: Discovery
A
Conventional RNA
S UTR
COS
logicRN
3'UTR
~T
MRE
Figure 1.2
ILogic of the ceRNA
language (adapted from (Salmena, 2011).
In 2010 the Pandolfi group devised a combined computational/experimental strategy to
search for potential competing endogenous RNAs (termed ceRNAs) for a tumour-suppresor
13
Chapter 1
gene Pten based on the number of predicted shared miRNA binding sites on other transcripts. This computational analysis identified over a hundred protein coding genes that
shared at least 7 miRNA binding sites with Pten. These genes were considered candidate
ceRNAs for Pten. For a subset of these gene (6 out of the 8 genes tested) they demonstrated a depletion of their expression upon Pten knockdown via siRNA, and conversely, a
up-regulation upon overexpression of PTEN 3'UTR. Specifically, the genes Vapa and Cnot6l
were confirmed as bona fide Pten bi-directional ceRNAs as transfecting the 3'UTR of these
mRNAs intensified their miRNA sponging and led to an increase in PTEN protein abundance. Such a change in PTEN protein levels was shown to have a functional significance:
it antagonized PI(3)K signaling and caused growth and tumor suppression (Tay 2011).
The authors went further and extrapolated that all kinds of RNA transcripts talk to
each other in a miRNA-mediated language and proposed a "crosstalk" hypothesis: RNA's
sharing multiple MRE in their 3' UTRs (or in other ncRNA) communicate with each other
and regulate their expression levels by competing for a limited pool of miRNAs (Salmena
2011). Upregulating a given RNA would lead to an increase in the total number of MRE's
and thereby attract miRNA binding towards it. As a result the targeting miRNAs would
be sequestered leading to the de-repression of other miRNAs sharing the same MRE's. This
indirect correlation, between competing targets was termed the ceRNA or crosstalk effect.
(Figure 1.2)
While the ceRNA hypothesis was a natural consequence of target competition and sequestration, it nevertheless made a startling claim: a new, pervasive gene regulatory network
must exist due to the highly promiscuous and clustered nature of miRNA-target binding
(Karreth 2011, Sumazin 2011). These papers proposed that shared miRNA target sites
linked dense networks of thousands of genes in a regulatory complex and moreover, the expression of these genes is correlated in many cancer cell-types. In order to test computational
predictions of ceRNAs, individual ceRNAs were either down regulated or over-expressed
and expression levels of other ceRNAs were measured. In this manner, many new ceRNAs
were discovered. In the following section, we briefly discuss some classes of transcripts that
have been identified as ceRNAs.
14
1.2. CERNAS: DISCOVERY
Different types of endogenous ceRNAs
1.2.1
miRNAs
A
-rrr=
e
Ti"If2f-
miRNA
Pseudogenes
AAAAA
miRNA
uicRNA
circRNA
Competing mRNA
MAAA
mRNA
AAAAA
B
T"
I I ftNotv&W&W Cerwclf w aft
nds
m u~fN
Sminon-codengRMA
H"sZI
fO1
IP$1
PnO2
MM-27a
iOR-399
ifPsa"AwzahMw
Lnjgnon-codingRNA
Hl.C
PRACO
ff372
Unc4AM1
MAEL
AMc
mAW133
inI-135
HOMMOW
AMnma cukand Homo sapvns
ImcRoR
~NG
WUr
AmbAdoptis SUwana
n)&145I
hiamsop"M
SM~
PTCSC3
miR-5745p
Le-7
H29
PTMIPI
PTEN
KWAIP
Pbcs4
OmXatRNA
-my
-17.mR-19
Mwnumuscus and HOMa
sapans
m&iR-21, miR-26 and
L*t-7 bn~i
Miarnuanuband Hoinoseiwis
SCAS4
Sqy
Mwwmosuandhamose
Mws musadus and Homo sapid
Dan,..ed,
CoRlas/AyS-
m*138
Figure 1.3 1 Various types of validated competing endogenous RNAs (adapted from (Tay
& Pandolfi, 2014).
1.2.2
3'UTRs as ceRNAs
3' UTRs are critical for mRNA stability and typically contain MREs for several different
miRNAs. One can view them differently as ceRNAs because 3' UTRs regulate not only
the stability of their own transcripts in cis, but are also likely to attract miRNAs from
transcripts with shared MREs, thereby regulating such transcripts in trans. This suggests
that mutations or changes in abundance, structure, or length of 3' UTRs may affect their
15
Chapter 1
ability to sponge miRNAs. Supporting this view, alternative polyadenylation of 3'UTR
has been observed- leading to their lengthening during embryogenesis, and shortening in
proliferating cells (Mayr 2009) and in cancer (Mercer 2011). These changes in the length of
3' UTRs affect the interaction of miRNAs with such transcripts and affect protein output.
Moreover, due to a reduced number of MREs, 3' UTR shortening will also modify the ability
of these mRNAs to compete for/sequester miRNAs and thereby function as ceRNAs.
1.2.3
Circular RNAs
RNA's that are covalently linked at the ends to form circles had been described in plants
(Sanger 1976) but a new class of noncoding circular RNAs (circRNAs) was recently identified and characterized in mammals (> 5000). These RNAs are processed by the spliceosome
in an unusual head-to- tail fashion, resulting in circular transcripts that contain multiple miRNA binding sites and act as miRNA sponges to deplete the cell of specific miRNAs, essentially alleviating repression of the mRNAs they target (Memczak 2013, Hansen
2013). They found that a circRNA ciRs-7 contained >70 MRE's for the miRNA miR-7
and formed complexes with AGO in a miR-7 dependent manner. smFISH then showed that
circRNA-miRNA complexes localize to P-bodies, suggesting that the complexes were being
sequestered from translational machinery. circRNAs have proven to be highly effective at
sequestering miRNA's as compared to their linear counterparts partly because they are almost immune to miRNA mediated target destabilization due to inherent resistance to RNA
exonucleases. Effective "supersponge" ceRNAs have precisely such properties: resistance to
degradation, high expression levels, multiple miRNA binding sites. Further characterization
of this abundant class of non coding RNAs will be necessary to determine how universal
this mechanism is for sequestering miRNAs inside cells and their ceRNA function.
1.2.4
Pseudogenes as ceRNAs
Pseudogenes, a class of non-coding RNAs, are transcribed yet posses features such as premature stop codons, deletions/insertions, or frameshift mutations that prevent them from
producing functional proteins. Hence they have been considered "junk" DNA. However, they
16
1.2. CERNAS: DISCOVERY
are thought to act as "perfect sponges" because they possess many of the same MREs located on their ancestral genes; for example, PTENP1 is able to change the miRNA network
normally involved in the regulation of PTEN [Tay 2011, Poliseno 2010]. PTENP1, the processed pseudogene of PTEN represents the first reported example of an RNA transcript that
acts as a ceRNA for PTEN. Within the coding region, the PTENP1 sequence differs from
the PTEN sequence by only 18 mismatches, thus PTEN-targeting microRNAs that bind
to MREs are usually PTENP1-targeting as well. (Poliseno, 2011) tested ceRNA activity
of PTENP1 in prostate cancer cells, and showed that inhibiting the common microRNAs
miR-17, -19, -21, -26 and -214 de-repressed PTENP1. Conversely, PTENP1 3'UTR overexpresison led to the de-repression of PTEN. Another pseudogene acting as a ceRNA is the
Oct44 pseudogene, Oct4-pg4 (Wang 2013). Oct4 pesudogene was shown to sponge away the
miR-145, and hence upregulate Oct4. These studies have attributed a miRNA-sponge function to pseudogenes however, the difficulty of reliably quantifying pseudogene expression
(due to the aforesaid sequence similarity) have hindered attempts to quantitatively study
their ceRNA function on a large scale.
1.2.5
Long non coding (IncRNA) as ceRNAs
Similar to pseudogenes, long non coding RNAs don't have any protein-coding capacity,
but are found pervasively across the transcriptome (-10,000) . They are good candidates
to act as ceRNAs because they are peppered with miRNA binding sites, and have an
ability to sequester miRNAs (Chi 2009). Moreover, lncRNAs are also known to display
specific expression patterns in different tissues, developmental stages, cell types and disease
and thus have been recognized as ideal candidates to tune post-transcriptional regulation
(Guttman 2012). Two such
IncRNA ceRNAs that have been discovered acting as miRNA
-
sponges are HULC and ROR. The lncRNA HULC has been shown to act as a ceRNA
it sequesters a set of miRNAs, including miR-372, and its over expression reduces miR372 expression and activity in the liver cancer cell line Hep3B. This miR-372 sequestration
increases the translational level of the miR-372 target gene, PRKACB (Cesana, 2011).
Recently, (Wang 2013) showed that lnc-RoR competes for miR-145 binding with the well17
Chapter 1
known core pluripotency factors Oct4, Nanog and Sox2 in pluripotent embryonic stem
cells and thereby protects them from miR-145 induced degradation. Interestingly,
Inc-ROR
was expressed at a greater level(>100 copies/cell) than its miRNA-145 (10-20 copies/cell)
suggesting that it acts as a good sponge.
1.3
Modulators of crosstalk activity
The size of the crosstalk effect depends upon whether or not a single ceRNA perturbation
has an appreciable effect on the total miRNA target pool so as to titrate away miRNAs
from other shared targets and thereby relieve their miRNA induced repression. Recent
mathematical models of miRNA gene regulation (Bosia 2013, Figliuzzi 2013, Ala 2013)
have aimed to quantitatively model ceRNA crosstalk through both steady-state and kinetic
descriptions for a small number of interacting miRNA-ceRNA species. The quantitative predictions of these models may not sufficiently explain the magnitudes of the endogenously
measured ceRNA effect due to the limited number of ceRNAs modeled and the use of free
kinetic parameters of transcription, degradation and association rates that are difficult to
experimentally ascertain (Ebert 2012). However, they illustrate some useful principles of
miRNA-target competition: (i)the optimal regime for ceRNA crosstalk occurs when targets
concentrations are close to the binding Kd of miRNA-target interaction (ii) crosstalk between targets is intensified with a greater number of shared miRNAs (iii) higher expressed
targets that form a greater proportion of a miRNA's total target pool are better senders for
crosstalk.(iv)ceRNA effects will be selective and hierarchical depending on the particular
affinities and binding strengths of miRNA-target pairs (Figliuzzi 2013) (v) ceRNA effects
can be indirect i.e if ceRNA1 shares miRNA1 with ceRNA2 and also shares miRNA2 with
ceRNA3,then ceRNA1 will be indirectly coupled to ceRNA3 through even though they do
not share any mIRNAs directly in common with each other.(Ala 2013)
Quantitative prediction of ceRNA effect in miRNA networks critically requires knowledge of the relative concentrations of miRNAs and targets in the cell. Both of these are
experimentally difficult to measure. Absolute concentrations of miRNA have been reported
18
1.3.
MODULATORS OF CROSSTALK ACTIVITY
to range up to 120,000 copies per cell in various cell types (Bissels et al., 2009; Calabrese
et al., 2007; Denzler et al., 2014; Lim et al., 2003; Mukherji et al., 2011). Estimated total
target concentrations for a given miRNA vary from 500 copies per cell to over 440,000
(Denzler et al., 2014; Loeb et al., 2012; Wee et al., 2012). Estimates of target abundance
concentrations are done in-silico and are widely divergent estimates. Consequently, differing target pool size predict very different characteristics of miRNA target competition
networks. Recently, researchers (Bosson 2014) have critically advanced the field by making state-of-the-art measurements of both miRNA abundance and the total abundance of
miRNA-binding sites (Bosson 2014).
1.3.1
Abundance of miRNA binding sites and miRNA concentration
Firstly, to directly determine bound miRNA target sites, (Bosson 2014) relied on crosslinking and immunopreciptiation (CLIP) of the Argonaute 2 protein to identify bound
AGO2 mRNA and consequently target-site abundance in vivo. CLIP protocols first use
ultra-violet (UV) light to induce protein-RNA cross links, then AGO2 protein is immunoprecipitated using a specific antibody, thus bringing both the guide miRNAs and their
bound targets, and these are stringently purified to get rid of unbound RNA, digested into
short RNAs, and prepared for sequencing. By quantifying the CLIP reads at each miRNA
seed-site they were able to specifically and reproducibly estimate the concentration of bound
targets. Secondly, they measured miRNA concentrations with a small RNA-seq assay and
normalized the counts to miR-295 copies per cell quantified by northern blot. With these
data, they show that for the thirty highest expressed miRNAs in ES cells, total 6-mer/7mer/8-mer target pools were more abundant than all miRNA concentrations. Thus any
perturbation of a ceRNA for those miRNAs is unlikely to titer them away as binding sites
are already in excess. Similarly, (Denzler 2014) reported that even for the highest expressed
miRNA, miR-122, total target binding sites are above miRNA levels; consequently miR-122
targets are not derepressed until they added unphysiologically high amounts of miR-122
sponges. These studies, done on primary cells, have considerably diminished the possibility
of a appreciable ceRNA effect that is purely stoichiometric in nature. It is important to
19
Chapter 1
realize that these CLIP protocols (Bosson 2014) pool together millions of cells, yielding an
average binding profile which may not be reflective of dynamic conditions in single cells.
Moreover, the studies by the Pandolfi group were done in a cancer cell-line which are known
to have altered miRNA concentrations. Therefore, we cannot rule out ceRNA effects in all
types of cells.
1.3.2
MiRNA binding affinity
The two main factors that affect miRNA-binding affinity are the number of miRNA binding sites on a target and the free energy of the miRNA-target hybridization (AG). Given
the variation in binding affinities across targets, miRNAs will preferentially bind targets
with greater affinity before spreading to lower-affinity sites. Thus the total target pool is
partitioned into hierarchical affinity classes that do not compete equally. Conceptually, all
binding sites of the same affinity (Kd) "see" the same concentration of free miRNA, which
means that they can be grouped together. Targets with affinity much greater than the rest
of the pool would act in a simple 1:1 titration regime with the miRNA. Since high-affinity
target sites more favorably bind the available miRNA pool, competition can occur without
approaching expression levels of the total pool of weak and strong sites combined.
1.3.3
MRE Accessibility and Local concentrations
Going back to the binding reaction of the form A+ B <--
AB, one notices that the relevant
concentrations of each species is not the global concentration (assuming a well-mixed cellular
environment) but rather that binding probabilities are determined by local concentrations. If
miRNAs or mRNAs are kept sequestered in sub-cellular structures, local concentrations may
deviate from the average by a large magnitude. Structures such as P-bodies or RNA granules
can harbor RNAs and mIRNAs in small volumes, thereby concentrating them and possibly
altering binding and unbinding of miRNAs. While these phenomena are very difficult to
quantify, altered local concentrations can change the competition between miRNA-mRNAs
and enhance the size of the ceRNA effect. Essentially, rather than competing for binding
with the whole target pool, miRNAs could bind much more favorably to locally available
20
1.3. MODULATORS OF CROSSTALK ACTIVITY
target sites. smFISH studies can allow us to quantify the localization of ceRNAs, which we
will perform in Chapter 3.
1.3.4
Post-transcriptional network effects
shard pairs
*3'0,\
0 230
m 45
= 60
.10
Mir-203mir-34
mir-2m2r-96
2000
0 1000
0 100
connectivity
(total # of shared pair targets)
.
Figure 1.4 I Extensive co-targeting of miRNAs - many targets share miRNA binding
sites(adapted from Obermayer(2014). The color of the edges indicates the number of pairs
which share a given pair of miRNAs while the size of the nodes indicates the total number
of shared targets for a given miRNA
A systematic analysis of the ceRNA effect is impeded by the complexity of natural
miRNAD ceRNA regulatory networks. The ceRNA effect depends both on the underlying
dynamical binding parameters of miRNAs-target RNAs and on the topology of the network.
The miRNA-RNA network is known to be highly clustered-certain miRNAs often target
genes in tandem- consequently, there appears strong correlations in network connectivity (Figure 1.4). An implication of the highly interconnected nature of the miRNA-RNA
target network is that perturbations of gene expression can potentially propagate in the network through a cascade of coregulated target RNAs and miRNAs that share targets (Nitzan
2014). Pairs of miRNAs which have greater number of shared targets would therefore act
as key nodes in the ceRNA network. Conversely, certain ceRNAs, which are commonly tar-
21
Chapter 1
geted by a large number of miRNA species can selectively transmit crosstalk than others.
Whether or not small effects caused by a propagation of the ceRNA effects are biologically meaningful remains to be investigated. Similar network propagation issues affect other
gene regulatory mechanisms. It has often been observed after a gene perturbation (eg. of
a transcription factor, miRNA, or drug target) that unrelated genes (off-targets) changed
expression i.e those genes whose connection to the perturbed genes was not traceable.
1.4
Summary and Outline
Following the discovery of transcripts that can sequester miRNAs thereby releasing other
targets from miRNA-mediated repression, a new principle for post-transcriptional gene regulation has been proposed. This layer of gene regulation works through competition for
miRNA binding between different RNAs, and thus has the capability to form a large-scale
regulatory network across the transcriptome. The competing endogenous RNA (ceRNA) or
RNA-RNA crosstalk hypothesis certainly seems an attractive explanation for the functionality of non-coding RNAs and pseudogenes, and until now, many ceRNAs, both coding and
non-coding, have been implicated in varied biological contexts, from cancer (Fang 2013) to
muscle differentiation (Cesana 2011). Nonetheless, only a handful of ceRNAs have been
experimentally identified and many features of the proposed ceRNA hypothesis remain unexamined. Our aim in this thesis is to address some of the fundamental questions about
the generality and magnitude of the crosstalk mechanism. In Chapter 2 we describe the
results of perturbing single ceRNAs ( Pten, Vapa and Cnot6l) and quantifying its effects on
the transcriptome to extract both the size of the ceRNA effect and test the contribution
of specific microRNAs. As will be seen the ceRNA effect is bounded yet pervasive across
the transcriptome. We find that in addition to the number of shared miRNA binding sites
between the perturbed ceRNA and its targets, the affinity of shared miRNA-target binding
is crucial in determining the magnitude of the ceRNA effect. Chapter 3 investigates three
specific ceRNAs at a single-cell level with single-molecule resolution to explore how ceRNA
co-regulation plays out in single cells. Unexpectedly, we find significant co-localization of
these ceRNAs which can enhance crosstalk locally through competition, thus allowing us
22
1.4. SUMMARY AND OUTLINE
to revise the original hypothesis. Moreover, we find miRNA-coupling between ceRNAs is
capable of buffering their individual fluctuations and producing surprising correlations in
gene expression. Chapter 4 studies the role of miRNAs in dampening fluctuations in protein
levels (Schmiedel et al. 2015). We find that miRNA regulation provides a significant reduction in intrinsic protein noise at low expression levels which scales with miRNA repression,
but variability in miRNA concentrations itself propagates to target fluctuations at higher
expression levels.
23
Chapter 2
Assesment of the ceRNA hypothesis with
integrated genome-wide measurements
reveals bounded yet pervasive crosstalk
activity
MicroRNAs (miRNAs) are an abundant class of small non-coding RNA that play complex
roles in post-transcriptional regulation of gene expression. Individual genes are typically regulated by many distinct miRNAs, and conversely individual miRNAs often target multiple
genes leading to complex regulatory networks (Friedman 2009) that drive a large variety of cellular processes, from differentiation and proliferation to apoptosis and cancer [Yi
2008, Sluijter 2010, Cimmino 2005]. Several recent studies have added a new facet of posttranscriptional gene regulation: one that is mediated by transcripts with shared miRNA
binding sites (Salmena 2011; Tay 2011; Tay 2014). This stems from the bidirectional effects between miRNAs and their target mRNAs- where a change in one transcript might
affect the expression of other transcripts by sequestering miRNAs from their shared targets
and thereby inhibit miRNA repression of those other targets. These transcripts-coupled
by their shared miRNAs- are said to 'crosstalk' or regulate each other by competing for
common miRNAs. Based upon such a target competition and sequestration mechanism,
the competing endogenous RNA (ceRNA) hypothesis proposes a rich network of protein
coding-independent regulatory interactions mediated by miRNAs.
Although many individual ceRNAs have been found, fundamental questions about the
magnitude of the effect remain. The experimental setup usually consists of altering the level
of a particular transcript, ncRNA or a 3 'UTR (a 'sender'), then measuring the change in
other genes ( 'receivers' ) that share MRE (miRNA response elements) with the sender, and
verifying that this change in receiver expression is miRNA dependent. In this way, perturbation of senders by siRNA knockdown or UTR overexpression assays indicates that specific
receivers move in a correlated fashion (Tay 2011, Salmena 2011)- they are reduced when
senders are knocked down and are de-repressed when senders are upregulated. However
such a competition mechanism faces three major limitations in accounting for the magnitude of the observed ceRNA effects. Firstly, individual miRNAs have long been thought
to confer limited repression (-2 fold Bartel 2004, Baek 2008). Secondly, given the large
target abundances in a cell, any sender perturbation is only thought to add or subtract
very few sites from the total target pools for a targeting miRNA (Arvey 2011), implying
that the repressive influence of that
miRNA on individual receivers would be muted, and
thus any consequent crosstalk would be small. Thirdly, mathematical models predict an optimum regime where crosstalk might be possible, namely when regulating miRNA and its
target binding sites are near equal effective concentrations (modulo binding Kd) (Jens 2015,
Bosia 2013, Figliuzzi 2013). While estimates of miRNA concentrations exist (tens to 120,000
copies per cell) [Bissels 2009 Denzler 2014], estimates of total target abundances and binding affinities are highly variable, making it difficult to asses whether genes are susceptible to
crosstalk in an endogenous environment. However, a recent study of ceRNA effects for the
exceptionally highly expressed liver-specific miR-122 determined that no target-competition
occurs in vivo because of the large relative abundance of the miRNA target pools (Denzler
2014). Thus the hypothesis remains controversial despite a variety of examples: psudogenes
(Poliseno 2010), circ-RNAs (Hansen 2011), and lnc-RNA (Cesana 2011) which suggest the
existence of ceRNA interactions.
The logic of crosstalk, supplemented with the highly interconnected network of miRNAmRNA interaction, suggests that ceRNA effects should be pervasive across the transcrip-
25
Chapter 2
tome (Sumazin,2011). Since each sender typically sequesters multiple miRNAs,which in
turn have other targets, perturbing the levels of one sender could potentially result in the
change in expression of hundreds of RNAs competing for shared miRNAs. Signal propagation through miRNA
-+
ceRNA
-+
miRNA could take place, affecting distant receivers
(Nitzan 2014, Bosia 2013). However no widespread ceRNA effects have been shown experimentally. Existing studies typically focus on perturbing a sender and testing only a handful
of ceRNAs. For example, after computationally searching for Pten ceRNAs based upon the
number of shared miRNAs, (Tay 2011), found hundreds of possible ceRNA candidates but
tested only a selected few Vapa, Cnot6l, Serinci, Znf460 that each shared at least 7 miRNA
binding sites with Pten. Consequently, it has proved difficult to ascertain whether crosstalk
is restricted to a select few sender-receiver pairs with high numbers of shared miRNAs, or
only to those in favourable stoichimetric [miRNA]
/
Target pool ratios or instead if crosstalk
is a general phenomenon.
Identifying which miRNAs are involved in transmitting crosstalk between a particular
sender and a receiver is crucial to refining the ceRNA hypothesis. Current methods to
identify ceRNAs rely upon computational miRNA-mRNA target predictions. In particular,
they emphasize the number of shared miRNAs between a sender-receiver pair (Salmena
2011, Ala 2013). However, each miRNA-mRNA interaction is affected differently by the
strength of the miRNA-mRNA binding and by the local concentration of each interacting
species. Thus the ability of a specific miRNA to transmit crosstalk will be influenced by its
differential sequestration by the sender and differential repression on the receiver, and not
only on the number of shared miRNA binding sites. In the case of Pten ceRNAs, miR-17,
miR-19 and miR-26 families have been validated as transmitting crosstalk but it remains
unknown whether other miRNAs are functional in the Pten ceRNA network.
2.1
Results
In our study we used RNA Sequencing to quantify both the magnitude and extent of the
crosstalk effect genome-wide by directly measuring the effect of perturbation of 3 different
26
2.1. RESULTS
senders on the transcriptome. The senders we chose to knock down - Pten, Vapa and Cnot6l
share many miRNA binding sites, and were each experimentally demonstrated as putative
ceRNAs, competing for miRNAs with each other in the colon carcinoma HCT 116 cell line.
Genome-wide measurement of the transcriptome after the perturbation of senders using
RNA Sequencing would allow an assessment of key features of the ceRNA hypothesis.
In particular, it would permit a quantification of the magnitude of crosstalk strength for
thousands of potential receivers. Our work is focused on three major questions: a)How large
is the magnitude of crosstalk in an endogenous system? Are ceRNAs restricted or are they
extensive when you test thousands of sender-receiver pairs? What are the characteristics
of a good sender? Which miRNAs are involved in transmitting crosstalk? What are the
characteristics of miRNA's that makes them good at transmitting crosstalk?
We used a highly simplified model of miRNA regulation of a single sender-receiver aimed
at quantifying the magnitude of crosstalk interactions. The model predicts that crosstalk
strength is bounded by 1 and is usually much smaller for reasonable binding parameters.
On evaluating the crosstalk strength transcriptome-wide in our experiments, we found that
crosstalk strength is indeed bounded for each of the senders, yet it is surprisingly pervasive
across the genome- including hundreds of genes at all expression levels. We uncover putative
ceRNA's for each sender based on the difference of crosstalk strength in the HCT116 and
HCT 116 DICER -/- colon carcinoma cells. We further characterize the influence of shared
miRNAs between senders and receivers upon the crosstalk strength and determine that
crosstalk strength is intensified when sender-receivers pairs share more miRNAs. Using
our quantification of crosstalk strength, we estimate the power of a miRNA to transduce
crosstalk for each sender, and find that there is a hierarchy of miRNAs crosstalk power i.e
miRNA are differentiated in their ability to affect ceRNAs. Surprisingly, we find that the
miRNAs targeting Pten have the highest crosstalk power of the three senders. We suggest
that the ability of a gene to be a good sender of crosstalk (like Pten) is dependent upon its
ability to sequester miRNAs and the overall stoichiometry of its [miRNA] / target pools.
We further find that we can modulate the levels of these putative ceRNA's by transfecting a plasmid carrying endogenous Pten 3'UTR sponges into the cells at varying levels.
27
Chapter 2
Specifically, we find a subset of 'robust' Pten ceRNAs are both de-repressed in a dosedependent manner and depleted when Pten is knocked down suggesting that Pten exists in
an optimal regime for crosstalk.
2.1.1
ODE biochemical model of crosstalk predicts that crosstalk strength
should be bounded by 1
The endogenous molecular environment consisting of numerous miRNAs and targets is
complicated; any perturbation of a sender changes target pools for many different miRNA's
targeting many receivers (Figure 2.1a). To characterize the strength of the ceRNA effect,
we need to answer two questions: How does a change in the sender influence the free miRNA
pool? How does the corresponding change in the miRNA pool influence the receiver? We
sought to understand the simplest system consisting of one sender- one transmitting miRNA
and one receiver. In the simplest titration mass-action ODE model (analagous to Buchler
2008;Mukherji 2011) of two mRNAs regulated by one miRNA which is recycled after interacting with its target we take into account the dynamic properties of miRNAs (p), free
mRNAs (ceRNAs) for the two targets (m, and M 2 ), and complexes of the miRNA with its
targets (mg and M 2,a) . The model's parameters are transcription and degradation rates
for m1, 2 (il and dm) and [t resp.), and association, dissociation, and degradation rates for
the complex m1,2t (kn,koff, d"' . For illustration (but this simplification can be relaxed),
all the transcription, degradation and association rates are assumed equal. Considering one
target as the sender and the other as the receiver of crosstalk (Figure 2.1a), we would
like to know the impact of the variation of the single sender m1 's transcription rate on the
receiver m 2 (their derivative is what we term crosstalk strength).
d[i] = V - d
dt
2
i.[m,] - k".[n.].[jL] + k f.[mp]
Z
2
d[minp]
= k ".[mj.[p] -k'
d
.[mp
-
A' = [p] + [mIA] + [n 2p]
28
(2.1)
2
d
.[mp
(2.2)
(2.3)
2.1. RESULTS
where we assumed that the total miRNA concentration is a constant ILT . We can obtain
the steady state solutions for each species:
(2.4)
K
where K is the effective dissociation constant of the miRNA complex, K
d
coped-,an
[Mil
([m2 - [pt - K* +
V([m1
- [p*]
-
K*) 2+ 4m .K
Where we defined a microRNA "target load" w = ml/Kd +
kof f+dm"
on
and
(2.5)
m 2 /Kd which describes the
sequestration of the miRNA by the two regulated mRNAs and captures the competition
between those two co-regulated genes for the same mRNA. [m9] = v m
/d' is the steady state
mRNA level without any microRNA regulation, and the effective miRNA concentration is
A
d*
/d
. [p T]. The effect of the competing mRNA can be subsumed into an apparent
Kd*=Kd. (1+ w,) corrected by the miRNA target load that the other mRNA contributes to
the miRNA, w 1 2 = m 2 /Kd.
The quantity we are interested in, crosstalk strength, is the sensitivity of m2 to
M1
levels, d".
drnj To see how it varies with sender expression, we fixed all parameters but the
transcription rate of the sender. For the steady state solution of the model, the dissociation
constant for the miRNA-target complex K , dictates the threshold at which the miRNA
is bound or unbound by the sender. Most miRNAs are bound as the sender levels increase
above the threshold while they become unbound below it (Figure 2.1b). The model gives us
the steady state concentration of the receiver as depending on the free miRNA concentration.
As free miRNA levels decrease, the receiver gets progressively unbound (Figure 2.1c). If
there are too many sender molecules then all miRNA would be bound by it, leaving the
receiver free, thus no crosstalk would be observed. If there would be too many miRNAs
then both ceRNAs would be bound by the miRNA, and no crosstalk would be observed.
Above the threshold, miRNA repression is lost and receiver levels grow almost linearly
with transcription rate of the sender while its variation is maximal close to the threshold.
Thus crosstalk is only present in a narrow range near the threshold Kd, where bound
29
Chapter 2
miRNA-mRNA complexes are most sensitive to free miRNA concentrations. Moreover, the
model illustrates that both the binding affinities and the overall stoichiometry of the system
dictates whether or not there is cross-talk between ceRNAs (Figure 2.1d) [similar to
Figluizzi 2013].
The magnitude of the crosstalk strength between the sender and receiver can be shown
to be the product of two factors: the response of the miRNA level to perturbations of
the transcription rate of of the sender, and the response of the level of the receiver to
the perturbations of the miRNA level (See Appendix A). The former depends upon the
fraction of miRNAs bound by the sender (Sequestration;determined by Kd, ) while the latter
depends upon the the relative repression conferred by a miRNA upon its target (Repression;
determined by rates of degradation and association and by the relative concentrations of
free miRNA). As the sequestration factor is always less than 1 (miRNAs are always bound
to other targets than just sender) and the repressive effect of the miRNA on the receiver
mRNA is also always less than 1, their product will also be less than 1. Thus, the crosstalk
strength can be shown to be bounded by 1.
CS recive < Sequestration
x Repressionreceiver< 1
(2.6)
In simulations of the single sender-receiver model, where we sweeped parameters (with
biologically reasonable values from the literature] in about 90% of expression states in all
systems, crosstalk strength was below 0.1. The simulations show that crosstalk is strongest
when the expression of the sender is in the sender's ultra-sensitive regime and the expression
of the receiver is below the receiver's ultra-sensitive regime. Though the single senderreceiver model is perhaps too simple, it does make a testable prediction: crosstalk strength
in an endogenous system should be small and generally bounded by 1. To evaluate this
general prediction we used RNA Sequencing to quantify both the magnitude and extent of
the crosstalk effect genome-wide for three different sender mRNA.
30
2.1. RESULTS
minimal system
endogenous situation
I
:.
transmittng
miRNA
0
receiver
sender
How does the corre sponding
How does a change
in the sender influence
the miRNA pool?
b
*
a
%~S
change in the miRNA pool
influence the receiver?
C
m"A
4D
z
a0
0
-0.3
100
102
101
[miRNA]
[sender]
d
03
CS
a)
11
o1 0" lop[sender] 0
= dm/dm,= dm/dTL * dTL/dm, <
microRNA-mediated
changes in mnRNA2 upon
change of targedoad (TL)
1
change of targetdo Ad
upon change in n RNA1
.
101
C,
W
Figure 2.1 1 ODE biochemical model of a miRNA mediated crosstalk predicts
that crosstalk strength should be bounded. (a) Generally RNAs (wavy lines) in
an endogenous system of multiple miRNAs (cicles) interacting with many targets will sequester miRNA and produce RNA competition effects. This competition between competing
endogenous RNA (ceRNA) species for their miRNA is termed 'crosstalk' or the ceRNA effect. (b)We study a minimal model with only one 'sender', one transmitting miRNA and
one 'receiver' under simple mass-action kinetics to computationally ascertain how a change
in the sender influences the miRNA pool and how the corresponding change in the miRNA
pool influences the sender under reasonable biochemical binding parameters.] Steady state
concentrations in the system are obtained by fixing all parameters but the transcription rate
of the sender. All binding parameters are assumed equal between sender and receiver. Sender
and receiver expressions are normalized by their (equal) dissociation constants.Numerical
simulations of the model show that bound miRNA-target complexes are formed and free
miRNA declines as more sender target sites are introduced into the system until the sender
saturates the miRNA pool. Maximal change requires free miRNA concentration around the
dissociation constant (Kd) of sender binding sites. Inset contains the derivative of miRNA
concentrations with respect to sender concentrations which is always negative because an
increase in the sender always causes an increase in the level of bound miRNA-target complex
31
Chapter 2
Figure 2.1 1 ODE biochemical model of a miRNA mediated crosstalk predicts
that crosstalk strength should be bounded. (c)Under repression by miRNAs, the
receiver levels decline upon increase of miRNA levels until they are maximally repressed.
Inset contains the derivative of receiver concentrations with respect to miRNA concentrations and it is always negative because an increase in [miRNA] always has a peak around
Kd of receiver. (d)Combining the dynamics in (b) and (c) we obtain the response of the
receiver to sender levels. The receiver is sensitive to variations in the level of its competitor
(sender) via the change of the free miRNA concentration [miRNA], and is progressively
derepressed as the sender starts to sequester the miRNA. Its derivative is what we refer to
as the crosstalk strength (CS) i.e the relative change in the free levels of the receiver upon
a relative change in the sender. The inset depicts the crosstalk strength in this model (parameter set). The crosstalk strength increases in the regime where free and bound molecules
have similar concentrations. Crosstalk is bound by 1 because it is the product of two factors
that are each less than 1: the fraction of miRNAs bound by the sender and the change in
repression of the receiver upon changes in its target pool.
2.1.2
Quantification of crosstalk following siRNA knockdown of sender
Previous studies of ceRNA's have focused on only one sender or on only a few targets of a
miRNA, even though a perturbation in ceRNA levels that changes miRNA activity would
be expected to affect many hundreds of genes. To obtain a more comprehensive view of
the effects of sender knockdown in an endogenous system, we knocked down a sender using
siRNA and quantified the concomitant changes in the transcriptome using RNAseq (Figure
2.2a). These experiments were performed in triplicate using siRNA pools (a combination
of four independent siRNAs) which have been specifically designed to achieve strong target
knockdown and minimize off-target effects. We chose to knock down Pten, Vapa and Cnot6l
as each of them has been previously shown to be a strong sender of crosstalk [Tay 2011];
moreover, they are targeted by many different, validated, miRNA families [figure], each
of which, in turn, targets many different RNA's, thus allowing us to simultaneously test
i) thousands of possible sender-receiver pairs for crosstalk and ii) isolate the contribution
of specific miRNA's in transmitting crosstalk iii) test the impact of shared miRNAs on
crosstalk. As any siRNA knockdown experiment has confounding direct and indirect effects
that are either a) due to off-target effects of siRNA transfection or b) not mediated through
competition with miRNA's but instead due to the changes in sender transcription (Pten
for eg. is a key antagonizer in the PI3K-AKT/PKB signalling pathway), all our siRNA
32
2.1.
RESULTS
knockdown experiments were performed in parallel with two essential controls:
a) with negative control siRNA's (Gene expression levels following the knockdown were
compared to expression data collected from three replicates that were transfected with
negative control siRNA)
b) in the HCT 116 DICER -/- cell lines.
The DICER -/- HCT 116 cell line has a deletion in exon5 of the DICER enzyme which
is crucial in the processing of mature microRNA's [Cummins 2006]; additionally mature
microRNA's are known to be significantly depleted in them [Tay 2011]. We expect crosstalk
would thus be reduced significantly in the DICER cell for any putative ceRNA, as observed
previously [Tay 20111 thus allowing us to use it as a control to eliminate non miRNAmediated fold changes.
After treating the cells with siRNA, we waited for 24 hours to ensure a strong knockdown,
extracted RNA and prepared RNA-sequencing libraries for each of the knockdowns. We
sequenced with Illumnia HiSeq 2500 at a depth of roughly 20-30 million short reads per
sample. We quantified gene expression in each condition by using reads-per-kilobase million
(RPKM) normalization and averaging RPKM over three biological replicates. To remove
variability from low-abundance RNA species, we removed genes that had 0 reads counts in
any library and measured fold changes for each gene between the sender knockdown libraries
and the negative control libraries. We achieved a direct knockdown fold change of 70-80% for
each of the three senders. A representative RPKM expression scatter-plot in siPten vs the
negative control sample in (Figure 2.2b) shows that Pten is the most strongly differentially
expressed gene. We also confirmed that siRNA mediated gene-silencing is independent of
DICER processing and hence fully functional in the DICER -/- cell line, as comparable
knockdown fold changes for the senders were observed in the DICER cells.
33
Chapter 2
25nM
si-VAPA
29nM
si-PTEN
25nm
hbCNOT6L
-
a
HCT 116
high cs
cells
med cs
p.
Z
C
24 h
.
Extract RNA for RNAseq
X1
.
-"'- -,"","""
os
t
-.
.
*
Test thousands of senderreceiver pairs
6*
...
3
*
Biological
ReplicatesI
HCT 116
& DICER
-
-
3
I
0
4
2
6
8
5
10
gene expression si-neg. control log2 (RPKM)
C
20
10
15
si-neg. control RPKM
25
I
30
VAM
AM
0-
to
OLO
-
0
-1.5
-1.0
-. 5
0.0
0.5
PTEN Crosstalk Strenath
1.0
-1.5
-1.0
0.0
0.5
-0.5
VAPA Crosstalk Strenath
1.0
-1.5
-1.0
-0.5
0.0
0.5
CNOT6L Crosstalk Strength
Figure 2.2 1siRNA knockdown of 3 different endogenous senders shows crosstalk
strength is bounded by 1. (a) Experimental system for quantifying crosstalk strength
genome-wide upon siRNA knockdown of either Pten, Vapa or Cnot6l in HCT116 and
miRNA deficient DICER -/- HCT 116 cells. Each cell-line was transfected with sendersiRNA and negative control siRNA in parallel and their RNA was extracted after 24 hours.
For each sample, RNAseq libraries were created and transcript expression was quantified
with sequencing. All RNAseq exeriments were performed with 3 biological replicates. (b)
RNAseq mean expression (in units of log 2 RPKM) scatter plot for the Pten knockdown and
negative control in HCT 116 cells. Each dot represents the mean expression for all genes
expressed at greater than 0.1 RPKM in the two libraries (n=13,700 genes). The direct fold
change in Pten (shown in green) due to the si-Pten knockdown was 80%. Crosstalk strength
for each receiver gene is defined as their fold change normalized to the fold change of Pten
(sender). Genes below the diagonal (purple line) have positive crosstalk strength as they
are reduced upon Pten knockdown. The right panel is a zoomed in version to highlight
changes in genes with expression similar to Pten. The magnitude of crosstalk strength can
be estimated as their relative distance from the diagonal compared to Pten's distance from
the diagonal. Genes marked in light blue have a lower crosstalk strength than those marked
in dark blue. Most genes that fall along diagonal show no changes in expression i.e no
crosstalk. In contrast, previously known Pten ceRNAs, Cnot6l and Vapa both have positive
CS and are marked in red for comparison. Expression is in units of RPKM. (c) Volcano
plot of statistical significance for Crosstalk Strength versus P-value in each of the senderknockdowns. Crosstalk strength is bounded by 1 (dotted green line) but can have larger
negative values. CS=1 for each of the senders (black dots) by construction. P-values are
adjusted for multiple comparisons by Benjamini and Hochberg false discovery rate (FDR)
fthod with a = 0.05)
1.0
2.1. RESULTS
2.1.3
Pervasive yet bounded mRNA Crosstalk upon siRNA knockdown
Different receivers will in general respond differently to a change in sender levels depending
upon exactly which miRNA are being sequestered by the sender or by the repressive effect
of miRNAs, and thus can exhibit more or less crosstalk. We wished to quantify the crosstalk
strength between senders and all its potential receivers for each of the 3 different siRNA
knockdown RNAseq datasets in the HCT and DICER cell lines. We defined the 'crosstalk
strength' of a receiver with respect to a sender in the respective cell lines/conditions, as the
relative fold change in the receiver levels after the sender knockdown to the relative change
in sender levels after its siRNA knockdown. For example, for HCT 116 cells, when the sender
is Pten, then for a receiver gene X we compare its mean expression in the negative control
(termed 'HN') replicates to its expression in the siPten (termed 'HP') biological replicates
CSceLs=HCT,receiver=X sender=Pten
fold change of gene X in HN over HP
fold change of Pten in HN over HP
_
-
XHN-XHP
XHN
PtenHN-PtenHP PtenHN
This means, that when the crosstalk strength is 0.1 and the sender levels reduce by
80% then the receiver levels will reduce by 8% through crosstalk. The crosstalk strength,
so defined, is dependent on the relative direction (sign) of the fold change: Genes with
positive crosstalk strengths are thus depressed when the sender is knocked down i.e they
co-vary with the sender as implied by the ceRNA hypothesis. Genes that are upregulated on
sender-knockdown will thereby have negative crosstalk strength but should not considered
as putative ceRNAs.
We calculated the crosstalk strength for all the 13,700 expressed genes in each of the
sender libraries as described above and examined its distribution in the HCT 116 and
DICER cell lines. Most genes suffered no expression change on knocking down the Pten,
Vapa or Cnot6l senders, thus the CS distribution was centered around zero in both HCT and
DICER. As suggested by the reduction of miRNA activity in DICER -/-, the distribution
of CS in DICER was substantially shifted towards smaller values than CS in HCT. (Figure
2.3). Strikingly however, crosstalk strength in all the conditions was bounded by +1 - almost no genes were down-regulated greater than the sender down-regulation i.e receiver gene
expression fold changes were smaller than the 70-80% fold change of the sender.(Figure
35
Chapter 2
2.2c). Hundreds of genes had statistically significant (p<0.05) CS between 0.1 and 0.5 but
relatively few had greater crosstalk strength that was also significant. We obtained the statistical significance for gene Crosstalk Strength by calculating z-values from our replicates
and using the Benjamini-Hochberg method to adjust p-values for multiple comparison testing. Interestingly, genes with negative crosstalk strengths had comparatively higher p-values
(more replicate variability) indicating that they tended to be expressed at lower levels. Taken
together our prediction that crosstalk strength should be bounded was supported by the
genome-wide expression data.
a>
0
HCT CS
DICER CS
HCT CS
DICER CS
HCT CS
DICER CS
"q
sq
C
-1.5
-1.0
-0.5
0.0
0.5
PTEN Crosstalk Strength
1.0
-1.5
-1.0
- .5
0.0
VAPA Crosstalk Strength
0.5
1.0
-1.5
0.5
0.0
-0.5
-1.0
CNOT6L Crosstalk Strength
Figure 2.3 1 Crosstalk is miRNA-mediated and pervasive on a genome-wide
scale. (a) Probability density of the crosstalk strength distribution in both HCT (black)
and DICER (red) for all genes expressed above 0.1 RPKM in each of the 3 senders. Observed
crosstalk strength in all of the knockdowns is always less than 1. CS is higher in HCT cells
compared to DICER for many genes, and more genes have negative CS in DICER. The
inset shows the same distributions but with the number of genes whose CS HCT > CS
DICER calculated for each of the 30 bins across HCT CS. This indicates that hundreds of
genes exhibit miRNA mediated crosstalk across the genome for each of the three senders
with low-moderate crosstalk strength (0.1<CS<0.5)
To determine whether or not these extensive positively crosstalking genes were indeed
miRNA mediated, we chose only those genes whose crosstalk strength in HCT116 was
greater than that in DICER. We found such genes across a range of crosstalk strengths
ranging from low (n=440 Pten ceRNAs with CS=0.1) to high (n=65 Pten ceRNAs with
CS=0.5)(Figure 2.3) indicating that putative ceRNAs were found pervasively across the
36
1.0
2.1. RESULTS
transcriptome. In addition, on examining the expression range of these putative ceRNAs,
we found that they were expressed across 3 orders of magnitude. These included some ceRNAs previously discovered (Cnot6l, serinci, Vapa, zeb2) but also hundreds of novel ceRNAs
((Figure ??). A GO-term analysis for putative Pten ceRNAs showed significant enrichment
for a range of biological processes including "protein phosphorylation", "regulation of phosphate metabolic process' (Table 2.3), which are also GO-terms linked to the functional role
of Pten- which acts as a tumor suppressor through the function of its phosphatase protein
product.
To assess whether these putative ceRNAs were actually responding to changes in miRNA
levels due to depletion of the sender, we performed a miRNA enrichment analysis. MiRNAmediated crosstalk would require that these putative ceRNAs are enriched in miRNA binding sites for their particular senders. Indeed, we found many sender-targeting miRNAs that
are enriched in their respective putative ceRNA lists (Table 2.1). These include miRNA
families (mir-17, mir-19, miR-93, miR-26) previously implicated in Pten ceRNA networks
[Poliseno 2010]. Intriguingly, we also found statistically significant miNRA enrichment in
these ceRNA sets for miRNAs that are not known to have binding sites on the sender,
suggesting ceRNA effects can propagate via the interconnected miRNA-target networks.
2.1.4
Crosstalk strength correlates with the number of shared binding
sites
We reasoned intuitively that if miRNAs of different families are sequestered by a sender, then
each miRNA released upon sender knockdown would repress their targets independently,
thus amplifying any crosstalk between the sender and receivers which share binding sites.
Indeed, our model, along with others [Ala 2013], suggests that crosstalk depends on the
overlap of miRNA-binding sites between senders and receivers. Specifically, it increases
with the increase in the number of shared MRE's. In order to test this hypothesis we first
tested the weaker claim: genes that share multiple miRNA binding sites with the sender
must have greater crosstalk strength than the set of all genes. The second, stronger claim
we tested was: the more miRNA binding sites a receiver shares with the sender, the more
37
Chapter 2
its crosstalk strength ought to increase.
We tested the weaker claim by ranking genes exclusively by the
of shared miRNAs
with the sender (independently for Pten, Vapa and Cnot6l). We counted all the predicted
target-scan overlapping binding sites shared between any given mRNA and the sender, and
then ranked this list of genes by the number of shared binding sites. We thus obtained a list
of top 500 Pten, Vapa and Cnot6l "shared miRNA predicted ceRNAs". We then compared
the CS of these genes in HCT to that in DICER cells, and found that their HCT CS
is significantly greater than their DICER CS for Pten, Vapa but not for Cnot6l (Figure
2.5a) suggesting that our measurement of crosstalk strength was miRNA dependent and
supported the hypothesis of the correlation between shared binding sites and crosstalk
strength. In order to eliminate any systematic CS bias in HCT versus DICER, we also
checked the CS distribution in the three HCT 116 sender libraries. We found that the HCT
crosstalk strengths in these "top 500 shared miRNA predicted ceRNAs was significantly
greater the the control set (consisting of all genes) for Pten, Vapa but not for Cnot6l (Figure
2.5b).
We caution that not all of these computationally predicted genes that share miRNA
binding sites with a sender have positive CS. For example, 155 genes of the "top 500" genes
that share more than 3 miRNA binding sites with Pten have negative crosstalk strength thus
demonstrating that computational methods of predicting ceRNAs have to be supplemented
by experimental tests due to the high number of false positives present in TargetScan binding
sites predictions.
Because the second claim is more quantitative than a simple comparison, we wished
to remove contamination from non-ceRNAs and required our basic condition: HCT CS>
DICER CS be met. To further increase stringency, we took this list of candidate ceRNAs
and required that they share at least four miRNA-binding sites with the sender. For Pten
we found 858 genes and for Vapa 610 genes
We then binned these receiver genes into quintiles of the number of shared miRNAs. For
each of these quintile gene-sets, we computed the median CS independently for each sender.
Consistent with the model, the crosstalk strength was shifted to higher levels in receivers
38
2.1. RESULTS
that share more and more miRNA binding sites with the sender (Figure 2.5c). The greater
CS of Pten may indicate a greater propensity to sequester miRNA's or its greater affinities
to miRNA (see discussion). These results confirmed that shared miRNA binding sites play
a significant role in transmitting crosstalk between a sender and a receiver.
With this analysis, we found that Cnot6l shows no evidence of crosstalk strength dependence on number of shared mIRNA (Figure 2.4) : (i)the genes that share more than
8 miRNA binding sites with Cnot6l have lower crosstalk strength than those that share no
miRNA binding sites. (ii) there is no increase in crosstalk strength for genes binned by the
# of shared miRNAs with Cnot6l
all genes
top 500
shared miRNA
-
C;
00
cell of"to500
gees
ihteC
(black)~~~~~~~~~NO6
hatshare
grae
Cln
itiuinfralgns(ry.Temda
TL
hn7miRNAbidgstswthC
rosstalk strength
is smaller for "top 500" genes than that for all genes indicating that CNOT6L
not dependent on the number of shared miRNA
crosstalk is
39
Chapter 2
b
0.20
HCT 116
HCT 116
-
top 500 shared miRNA
alligenes
tIoP 500
DICER
* siVAPA
C0.15
-
a
a)
0<0
E
'0.10
-1.0
0.0
0.5
-0.5
PTEN Cro.takatnenEM
1.0
DICER
1.0
0.0
0.5
-015
OTEN Crossilk .100001
-to
-
CA
0
a)
cc,
to S1M1.
000
L
0.00
(3
4
P< 1 -1.0
10
0.5
0.0
-0.5
VAPA 00.000.00050 Ss0500010 VAA
-10
o etelkst0
-0.0
00
05
to
3'
1 ~
Bins (# of Shared miRNA)
Figure 2.5 1Crosstalk strength of receivers correlates with the predicted number
of miRNA binding sites shared with the sender. (a) and (b) Crosstalk strength is
higher for receivers that have the largest number of predicted miRNA binding sites in common with their respective senders both between HCT116 and DICER, and within HCT116
cells. (a) Cumulative distributions of crosstalk strengths wrt each sender for receivers that
share the most binding sites with the sender. The crosstalk strength distributions for these
set of genes is shown in HCT116 and DICER. "top 500 shared miRNAs" indicates the
ranked list of genes sharing at least 4 or more binding sites with the sender, see text).
These genes show a significant increase in CSh116 compared to CSdicer . p < 10-9 for the
difference between the distributions was calculated by the one-side Kolmogorov Smirnov
(K-S) test. (b) same as above but the Crosstalk strength distributions wrt each sender are
for all genes in HCT116 and the set of "top 500 shared miRNA" genes also in HCT116.
These genes show a significant increase in CS compared to the 'all genes' background set.
p < 10- 5, p < 10-4. for CS pten and CSvapa respectively (K-S test).(c) Genes that share the
most binding sites with the respective senders were grouped into bins based on their of
shared binding sites (colored # of shared binding sites is indicated on x-axis). Only those
receivers with CShctll6 > CSd"i were selected. The median crosstalk strength in each bin
is reported (for each sender). The distribution of CS for each bin was significantly different
from the preceding bin with all p-values less than 10-3. (KS test). Each bin had atleast 90
genes.
2.1.5
miRNA's hierarchically contribute to transmitting crosstalk
With these quantitative genome-wide measurements of crosstalk strength, we next turned
to measuring the ability of a miRNA to transmit crosstalk. Given that different miRNAs
vary in their concentrations, binding affinities, target abundances, all of which modulate
40
2.1.
RESULTS
their ability to transduce crosstalk, we wished to dissect their individual contributions.
To determine which miRNA's were involved in mediating crosstalk, and to what extent,
we developed a metric to quantify the bulk effect of sender knock-down upon the predicted
targets of a miRNA. Rather than evaluating the crosstalk stength for a particular target of
a miRNA, our metric characterizes the the cumulative, concordant variations of all, rather
than individual target genes. Specifically, for each miRNA, we calculated the difference
between the median CS of its targets (genes that contain a predicted binding site for that
miRNA) and its non-targets (genes that don't contain a binding site for that miRNA).
= median (CS targets of miRNA,)- median (CS non-targets of miRNA,)
CT powered
miRNAi
(2.7)
We term this shift in the CS distribution for targets vs non-targets the Crosstalk Power
for each miRNA (Figure 2.6a). Note that we don't require the crosstalk strength of a
particular gene to be statistically significant as we are interested in the cumulative effect
of a miRNA on all its targets. Reassuringly, the 'Crosstalk power' for conserved miRNA's
that have known binding sites on Pten and Vapa (again, not for Cnot6l) is greater than
the crosstalk power of miRNA's that are not predicted to have binding sites on these
senders(Figure 2.6c). This suggests that crosstalk power can be used to discriminate between sender-targeting miRNAs and non sender-targeting miRNA. The crosstalk power of
the sender-targeting miRNA families are shown in (Figure 2.6b). We found that miRNA's
differ considerably in their ability to transmit crosstalk, as exemplified by miR-374ab and
miR-875 which emerged as the miRNA with the greatest Pten and Vapa miRNA crosstalk
power respectively. Strikingly, almost all Pten-targeting miRNAs have positive CT power,
including many miRNAs that have greater crosstalk power than mir-17, mir-19, miR-20a,
mir-26a which have been previously shown to directly mediate crosstalk for Pten. [Poliseno
2010].
Pten therefore has the ability to transmit miRNA through by sequestering many different
miRNAs allowing it to promiscuously interact with ceRNAs. In general however, not all
miRNAs that are predicted to have binding sites on the senders necessarily have a positive
41
Chapter 2
crosstalk power; nor do all miRNAs with positive crosstalk powers necessarily have binding
sites on the senders. For example, we uncovered 92 different miRNA with positive crosstalk
power for Pten and 67 different miRNA with positive crosstalk power for Vapa that do not
have any predicted binding sites on the respective genes.
One factor that our model suggests can influence the ability of a miRNA to transduce
crosstalk for a sender is the cumulative number of its binding sites sequestered by that
)
sender. A more effective sender of crosstalk would sequester many miRNA's (higher Kd
but would only be weakly repressed by them, enabling a greater contribution to free miRNA
pools when the sender is perturbed. However, it is very difficult to experimentally quantitate miRNA sequestration on miRNAs. We therefore estimated the sequestration fraction
bioinformatically for each of the miRNA's which target Pten (similarly for Vapa, Cnot6l).
To do so, we calculated the ratio of the number of predicted targetscan binding sites on
Pten (scaled by Pten's expression) to the predicted targetscan binding sites for that miRNA
on all its other targets (scaled by their expression). This ratio quantifies for each miRNA
the fraction potentially sequestered by Pten. As expected from the model, we find that
miRNA crosstalk power for each sender is strongly correlated with the fraction of miRNA
binding sites sequestered by the sender (Figure 2.6d). Notably, those miRNAs which are
sequestered by Pten the most tend also to have the greatest crosstalk power (miR-374ab,
miR-410)
42
2.1. RESULTS
a
b
Ist-7M84W
no-
P.3030946-10
rges
PTEN miRNA CT power
.040.02-
CrT
0.0Emil
I
'r
0.04-
06-6-
000
E
V
TE
6-
LC
I
VAPA miRNA CT power
-
0.02
-
0.00
P
-
-0.02
-5
-1.0
-05
00
05
1.5
10
PTEN Crosstalk Strength
"2
9
C
Eg
d
PTEN mRN
bgr miRNAs
3
- --E2-2 E
-a I
T
rho
=
0.37
. ..
'_ .............
..
00....
C)
-0.04
0.00
-0.02
0.02
004
0
0.00
0.00
0.02
0.04
0.0
0.06
0.10
%
maRNA sequestraion
o
miRNAs
rfo = 0.35
-
VAPA
bgr PoRNAs
6
a
VAPA
I-
0
z
-g
-0.04
-0.02
0.00
0.02
0.06
0.00
0.02
0.04
0.00
miRNA sequestraton
0.6
0.10
%
miRNA Crosstalk power
0.04
Figure 2.6 1 Dissecting relative contributions of miRNAs in transmitting
crosstalk. (a) Histogram of Pten crosstalk strength in HCT116 for predicted targetscan
targets of let-7 (red) and its non-targets (gray). The bulk-contribution of let-7 in transmitting Pten crosstalk to all of its targets can be estimated by the difference in the medians
of the two distributions. We defined this difference as the "Crosstalk power" (CT power)
of the miRNA let-7 for the sender-Pten. Crosstalk power can similarly be calculated for all
153 conserved miRNA families expressed in HCT cells from each of the sender crosstalk
strength distributions. CT power is larger for those miRNAs whose targets suffer a large
overall repression when the sender is knocked down. Only those genes with CS HCT>CS
DICER were considered. (b)miRNA CT power for all the miRNAs which target Pten, Vapa
shows differential ability of sender-targeting miRNAs to transmit crosstalk. Those miRNA
with negative CT power are those whose targets tend to be up-regulated when the sender
is knocked down, and are thus unlikely to be involved in the ceRNA effect. miRNAs which
have shared binding sites in all the three senders are in bold. See [table] for miRNA CT
power and p-values for all 153 miRNA families. (c)Cumulative distributions of miRNA
crosstalk power for all the miRNAs which target the sender (red) compared to the miRNAs
which dont target the sender (black). (d)miRNA CT power for sender-targeting miRNAs
is correlated with the its fraction of binding sites on the sender- its sequestration fraction.
Those miRNA with higher CT power also tend to be (relatively) highly sequestered by the
sender.
43
Chapter 2
2.1.6
Pten miRNAs have the greatest crosstalk power due to high [miRNA]:
Target abundance ratios
A recent study using Argonaute CLIP assays [Bosson 2014] has shown that miRNA:Target
ratios is correlated with higher Argonaute binding on genes, and consequently, greater
miRNA repression. It has been experimentally demonstrated that only the most abundant
miRNAs have significant repression suggesting that ceRNAs are free from miRNAs when
those miRNAs have low concentrations. Conversely, previous analysis of miRNA repression
showed that miRNAs with lower miRNA:Target abundance ratios deliver minimal repression
[Garcia 2011; Arvey 2010]. A possible explanation is that lowly expressed miRNA have a low
probability to find their target sites on transcripts because miRNA-target target encounter
occurs by mass action. Additionally, when microRNAs that are expressed at a low level
have hundreds of different targets (i.e have high target abundance), a single miRNA would
have a limited repressive impact on any one gene.
We sought to investigate differences in the relative miRNA and target levels for our three
senders. We first obtained miRNA expression profiles in HCT116 from a miRNA microarray
[Yan 2011] and found that the average expression of miRNAs which target Pten were greater
that Vapa or CNOT6L targeting miRNAS. Surprisingly, even though Cnot6l has several
more predicted miRNA binding sites than Pten (Figure 2.7a)(44 and 24 respectively), the
average expression of Pten-targeting miRNAs is almost four times greater than the average
expression of Cnot6l targeting miRNAs(Figure 2.7c).
We next estimated Target Abundances (TA) for each miRNA by summing the predicted
6-mer,7-mer and -8-mer binding sites on each of its targets scaled by the RPKM expression
of that target in HCT116 in our data [following Bosson 2014]. We averaged the target
abundance for each of the miRNAs targeting Pten, Vapa, Cnot6l. Interestingly, we found
the opposite hierarchy between the 3 senders: Pten had the least TA while Cnot6l had an
average TA about 10 fold higher(Figure 2.7b). Thus, Pten has the greatest [miRNA]: TA
ratio of the three, allowing us to hypothesize that miRNA's targeting Pten might confer
greater repression on their targets, compared to Vapa and Cnot6l, rendering Pten ceRNA's
44
2.1. RESULTS
more susceptible to crosstalk.
a
e
aof Prodded twgotio ndFtdA
PTE!%
or-
0-
b
n~lbg
S
Avg TWgotobuiduio of
0-
0
AqLL~q1Ofld
C
0
lnvHNA
C)
Avg EMwedon of 1W98"o ojfA
cc
CNCFr
VM%
----
VAPA
z
0O
0~
-CNOT6L
KM
d
Ag WE~so)O
WgWdoNA
modhon ,oIANA coostilpowortorrno tWgott adRt4A
2.00
FdCoQv&ddpoffb
g.&.oA
* Targeting miRNA
0 Background rniRNA
01
ci
2.10
[miRNA]
Target Abundance
2.05
2.15
NA
Figure 2.7 1 Greater miRNA:Target ratios underlie Pten's superior ability to
send crosstalk. (a) TargetScan based prediction of the number of different miRNA families targeting each of Pten, Vapa and Cnot6l. (b)Average target abundance of sendertargeting miRNAs (in log10 units). Target abundance (TA) for each of the 153 human
conserved miRNA was calculated (see methods) by summing the predicted 6-mer,7-mer
and -8-mer binding sites on each of its targets scaled by its target expression in HCT116.
(c)Average miRNA expression for each of sender-targeting miRNAs. miRNA expression in
HCT116 cells are from a miRNA microarray dataset (Yan 2011) and are in relative units.
Pten is targeted by highly expressed miRNAs compared to Cnot6l. (d)Median crosstalk
power for all miRNAs which target Pten, Vapa and Cnot6l respectively. Pten miRNAs have
greater crosstalk power. Crosstalk powers for each miRNA (for each sender) were calculated
from the crosstalk strength distribution as in the text. (d)miRNAs with greater crosstalk
power also have higher [miRNA]: Target ratio as exemplified by Pten which has the greatest
[miRNA]: Target ratio and miRNA crosstalk power of the three senders. Targeting miRNAs
(white) are those
(red) are all those miRNAs which target the sender. Background miRNAs
miRNAs that dont have predicted binding sites on the sender.
We used miRNA crosstalk power as a proxy for repression and ranked the senders by
our experimentally determined crosstalk power for each miRNA. Pten clearly emerged as
the best sender of crosstalk-its miRNAs had much greater miRNA crosstalk power than the
other two senders Vapa and Cnot6l (Figure 2.7d)- about twice the median crosstalk power
than Vapa's miRNAs. In fact, for each sender, we observed that miRNAs which target them
had both greater [miRNA]:TA and crosstalk power on average, than the background set of
45
Chapter 2
miRNA's that did not target them (Figure 2.7e).
Thus, we conclude that senders such as Cnot6l, which are targeted largely by lowabundance miRNA's with comparatively more targets have much weaker ability to transmit
crosstalk compared with a sender such as Pten. However, we are only making a comparative claim between the senders- as both the miRNA expression data and target abundance
estimations are in non-absolute concentrations, we cannot not be certain that miRNA concentrations are in excess of the target pool size or vice versa. Moreover, we observed no
correlation between the [miRNA]:TA ratios and the miRNA crosstalk power for just the
Pten miRNAs, just an overall correspondence in the median miRNA crosstalk power and
average [miRNA]:TA of the three senders. We found individual miRNAs that had high Pten
miRNA crosstalk power but had a low [miRNA]:TA and vice-versa.
2.1.7
Transfecting Pten UTR as a sponge de-represses putative ceRNA's
in a dose-dependent and miRNA dependent manner
As Pten emerged as the strongest sender of crosstalk in the siRNA knockdown experiments,
we wanted to exclude the possibility that transcriptional regulation via PTEN protein, a
tumour suppressor and a key member of the P13KT -mTOR pathway, may have been a
factor in the widespread crosstalk changes observed. We sought to clarify two questions:
a)whether miRNA binding sites on the Pten 3 'UTR were directly responsible for the
crosstalk effects b)to what extent would Pten ceRNAs be de-repressed by modulating the
amount of Pten 3' UTR i.e the varying levels of MREs by an endogenous 3' UTR
46
2.1. RESULTS
a
P
pTRE-Tight
b
PTEN 3'UTR
20% transfection
0.V
pTRE-TihtNULL
Iiu
ic~iencyiiii31m
C)
0
102
10
103
4
105
d
C
5
4
-PTEN
-NULL
-NL
2.0
3'UTR
3'UTR
TR
1.5
13
4,
NO
-
4.5
1.0
0.5
2.5
3
3.5
4
0
4.5
logio mCherry
[a.u]
Figure 2.8 1Derepression of Pten ceRNAs is detected upon modulating the levels
of Pten 3' UTR with a transiently trasnfected synthetic reporter construct. (a)
A synthetic two-color reporter construct for measuring the effect of Pten 3' UTR sponging
in single cells The construct consists of a bidirectional tetracycline-responsive promoter
that drives the transcription of two fluorescent reporter proteins: ZsGreen and mCherry.
We fused Pten 3'UTR to Zsgreen, and the unmodified plasmid is used as a control (NULL 3'
UTR) (b) Flow cytometry measurement of HCT116 cells transiently transfected with Pten
3'UTR sponge plasmid and induced with doxycycline for 18h (cells positive for plasmid
are in purple) indicate a robust expression of the Pten 3'UTR sponge across 3-decades.
Transfection efficiency is about 20%. (c)Pten 3'UTR. is under robust repression throughout
the plasmid expression range. It exerts a strong influence on Zsgreen levels as seen by the
difference in transfer function of Pten 3'UTR and NULL 3'UTR transfections. Cells were
binned by mcherry expression and the mean Zsgreen expression was calculated for each bin.
(d) Total RNA from sorted cells (purple in b)) carrying the Pten 3'UTR plasmid was probed
for the expression of known Pten ceRNAs with RT-PCR. Expression of indicated genes was
sem(wt)
normalized to their expression in the un-transfected cells. Data are mean
Constructing and transfecting Pten 3'UTR reporter sponge into HCT116 cells
The Pten 3'UTR contains predicted sites for 25 different miRNA families (http://www.
targetscan.org) and there is direct evidence using RNA immunoprecipitation for Pten reg47
Chapter 2
ulation by miR-106ab, miR- 130 and the miR-17-92 cluster (which encodes the microRNAs
miR-17, -18, -19a, -19b, -20a, and -92 in HCT 116 (Tay, 2011). To explore the genome-wide
effects of sponging away miRNA's with an endogenous 3' UTR on competing RNA's we
adapted a plasmid-reporter system previously developed in our lab [Mukherji 2011].
The plasmid contains two genes that encode fluorescent proteins (ZsGreen and mCherry),
which are transcribed at identical levels from a common bi-directional tetracycline-inducible
promoter and contains multiple-cloning sites to insert any 3'UTRs of interest(Figure 2.8a).
To probe the effect of microRNA sponging, we constructed a variant carrying the entire 3'
Pten UTR fused to the Zsgreen gene. We transfected this plasmid into the HCT116 cells and
used the original plasmid (without the Pten 3'UTR) as a control (we call this the null UTR
). The mcherry/Zsgreen fluorescence from the NULL UTR construct is used as a control
and allows us to isolate only the effect of the Pten 3'UTR sponge. In order to induce the
promoter with doxycycline we co-transfected these cells with the rtTA plasmid as HCT 116
does not endogenously produce rtTA transcription factors. We observed robust expression
of the Pten 3'UTR sponge construct across 3 orders of magnitude on quantifying the single
cell fluorescence 18 hours later using a flow cytometer (Figure 2.8b). In principle, plasmid
induction starts immediately after the addition of doxycycline, but we observed more derepression in confirmed Pten ceRNAs Vapa, Cnot6l and SERINC1 18h later as compared
to after 12h or 36h [supp figure], possibly due to miRNA degradation timescales.
To ascertain whether overexpression of Pten 3'UTR with a synthetic construct was capable of sponging away endogenous miRNAs, and thus derepressing other targets, we measured its effect on some previously established Pten ceRNAs. Taking the ratio of zsgreen to
mcherry across bins of mcherry fluorescence in the flow cytometry measurements in each of
the Pten 3'UTR and NULL plasmid transfections, allowed us to calculate the Pten 3'UTR
fold repression across the transfection range. We observed that Pten 3'UTR was under
weak repression (upto 2-fold) throughout the transfection range (Figure 2.8c). FACS sorting only the mcherry expressing cells, and measuring the bulk RNA levels of four Pten
ceRNAS- (Vapa, Cnot6l, SERINC1) with RT-PCR showed that they were de-repressed [figure] by 40-80% which confirmed that the Pten 3'UTR sponge was functionally engaging the
48
2.1. RESULTS
miRNAs in the cell and competing with its known ceRNAs (Figure 2.8d). Additionally,
in the RNAseq analysis, we could discriminate between the Pten UTR and Pten coding
sequence reads, and found that Pten mRNA (cds) was de-repressed increasingly as the exogenous Pten 3'UTR sponged away miRNAs from the endogenous Pten mRNA, confirming
our observation that Pten 3'UTR was under mild repression [figure 2.10 b]. Having observed
an increase in Pten (coding sequence) expression throughout all the bins, we reasoned that
the transfected Pten UTR sponges could also derepress other potential ceRNAs across the
transcriptome.
FACS sorting cells with varying amounts of Pten 3'UTR for RNASeq Assay
In order to isolate cell populations expressing varying amounts of Pten 3'UTR we used FACS
sorting. We used mcherry fluorescence intensity to bin cells with similar transcriptional
activity (e.g. due to varying plasmid copy numbers), indicating similar levels of Pten 3'UTR
sponge. For both of the Pten 3'UTR and NULL 3'UTR transfections, we then FACS sorted
100,000 live cells in 4 different bins across 3 orders of magnitude (Figure 2.10a), see
Methods), extracted RNA (500-10OOng per bin) and quantified the transcriptome of each
bin using RNA sequencing. As the amount of plasmid expression in bins 2 and 3 were upto
30% of the total reads (Figure 2.9a), and moreover, due to repression the expression of
Pten 3'UTR and Zsgreen expression were very different in each Pten UTR or NULL BIN,
estimating fold changes was not straightforward. Even after explicitly removing the reads
coming from the plasmid, and performing RPKM normalizations, we observed an overall
offset in the overall distribution of fold changes (Pten UTR/NULL) in bins 1,2,3 (Figure
2.9b). We used the more appropriate TMM (trimmed mean of M-values) normalization
method [Robinson 2010] to estimate the scale factor to remove the overall offset. After
doing so, we could measure the fold changes of the transcriptome in each bin reliably, and
set it as the ratio of the normalized TMM values in each Pten 3'UTR bin to the TMM
values in the corresponding NULL UTR bin.
Now that we could reliably infer the magnitude of fold changes causes by the sponging
effect of Pten UTR, we decided to explore the concordance between genes identified as
49
Chapter 2
putative Pten ceRNA's by the siPten knockdown and genes derepressed by Pten UTR
overexpression. These genes would be sensitive to the levels of Pten, and so would be
extremely likely interacting with Pten through the crosstalk mechanism. To identify these
genes, we first obtained the distribution of null fold changes from technical replicates in both
the siPten knockdown and the Pten 3'UTR RNAseq data, and defined a null fold change
threshold and CS of 1 standard deviation above 0 (Figure 2.10c). We only considered
genes whose FC and CS were above this threshold. These genes, were therefore both reduced
when Pten was knocked down, and de-repressed when miRNA's were sponged away by Pten
3'UTR, making them sensitive to perturbations in Pten levels in both directions. We found
2305 genes meeting these criteria in bin 0, 2493 in bin 1, 2090 in bin 2 and 2470 in bin 3.
50
2.1. RESULTS
a
C
RPKM normalization
Fewgesswah motmad counts
C\1
TMM normalization
C.'j
3,~
z0
0
0
C,
BINO
Ti- I.
b
C%j
.
F1
*
PTEN UTR
NULL UTR
J
20
0
BIN
I
04
Jft
30
BIN
-C\-
I*
-j
.*
BIN
I
.-
10
- t 0..
.- --- . *S.
*
0
BINO BINI
i
BIN2 BIN3
C.j -j
d
1Wit
BIN 2
-1 . .
BIN 2
C\j
0
U-
-C'J
0.75
B IN 3
0
0.5
BINO
BINI
BIN2
BIN3
10
5
A=.5*10g2(WT*NULL)
10
5
A=.5*10g2(WTr*NULL)
0
Figure 2.9 1 Normalization is required for FACS Sorted RNAseq data as reads
from plasmid occupy a large percentage of total sequencing reads leading to an
overall offset in fold changes. (a) Schematic of two libraries A and B with a small set
of genes in library B having enormous of sequencing reads thereby reducing the sequencing
"real estate" for the rest of the genes. Require an overall scale factor to normalize the
library sizes. (b) Proportion of Plasmid reads (mcherry+zsgreen+Pten 3'UTR) of the total
sequencing reads from the indicated sorted bins ( in the Pten 3'UTR and the NULL 3'UTR
data sets). Total RNA output from each bin is quite different with reads from plasmid taking
increasing sequencing real estate. (c) M (fold changes) versus A (average expression) plot
comparing RPKM values from the Pten UTR and NULL datasets for each bin shows a
clear offset from zero from Bins 1,2,3 (left panel). Genes indicated in red are in the middle
40% of M values and middle 90% of A values which are used to estimate the TMM factor
as described in the methods. The green line shows the estimated TMM factor and is offset
from zero in bins 1,2,3. Panel on the right contains the same M-A plots with the offset
removed after normalizing the fold changes with the TMM factors. y-axis is in log scale.
(d) Estimated TMM normalization factors from b) is used to normalize the library size of
the respective bin.
We hypothesized that these genes were most likely to be 'robust Pten ceRNAs' , and thus
would show a bin-dependent signature of crosstalk. We observed increasing derepression in
51
Chapter 2
these robust Pten ceRNA's as more Pten UTR's were expressed in the system (Figure
2.10d). Notably, these robust Pten ceRNA's have a median fold change of 0.19 even when
very few Pten UTR sponges are present (bin 1) and the median fold-change increases to
0.27 when 10 3 more Pten UTR sponges are present (in bin 3).
In order to verify that the fold changes in the transcriptome that we observed with
the 3'Pten UTR plasmid were miRNA-dependent we examined if the magnitude of derepression was correlated with the overall functional efficiency of the miRNA binding sites
(based on the context+ score of the site) in each bin. We relied on our siPten knockdown
data to select those miRNAs that were involved in Pten crosstalk with high confidence.
We had ascertained that miR-17, miR-19a, miR-20a and miR-130 were the miRNAs both
strongly involved in transmitting Pten crosstalk to their other targets, and were highly
enriched in the putative Pten ceRNAs. Moreover, they have been show to physically bind
to Pten 3'UTR by RNA-Immunoprecipitation (RIP assays) [Poliseno 2010] . With our
cleaner Pten 3'UTR overexpression system, we investigated whether these miRNA were
directly being sequestered by Pten 3'UTR and thus derepressing their targets, as such a
dependence would result in an increasing bin-dependent fold changes of their targets. We
analyzed the relationship between bin-dependent derepression of genes by these miRNA's
based on their site number, site type (6-,7,-8-nt sites), site position, and other determinants
used by TargetScan to calculate total context+ scores of predicted miRNA targets [Lewis
2005; Garcia 2011]. Binding sites with greater context+ scores have been shown to to be
effectively bound by miRNAs and repressed. When predicted targets of miR-17, miR-19a,
miR-93 and miR-130 were distributed into 4 context+ score bins and the distribution of
fold changes was plotted, the effect of increasing target derepression was clear in bins 2 and
3, but not so in bins 0 and 1 (Figure 2.10e). Thus, the affinity of miRNA binding sites
in each receiver leads to greater crosstalk strength even for a fixed amount of the sender in
each bin (PTEN 3'UTR level)
52
2.1. RESULTS
a
5
"'r
'3-
2
3r
MCHERRY
-
-
M-
PTENLUTR
ZSGREEN
-
-1
transfecti on+flow
sorting
3
N
2.5-
B
-N
bos
0
1
2
3
r
BIN 0 BIN I
4
BIN 2
BIN 3
d
Iog(mcherry)
C
+FC
-1.0
0.5
0.0
-0.5
FoMwinge
i
1.0
1.5
MM1)
I
-I**1
/
15
1 IC
/
I
BN
BIND#
0F
-1.5
-1.0
-0.5
0,0
PTENCro95b
e
1
*
-1.5
M
0.5
1.0
h0
(5
1.6
1N0
FC~ang Ch9 ( NULL
r
1O
rI~hn
BIN 3 (WT/NULL)
BIN 2 (WT/NULL)
BIN I (WT/NULL)
0
25
75
miR-17
miR-19a
miR-93
miR-130
Lot Fold Change
miR-17
miR-19a
miR-93
miR-130
-
Lot 0.0 Can
Loni Fold Change
1.0
miR-17
miR-19a
-
miR-93
miR-130
I.
Lotu Fold Change
Figure 2.10 1 Transfecting Pten UTR as a sponge derepresses putative ceRNAs
in a dose-dependent and miRNA dependent manner. (a) Schematic of FACS sorting: Cells are transfected with bidirectional plasmids expressing mCherry and ZsGreen with
Pten 3' UTR and without (NULL 3' UTR) . The transfected cells are sorted on the flow
sorter into 4 different bins depending on mCherry expression and collected for downstream
RNAseq (b) Expression (in RPKM) of mcherry, zsgreen, pten3' UTR and pten coding sequence in each bin for the cells transfected with Pten 3' UTR plasmid. Pten coding sequence
(RPKM) is increasingly upregulated in each bin indicating that the Pten 3' UTR plasmid
is capable of sponging away miRNA.
53
Chapter 2
Figure 2.10 | Transfecting Pten UTR as a sponge derepresses putative ceRNAs
in a dose-dependent and miRNA dependent manner. (c) Distribution of RNAseq
fold changes for bini (WT/NULL) and Pten CS. We refine potential Pten ceRNAOs by
intersecting the sets of genes repressed in Pten knockdown and derepressed in Pten UTR
transfection. A threshold for null changes "OFC" or "OCS" is determined as 1 std. deviation
of the fold change in the technical replicates (gray bar). Only genes that have positive Pten
crosstalk stength (+CS) and positive fold changes in each bin are considered as 'robust
Pten ceRNAs' as they are sensitive to Pten levels in both directions i.e they are reduced
when Pten is knocked down and are de-repressed when Pten 3'UTR sponges are introduced. (d) Cumulative distributions of fold change for genes in the intersection of the two
datasets. Inset shows the median fold change for robust ceRNAs in each bin. Robust ceRNAs are increasingly derepressed in each bin. P-values for difference in medians between
each preceding bin were calculated by Wilcox rank sum test (P<0.05 (bini), P<0.01(bin2),
P<10^-16(bin3)) (e) Cumulative distributions of fold changes for all targets of indicated
Pten miRNA's (that were enriched in the list of Pten ceRNAs from the knockdown dataset)
with increasing Context+ scores (colour).
2.2
Discussion and conclusions
Recent experimental studies have suggested that miRNA-mediated competition between
RNAs could be a new channel of post-transcriptional gene regulation, and such RNA-RNA
'crosstalk' affects many different biological contexts. Our study represents the first genome
wide measurement of crosstalk strength in response to the knockdown of three different
genes. Previous studies of the ceRNA hypothesis have concentrated on only one or a few
targets of a miRNA even though a perturbation in a ceRNA that changes miRNA activity
is expected to affect hundreds of targets. Quantifying the magnitude of crosstalk has also
proven challenging as existing studies rely on qPCR or luciferase assays, both of which have
difficulties in extracting precise fold changes due to issues with primer/enzyme efficiency
or amplifications biases. In order to fully test the generality and magnitude of the miRNAmediated crosstalk hypothesis, it is necessary to perform perturbation experiments to see
how the alteration of the expression level of one 'sender' mRNA could affect other 'receiver'
mRNAs regulated by the same miRNAs. Thus it is essential to measure crosstalk strength
transcriptome wide using a quantitative assay.
Knocking down a individual mRNA is expected to widely affect the transcriptome making
it difficult to extract the effect of miRNA-mediated effects. Thus we used the DICER -/54
2.2.
DISCUSSION AND CONCLUSIONS
cell line, which has depleted levels of mature mIRNAs, to isolate only those fold changes
that were miRNA dependent. Careful scrutiny of RNA-seq crosstalk strength measurements
in HCT116 and DICER-/- yielded a high-confidence set of putative ceRNAs for each of the
three senders. We studied whether this cohort of putative ceRNA were actually miRNAmediated and hence in accord with a ceRNA effect. Firstly, we found them to be enriched
in miRNA-binding sites for their respective senders. Secondly, the hypothesis implies that
the crosstalk effect should be more effective if genes share more miRNA binding sites with
the sender. However, such a feature has not been experimentally demonstrated to the best
of our knowledge. We binned our list of putative ceRNAs by the number of miRNA shared
binding sites and found that their magnitude of crosstalk strength was correlated with the
number of binding sites shared with their senders. We suspect that this feature implies that
multiple miRNAs act cooperatively on receivers. Thirdly, different miRNAs have different
total
#
of binding sites in the transcriptome, are sequestered to different extents by each
sender, and therefore should have different ability to transmit crosstalk ("crosstalk power").
By considering the difference in CS distributions for targets vs non-targets, we ranked the
crosstalk power for all miRNAs expressed in the cell-line. Other miRNAs besides miR-17,
miR-19 and miR-26 (tested by Tay(2011) have greater PTEN crosstalk powers, thus we
suggest that manipulating the levels of more highly ranked miRNAs from our list, would
be more effective for future studies. As an example, because PTEN ceRNA's have known
oncogenic effect, miR-374ab which we found has the highest crosstalk power for PTEN,
could be a useful target for miRNA based cancer therapies(Cai 2013).
The originating studies of the crosstalk hypothesis had computationally found many possible ceRNA candidates, but had only experimentally tested a few genes. Our data indicates
that ceRNAs were pervasive across the transcriptome and were broadly expressed ( 3-decade
range of RPKM).The functional relevance of a broad class of ceRNAs may be concordant
buffering of lots of genes involved in similar biological functions. Indeed, in a GO-term
analysis of putative ceRNAs, we observe shared functional roles of Pten, Vapa and Cnot6l
ceRNAS with their respective senders. Such a covariation between a broad class of ceRNAs when a sender is perturbed could help maintain a stoichiometric balance in pathways.
55
Chapter 2
However, we caution that it is difficult to construct a null model for whether these covariations are themselves caused by transcriptional level changes of the sender. Our candidate
ceRNAs may not be true ceRNAs. A limitation of our analysis is its dependence on DICER
-/- cells to control for spurious crosstalk effects that result from purely transcriptional network perturbations. For example, it is well-known that regulatory network structures such
as incoherent feed-forward loops can produce positive correlation between an mRNA and
targeting miRNA/Transcription Factors (Tsang 2007). How many of the ceRNA candidates
identified in our analysis are directly repressed by targeting miRNAs is currently unknown.
Detailed experimental work is needed to examine these candidate ceRNAs; in particular,
assays for miRNA binding and siRNA knockdown experiments can provide more conclusive
evidence for ceRNA interactions in individual receiver-sender pairs. It will be a combination
of our transcriptomic analysis with more biochemical assays to identify binding partners
that will enable a greater understanding of the crosstalk mechanism.
The size of the ceRNA effect has been widely considered larger than expected purely by
steady-state target competition because of the typically large number of targets (Broderick
& Zamore 2014). We find that crosstalk strength, though substantial, is usually less than
0.5 for most genes, and is generally bound by 1 for the three different senders that we
tested. Crosstalk strength is larger than we expected based on most sequestration models
(including ours). For example, we estimate sequestration for most miRNAs on Pten to be
less than 1% and Pten mRNA repression to be atmost 2 fold (based on PTEN 3'UTR
sponging data)
.
So CS < Sequestration
de
Repressionr
implies CS < 2%. In or-
der to explain the relatively large CS magnitude, we suggest two possibilities. Firstly, it
remains unclear if the total binding sites for a miRNA are truly in excess of miRNA concentrations locally. Estimates of total average binding sites in the cell might be irrelevant to
individual miRNA-target interactions that depend of local miRNA/target concentrations.
A recent theoretical study (Figliuzzi,2013) also finds that substantial crosstalk requires a
small number of competing target sites. They propose that ceRNA function may require
a channel of 'stoichiometric decay', in which a bound miRNA needs to be destabilized or
functionally depleted by other mechanisms such as trapping in P-bodies. Secondly, the topol-
56
2.2. DISCUSSION AND CONCLUSIONS
ogy of the ceRNA-miRNA network may play an important role as strongly interconnected
sender:miRNA:receiver subnetworks could enhance crosstalk. For examples, the miRNAs
(miR-17 and miR-19) which are strongly implicated in PTEN ceRNAs in our data, are
co-transcribed in polycistronic regions, and tend to have similar sets of targets, suggesting
their repressive effects can amplify for a large number of ceRNas. (Yip 2014)
The discrepancy between the ceRNA effect we detect by over expressing PTEN 3'UTR
(even at low amounts) and the lack of any detectable ceRNA effect by over expression of
synthetic miR-122 seed-sites (Denzler 2014) may be due to atleast two reasons. Firstly,
we used a full-length endogenous PTEN 3'UTR (3.3Kb) which contains multiple binding
sites of 25 miRNA families while Denzler et al used a short (125bp) AldoA mRNA with a
single miRNA binding site (miR-122). The functionality of miRNA target sites is affected
by numerous 3'UTR properties including, the presence of multiple target sites in close
proximity (Grimson 2007, Broderick 2011), the position of the site in the 3'UTR (Marjoros
2007), and the synergistic repression of multiple miRNAs(Lai 2012). Thus an endogenous
3'UTR' with multiple miRNA sites could have greater sequestration and miRNA-repression
ability than AldoA. Moreover,in our system, the PTEN 3' UTR sponge is under constant
repressive fold changes ranging from 2-3 (Pten) unlike miR-122 sponge, which exhibited a
loss of repression at higher induction levels ( from 2-fold to 0.1) suggesting that miR-122
was saturated. Other endogenous 3'UTRs we measured also had constant repression fold
changes 2-fold (Weel) and 5-fold (Lats2) (c.f Chapter 4, Schmiedel 2015) at all induction
levels, suggesting that endogenous 3'UTRs carrying more seed-sites are attracting more
miRNA repression. Secondly, the ceRNA effect depends on the cellular concentrations of
miRNAs; our cancer HCT116 cell-line has a different miRNA expression profile compared
to primary cells. The oncogenic miR-17-92 cluster in particular, which we found has high
PTEN crosstalk power, is known to be significantly upregulated in the HCT116 cell line
(Wang 2008).
57
Chapter 2
2.3
Methods and Materials
2.3.1
Cell culture and siRNA Transfection
The HCT 116 colorectal cancer cell-line was obtained from ATCC ( American Type Culture
Collection). The HCT116 DICEReon5 -/- cell lines was a kind gift from Dr. B. Vogelstein
and was generated as described previously (Cummins 2006). HCT116 wild-type and HCT
116 DICER -/- cells were grown in an ATCC-formulated McCoy's 5a Medium Modified
(Catalog No. 30-2007) plus 10% (v/v) FBS, penicillin/streptomycin (Gibco), L-glutamine at
370C in a humidified atmosphere with 5% CO 2. Cells were grown adherently in 10cm dishes
or 6-well plates at a seeding density of 1.0x10 5 cells/cm 2 until they were 50% confluent (4050 hours), upon which they were trypsinized, re-plated and transfected with 25nM siRNA
for 24 hours. Titration of the siRNA and the transfection reagent was performed (data not
shown), and the lowest working amounts of the siRNA and the transfection reagent were
applied in the present study. Transfection of siRNA oligonucleotides was performed with
Dharmafect lipid transfection reagent according to the manufacturer's protocols. siRNA
were purchased from Dharmacon as smart pools. Titration of the siRNA and the transfection
reagent was performed (data not shown), and the lowest working amounts of the siRNA
(25nM) and the transfection reagent were applied in the present study.With this protocol
more than 90% of cells were positive to the fluorescent siGLO RISC-free control siRNA. A
list of immunological reagents used in this study is below. A master mix was created for
each individual condition in order to eliminate pipetting errors and to increase consistency
between each well. Each siRNA was transfected in triplicate in each of HCT116 and DICER
-/- and all the knockdown experiments were done simultaneously to avoid an additional
source of variation. After 24 hours cells were harvested for various assays.
58
2.3.
METHODS AND MATERIALS
Reagent
Source
McCoys 5A Medium;Fetal Bovine Serum (FBS)
ATCC(30-2007, 30-2020)
Trypsin
ATCC(30-2101)
siGENOME siRNA pool for nontargeting 1
Dharmacon (Catalog D-001206)
5X siRNA buffer
Dharmacon (Catalog B-002000)
SMARTpool si-Pten
Dharmacon (Catalog M-003023)
SMARTpool si-Pten
Dharmacon (Catalog M-021382)
SMARTpool si-Pten
Dharmacon (Catalog M-016411)
2.3.2
RNA extraction
Total RNA was extracted from cells using Trizol reagent for the RT-PCR assay or using RNeasy (Qiagen) for RNA-sequencing assays following the manufacturer's protocols.
RNA pellets were resuspended in 20ul RNase-free sterile water, RNA quantity was assessed
spectrophotometrically using the NanoDrop ND-1000 UV-VIS Spectrophotometer (Thermo
Fisher). The RNA integrity number (RIN) was assessed with a 2100 Agilent Bioanalyzer to
verify RNA quality for all experimental samples. Only samples with RIN >9 were used for
sequencing.
2.3.3
RT-PCR
mRNA levels of various transcripts were measured using RT-PCR. Reverse transcription
into cDNA was done using a First Strand Synthesis kit (Invitrogen). RT-PCR was performed in triplicate reactions using SYBRGreen mix (Applied Biosystems), run on Applied
Biosystems 7500 Real-Time PCR instrument. Levels of various genes after siRNA knockdown were measured with the ddCT method and human Actin for normalization. List of
primers used are in a supplementary table.
59
Chapter 2
2.3.4
Reporter Plasmid Construction
Starting from a previously established reporter system (Mukherji 2011), the plasmid pTRETight-BI (Clontech), eYFP was replaced with ZsGreen1-1 (Clontech) using EcoRI and NdeI
digestion sites. We received the psicheck2 -Pten 3'UTR plasmid as a kind gift from Yvonne
Tay, The Pten 3'UTR sequence was cloned from that plasmid using custom primers, and
was inserted into the bi-directional plasmid into the ZsGreen MCS via the NdeI and XbaI
digestion sites using standard cloning techniques.This reporter plasmid is referred to as the
Pten 3'UTR sponge plasmid in the text. The "NULL" plasmid, which we used as a control,
consists of the same construct as above, but without the Pten 3'UTR i.e just the plasmid
containing the bidirectional tetracycline-responsive promoter that drives the transcription
of two fluorescent reporter proteins: ZsGreen and mCherry. All constructs were sequence
confirmed
2.3.5
Transient Transfection of plasmid
HCT 116 cells were grown in 2m of culturing media (antibiotic free) on 6-well dishes for
two days before the transfection. PtenT3'UTR or NULL plasmids were mixed with the rtTA
plasmid at a ratio of 3:1 (1.5 ug reporter plasmid: 0.5 ug rtTA plasmid) and then cotransfected into the cells in a medium consisting of 10ul Lipofectamine 2000 (Invitrogen)
and 250ul Opti-MEM. 6 hours post-transfection, when the cells had stabilized, they were
detached with trypsin, passaged onto 60mm plates in 3m1l culturing medium and induced
with 1 ug/ml doxycycline (Sigma). Live cells were taken for flow sorting assay 18 hours
post-induction.
2.3.6
FACS sorting
At the end of the transfection period, live cells were trypsinized, pelleted and resuspended
into a single-cell suspension in McCoys 5A medium . These transfected cells were sorted
by FACS into ice-cold PBS+3% FBS using a BD Biosciences Aria II flow cytometer in the
following manner: (i) Single cells were gated using their FSC-A and SSC-A scatter profiles
60
2.3.
METHODS AND MATERIALS
(ii) Only those cells containing the reporter plasmid were chosen based on their mCherry
expression values. (iii) We collected cells into 4 different bins based on their mCherry expression values (see figure). 100,000 cells from each bin (the same bins were used for sorting
both Pten UTR and NULL UTR) were sorted into eppendorf tubes containing ice-cold PBS
1%FBS buffer and their RNA was extracted as above. This method gave a total of 500-1000
ng of RNA per bin. For Analytic flow cytometry cells were detached with 0.05% trypsinEDTA, washed and resuspended in sterile 3% FBS PBS. Measurements were performed on
a BD Biosciences LSR Fortessa platform.
2.3.7
RNA Sequencing
From isolated RNA, poly(A)+ RNA sequencing libraries were prepared using Illumina TrueSeq Stranded mRNA kit in the MIT BioMicro Center. The prepared libraries were multiplexed and sequenced on an Illumnia HiSeq 2500 sequencer to obtain single-end 40-bp
reads. On average we obtained 20 million reads per sample. For each sample, there were
three biological replicates. Reads were aligned with Burrows-Wheeler Aligner (BWA)(Li
and Durbin, 2009) using parameters [q (PHRED-quality)=30,1 (seed length)=30] to modENCODE integrated transcript models on the basis of human genome (hg19 version). We
allowed a maximum edit distance of 2 [options "aln -n2" and flag "-uniq=1' to only map
unique reads. The output was converted into SAM format using the BWA "samse" option,
and processed with a custom perl script. Each library had 85% mapped reads. For the Pten
3'UTR plasmid transfection experiment, we disaggregated pten sequence into pten 3'UTR
and pten cds and the sequences of mcherry, zsgreen were added to hg19 transcript model.
Reads were aggregated across isoforms, and expression per gene locus was calculated in
reads per million mapped reads (RPM). Whenever expression was measured in RPKM, the
length of merged isoforms was used for normalization
2.3.8
RNASeq Data Analysis
Genes with no zero-read counts in any of the libraries were retained, resulting in a total of
13,700 (out of 23,704) expressed genes. RPKM values were averaged over the 3 biological
61
Chapter 2
replicates. The CV was estimated in a Gene-independent manner by pooling all the CV
measurements at a given expression in the following way: Loess regression was performed to
obtain an error model relating expression CV for each gene as a function of expression mean
for all samples. Expression CV for each gene was adjusted to the
loess-regression fitted line
of expression CV to expression mean. Significance of fold-changes was by calculating z-scores
and standard benjamini-hochberg multiple hypothesis corrected p-values were obtained.
2.3.9
miRNA-mRNA Target prediction
Genes were labeled as predicted microRNA targets if they contain at least one predicted
conserved microRNA binding site (Targetscan6.2 (Garcia, 2011) for a microRNA seed family
expressed in HCT 116.
2.3.10
miRNA expression Data sources
For expression of miRNAs in HCT 116, we obtained microarray-data sets generated by [Yan
2011], and were downloaded from NCBI GEO (Series GSE26819).
2.3.11
Target Abundance and Sequestration estimation
For each conserved human miRNA, the total number of predicted 6-,7-, and 8-nt 3'UTR
binding sites on a gene were weighted by the RPKM expression value of that gene in the
untreated HCT 116 RNAseq data to yield the TA for each miRNA.
We estimated the fraction of miRNA i sequestered by Pten (similarly for Vapa and
Cnot6l) as
SequestrationmiRNAi
Pten
2.3.12
[# of predicted niRNAi binding sites on Pten] x [PtenRPKMexpression]
miRNAi binding sites on gene j] x [gene JRPKM expression]
Ej[#of predicted
GO term analysis
GO term analysis was performed in R using the GOstats package [Falcon 2007]. For each
set of putative Pten, Vapa or Cnot6l ceRNAs, we collected the GO terms associated for
62
2.3.
METHODS AND MATERIALS
each mRNA in the set. For each term, we then computed a p-value using a hypergeometric
test, to indicate the enrichment of the term in the ceRNA set compared to the background
set of all genes.
2.3.13
TMM (Trimmed Mean of M-values) Normalization
Methods for normalization of RNA-sequencing gene expression data commonly assume equal
total expression between compared samples. The number of reads expected to map to a gene
not only depends on the expression level and length of the gene, but also on the composition
of the RNA population that is being sampled [Robinson 2010]. Thus, if a large number of
genes are unique to, or highly expressed in a experimental condition, the sequencing 'realestate' available for the remaining genes in that sample is decreased. If not adjusted for,
this sampling artifact can force any fold-change analysis to be skewed. This is precisely the
situation in our FACS sorted sequencing dataset. Upon transfecting reporter plasmids into
cells and inducing thousands of transcripts we obviously change the global gene expression to
different extents in each bin. We sorted cells by their expression of mcherry transcripts, and
consequently found a large 3-log decade increase in mcherry read counts in the untrasnfected
(bin 0) and the fully saturated bin 3 (Figure 2.9b). Mchery and zsgreen reads combined
were as much as 30% of the total reads in the last bin.
Define Y
9kk
as the observed read count for gene g in library k and N as the total number
of reads for library k. Remove all instances of Y
9k
= 0 as fold changes cannot be calculated.
Then for each bin, let k and and k' stand for the Pten UTR and control NULL UTR library.
Define the gene-wise log-fold- changes M between these two libraries as:
9
M =io
g =
Yk/Nk
2(Ygki1Nk,)
and absolute expression levels A as
g
A = 2log2 (Y kIN x Y k,/N,) for Y ,4 0
If there would be no bias by RPKM normalization, one would expect that the distribution
of M values would be centered around zero. This is not so due to the distorting effects of
the different amounts of plasmid RNA in each bin. To eliminate their effects, we robustly
63
Chapter 2
summarize the M and A values, by a trimmed mean. A trimmed mean is the mean after
removing the upper x% and lower x% of the data. We use a double trimming of both the
M and A values: trimming the top 30% and bottom 30% of M values, and the top 5%
and bottom 5 % of A. After trimming, we define the TMM Factor as the mean M of the
remaining genes. This TMM factor is then used to normalize the library size for library
k. We estimated TMM factors for bin 1 as 0.91 indicating that the Pten UTR library size
had to reduced by that factor. After performing this normalization, we find that the overall
offset in fold changes is 0 as expected(Figure 2.9c). The TMM factor is reasonably stable
for different choices of trim percentiles [data not shown].
2.4
a
Supplementary Figures and Tables
LYR
UMmgi3302
1Uman PEI!N NM.0*314
CoaservWd tm for~nMINA fimflhsbraft1 comerved among vertebratks
miR-23ab
29bmiR-19
iR-19
miR-26b1297
miR-17 5p-nm9
miR-23ab
nuUR-19
i-26EW1297
ma103Imi-22
miR-148&152
1OM5194
I
-
MiR-130V30I
mil-205
I
I
b
Human VAPA N14M 3574 SI trMkw5724
amngertba
fxmes brocoswerd
fo ndR A
smmM-13=/12
Conswed
ma&-bMlaag
nR2
mM.3Oa-plmiR-194
miR-11206
miR-145
75 p0c4/384-5p
2
miR-10130
miR-451
miR-19
C
Human CNOT6L
NM_144571 31 UTR Iengfk7042
Cosmrved dbz for miRNA fmuiak brosift comaurved ateb
miR-5/16t94247
nu -9 nR-19
maiR-17-5p/20W3,d1.5I94
miR-499/49-5p
miR-IR2
miR-23ab
miR-23ab
rutesx
miR-
miR-365
miR-15/16195424/497
niR-34w34b-5p34c/34-5pi449449abd609
miR-137
iR-961271
ImiR-507
1
Ilet-7
miR-145
mlR-144
Figure 2.11 I Predicted TargetScan conserved miRNA binding sites in the
3'UTR of the ceRNAs chosen in this study. (a) PTEN is targeted by 25 conserved
miRNAs (b) VAPA is targeted by 28 conserved miRNAs (c) CNOT6L is targeted by 44
conserved miRNAs
64
2.4. SUPPLEMENTARY FIGURES AND TABLES
a
C
:xz:.: vc
r
b
t
0-
.
.
0-'
0-. 5
-1.
1.0
0.5
0'.0
-. 5
PTEN Crosstalk Strength
-1.5
-1.0
-0.5
0.0
0.5
VAPA Crosstalk Strength
1.0
-1.5
0.0
0.
-1.0
-0.5
CNOT6L Crosstalk Strength
Figure 2.12 1 Crosstalk is microRNA mediated and pervasive on a genome-wide
scale. Related to (Figure 2.3). Volcano plot of magnitude of Crosstalk Strength versus
P-value in each of the sender-knockdowns for putative ceRNAs i.e only those genes with
crosstalk strength in HCT 116 greater than DICER cells. (HCT CS >DICER CS). Data
points for genes marked in green have P-value < 10-3 and are statistically significant.
Number of putative ceRNAs for the given sender are indicated in the legend.
C
BIN1
BIN2
BINS
LL
-5
E
0
0
0
y
C
-1.0
-0.5
-
-'^-~BIN
-
0.0
0.5
1.0
2.0
3.0
#
0.0
1.0
1.5
Fold Change (WT/NULL)
Figure 2.13 1 Distribution of log2 fold changes (PTEN UTR/NULL) for all genes post
TMM normalization is centered around zero in each bin i.e no bin-dependent effects are
seen. Related to Figure 2.10.
65
1.0
Chapter 2
Table 2.1 1MicroRNA's enriched in genes with positive PTEN Crosstalk Strength with
hypergeometric p-value less than 0.05. MicroRNA's in bold are those that are predicted to
target PTEN.
miRNA seed family
P-value
Enrichiment factor
nuiR-200bc/429/548a
miR-17/17-5p/2Oab/20b-5p/93/1O6ab/427/518a-3p/519d
miR-23abc/23b-3p
miR-340-5p
miR-101/l01ab
miR-19ab
niR-181abcd/4262
rniR-144
miR-300/381/539-3p
miR-590-3p
rniR-13Oac/301ab/301b/30lb-3p/454/721/4295/3666
miR-93/93a/105/1O6a/29la-3p/294/295/3O2abcde/372/373/428/519a/520be/52acd-3p/1378/1420ac
miR-30abcdef/30abe-5p/384-5p
miR-26ab/1297/4465
miR-186
miR-141/200a
niiR-25/32/92abc/363/363-3p/367
miR-15abc/16/16abc/195/322/4-)4/497/1907
miR-27abe/27a-3p
miR-216b/216b-5p
miR-148ab-3p/152
nuiR-495/1192
miR-96/507/1271
miR-21/590-5p
miR-132/212/212-3p
miR-543
miR-503
miR-153
miR-374ab
miR-205/205ab
miR-448/448-3p
miR-124/124ab/506
miR-7/7ab
miR-410/344de/344b-1-3p
miR.155
miR-433
miR-lab/206/613
miR-221/222/222ab/1928
miR-217
miR-202.3p
rniR-128/128ab
miR-320abcd/4429
miR.140/140-5p/876-3p/1244
miR-223
rniR-544/544ab/5,14-3p
miR-218/218a
miR-iSI
miR-499-5p
miR-199ab-5p
mill-29abcd
miR-139-5p
miR-194
miR-494
miR-103a/ 107/lO7&b
miR-2O8ab/M ab-3p
miR.137/137ab
iniR-224
let-7/98/4458/4500
miR-421
miR-290-5p/292-5p/371-5p/293
miR-9/9ab
mniR-135ab/135a-5p
miR-653
miRl-142-3p
miR-l96abc
miR-377
miR-l8ab/4735-3p
miR-425/425-5p/489
miR-l38/138ab
miR-24/24ab/24-3p
miR-324-5p
miR-33a-3p/365/365-3p
0
1.01E-13
4.02E-13
4.33E-13
6.78E-13
9.20E-13
4.15F,12
1.41E-11
8.74E-11
4.63E-10
1.12E-09
1.55F,09
2.77E-08
2.92E-08
1.04E-07
1.32E-07
1.55E-07
6.52E-07
1.13E-06
1.65E-06
2.19E-06
2.56E-06
2.74E-06
2.89F-06
4.94E-06
4.98E-06
5.20E-06
7.92E-06
8.67E-06
9.67E-06
2.29-E-05
2.27E-05
3.60E-05
6.57-05
7.06F-05
9.37E-05
0.0X001
0.0001
0.00018
0.0001
0.0003
0.0004
0.0005
0.001
0.001
0.001
0.00)1
0.001
0.00)2
0.002
0.003
0.003
0.003
0.003
0.003
0.004
0.006
0.006
0.007
0.01
0.01
0.01
0.01
0.01
0.01
0.01
0.019
0.03
0.03
0.04
0.04
0.04726543453742
1.87
1.8
1.79
1.7
1.9
1.78
1.75
1.87
1.83
1.62
1.8
1.8
1.58
1.72
1.68
1.74
1.71
1.53
1.55
2.14
1.71
1.62
1.57
2.13
1.94
1.69
2
1.69
1.67
1.86
1.67
1.42
1.8
1.67
1.81
1.87
1.57
1.76
1.86
1.58
1,218
1.54
1.8
1.8
1.58
1.48
1.411
1.76
1.61
1.42
1.74
1.69
1.56
1.53
2.06
1.4
1.63
1.1
1.59
1.63
1.35
1.47
1.78
1.66
1.75
1.46
1.71
1.78
1.47
1.12
1.94
1.64
66
2.4.
SUPPLEMENTARY FIGURES AND TABLES
Table 2.2 1 MicroRNA's enriched in genes with positive VAPA Crosstalk Strength with
hypergeometric p-value less than 0.05. MicroRNA's in bold are those that are predicted to
target VAPA.
miRNA seed family
P-value
Enrichment factor
miR-17/17-5p/20ab/20b-5p/93/106ab/427/518a-3p/519d
miR-200bc/429/548a
miR-93/93a/105/106a/291a-3p/294/295/302abcde
miR-30abcdef/30abe-5p/384-5p
miR-202-3p
miR-300/381/539-3p
miR-186
let-7/98/4458/4500
miR-23abc/23b-3p
miR-19ab
miR-27abc/27a-3p
miR-590-3p
miR-217
6.53E-08
1.50E-07
2.49E-07
2.90E-07
1.60E-06
2.80E-06
1.09E-05
7.OOE-05
7.86E-05
0.0001
0.0001
0.0001
0.0001
1.8
1.8
1.94
1.72
1.96
1.84
1.79
1.7
1.64
1.62
1.62
1.55
2.17
miR-148ab-3p/152
0.0002
1.8
miR-340-5p
miR-9/9ab
miR-144
0.0003
0.0003
0.001
1.53
1.58
1.64
miR-130ac/301ab/301b/301b-3p/454/721/4295/3666
0.002
1.63
miR-26ab/1297/4465
miR-25/32/92abc/363/363-3p/367
0.002
0.002
1.62
1.64
miR-128/128ab
miR-101/101ab
miR-182
miR-503
miR-374ab
miR-141/200a
miR-192/215
miR-543
0.002
0.005
0.007
0.007
0.008
0.01
0.01
0.01
1.57
1.63
1.5
1.93
1.64
1.59
2.4
1.61
miR-196abc
0.01
2.04
miR-139-5p
miR-15abc/16/16abc/195/322/424/497/1907
miR-155
miR-221/222/222ab/1928
miR-181abcd/4262
0.01
0.01
0.02
0.03
0.04
1.89
1.44
1.75
1.72
1.42
67
Chapter 2
Table 2.3 1 Table of Biological Processes Gene Ontology (GO) annotations significantly
enriched in putative PTEN ceRNAs. Only the top 10 are shown
GO Term
P-value
Enrichment
Description
GO:0001568
GO:0036211
GO:0031323
GO:2000112
GO:0072358
GO:0019220
GO:0009891
GO:2001141
GO:0009892
GO:0045944
2.37E-09
8.06E-09
8.72F,09
2.69E-08
3.35E-08
7.64E-08
1.07E-07
1.51E-07
1.57E-07
1.76F,07
2.73
1.59
1.50
1.53
2.66
1.89
1.79
1.50
1.70
2.09
blood vessel development
protein modification process
regulation of cellular metabolic process
regulation of cellular macromolecule biosynthetic process
cardiovascular system development
regulation of phosphate metabolic process
positive regulation of biosynthetic process
regulation of RNA biosynthetic process
negative regulation of metabolic process
positive regulation of transcription from RNA polymerase II promoter
Table 2.4 1 Table of Biological Processes Gene Ontology (GO) annotations significantly
enriched in putative VAPA ceRNAs. Only the top 10 are shown
GO Term
P-value
Enrichment
Description
GO:0000279
GO:0051301
GO:0048285
GO:0000278
GO:0007067
GO:0007098
GO:0000236
GO:0006796
GO:0022403
GO:0009889
2.20E-14
4.61E-11
1.61E-10
4.49E-09
2.50E-05
3.62E-05
6.22E-05
6.50E-05
6.56E-05
6.83E-05
3.37
3.08
3.20
3.86
2.90
6.79
3.89
1.61
2.39
1.45
M phase
cell division
organelle fission
mitotic cell cycle
mitosis
centrosome cycle
mitotic prometaphase
phosphate-containing compound metabolic process
cell cycle phase
regulation of biosynthetic process
Table 2.5 1 Table of Biological Processes Gene Ontology (GO) annotations significantly
enriched in putative CNOT6L ceRNAs. Only the top 10 are shown
GO Term
P-value
Enrichment
Description
GO:0045766
GO:0009890
GO:0071276
GO:0048714
GO:0048514
GO:0071294
GO:0010035
GO:0001944
GO:0051918
GO:0031324
1.07E-05
7.93E-05
0.0001
0.0001
0.0001
0.0002
0.0002
0.0003
0.0003
0.0005
6.68
1.95
22.6913
84.91
2.50
18.91
2.54
2.29
42.45
1.69
positive regulation of angiogenesis
negative regulation of biosynthetic process
cellular response to cadmium ion
positive regulation of oligodendrocyte differentiation
blood vessel morphogenesis
cellular response to zinc ion
response to inorganic substance
vasculature development
negative regulation of fibrinolysis
negative regulation of cellular metabolic process
68
Chapter 3
A single molecule analysis of ceRNAs
reveals miRNA-dependent correlation and
colocalization
In the
preceding chapter, we presented the genome-wide measurement of crosstalk strength
for three different senders Cnot6l,Pten, Vapa and identified key features -
#
of shared miR-
NAs, target abundance of miRNAs and the binding affinity of miRNAs -that impact the
magnitude of crosstalk strength. We were able to isolate these factors by profiling transcript abundances for a bulk population of cells after perturbing sender levels. However the
theoretical framework of the ceRNA hypothesis depends upon relative concentrations of
targets and miRNAs, sequestration of miRNAs and the repression of miRNA targets. Each
of these processes occur within individual cells. The cell is a highly complex environment
that cannot be approximated as well-mixed due to the presence of numerous sub-cellular
structures. Local concentrations of miRNA binding sites and miRNAs may differ from the
average by large amounts thus affecting the rates of miRNA sequestration or repression.
Hence, the absolute intracellular concentrations of these species, their spatial localization,
and dynamics along with other molecules involved in miRNA biogenesis (Argonaute2) have
to be taken into account for a quantitative understanding of the ceRNA hypothesis. Moreover, bulk measurements of crosstalk in cells following sender-knockdowns may mask some
Chapter 3
other features of miRNA-mRNA coupling such as buffering of individual ceRNA fluctuations by their shared
miRNA. This chapter focuses on quantifying endogenous transcription
in single cells of known ceRNAs (Cnot6l,Pten, Vapa) with single-molecule resolution by an
in situ hybridization (smFISH) assay and analyzing their spatial localizations.
3.1
3.1.1
Results
Quantification of gene expression for Pten, Vapa and Cnot6l in
single cells with 3-colour smFISH
In order to quantify the absolute abundance of ceRNAs in single cells with molecular resolution, we used the single-molecule RNA FISH (smFISH) method (Femino et al., 1998; Raj,
Bogaard, et al., 2008) which labels each RNA molecule of a particular species with a fluorescently coupled set of complementary oligonucleotide probe sequences. For each of Pten,
Vapa and Cnot6l, we designed 25 to 48 fluorescently labeled probes, each 20 bases long,
complementary to the coding-sequence of the target transcript (Figure 3.1). Cnot6l had a
shorter coding sequence that admitted only 25 probes. We hybridized individual probe-sets
for each of the three genes to fixed and permeabilized cells, stringently washed unbound
probes and finally imaged the cells under the fluorescence microscope. For simultaneous
detection of three different genes, we labeled our probes with spectrally distinguishable fluorophores (Cy5, Alexa Fluor 594, and Cy5) and imaged with the appropriate filter sets.
The large number of fluorophores bound to a single mRNA results in diffraction-limited
fluorescent spots corresponding to single transcripts (Figure 3.1). We found no obvious
sub-cellular localization of these transcripts in the cytoplasm suggesting that ceRNAs are
not uniquely captured in a particular structure. In order to quantify the expression level of
endogenous mRNA in individual cells, we counted the fluorescent spots from the 3D images
of cells using custom MATLAB scripts adapted from (Raj 2008). Each computationally
identified molecule was assigned to a cell by manually tracing individual cell boundaries
based on the DAPI nuclear staining signal.
To robustly estimate expression levels we counted gene expression in 300-400 cells. We
70
3.1.
RESULTS
also performed identical smFISH experiments in the miRNA-deficient HCT 116 DICER -/cell line. In HCT 116, the ceRNAs Cnot6l,Pten and Vapa were expressed at an average of
12, 28 and 82 molecules/cell respectively. Expression levels of Pten, Vapa and Cnot6l were
largely unchanged in the miRNA-deficient DICER-/- cell-line. Quantification of Pten and
Vapa mRNA levels with our sensitive single-molecule method revealed a 1:3 ratio contrary
to their previously reported ratio of 1:100 ratio in the same HCT 116 cell-line as estimated
by qPCR. (Tay 2011, Ala 2013). We note that such a reported disparity in expression levels
had led many authors to conclude that the crosstalk mechanism could not account for a
sender expressed at extremely low levels to affect the levels of a receiver expressed 100-fold
higher [Ebert, Sharp 2012]. Having established our quantitative 3-colour smFISH data-set
we then proceeded to analyze the correlative structure of the single-cell gene expression of
the three ceRNAs.
PTEN ORF
48 probes
CNOT6L ORF
25 probes
VAPA ORF
Aft4
4
- __L__-
Figure 3.1 1 Measuring Pten, Vapa and Cnot6l gene expression in single cells
with 3-colour single-molecule FISH. (a) Multiple 20-mer oligonucleotide probes for
Pten, Vapa and Cnot6l were constructed and labeled with distinct dyes to allow simultaneous measurement of gene expression with a smFISH assay (b) Spots corresponding
to single mRNA molecules resulting from the transcription of the genes Vapa (red, detected with oligonucleotide probes coupled to Alexa 595) and Cnot6l (green, oligonucleotide
probes coupled to Cy5) in HCT116 cells. Representative maximum intensity z- projection.
Diffraction-limited spots (molecules) in each channel were automatically identified with a
custom MATLAB script and assigned to individual cells which had been manually segmented based on DAPI nuclear staining.
71
Chapter 3
3.1.2
Presence of shared miRNAs generates correlated fluctuations of
Pten ceRNAs in single cells
We used the smFISH data to determine if ceRNAs are correlated in individual cells, which
would suggest shared miRNAs co-regulate their fluctuations, or if they varied independently,
which would indicate miRNA coupling occurs at a slower timescale to gene expression
fluctuations. Previous studies of Pten and Vapa have shown that their gene expression levels
are correlated in different bulk tumor samples (Ala 2013). However, any competition and
sequestration of miRNAs and consequent crosstalk is a single-cell phenomenon i.e ceRNAs
sponge away miRNAs from each other within a noisy intracellular environment consisting of
different levels of ceRNAs, miRNAs and RISC/DICER enzymes. Thus it becomes necessary
to study the expression of ceRNAs in single-cells to investigate how the presence of shared
miRNA biding sites in the three ceRNAs influences their gene expression. We plotted ceRNA
pairwise gene expression in single cells for both the HCT116 and DICER datasets. Strikingly,
we observed a significant correlation (Pearson correlation coefficient p -0.40) between the
gene expression of the three ceRNA pairs in HCT116 cells that was lost in the miRNA-
deficient DICER cells (p -0.10) (Figure 3.2b,c). Thus, a cell with low or high expression
levels of one of the ceRNAs is likely to be in the corresponding expression state of the other
ceRNAs only in the presence of functional miRNAs.
In order to control for possible large-scale transcriptional network imbalances in the
DICER cell-line that might result in all genes fluctuating randomly, we performed smFISH
on Twisti and Pten. Twisti is a highly expressed transcription factor that induces epithelial
to mesenchymal transition (Yang 2010), and significantly doesn't have any predicted miRNA
binding sites in common with Pten. Moreover, Twisti had negligible Pten crosstalk strength
in our RNAseq si-Pten knockdown experiment, making it an attractive negative control. We
found that while Twisti and Pten expression levels were significantly negatively correlated
in HCT 116 cells (p = -0.31), they remained negatively correlated in DICER cells (p =
-0.29).(Figure 3.3)
72
3.1. RESULTS
A
nkdepender*
ge
AB
gen s twing
A
stoichiomnery
extrinsic noise
Kntnsic
noise
C
B
D
"U ""st"in"A'""essh-)
200
200
1 4150
0
. 2=0.91
*DICER
HFCT
p =.42*0.09
S
150
p=.11
000.8
g
<
o
>n>
35
I0.
SHCT
D0 E
30
p=.42*0.08
30
1C
P
30
% P =.020 .13
S
0
35 HC
0
420
0
4
p=.
ef.t
n25
#
0 09
0*
*.
ICER
*
5
51::
150
s0
100
VA DrRNA Counts
200
0.73
0.1
C
s.
HCT
0
CR
'a21.16
0.3
0
Z
0
0.2
~0.5
4P-0
1500
15
02=0.
mRNA Counts
-
25
8
Is"20
=100
glCER
30 0o2= 0t50
0r6
20
00
5
35r
91
s FI da t
0
z 10
,
NEN
c
HCT
'.C0.4
.
35
and
30
*
.
3
0.
0.5
.2P
.4
5
15 0
20
40
F EN mRNA Counts
dH
DE
060
.
*
30
Log2(Stoichlometric Ratio) PTEN / VAPA
PTEN mnRA Counts
35
1010
DOR
0
mRNA Counts
15
HCT
02 = 0.24
0.4
0.4
0.2
504
IFEN
3 0 20 40 60 80 Log2(Stoichiometric Ratio) PTEN / CNOT6L
1
0.12
0.2
5*0.1
0
200
-5
100
150
50
VA
p RNA Counts Log2(Stchior etrlc Ratio) VAPA /(CNOTbL
5
10
Figure 3.2 t Crosstalk helps ceRpNAs co-fluctuate in single cells thereby tightening their stoichiometric ratios in the presence of activedsiRNAs. (a) Two
genes that are coupled by a common microRNA (red) will thereby also manage to couhave reduced deviations
"intrinsic' fluctuations
ple DIE-their endogenous
orltoin their
oso
hwsgnfcn
n
ntprgt and therefore
ar
CorltoCefcet
stoichiometric expressions (upper marginal histogram) when compared to genes that don't
share miRNAs (grey) .(b,c) Using 3-colour smFISH to quantify expression of Pten, Vapa
and Cnot6l in HCT116 and control DICER -/- single cells. Over 300 cells were analysed.
Scatter-plot for single-cell transcript counts for Ptert, Vapa, Crtot6l of each pair of ceRNAs. Left column is smFISH data for untreated HCT116 and right column that for control
DICER-/-. Correlation coefficients are on top right, and show significant loss of correlation
in DICER -/-. Error bars are bootstrapped 95% confidence intervals (d) Using data in (b,c)
we computed the ratio of the transcript count for two genes in each cell and refer to this as
the stoichiometric ratio of two genes. Red and black curves are the distribution of the 1og 2
(Stoichiometric Ratio) for each pair of genes in HCT116 and DICER cells respectively. The
variance of these distributions are indicated in the top left.
73
Chapter 3
A
B
P - -. 31
30
HC -
P - -. 29
CE
-.
-
250200
IS
500
00
0
0
Figur
3.
0
0
I
2
Oe
0doe
0
d Ts 116
notn
0
00
90
0
DIE
.oecreaini
to1000
4
0
for
a
0
0
0
eewthwih
DICR
PTEN MRNA
PTEN MRNA
Figure 3.3 1 Pten does not lose correlation in DICER for a gene with which it
doesn't share miRNAs. (a,b) Scatter plot of gene expression in single cells for Ptert
and Twisti in HCT1 16 (left) and DICER (right) cell lines. Twisti was chosen as a control
as it does not share any predicted miRNA binding sites with Pten and is highly expressed.
The two genes remain negatively correlated in both the cell lines. Correlation coefficient on
the top right.
Another possible explanation for the the observed difference in ceRNA correlations in
HCT 116 and DICER is that their cell-cycles proceed at different rates. However, we had
cultured the two cell-lines under identical conditions and found only a small difference of
doubling time between them (-21h and -23h for HCT116 and DICER respectively). Nevertheless, we accounted for a possible cell-cycle mechanism in explaining such a correlation
by calculating the concentration of mRNAs (by dividing the of mRNA in each cell by its
cellular volume) and found a similar loss of ceRNA correlation in DICER cells compared
to HCT 116 cells. Together, these data suggest that individual ceRNAs appear to be correlated i.e they co-fluctuate with each other in single cells due to the buffering effect of active
miRNAs.
Stoichiometric ratio of ceRNAs is tightened by active miRNAs
We speculated that shared miRNAs could couple fluctuations of ceRNAs and thus regulate
the stoichiometry of gene expression. By dynamically buffering individual fluctuations in
each species via miRNA-mediated crosstalk, ceRNAs could have tighter stoichiometric ratios
with each other than with non miRNA regulated genes(Figure 3.2a). Cellular processes
74
3.1.
RESULTS
are acutely sensitive to changes in dosage for many genes, and thus ceRNAs may be used
in pathways to minimize fluctuations. Pten, for example, is a haplo-insufficient gene such
that even moderate Pten down-regulation resulting from the loss of a single allele may
be tumorigenic (Kwabi-addo 2001). To compare the range of these ceRNA fluctuations in
the HCT116 and DICER cell lines, we calculated a 'stoichiometric ratio' , defined as the
ratio between the individual mRNA counts for each ceRNA pair in each cell. Notably the
stoichiometric ratio is calculated for each single cell in our dataset, and is thus different
from the pearson correlation which is defined for two mRNA count series for a entire cellpopulation. When the distribution of 'stoichiometric ratio' values is plotted for each of the
-
three ceRNA pairs (Pten & Vapa; Pten & Cnot6l; Vapa & Cnot6l) in HCT116 and DICER
/- cells, significant differences can be detected between the two cell-lines. The distribution of
ceRNA stoichiometric ratio is tighter in HCT116 cells compared to DICER -/- as measured
by the variance in the distribution, implying that the loss of active miRNAs in DICER
-/-
causes ceRNAs to fluctuate independently of each other. (Figure 3.2d).
3.1.3
Pten, Vapa, Cnot6l are mutually reciprocal ceRNAs
As ceRNAs share miRNA binding sites, it is expected that they should behave in a bidirectional manner i.e their interactions should be reciprocal. In order to study their reciprocal
effects, we knocked down 3 separate transcripts (with three biological replicates) of 25nM
si-Pten, 25nM si- Vapa and 25nM si- Cnot6l and counted the number of transcripts of Pten,
Vapa and Cnot6l simultaneously using smFISH for each of the knockdowns.Though we had
quantified the crosstalk strength genome wide for Pten, Vapa and Cnot6l as described in
Chapter 2 by knocking them down individually with siRNA, and RNA sequencing the transcriptomes, we observed a significantly greater crosstalk strength in HCT116 compared to
DICER -/- in 4 of the 6 possible sender-receiver pairs. Given that smFISH measurements
yield absolute mRNA expression levels rather than relative RPKM values, we anticipated
that quantifications of crosstalk strength would be more accurate when performed at a single
molecule resolution. Pairwise analysis of scatter-plots for each of the receivers reveals that
they are each depleted when any individual sender is knocked-down (Figure 3.4a,b,c).
75
Chapter 3
fractional charge
.
*WT
2nM *I-VAPA
ilt
0,
-
in receiver
fractional
00
ptc-n = 33%
CS, =
=.53
63%
change
C
en
*
not 6l
eOc
0.7
0W
40
9
2
------ r
U.
a&
PTEN
VAWA
B
_.
DICER HCTI1I6
J6
vapacnot6l
CSgt
C
-
DICER
HCT116
DICER
,.CN
6L
UI
apapten
lo-cnt
DICER HCT116
~
VAM
Cscno6:I
DICER
A
Figure 3.4 I Measuring crosstalk strength with smFISH for 3 different senders
in HCT116 and DICER -/-. (a) Single-cell mRNA counts with 3-colour FISH on Vapa,
Pten and Cnot6l in WT(black) and 25nM si- Vapa knockdown (violet). Each dot is the
mRNA count/cell for the two indicated mRNA species. Marginal histograms for each mRNA
in the two different conditions are on the top and right of each scatter plot. Bars indicate the
mean expression in each single-cell distribution (black=WT and violet=si- Vapa). Knocking
down Vapa by 60% results in a 33% fold change of Pten. (b,c) Same as (a) with 25 nM
si-Pten knockdown (pink) and 25nM si-Cnot6l knockdown(cyan) (c)Crosstalk strength for
a receiver wrt to a sender is defined as in the text. Average CS measured in 3 different
biological replicates for each sender-receiver pair in HCT and DICER -/- cells. Error bars
are standard deviations of 3 independent sets of knockdown experiments.
For instance, Pten is reduced by 33% when Vapa levels are knocked down by si-Vapa
= 0.53 (Figure 3.4a). Similarly we could calculate the crosstalk
by 60%, thus the CSPte"
vapa
strength for each of the six possible sender-receiver pairs. Even though Pten is not as highly
expressed at Vapa, it again emerges as the best sender of crosstalk as corroborated from
our genome-wide RNAsequencing results. Importantly, though senders suffer similar fold
knockdowns in DICER -/- as in HCT116 cells, the receiver reduction (and consequently
crosstalk strength) is much weaker in DICER -/- cells for all 6 sender-receiver pairs indicat76
HCT116
PrEN
VAPAVAA
PTEN
CS
0.4jo
0.3
m
DICER HCTI 16
VAPA
HCT116
3.1. RESULTS
ing that mature miRNAs are essential for the crosstalk mechanism (Figure 3.4d). Notably,
we always measure a non-zero 'residual' crosstalk effect in DICER -/- due to the attenuation
but not elimination of mature microRNAs as reported by Taqman miRNA qPCR in the
DICER -/- cell line (Tay 2011). Taken together, we find that the ceRNA effect is indeed
bi-directional and miRNA dependent.
3.1.4
Individual molecules of Pten ceRNAs are colocalized in a miRNAdependent manner
On inspecting our smFISH dataset closely, we surprisingly found some of the individual
ceRNA molecules were colocalized with each other (Figure 3.5b). As discussed in the
introduction, local concentrations of miRNAs and mRNAs can differ considerably from
average cellular concentrations. If ceRNAs are co-localized with each other, or sequestered
in miRNA processing machinery, then their competition for miRNAs could substantially
increase as their effective local concentrations would be much greater than the average
concentration of all possible competing miRNA binding sites. Put another way, bound
miRNAs released from a sender would have greater propensity to bind to other receiver
mRNAs in its vicinity than diffusing to other far-away binding sites. We speculated that
the high magnitude of crosstalk strength that we observed for the three reciprocal Pten,
Vapa, Cnot6l ceRNAs might be explained by such colocalization of their transcripts.
Quantifying degree of colocalization between ceRNAs
In order to measure the degree of colocalization between ceRNAs, we used the 3-colour
smFISH expression datasets to first identify the precise 3D locations of the centers of each
diffraction-limited fluorescent spot. To do so, we fitted a gaussian to each spot's intensity
trace for each channel and thereby calculated the centre of each spot. The channels are
aligned using TetraSpeckTM Microspheres (Invitrogen) and in each channel we find all
the spots in another channel that are within 2 pixels in the xy plane and 2 z-planes away
to control for possible stage drifts during the imaging procedure. This method allows the
automated quantification of the number of transcripts being colocalized in each single cell.
77
Chapter 3
B
A PTEN ORF
48 probes
CNOT6
ORF
48 probes
VAPA ORF
48 robes
P~TE
1VAPA roRKA oocafted with PlM n~U4
4.52
DICERMdePenetm
()R
07.6k0
Figur
Fiur
3.
5
Snge
Singl
FIS
oecl
moeueFS2hw
mRt4A osocaftW~ Y~h YAP
r
*wer
set
P
s
RNs
wsPe
tnceN
AP
A
37
d
0,
ef
IM"
OatWdt, uUL
A
A
f
,
C
aeclclzdi
saeclclzdi
~4
frth three
wthe t hree
trHacriptsfo
~ OfOO trnciptse
wocv
detctio CoS
MFAllwnTS detection
flopoe lo.wit
flopOre
diffren
1~
t dTO#feret
copled toh
nR cupled
C NOTOL
CnotOl nd
expression
genes simultaneously. A representative dual-colour image for Vapa and Cnot6l
in HCT116
cells
is shown (maximum intensity projection) (b) A single z-slice of a 3-colour
FISH image for Pten, Vapa and Cnot6l in HCT116 cells. Arrows indicate colocalized tranlocation
scripts for each pair of genes. (c)We computationally detected the precise 3D
of
each
transcript's intensity peak and calculated the percentage of transcripts that are
colocalized between pairs of ceR.NAs in HCT116 and the control DICER -/- in different
experimental conditions (indicated below each barpolot). ]For cg. the colocalization percentto the
age of Vapa with Pten indicates the fraction of colocalized Vapa and Pten molecules
and
analysed
were
cells
300
than
more
total number of Vapa molecules. For each condition,
cell.
a
of
percentage
the colocalization percentage represents the mean colocalization
For each pair of ccRNAs, we define the average colocalization fraction as follows:
ithceRA =# of colocalized transcripts of ceRNA1 with ceRNA2
Coloaliedracton2=f cRNA total
# of transcripts of ceRN A1
frcino eR ,wt eR
Colocahzed~
where () denotes the average over all the ells. Note that the Colocalization fraction is not
78
3.1. RESULTS
symmetric as the denominators are different even though the numerators are identical i.e
Colocalized fraction of Pten with Vapa will always be greater than Colocalized fraction of
Vapa with Pten because Vapa expression (denominator) is greater than Pten expression (denominator) even though the number of colocalized Pten and Vapa (numerator) molecules
is identical.
In order to test whether the colocalization was miRNA dependent, we measured the
colocalization fraction between each ceRNA pair in all our experimental conditions, and
for both the HCT116 and DICER -/- cell lines. We found that colocalization fraction for
each ceRNA pair was significantly higher in HCT compared to DICER -/- in all the conditions,suggesting that miRNAs were partly responsible for colocalization (Figure 3.5c).
The fraction of Cnot6l colocalized with Vapal was surprisingly high and ranged from 25-40%
in the siRNA knockdown conditions. Most other ceRNA pairs had colocalization fractions
between 2-10% in HCT116. However, this is likely to be a lower bound for colocalization
of ceRNA species over a cell-cycle because we only take snapshots of gene expression with
smFISH. To test for the specificity of our colocalization algorithm, and exclude the possibility that the colocalization was independent of common miRNAs between ceRNAs we
used Twisti as a negative control. We checked for colocalization between Pten and Twisti
which dont share any miRNA binding sites. We found no colocalization between the two
suggesting that colocalization was specific to ceRNA species. We also estimated a null model
for random colocalization in the following manner: we took the probability of 2 transcripts
to randomly colocalize as the size of a voxel occupied by a diffraction-limited spot / cellular
volume. The size of a voxel for a diffraction limited spot is -0.2pm x
0.2pm x 0.3pm while
the volume of a cell is -10pm x 10pm x 5pm, thus the probability of random colocalization
is negligible. Taken together, we find that colocalization of ceRNAs is miRNA dependent
and differs considerably for each ceRNA pair.
79
Chapter 3
3.2
Discussion
Here we used a smFISH assay to quantify endogenous transcription in single cells of known
ceRNAs (Cnot6l,Pten, Vapa) with single-molecule resolution and analyzed their spatial localizations. Our smFISH single-cell measurement of crosstalk strength for these three ceRNAs that share at least 7 miRNA binding sites is consistent with the previous chapter's
population-level result. However, we measured Vapa, Pten and Cnot6l's crosstalk effects on
each other with a much greater accuracy, and found that they affected each other reciprocally at both mean-level changes and dynamically in single-cells. In analyzing the single-cell
expression profile, we uncovered a miRNA-dependent correlation and stoichiometric covariation of ceRNA expression in single cells along with a miRNA-dependent colocalization
of their mRNA molecules. These findings may have important implications of a crosstalkbased mechanism of post-transcriptional regulation.
Firstly, if microRNAs promote a stoichiometric balance among genes that share miRNA
binding sites then this could explain the paradox of weak miRNA repression on individual
targets versus strong evolutionary selection of microRNA-targeting. Stoichiometric balance
is crucial within macromolecular complexes and cellular networks where imbalances can
lead to severe malfunctions. As microRNAs are known to extensively co-target functionally
shared gene networks and proteins in macromolecular complexes, we suggest that microRNAs may be selected for their combinatorial regulation on many different ceRNAs together
rather than on individual targets. The individual repressive effect of a miRNA on its shared
targets would be correlated through the crosstalk channel and allow for stoichiometric expression of a large set of miRNA targets. Such a crosstalk based co-regulatory mechanism
at the transcript level would allow a flexible,adaptive mechanism for compensating environmental, genetic or random perturbations in mRNA abundance.
Secondly, our observation that ceRNAS exhibit reduced gene expression correlations in
miRNA deficient DICER -/- cells may be taken as a general signature of crosstalk to help
in their identification. Putative ceRNAs could be identified without perturbing the cell
i.e without relying upon either down-regulating or up-regulating the levels of a particular
80
3.3. METHODS
sender and observing changes in a particular receivers. Instead, the intrinsic variability of
sender transcript levels in a cell would correlate the levels of a receiver through the shared
miRNA crosstalk channel. Recent advances in single-cell sequencing technology has resulted
in the ability to measure the entire transcriptome of hundreds of cells, and thereby compute
single-cell correlations between all possible pairs of genes (Gruen,Kester 2014). Pairs of
genes that appear to lose correlation in DICER -/- when compared to HCT 116 would thus
be attractive ceRNA candidates. Such an unbiased, "loss of correlation" based approach
to identify ceRNAs would circumvent two major limitations of the sender perturbation
strategy. One, the reliance on microRNA-target predictions to identify putative ceRNAs.
Computational target predictions are often noisy and have limited accuracy and consistencyin practice, false positives and false negatives in the target predictions often make it difficult
to identify mRNAs with common targeting miRNAs. Secondly, perturbing a sender mRNA
causes a cascade of transcriptional and protein-level changes which make the construction
of a null model challenging.
3.3
3.3.1
Methods
Fluorescent in situ hybridization and imaging
Hybridization and washes were carried out according to previously established protocols
(Femino 1998,Raj, 2008). Briefly, we hybridized probes for at least 18 hours at 30C, we
used wash buffers of formamide concentration 25%. Optimal washing conditions and probe
concentrations were determined empirically for each gene. For nuclear staining, we used
the DAPI after the wash steps. Z-stacks of images were taken with a Nikon Ti-E inverted
fluorescence microscope equipped with a 100x oil-immersion objective and a Photometrics
Pixis 1024B CCD (charge-coupled device) camera using MetaMorph software (Molecular
Devices, Downington, PA). The image-plane pixel dimension was 0.13 Jpm and the Z spacing
between planes was 0.4
pm.
81
Chapter 3
Table of smFISH experimental conditions
Treatment
Cell-line
smFISH species
untreated
HCT116 and DICER
-/-
Pten, Vapa, Cnot6l, Twisti
25nM si-non targeting neg control
HCT116 and DICER
-/-
Pten, Vapa, Cnot6l
25nM si-Pten
HCT116 and DICER
-/-
Pten, Vapa, Cnot6l
25nM si- Vapa
HCT116 and DICER
-/-
Pten, Vapa, Cnot6l
25nM si-Cnot6l
HCT116 and DICER
-/-
Pten, Vapa, Cnot6l
3.3.2
Image analysis
The transcript distribution was measured by counting smFISH labeled mRNA in single
cells as previously described (Raj, Bogaard, et al., 2008). Briefly, a log filter is applied to
each optical plane of the image stack to enhance the fluorescent signal. A threshold on
intensity values is taken for where the plot consisting of the of identified spots with respect
to intensity plateaus to pick up true mRNA spots. The locations of mRNA spots are then
taken to be the regional maximum pixel value of each connected region. Cell boundaries
are manually traced using the dapi and bright-field images. The number of mRNA spots
located within the cell boundaries of an individual cell can thus be quantified.
3.3.3
siRNA transfection and cell culturing
Transfections and cell culturing were carried out as described in Chapter 2.
82
Chapter 4
MicroRNA-mediated control of protein
expression noise
4.1
Background
1 MicroRNAs regulate a large number of genes in metazoan organisms (Friedman et al.,
2009; Lewis et al., 2005; John et al., 2004; Lee et al., 1993; Wightman et al., 1993; Enright
et al., 2003) by accelerating mRNA degradation and inhibiting translation (Guo et al.,
2010; Lim et al., 2005). Although the physiological function of some microRNAs is known
in detail (Lee et al., 1993; Wightman et al., 1993; Brennecke et al., 2003; Johnston and
Hobert, 2003), it is not clear why microRNA regulation is so ubiquitous and conserved, since
individual microRNAs only weakly repress the vast majority of their target genes (Baek et
al., 2008; Selbach et al., 2008) and knockouts rarely result in mutant phenotypes (Miska et
al., 2007). One reasons for this widespread regulation that has been proposed is the ability
of microRNAs to provide robustness to gene expression (Bartel and Chen, 2004; Hornstein
and Shomron, 2006) - e.g. by buffering stochastic variability in gene expression(Ebert and
'This chapter has been adapted from a paper entitled "MicroRNA control of protein
expression noise' that has been published (Science 3 April 2015: 128-132) with lead
author J6rn Schmiedel. My contribution was to aid in experimental design and in
writing an earlier version of the final paper.
Chapter 4
Sharp, 2012).
In this work we use mathematical modeling and single cell reporter assays to show that
microRNAs - in conjunction with increased transcription - decrease protein expression
noise for lowly expressed genes, but increase noise for highly expressed genes. Genes that
are regulated by multiple microRNAs show more pronounced noise reduction. We estimate
that hundreds of (lowly expressed) genes in mouse embryonic stem cells have reduced noise
due to substantial microRNA regulation. Our findings therefore suggest that microRNAs
confer precision to protein expression and thus offer plausible explanations for the commonly
observed combinatorial targeting of endogenous genes by multiple microRNAs as well as
the preferential targeting of lowly expressed genes.
Gene expression is inherently variable due to the stochasticity of all molecular reactions
(Raj et. al., 2006; see (Figure 4.1a). Noise in the expression of a gene is thought to mainly
originate from transcriptional dynamics (Blake et al., 2003; Raj, Peskin, et al., 2006), low
number of mRNA molecules (Ozbudak et al., 2002; Bar-Even et al., 2006) or fluctuations
that propagate to the gene from external sources, such as varying numbers of transcription
factors or ribosomes (Pedraza and van Oudenaarden, 2005; Paulsson, 2004). Previous work
has hypothesized that microRNAs should be able to reduce gene expression noise when
their repressive post-transcriptional effects are antagonized by accelerated transcriptional
dynamics (Ebert and Sharp, 2012; Noorbakhsh et al., 2013). However, this has not been
shown experimentally and since microRNA levels themselves are variable, the propagation
of their fluctuations should theoretically contribute additional gene expression noise.
4.2
Effects of microRNAs on gene expression noise
To explore the effects of endogenous microRNAs on protein expression noise, we adapted a
single-cell plasmid reporter system (Mukherji et al., 2011) to measure microRNA-dependent
expression fluctuations in mouse embryonic stem cells (mESC). The plasmid contains two
genes that encode fluorescent proteins (ZsGreen and mCherry), which are transcribed from
a common bi-directional promoter (Figure 4.1b).
84
4.2. EFFECTS OF MICRORNAS ON GENE EXPRESSION NOISE
A
translational
transcriptional
machinery
machinery
gene
r
--
=-
protein
mRNA
I
expr.
-~--
microRNA
00
D
C
B
1 05
four
pTRE-Tight
~i-'~
-
<
no 3'UTR
>0.1
<Ri>
bulged miR-20a sites
3UTR I
10
Cd
0
--
microRNA
pr
te
104
10'
mCherry intensity (a.u.]
(in one ZsGreen bin)
aE
U
0
protein
E
10e
101
E
A
L
1
105
intensity [a.u.]
*no 3'UTR
* one perfect miR-20a site
*no 3'UTR
e one bulged miR-20a site
.
104
103
102
ZsGreen
1.5
.no 3'UTR
* four bulged miR-20a sites
(0
05
E
0.5
0102
0
103
104
0.5
010
102
103
104
102
1W
104
mCherry intensity mean [a.u.]
Figure 4.1 I Opposing noise effects of microRNA regulation at low and high
gene expression. (a) Model scheme for the expression of a microRNA regulated gene.
The microRNA can reversible bind the mRNA (not depicted) to inhibit its translation
and decrease its stability. If the mRNA is degraded in the mRNA-microRNA complex,
the microRNA is recycled. Noise in gene expression originates from the stochasticity of
molecular reactions (intrinsic noise; jagged reaction arrows), or variability in the cellular
machinery (extrinsic noise; external factors with fluctuating levels). (b)The plasmid reporter
system. The plasmid carries a pTRE-Tight bi-directional promoter from which ZsGreen
and mCherry are transcribed. The mCherry 3'UTR can be modified to contain no or a
certain number and type of microRNA binding sites. (c) Overlay of two flow cytometry
measurements of mouse embryonic stem cells transiently transfected with two different
variants of the reporter system, one with no mCherry 3'UTR (black) and the other with
four bulged miR-20a binding sites in the mCherry 3'UTR (blue). For further processing
we binned cells according to ZsGreen intensity (red vertical lines) and discarded cells in
ZsGreen background (grey) (see Appendix C, Methods). a.u.: arbitrary units.
85
Chapter 4
Figure 4.1 1 Opposing noise effects of microRNA regulation at low and high
gene expression. (d) Example of mCherry intensity distributions in one ZsGreen bin. In
each bin we calculate the mean and noise - defined as the coefficient of variation (standard
deviation divided by mean) - of mCherry intensity distributions. (e)Noise of mCherry
intensity as a function of mean mCherry intensity in each bin for three different miR-20a
regulated constructs (blue) compared to respective unregulated constructs (black). Panels
are ordered from left to right according to increasing repression of constructs by miR-20a (cf.
Figure C.1). Dots and error bars represent data mean and bootstrapped standard deviation,
respectively. Dashed lines and patches represent optimal model fit and 95% confidence
interval, respectively.
To probe the effect of microRNAs, we constructed variants of the plasmid with different
numbers and types of microRNA binding sites in the 3'UTR of the mCherry gene. We
transfected plasmids into mESCs and quantified single cell fluorescence two days later using
a flow cytometer (Figure 4.1c). We used ZsGreen fluorescence intensity to bin cells with
similar transcriptional activity (e.g. due to varying plasmid copy numbers) and in each bin
we calculated mean and noise of mCherry intensities over all cells in the bin ( (Figure
4.1d), see Appendix C, Materials and Methods and Supplementary Note). We define noise
as the standard deviation of the protein expression distribution divided by its mean, which
is an intuitive measure of the relative size of expression fluctuations.
We started by assessing the effects of miR-20a, a microRNA endogenously expressed in
mESC, on mCherry protein expression noise (Figure 4.le). In cells with low mCherry expression, miR-20a regulation reduces noise compared to an unregulated control. In contrast,
in cells with high mCherry expression, miR-20a regulation increases noise. These changes
in mCherry noise are more pronounced for reporters where miR-20a repression of mCherry
protein is stronger, e.g. when using perfect and multiple target sites ( (Figure 4.lf,g)and
Figure C.1).
We utilized a mathematical model in order to understand these opposing effects of microRNA regulation on protein expression noise (see Appendix C, Supplementary Model).In
this work, we adopt the commonly used decomposition of total noise 7tot into intrinsic noise
and extrinsic noise q 2
(Elowitz, Levine, et al., 2002; Swain et al., 2002). Here,
squared total noise is the sum of the squared intrinsic and extrinsic noise components:
86
2t
4.2. EFFECTS OF MICRORNAS ON GENE EXPRESSION NOISE
77ot mt
A
intrinsic noise nt
A 1.5
CL
-no
~|
I
mRNA-miRNA
interaction
interaction
B
1
C
total noise TIo
A 1.5
0L
V
1
a)
1
A)
0
C
0.5'S
0
101
102 103
-
(0
C
extrinsic noise next
A 1.5
CL
(4.1)
ext
104 105
protein expression [a.u.]
0.51
C:
0
10
0.5
0'
102 103 10 10,
protein expression [a.u.]
10
10 10 10 10,
protein expression [a.u.]
Figure 4.2 1 Noise model predictions for a microRNA regulated gene. (a) Intrinsic noise due to low molecule numbers declines with increasing expression. MicroRNA
regulation reduces intrinsic noise as a function of repression due to higher mRNA numbers necessary and dampened propagation of noise from the mRNA to the protein level.
(b)microRNA regulation results in additional extrinsic noise due to fluctuations in the microRNA pool that are propagated to the target gene dependent on conferred repression
and satu- ration of the microRNA pool (cf. Figure C.2). (c) Net influence of microRNA
regulation results in decreased total noise at low and increased total noise at high expression
levels.
Intrinsic noise stems from the reactions internal to the expression of the gene and is dominated by transcriptional dynamics and low mRNA copy numbers. Extrinsic noise stems from
fluctuations propagating from external factors to the gene (Figure 4.2a). The modeling
results in two key predictions. Firstly, the model predicts that a microRNA-regulated gene
(reg) has reduced intrinsic noise compared to an unregulated gene (unreg) at equal protein
expression levels; the size of intrinsic noise reduction is approximately equal to the square
root of microRNA-mediated fold-repression r (Figure 4.2a):
unreg
_reg
77
-V'
(4.2)
int
The model predicts that the effect and its size are independent of the mode of repression,
since translational inhibition requires higher mRNA levels and therefore reduces intrinsic
87
Chapter 4
noise resulting from low mRNA copy numbers, while accelerated mRNA degradation dampens the propagation of noise from the mRNA to the protein level (see Appendix C, Supplementary Note 1; Ebert et. al., 2012; Pedraza et. al., 2005; Fraser et. al., 2004). To achieve
equal protein expression given increased mRNA turnover, there must be increased transcription rates. Reduction of intrinsic noise can therefore be understood as the combined
effect of microRNA-mediated accelerated turnover and increased transcriptional activity
(Ebert and Sharp, 2012). Secondly, the model predicts ( (Figure 4.2b) and Figure C.2)
that microRNA regulation acts as an additional extrinsic noise source given by
7ext =4
where
(4.3)
IT denotes the noise in the pool of regulating microRNAs (see Appendix C, Sup-
plementary Model), and 0 is the microRNA repression (see Figure C.2). The combined
effects of decreased intrinsic and additional extrinsic noise result in decreased total noise at
low expression, but increased total noise at high expression (Figure 4.2c) ; and model-fits,
with the microRNA pool noise as the only free parameter, yield accurate agreement with
the experimentally observed total noise profiles (Figure 3.le-g). To distinguish the effects
of microRNA regulation on intrinsic and extrinsic noise experimentally, we modified our
plasmid reporter system such that both ZsGreen and mCherry are regulated by miR-20a
through identical 3'UTRs ((Figure 4.3a) and Figure C.3a). As a result of this design, both
fluorescent reporters share the same regulatory inputs and cellular environment, and intracellular differences in their expression can only result from processes inherent to each gene,
i.e. the processes that create intrinsic noise (Elowitz, Levine, et al., 2002; Swain et al., 2002).
Results from this experimental design show that miR-20a regulation reduces intrinsic noise
compared to an unregulated construct ((Figure 4.3b) and Figure C.3b). As predicted by
our model, the intrinsic noise is reduced by the square root of fold-repression conferred by
miR-20a ((Figure 4.3c) ; see also Figure C.3d), confirming our results reported in Figure
4.1c These results further imply that the observed increase in total noise at high mCherry
expression must be due to additional extrinsic noise (Figure C.3c).
88
4.2. EFFECTS OF MICRORNAS ON GENE EXPRESSION NOISE
B
A
pTRE-Tight
P=UUTR
4
0
CD
0
4
0.5
. no 3'UTR
.
elxbulged
miR-20a
* no 3'UTR
. no 3'UTRs
* I xperfect miR-20a
9 4xbulged miR-20a
C
0
0
C
mRNA
C
protein
nicroRNA
104
10m
mean mCherry + ZsGreen
C
=3
0
10310105103104
4
3
intensity [a.u.]
4xbulged rr 8-20a
ix perfect mirA-29ieT
Q.)
--lprfectmtrri-20a
L l tulged miR-20a
2
3
4
sqrt(fold-repression)
Figure 4.3 1 microRNA-mediated intrinsic noise effects. (a)Modified plasmid reporter system where ZsGreen and mCherry have identical 3'UTRs, which allows to quantify
expression-dependent intrinsic noise. (b) Intrinsic noise as a function of mean ZsGreen
mCherry intensities in each bin, showing that microRNA regulation reduces intrinsic noise.
Dots and error bars represent data mean and bootstrapped standard deviation, respectively.
Dashed lines and patches represent optimal model fit and 95% confidence interval, respectively. (c) Measured intrinsic noise reduction for bi-regulated constructs is proportional to
square root of fold-repression, as measured independently by mCherry-regulated constructs
(cf. Figure C.1). Error bars indicate standard deviation of three biological replicates.
+
1
In summary, our data show that miR-20a regulation reduces intrinsic noise while it
increases extrinsic noise of target genes, resulting in lower total noise at low expression but
increased total noise at high expression levels.
Our analyses so far suggest that the reduction of intrinsic noise is a generic property
of microRNAs as post-transcriptional repressors of protein expression and therefore noise
reduction should occur irrespective of the specific microRNAs or the molecular details of
the mRNA-microRNA interaction. In contrast, additional extrinsic noise stems from the
variability of the microRNA pools and should therefore depend on the specific microRNA.
To investigate these hypotheses, we constructed reporters with binding sites for eight additional
microRNAs that are endogenously expressed in mESC over a wide range (Figure
89
Chapter 4
C.4). Since the molecular details of mRNA-microRNA interactions do not affect microRNAmediated noise effects we chose perfect target sites to allow for high specificity with respect
to the regulating microRNA pool and to optimize measurement signals. The data from all
eight reporters consistently show intrinsic noise reduction as large as the square root of foldrepression (Figure C.3e), and we additionally confirmed this by directly measuring intrinsic
noise reduction for miR-291a (cf. (Figure 4.3c) ). We furthermore found that AU-rich
elements, which induce post-transcriptional repression of protein expression due to binding
of various co-factors (Barreau et al., 2005), also reduce intrinsic noise by the square root of
fold-repression (Figure C.3f). These data therefore support the hypothesis that reduction
of intrinsic noise is a generic property of microRNAs as post-transcriptional repressors that
is independent of the specific identity of the regulating microRNA.
Next we used our mathematical model to extract the microRNA pool noise from the
fits to the experimental data. We find that microRNA pool noise differs across all assayed
microRNAs (Figure 4.4a) , while estimates of microRNA pool noise for different constructs
assaying the same microRNA are similar (Figure C.7), validating that our model fits can
faithfully estimate microRNA pool noise. Although microRNA pool noise decreases for
microRNAs that repress the reporters more strongly, it is still substantial even for the
most highly expressed microRNAs in mESC (miR-290 cluster, including miR-290, miR291a
miR295; Marson et. al., 2008). Interestingly, we find that the subset of assayed microRNAs
with two independent gene copies, producing the identical mature microRNA ((Figure
4.4a), marked in red), tend to have lower microRNA pool noise compared to microRNAs
that confer similar repression but only have one gene copy ((Figure 4.4a), marked in
black).
90
4.2. EFFECTS OF MICRORNAS ON GENE EXPRESSION NOISE
A
E
A)
0
0.5
r_
----.
miR-200b
m--
R2&- ------.
niiR-_9P
0
iRT2amiR-2
0.
z 0.25
..-.-
0
m1F 126a
:3104
acc
o
miR-291a
.2E
E)-
iR-16ni
3
10
A
i102
OE
0Q
10p
102
10'
0
-J
103
m herry mRNA leve s [RPKM]
101
fold-repression
mESC
B
transcriptome
0.5
0)
0
CL
z
0.25
CL q
-
-- -
-
0
0.3~
C5
.
I
-
..
0. 2
0.1
0
10' 102 10
100
mRNA levels [RPKMI
0,
qt * +
F
+
NC
percentage of genes expressed below
25 50 75 90 95 99
C
1.5
v
0
- -
Weel 3'UTR
------- I
Lats23'UTR
100- -
3'UTR mut
A Wee1 3'UTR wt
"Weel
A
so,
1
0.
a
C
100
0
0.5
50.
E
0
D
2
104
103
10
mCherry intensity mean [a.u.]
0.
E
100-
e Lats2 3'UTR
E
mut
* Lats2 3'UTR wt
1
~ ~ ----
0
---------
Casp2 TUTR
-- - - - -- - -
0
100
-Rbl2
3'UTR
- -----------
50-
0
0.
C
100
10'
102
10
mRNA levels [RPKM]
0
crossover
* endogenous expression
.
-0-
0.5
E
so
so0
E
0.
A
0
-
a)
3
max. possible
noise reduction
104
1io
102
mCherry intensity mean [a.u.]
Figure 4.4 1Estimation of microRNA pool noise and noise effects for endogenous
genes. (a) MicroRNA pool noise estimates from reporters with perfect target sites for
nine different microRNAs endogenously expressed in mESC. Subsets of microRNAs with
one (black) or two gene copies (red) show negative scaling of pool noise with conferred foldrepression, with latter subset having lower noise levels.
91
Chapter 4
Figure 4.4 1Estimation of microRNA pool noise and noise effects for endogenous
genes. (b)microRNA pool noise estimates of individual pools of miR-16, miR-20a and miR290 compared to mixed pools of miR-16 miR-20a and miR-20a miR-290, as determined
from a reporter regulated by two perfect target sites for the respective microRNA species.
Red bars in columns for mixed pools show expected microRNA pool noise when individual
microRNA sub-pools were fully correlated. (c)Total noise levels for the 3'UTR of the cell
cycle regulator Weel, wild-type (blue) and microRNA binding sites point-mutated (black)
versions. (d) Total noise levels for the 3'UTR of the tumor suppressor Lats2, wild-type
(blue) and microRNA binding sites point-mutated (black) versions. (e) Mapping fluorescent
reporter levels to the transcriptome of mESC. (Upper panel) FACS sorting and least square
regression was used to determine conversion between mean mCherry fluorescent intensities
and mCherry mRNA levels (as measured by RNA seq). (Lower panel) The range covered
by the fluorescent reporter system in relation to the transcriptome expression (n = 13751)
in mESC (25% to -99% of transcriptome expression). (f) Relative microRNA-mediated
effects on total noise in assayed endogenous 3'UTRs compared to their point-mutated 3'UTR
versions as a function of transcriptome expression. Blue line and area represent model-based
extrapolation of noise effects to transcriptome expression (mean and 95% confidence interval
based on parameter estimates of n=3 measurements). Black dots indicate crossover from
reduced to increased total noise. Red dots indicate endogenous transcriptome expression
of the respective gene in mESC. Red dashed lines indicate maximally expected reduction
of total noise given the observed repression. Error bars in (a) & (b) indicate standard
deviation of at least three biological replicates. In (c) & (d) dots and error bars represent
data mean and bootstrapped standard deviation, respectively. Dashed lines and patches
represent optimal model fit and 95% confidence interval.
This suggests that microRNA pools could have lower noise if they consist of independently transcribed microRNAs. We reasoned that these findings should extend to genes that
are regulated by different microRNAs, where uncorrelated fluctuations between the different
microRNAs can average out, resulting in lower noise of the overall pool. To test this hypothesis, we constructed reporters with a perfect target site for miR-20a and an additional
perfect target site for either miR-16 or miR-290 in the mCherry 3'UTR and compared them
to reporters with two perfect target sites for miR-16, miR-20a or miR-290, respectively.
When estimating microRNA pool noise from the total noise profiles (Figure C.8) we find
that the noise levels in the mixed pools are lower than expected if the individual microRNA
pools were fully correlated (see Appendix C, Methods) and can be lower than the noise in
the individual microRNA pools (Figure 4.4b). Taken together these experiments show that
although microRNA regulation increases extrinsic protein expression noise, mixed pools of
microRNAs can attenuate this effect.
92
4.2. EFFECTS OF MICRORNAS ON GENE EXPRESSION NOISE
So far we investigated microRNA-mediated noise effects using nearly or fully complementary microRNA binding sites in an artificial 3'UTR setting. Endogenous microRNA
targets however often harbor many binding sites, albeit with less complementarity, for different microRNAs in their 3'UTRs (Friedman et al., 2009; Enright et al., 2003; Krek et
al., 2005; Stark et al., 2005). To test if our findings extend also to those situations, we
constructed four mCherry reporters with the 3'UTRs for the genes Weel, Lats2, Casp2 and
Rbl2, which all have multiple binding sites for different microRNAs endogenously expressed
in mESC. We then compared protein expression noise for constructs with the wild-type
3'UTRs to versions with point-mutated microRNA binding sites (see Appendix C, Methods). The microRNAs together confer between 3 and 5.5-fold repression for the wild-type
3'UTRJs compared to the point-mutated 3'UTRs (Figure C.9a). For all wild-type 3'UTRs
we observed reduced total noise at low and intermediate expression compared to the mu-
tated 3'UTRs ((Figure 4.4c,d) and Figure C.9a). As observed for the artificial 3'UTR
constructs, intrinsic noise for the wild-type 3'UTR constructs is reduced by the square root
of fold-repression (Figure C.3g), indicating that our previous findings on the reduction of
intrinsic noise can be extrapolated to endogenous microRNA targets. Interestingly, total
noise is hardly increased at high expression levels and the estimated noise levels for the
mixed microRlNA pools regulating the endogenous 3'UTRs are very low compared to the
noise levels estimated for single microRNA pools (Figure C.9b), consistent with the findings
above that mixing of different microRNA species results in lowered microRNA pool noise.
Finally, we determined if the expression range covered by our reporter assay covers
relevant expression levels of endogenous genes. We collected cells at different mCherry fluorescence intensities using fluorescence-activated cell sorting, and measured mCherry mRNA
levels in conjunction with the whole transcriptome using mRNA sequencing (see Appendix
C, Methods Figure C.10a). We find that our reporter assay covers the range of 25% to 99%
(-1 RPKM to -500 RPKM) of expressed genes in mESC (Figure 4.4e), indicating that
the noise effects observed in our reporter assay are relevant to endogenous genes. For all
four 3'UTRs that we assayed with our reporter, reduction of total noise extends in a graded
fashion up to the top 10% of the transcriptome expression distribution (Figure 4.4f).
93
Chapter 4
While most microRNAs individually repress genes only to a small extend (11, 12), we find
that hundreds of genes are substantially repressed (>2 fold) by the combinatorial action
of microRNAs in mESC (Figure C.11), as determined from data comparing the transcriptome expression between wild-type and microRNA-deficient Dicer knockout mESC (Leung
et al., 2011). Furthermore, most of the highly repressed genes have low expression levels
(see Figure C.11; Stark et. al., 2005; Farh et. al., 2005; Sood et. al., 2006), suggesting that
these genes should have reduced protein expression noise as a consequence of microRNA
regulation.
4.3
Conclusions
Genome-scale analysis of microRNA binding data (Farh et al., 2005; Sood et al., 2006)
has shown that microRNAs preferentially target lowly expressed genes that are dominated
by intrinsic noise, while selectively avoiding ubiquitous and highly expressed genes that
are more sensitive to extrinsic fluctuations. Our integrated theoretical and experimental
approach has shown that microRNAs reduce intrinsic noise while increasing extrinsic noise.
Together these results suggests that a common effect of microRNAs is to reduce gene
expression noise. Our work has further shown that combinatorial microRNA regulation,
a widely observed phenomenon in vivo (Friedman et al., 2009; Enright et al., 2003; Krek
et al., 2005; Stark et al., 2005), enhances overall noise reduction by amplifying repression
and buffering stochastic fluctuations in the abundance of single microRNAs. Combinatorial
microRNA regulation may thus be a potent mechanism to reinforce cellular identity by
reducing gene expression fluctuations that are undesirable for the cell.
The principle established in this work is that fluctuations in protein abundance can be
effectively regulated at the level of transcription. Here, we have focused on the capacity
of microRNAs to regulate gene expression noise; however, any translationally invariant
mechanism that decreases the timescale of mRNA fluctuations will, in principle, produce a
similar effect. This conceptual perspective provides a foundation for studying a broad range
of transcriptional regulators as alternative instruments for controlling protein noise.
94
Chapter 5
Conclusions and Future Directions
it is now well-established that miRNA play an important role in gene regulation through either translational repression or mRNA degradation. By being able to target different mRNA
species, their impact may be more extended. In this thesis we have investigated the ceRNA
hypothesis which proposes to add a new layer of post-transcriptional gene regulation mediated by the titration of common miRNAs by competing targets. This RNA-RNA crosstalk
effect is a subject of intense activity and indeed controversy. Indeed, it is difficult to imagine
that perturbing the expression of individual miRNA targets, which are only a small part
of the total number of binding sites in the cell, could possibly influence enough miRNA to
significantly change the repression of other targets. The focus of this work has been to interrogate key questions about the ceRNA mechanism- its generality, its dependence on shared
miRNAs, and the size of the effect. We aimed to answer these questions by integrating
three kinds of experiments: a) perturbing the levels of 3 known ceRNAs and systematically
searching for miRNA-mediated crosstalk effects on the transcriptome b) modulating the
levels of binding sites in the cell by over-expressing an endogenous PTEN 3'UTR and sorting cells carrying specific amounts of the PTEN 3'UTR to isolate dose-dependent crosstalk
effects c) quantifying the expression and spatial localization of ceRNAs in single-cells
While initial studies of the ceRNA hypothesis were restricted only to a few computa-
Conclusion and Review
tionally predicted ceRNAs, our results show that an appreciable crosstalk effect exists quite
pervasively across the genome, i.e the levels of hundreds of genes, across all expression
scales, appear co-regulated along with the perturbed sender. Through carefully selecting
genes whose crosstalk was lowered in a miRNA-deficient control we could ascertain that
the effect was miRNA-mediated. More specifically, the size of the crosstalk effect can be
correlated to the number of shared miRNA, and the quality of miRNA binding sites in
the receiver genes. Thus both the overlap of miRNA binding sites and the affinity of those
miRNAs are important determinants of crosstalk. In the case of VAPA, PTEN, CNOT6L,
we found that shared miRNA binding sites made their interactions reciprocal- perturbations in each caused changes in the other. These findings suggest that combinatorial miRNA
targeting could be a mechanism that cells use to concordantly shape the expression of an
entire class of genes which may be functional in similar pathways or need to be expressed
stoichiometrically.
10075
Binding equation:
U50
Binding or unbndning
oP 25S
0
1
10,000
100
1
1
100
Free miRNA concentation F (units of Kd)
10,000
Figure 5.1 I Colocalization of ceRNA's can enhance crosstalk by increasing their local concentrations hence promoting rates of miRNA association between ceRNA as free miRNA's
are more likely to bind to nearby mRNA than other targets (adapted from Jens (2015).
The size of the ceRNA effect was bounded by 1 for all receivers, for each of the 3 senders.
That is, the fold change in a receiver was always lower than the fold change in the perturbed
sender. The existence of a hard bound emerged naturally from our minimal ceRNA model
because each receiver is only weakly repressed by a miRNA and each sender sequesters
only a fraction of the total miRNA pool. However the moderate crosstalk strength we
96
measured for many genes was still larger than predicted by our minimal miRNA-ceRNA
model, and other steady-state models of target competition. To examine this discrepancy in
more detail, we used smFISH to measure the intracellular concentrations of these molecules
at the molecular level and surprisingly found a colocalization of different ceRNA species
with each other. Thus, we hypothesize that the strong crosstalk for PTEN, VAPA and
CNOT61 (between 0.2-0.5), and possibly other ceRNAs, might be explained by localization.
Effectively, localization renders the available pool of interacting binding sites much smaller
than the total, amplifying crosstalk between select ceRNAs. Put another way, colocalization
of bound ceRNAs increases their local concentrations making it more likely that dynamically
binding/un-binding miRNAs from one ceRNA will bind to another nearby ceRNA. We are
currently working on extending our minimal model to take localization effects into account.
Experimentally, one can apply new multiplexed smFISH technologies to potentially search
for colocalization between multiple ceRNAs (Lee 2014). Though more challenging, with
recently developed technologies to visualize sub-cellular localization of miRNAs (Pitchiaya
2014), one can probe the miRNAs we have identified to search for spatial colocalizations
between ceRNA-miRNA pairs.
Genome-wide ceRNA studies have measured expression changes in population averages
of cells after perturbing either the
#
of targets/or miRNAs and have found crosstalk effects
to be small. We think miRNA-mediated crosstalk effects are more visible in in un-perturbed
single cells. As we found in Chapter 4 miRNA pools of different miRNA families themselves
can be quite noisy and propagate noise to their target proteins. Our work suggests that both
miRNA pool noise and miRNA coupling between ceRNAs are a mechanism to suppress their
independent fluctuations, leading to more correlated and even stoichiometric expression of
genes in single cells. However we caution that we only demonstrated this effect in fixed
cells by observing differences in ceRNA correlations between HCT 116 and miRNA deficient DICER cells. Future studies could track ceRNA levels dynamically in single-cells after
antagonizing specific miRNAS to truly isolate which miRNAs are responsible for reduced
fluctuations. Measuring correlations or noise would be a more sensitive measure of miRNA
induced interactions between ceRNAs than perturbing individual ceRNAs
97
References
Ebert, M.S., Neilson, J.R., and Sharp, P.A. (2007). MicroRNA sponges: competitive inhibitors of
small RNAs in mammalian cells. Nat. Methods 4, 721-726
Baek, D, Villen, J., Shin, C., Camargo, F.D., Gygi, S.P., and Bartel, D.P. (2008). The impact of
microRNAs on protein output. Nature 455, 64-71.
Selbach, M., B. Schwanhausser, N. Thierfelder, Z. Fang, R. Khanin, N. Rajewsky. (2008).
Widespread changes in protein synthesis induced by microRNAs. Nature 455: 58-63
Bartel DP. (2004). MicroRNAs: genomics, biogenesis, mechanism, and function. Cell 116: 281297
Wightman, B., Ha, I. and Ruvkun, G. (1993). Post-transcriptional regulation of the heterochronic
gene lin-14 by lin-4 mediates temporal pattern formation in C. elegans. Cell 75, 855-862.
Reinhart, B. J., Slack, F. J., Basson, M., Pasquenelli, A. E., Bet- tinger, J. C., Rougvie, A. E. and
Horvitz, H. R., (2000) The 21-nucleotide let-7 RNA regulates developmental timings in
Caenorhabditis ele- gans. Nature, 403, 901-906.
Cai S, Han HJ, Kohwi-Shigematsu T (2003) Tissue-speciWc nuclear architecture and gene
expression regulated by SATB1. Nat Genet 34(1):42-51
Hansen T.B.,J.Kjems, C.K.Damgaard (2010),. CircularRNAand miR-7 in cancer. Cancer
Research, vol. 73, no. 18, pp. 5609-5612, 2013.
Franco-Zorrilla JM, Valli A, Todesco M, Mateos I, Puga MI, Rubio-Somoza I, Leyva A, Weigel D,
Garcia JA, Paz-Ares J. (2007). Target mimicry provides a new mechanism for regulation of
microRNA activity. Nat Genet 39: 1033-1037.
Seitz, H. (2009). Redefining microRNA targets. Curr. Biol. 19, 870-873.
Mayr, C., and Bartel, D.P. (2009). Widespread shortening of 3'UTRs by alterna- tive cleavage
and polyadenylation activates oncogenes in cancer cells. Cell 138, 673-684.
Poliseno, L., Salmena, L., Zhang, J., Carver, B., Haveman, W.J., and Pandolfi, P.P. (2010). A
coding-independent function of gene and pseudogene mRNAs regulates tumour biology. Nature
465, 1033-1038.
Cesana M, Cacchiarelli D, Legnini I, Santini T, Sthandier 0, Chinappi M, Tramontano A, Bozzoni
I. (2011). A long noncoding RNA controls muscle differentiation by functioning as a competing
endogenous RNA. Cell 147:358 -369.
Lewis B, Shih 1, et al (2003).: Prediction of mammalian microRNA targets. Cell, 115(7):787798.
Giraldez, A.J., Mishima, Y., Rihel, J., Grocock, R.J., Van Dongen, S., Inoue, K., Enright, A.J.,
and Schier, A.F. (2006). Zebrafish MiR-430 promotes deadenyla- tion and clearance of maternal
mRNAs. Science 312, 75-79.
Ebert, M. S. & Sharp, P. A. Emerging roles for natural microRNA sponges. Curr. Biol. 20, R858R861 (2010).
98
Memczak S, et al. (2013). Circular RNAs are a large class of animal RNAs with regulatory
potency. Nature 495:333- 338.
Brewster, R.C., Weinert, F.M., Garcia, H.G., Song, D., Rydenfelt, M., and Phillips, R. (2014). The
transcription factor titration effect dictates level of gene expression. Cell 156, 1312-1323.
Buchler, N.E., and Louis, M. (2008). Molecular titration and ultrasensitivity in regulatory
networks. J. Mol. Biol. 384, 1106-1119.
Bartel DP. 2004. MicroRNAs: genomics, biogenesis, mechanism, and function. Cell 116: 281297
Tay Y, Kats L, Salmena L, Weiss D, Tan SM, Ala U, Karreth F, Poliseno L, Provero P, Di Cunto F,
Lieberman J, Rigoutsos I, Pandolfi PP. (2011) Coding-independent regulation of the tumor
suppressor PTEN by competing endogenous mRNAs. Cell 147, 344-357.
Karreth FA et al (2011) In vivo identification of tumor-suppressive PTEN ceRNAs in an
oncogenic BRAF-induced mouse model of melanoma. Cell, 147:382-395
Sumazin P, Yang X, Chiu HS, Chung WJ, lyer A, Llobet-Navas D, Rajbhandari P, Bansal M,
Guarnieri P, Silva J. (2011). An extensive microRNA-mediated network of RNA-RNA interactions
regulates established oncogenic pathways in glioblastoma. Cell 147: 370-381
Yi et al. (2008). A skin microRNA promotes differentiation by repressing 'stemness'. Nature 452,
225-229.
Sluijter, J.P.G. et al. (2010). MicroRNA-1 and -499 regulate differentiation and proliferation in
human-derived cardiomyocyte progenitor cells. Arterioscler. Thromb. Vasc. Biol. 30, 859-868.
Cimmino, A. et al. (2005). miR-15 and miR-16 induce apoptosis by targeting Bcl2. Proc. Nati.
Acad. Sci. USA 102, 13944-13949
Jens M, Rajewsky N, (2015) Competition between target sites of regulators shapes posttranscriptional gene regulation, Nature Reviews Genetics 16, 113-126
Nitzan M., et al. Interactions between distant ceRNAs in regulatory networks. Biophys. J., 106
(2014), pp. 2254-2266
Salmena, L., Poliseno, L., Tay, Y., Kats, L., and Pandolfi, P. P. (2011). A ceRNA hypothesis: the
Rosetta Stone of a hidden RNA language? Cell 146, 353-358.
Cesana M, et al. (2011). A long noncoding RNA controls muscle differentiation by functioning as
a competing endogenous RNA, Cell, 147 , pp. 358-369
Tay, Y, Rinn, J., and Pandolfi, P.P. (2014). The multilayered complexity of ceRNA crosstalk and
competition. Nature 505, 344-352.
Figliuzzi, M., Marinari, E., and De Martino, A. (2013). MicroRNAs as a selective channel of
communication between competing RNAs: a steady-state theory. Biophys. J. 104, 1203-1213.
Bosia C, Pagnani A, Zecchina R. (2013) Modelling Competing Endogenous RNA Networks
PLoS ONE vol. 8 (6) pp. e 66609
99
Denzler, R., Agarwal, V., Stefano, J., Bartel, D.P., and Stoffel, M. (2014). Assessing the ceRNA
hypothesis with quantitative measurements of miRNA and target abundance. Mol. Cell 54, 766776.
Cummins J. M.et al. (2006) The colorectal microRNAome Proc. Nati. Acad. Sci. U.S.A. 103,
3687-3692
Arvey A, Larsson E, Sander C, Leslie CS, Marks DS. (2010). Target mRNA abundance dilutes
MicroRNA and siRNA activity. Mol Syst Biol, 6(363).
Li, H. & Durbin, R. (2010). Fast and accurate long-read alignment with Burrows-Wheeler
transform. Bioinformatics 26, 589-595
Yan H, Choi AJ, Lee BH, Ting AH.(201 1) Identification and functional analysis of epigenetically
silenced microRNAs in colorectal cancer cells. PLoS One 6(6):e20628
Garcia et al., Weak seed-pairing stability and high target-site abundance decrease the
proficiency of Isy-6 and other microRNAs. (2011) Nature Structural & Molecular Biology 18,
1139-1146
Robinson M, Oshlack A, (2010) A scaling normalization method for differential expression
analysis of RNA-seq data Genome Biology, 11:R25
.
Falcon, S. & Gentleman, R. (2007). Using GOstats to test gene lists for GO term association.
Bioinformatics 23, 257-258
Mukherji, S., M. S. Ebert, ., A. van Oudenaarden. (2011). MicroRNAs can generate thresholds in
target gene expression. Nat. Genet. 43: 854-859
Levine E, McHale P, Levine H. (2007). Small regulatory RNAs may sharpen spatial expression
patterns. PLoS computational biology, 3(11):e233,
Figliuzzi M, Marinari E, De Martino A. (2013). MicroRNAs as a selective channel of
communication between competing RNAs: a steady-state theory. Biophys J 104: 1203-1213.
Yuan Y, Liu B, Xie P, Zhang MQ, Li Y, Xie Z, Wang X. (2015). Model-guided quantitative analysis
of microRNA-mediated regulation on competing endogenous RNAs using a synthetic gene
circuit. Proc Natl Acad Sci 112: 3158-3163.
Ala U, Karreth FA, Bosia C, Pagnani A, Taulli R, Leopold V, Tay Y, Provero P, Zecchina R,
Pandolfi PP. (2013). Integrated transcriptional and competitive endogenous RNA networks are
cross-regulated in permissive molecular environments. Proc Natl Acad Sci 110: 7154-7159.
R Core Team. (2011). R: a language and environment for statistical computing. R Foundation for
Statistical Computing, Vienna, Austria. http://www.R-project.org/.
Hausser J, Zavolan M. (2014). Identification and consequences of miRNA-target interactionsbeyond repression of gene expression. Nat Rev Genet 15: 599-612.
Bosson AD, Zamudio JR, Sharp PA. (2014). Endogenous miRNA and target concentrations
determine susceptibility to potential ceRNA competition. Mol Cell 56: 347-359.
Broderick JA, Zamore PD. (2014). Competitive endogenous RNAs cannot alter microRNA
function in vivo. Mol Cell 54: 711-713.
100
Supplementary Note
Derivation of the mathematical model of microRNA regulation
In order to investigate the regulation of genes by microRNAs we build a kinetic model describing the expression of
a gene that is regulated on the post-transcriptional level by a microRNA.
We start with a previously published model of microRNA regulation [Mukherji et al., 20111 that we extend to
include the competition of multiple mRNAs for the same microRNA regulation and the turnover of the microRNA
(see later section).
The model is an ordinary differential equation model that describes the temporal evolution of free mRNA levels
[mi] as well as the levels of the complex between mRNAs and the microRNA [milt] for an unlimited number of
regulated genes. We denote different genes and the parameters associated with them by subscripts. We assume
that mRNA is transcribed with the constant rate vi and constitutively degraded with a rate d- [mi]. The mRNA
can bind to the free microRNA to reversibly form the complex mitt, with the associated on-rate k?' and off-rate
[mii].
k? . When bound in the complex, the mRNA is degraded with the rate d"
By assuming mass-action kinetics, the ordinary differential equations for the free levels of an mRNA mi and the
levels of the respective complex mip can be written as
d[mi] =vi - d
. [mi] - k?' - [mi] - [p] + ko I - [mip]
dt
0T
,
(1)
d[m2 p]
n
(2)
k - [mi] - [p] - k'f - [mip] - di" - [mip]
dt
In the beginning we assume that the turnover of microRNAs is much slower and therefore we treat the total
microRNA concentration as constant. Consequently, the following conservation relation holds
N
[WT]=[p]+
[mui]
(3)
,
3=1
where [ 1 T],[p] and [mjp] are the levels of total microRNA, free microRNA and all complexes formed by the
microRNA with the regulated mRNAs mj with j = 1, ... , N , respectively.
Solving equation 2 for steady state, i.e. setting the time derivate of [mipi] equal to zero, we obtain
i[mi] - [P]
[m ]= Ki
(4)
4
'
is the dissociation constant of the mRNA-microRNA interaction. It follows from equation
where Ki =
4 that the concentrations of the complexes formed by two different mRNAs with the microRNA are related by
[mxp] _ [m.] Ky
[myPI]
K,
[my]
Using equations 3 and 5 we can solve the steady state of [mip] as
[mip]
=
[mi]
K
(5)
(6)
[T
1+w
Here we define the sum over all free levels of regulated mRNAs normalized by their respective dissociation
constants
=(7)
E
j=1 Kj
as the microRNA workload. It follows from equations 4 and 6 that the inverse of one plus the microRNA workload
is the fraction of free microRNA
7
(8)
.
[A] = 1+w
The workload describes the sequestration of the microRNA by all regulated mRNAs and therefore captures the
competition between co-regulated genes.
With equation 6 we can write the steady state of the free mRNA levels implicitly as
[mni] =
r
Z
=
1
n?
+
Kic(1+w)
9
K-lw
where we define
(10)
[i] =
as the steady state concentration of the mRNA when it is not regulated by a microRNA. Further it is beneficial
to also define the effective total microRNA concentration as
[p7]=
.-
[pT ]
.(11)
To quantify the effect of the microRNA regulation on the free levels of an mRNA we introduce the measure of
repression as
Ri = 1 - [m.]
[1?]
(12)
Therefore repression of 0% (Ri = 0) means the free levels of the mRNA are not changed by the microRNA
regulation and repression of 100% (Ri = 1) means the levels of the free mRNA are completely suppressed by the
microRNA regulation.
Using the implicit expression for the steady state of the free mRNA (equation 9), the repression of regulated
mRNAs can be re-written in terms of the workload as
Ri
=
[me]
I-
'.j
(13)
Ki
(14)
[in?
(15)
1+w + xj
where xi =- d(l'I - is the ratio between the maximal microRNA mediated mR.NA degradation rate constant
(at zero microRNA workload, w = 0) and the constitutive mRNA degradation rate constant. Therefore, at a given
workload of the microRNA, the repression of any regulated mRNA is simply determined by its ratio between the
maximal microRNA mediated degradation rate constant and the constitutive mRNA degradation rate constant.
The workload at which each mRNA's repression is reduced to half of the maximal repression present at zero
workload w = 0 is
R
- Ri
= 0)
=1+x
.
(16)
An increased ratio of microRNA mediated to constitutive mRNA degradation rate constant therefore increases
repression and also shifts the loss of repression to higher microRNA workload values.
8
The steady state of the mRNA can also be solved explicitly as
[mi] =
([m+ - [p*I -K* (1 +wi) + ([m?]
-
[i*J -Ki (1 +-wi))2 4- [my] -K- (1 +wi) ,
(17)
where
Wi =W -
(18)
,i]
is the workload of the microRNA contributed by all regulated mRNAs except mRNA mi.
The competition of co-regulated mRNAs results in an apparent dissociation constant KZ for each regulated
mRNA, depending on the workload contributed by all co-regulated mRNAs:
Ki =Ki-(1 +w)
.
(19)
Further, to quantify the influence of an mRNA towards the microRNA, we introduce the fraction of microRNA
sequestrated by mRNA mi as
[Mi]
s
1+
.
(20)
Quantification of mRNA crosstalk
To investigate the coupling between co-regulated genes that share a common microRNA regulation, we introduce
the measure of crosstalk strength. Crosstalk strength describes the relative change in the free levels of the receiver
m, upon a relative change in the free levels of the sender m.
Cr -
Oln([m,]) _ O[m,] [m,]
91Bn([Tn.]) - [m,] [m,l
(44)
(44
Using the implicit equation for the steady state of the free mRNA levels (equation 9) and the theorem on implicit
differentiation we can rewrite equation 44 and solve it as
-
K,+
VM-1
( gr[T]
-
[m,]
- [m,]
K_+_(45)
S[T]
a[,m1]
[M[](
v,-d"P -[p']
d" +
V -d"A -[p
-J2
Ks K, -(1 +w)2
(d +
T]
22
m
)
V
]
Crosstalk strength is always positive, because all terms in equation 47 are positive. And it is always smaller
than 1, because
[ms
1
1
+
+
[
.
2
(48)
Crosstalk strength can be reformulated in terms of repression and sequestration as
C7 = S, -
(9
Rr
(19
1 -R, -S
where R, is the repression of the receiver in the given state (cf. equation 12) and Sr and S, are the fractions of
microRNA sequestrated by the receiver and the sender, respectively (cf. equation 20).
Further it can be shown that given a certain concentration of the sender [m] crosstalk strength will be maximal
when all concentrations of co-regulated mRNAs (including the receiver) are close to zero
[in] -+ 0
Vj
# s
(50)
.
Therefore crosstalk strength at a given concentration of the sender [m,] will always be equal to or less than
Cr < Ss -R mrax
.(1
Equation 51 can also be used to estimate the limits of crosstalk effects among several mRNAs.
Case 1: When the receiver is sequentially influenced by multiple senders, all of them who share the same
microRNA regulation, the sum over all crosstalk strengths must be smaller 1:
C; R"max. r
S = Rm1ax
.
r
13
1 +w
-+R rmax
< 1
.
(52)
The sum over all crosstalk strengths from different senders can be re-written as the product of repression of the
receiver times the sum over all fractions of microRNA sequestration by the senders (-). The sum over all fractions
of microRNA sequestration must be smaller 1 (-.).
Case 2: When the receiver is sequentially influenced by multiple senders, all of whom share a connon microRNA
regulation with the receiver, but no common microRNA regulation among each other, the sum over all crosstalk
strengths must be smaller 1. Let us denote the different microRNA regulators with the index k, then we can
formulate this as
Cr<Z Rk"m
k
.Rs
Rsk'max: .
(53)
k
k
The sum over all crosstalk strengths from the different senders with microRNA regulation k towards the receiver
can be re-written as the product of repression of the receiver by microRNA regulation k times the fraction of
microRNA k sequestered by the respective sender (-). The fraction of a microRNA k sequestered by the respective
sender is always smaller 1 (--). Assuming that the repression effects by different microRNA regulations are additive,
the sum over all receiver repressions by the microRNAs must be smaller 1 ( ... ).
Also for all cases in-between, where the receiver is influenced by multiple senders, some of them who might share
a common microRNA regulation, the sum over all crosstalk strengths must be smaller 1.
References
[Baccarini et al., 2011] Baccarini, A., Chauhan, H., Gardner, T. J., Jayaprakash, A. D., Sachidanandam, R. and
Brown, B. D. (2011). Kinetic analysis reveals the fate of a microRNA following target regulation in mammalian
cells. Current biology : CB 21, 369-376.
[Baek et al., 20081 Baek, D., Vill6n, J., Shin, C., Camargo, F. D., Gygi, S. P. and Bartel, D. P. (2008). The impact
of microRNAs on protein output. Nature 455, 64-71.
IBruggeman et al., 20091 Bruggeman, F. J., Bhithgen, N. and Westerhoff, H. V. (2009). Noise management by
molecular networks. PLoS Computational Biology 5, e1000506.
[Elf and Ehrenberg, 20031 Elf, J. and Ehrenberg, M. (2003). Fast evaluation of fluctuations in biochemical networks
with the linear noise approximation. Genome Research 13, 2475-2484.
[Gantier et al., 20111 Gantier, M. P., McCoy, C. E., Rusinova, I., Saulep, D., Wang, D., Xu, D., Irving, A. T.,
Behlke, M. A., Hertzog, P. J., Mackay, F. and Williams, B. R. G. (2011). Analysis of microRNA turnover in
mammalian cells following Diceri ablation. Nucleic Acids Research 39, 5692-5703.
[Haley and Zamore, 20041 Haley, B. and Zamore, P. D. (2004). Kinetic analysis of the RNAi enzyme complex.
Nature structural & molecular biology 11, 599-606.
[Lim et al., 20031 Lim, L. P., Lau, N. C., Weinstein, E. G., Abdelhakim, A., Yekta, S., Rhoades, M. W., Burge,
C. B. and Bartel, D. P. (2003). The microRNAs of Caenorhabditis elegans. Genes & Development 17, 991-1008.
[Mukherji et al., 20111 Mukherji, S., Ebert, M. S., Zheng, G. X. Y., Tsang, J. S., Sharp, P. A. and van Oudenaarden,
A. (2011). MicroRNAs can generate thresholds in target gene expression. Nature Genetics 43, 854-859.
[Paulsson, 20041 Paulsson, J. (2004). Summing up the noise in gene networks. Nature 427, 415-418.
[Pedraza and van Oudenaarden, 20051 Pedraza, J. M. and van Oudenaarden, A. (2005). Noise propagation in gene
networks. Science 307, 1965-1969.
[Schwanhiiusser et al., 20111 Schwanhiiusser, B., Busse, D., Li, N., Dittmar, G., Schuchhardt, J., Wolf, J., Chen,
W. and Selbach, M. (2011). Global quantification of mammalian gene expression control. Nature 473, 337-342.
14
Download