Molecular Titration by MicroRNAs and Target Mimic Inhibitors LOGY

advertisement
Molecular Titration by MicroRNAs and Target Mimic
Inhibitors
by
MASSACHUSETTS S INSTITUTE
OFTECHNC
Margaret S. Ebert
SEP 14 2010
LOGY
LIB3RAF; IES
M.Phil., Molecular Biology
University of Cambridge 2004
B.S., Molecular, Cellular, and Developmental Biology
Yale University 2003
ARCHIVES
SUBMITTED TO THE DEPARTMENT OF BIOLOGY
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTORATE OF PHILOSOPHY
AT THE
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
SEPTEMBER 2010
@ 2010 Massachusetts Institute of Technology
All rights reserved
IA /
..
... .. - -...................... .
Signature of Author.......
Margaret S. Ebert
Department of Biology
June 18, 2010
Certified by...............
.
. ..
...
.
.
.
..
..
.
. .
..
...........................
Phillip A. Sharp
Institute Professor of Biology
Thesis Supervisor
>2:
Accepted by...........
.
. ..........................
.
.
...--.-
Stephen P. Bell
Professor of Biology
Chair, Biology Graduate Committee
Molecular Titration by MicroRNAs and Target Mimic
Inhibitors
by
Margaret S. Ebert
ABSTRACT
MicroRNAs (miRNAs) are short, highly conserved non-coding RNA molecules that
repress gene expression in a sequence-dependent manner. Each miRNA is predicted to
target hundreds of genes, and a majority of protein-coding genes are computationally
predicted to be miRNA targets. To test miRNA functions experimentally, we introduced
the miRNA "sponge" method, which uses miRNA target mimics to sequester mature
miRNAs and thereby create continuous miRNA loss of function in cell lines and
transgenic organisms. Sponge RNAs contain complementary binding sites to a miRNA of
interest, and are produced from transgenes within cells. As with most miRNA target
genes, a sponge's binding sites are specific to the miRNA seed region, which allows them
to block a whole family of related miRNAs. This transgenic approach has proven to be a
powerful tool to generate miRNA phenotypes in a variety of experimental systems.
Bulk measurements on populations of cells have indicated that, although pervasive,
repression due to miRNAs is on average quite modest. To assay repression in single cells,
we performed quantitative fluorescence microscopy and flow cytometry to monitor a
target gene's protein expression in the presence and absence of regulation by miRNA.
We found that repression among individual cells varies dramatically. miRNAs establish a
threshold level of target mRNA below which protein production is highly repressed and
above which expression responds ultrasensitively to target mRNA input until reaching
high enough mRNA levels to almost escape repression. We constructed a mathematical
model describing molecular titration of target mRNAs by miRNAs. The model predicted,
and experiments confirmed, that the ultrasensitive regime could be shifted to higher
target mRNA levels by increasing the miRNA concentration or the number of miRNA
binding sites in the 3' untranslated region (UTR) of the target mRNA. Thus even a single
species of miRNA can act both as a switch to effectively silence gene expression and as a
fine-tuner of gene expression. This fits the emerging paradigm in which miRNAs help to
confer robustness to biological processes by reinforcing transcriptional programs,
attenuating leaky transcripts, and perhaps buffering random fluctuations in transcript
copy number.
Thesis Supervisor: Phillip A. Sharp
Title: Institute Professor of Biology
Table of contents
A bstract...........................................................................................
. .. 2
A cknow ledgm ents....................................................................................5
Chapter 1.
6
Introduction.........................................................................................
6
systems..........................................
miRNA biogenesis in mammalian
Regulation of miRNA biogenesis......................................................7
miRNA target recognition...............................................................8
Mechanisms of gene repression........................................................8
Modulation of miRNA-mediated repression..........................................9
Target prediction and validation........................................................9
11
miRNA expression profiling .............................................................
Functional manipulation..................................................................11
12
References................................................................................
Chapter 2.
MicroRNA sponge inhibitors: competitive inhibitors of small RNAs in mammalian
. 22
cells................................................................................................
Introduction..............................................................................22
23
...................................................
Results
27
D iscussion................................................................................
Methods......................................................................................29
29
References.................................................................................
1
Figures.....................................................................................3
Supplementary Information....................................35
Chapter 3.
MicroRNA sponge inhibitors: progress and possibilities...................................40
40
Introduction..............................................................................
Recent applications of miRNA sponges................................................41
Stable miRNA sponge expression.......................................................42
miRNA sponges in transgenic animals..................................................44
45
Are there natural miRNA sponges? .......................................................................
Concluding remarks....................................................................47
References................................................................................48
53
Figures.....................................................................................
Supplementary Information...........................................................56
Chapter 4.
MicroRNAs generate gene expression thresholds with ultrasensitive transitions.....
Introduction..............................................................................
Results and Discussion................................................................58
M ethods...................................................................................
References................................................................................
Figures.....................................................................................
Supplementary Information...........................................................71
57
57
62
64
67
Chapter 5.
Roles for microRNAs in conferring robustness to biological processes..................81
Introduction..............................................................................
81
miRNA-target architectures that increase robustness..............................82
miRNAs attenuate leaky transcripts......................................................84
miRNAs set gene expression thresholds for their targets.............................85
miRNAs may buffer transcriptional noise..............................................86
Some miRNA phenotypes appear upon stress..........................................88
miRNAs, robustness and disease......................................................89
Implications for miRNA influence in evolution.....................................90
C onclusions..............................................................................
91
References...................................................................................9
1
Figures.....................................................................................
97
Chapter 6.
Conclusions and future directions...............................................................101
C onclusions................................................................................10
Future directions..........................................................................103
References.................................................................................105
F igures......................................................................................106
1
Appendix.
Genome-wide dissection of microRNA functions and co-targeting networks using gene
set signatures.......................................................................................107
Introduction ................................................................................
107
Results......................................................................................109
D iscussion .................................................................................
117
Experimental Procedures................................................................118
References.................................................................................120
Figures......................................................................................127
Supplemental Information...............................................................135
C urriculum vitae...................................................................................156
Acknowledgments
The work presented in this thesis was made possible by the generous help of many
friends, colleagues, and teachers. For their contributions I thank the following:
members of the Koch Institute fifth floor labs for advice, protocols and reagents, and for
their welcoming and cooperative spirit;
Koch Institute core facilities staff Glenn Paradis, Michele Perry, and Mike Jennings for
flow cytometry training and cell sorting;
Margarita Siafaca for administrative support;
Mary Lindstrom for lab management and for making figures for this thesis;
members of the Sharp lab 2005-20 10, for countless questions answered and conversations
shared, with special thanks to Amanda Young, Joel Neilson, Anthony Leung, Grace
Zheng, Amy Seila White, Mauro Calabrese, Lourdes Aleman, and John Doench for
training on new techniques;
Joe Markson and Peter Ebert for helpful discussions;
Tudor Fulga; Madhu Kumar; Moshe Gatt; Shankar Mukherji, John Tsang, and Gregor
Neuert for friendly and productive collaborations;
thesis committee members Jackie Lees, Dave Bartel, and Alexander van Oudenaarden for
insightful comments and recommendations;
and Phil Sharp for exceptional mentorship and inspiration.
Chapter 1. Introduction
This chapter was written by Margaret S. Ebert.
In the past decade we have witnessed a revolution in molecular biology sparked by the
discovery of RNA interference (RNAi): pathways in which small RNAs complexed with
regulatory proteins sequence-specifically interact with messenger RNAs (mRNAs) to
silence gene expression. This thesis concerns the branch of RNAi called microRNAs
(miRNAs). miRNAs were discovered in 1993 when Victor Ambros's group cloned the C.
elegans lin-4 gene, which was known to control larval developmental timing (Lee et al.
1993). Lin-4 is a non-coding RNA that produces a 22-nucleotide (nt) form with partial
complementarity to sequences in the 3' untranslated region (UTR) of lin-14; lin-4 was
shown genetically to down-regulate LIN- 14 protein. The next animal miRNA was not
discovered until 2000, when Gary Ruvkun's group reported that let- 7 is a 21-nt RNA that
down-regulates lin-41 and other target genes involved in temporal control of worm
development (Reinhart et al. 2000). Unlike lin-4, let-7 was found to be conserved in other
phyla including vertebrates (Pasquinelli et al. 2000). Since 2000, cloning and sequencing
of small RNAs from a variety of organisms has revealed the expression of miRNAs in
animals, viruses, and plants (Lagos-Quintana et al. 2001, Lau et al. 2001, Reinhart et al.
2002, Pfeffer et al. 2004). Some are deeply conserved, some species-specific. At present
there are several hundred confirmed miRNAs in mammals representing about 200
conserved miRNA families (Chiang et al. 2010).
miRNA biogenesis in mammalian systems
Most miRNAs are processed from longer precursors that are transcribed by Pol II, capped
and polyadenylated (Lee et al. 2004). These transcripts are called pri-miRNAs. A
minority of pri-miRNAs are transcribed by Pol III (Borchert et al. 2006). miRNA
hairpins (stem-loops with imperfect pairing in the stem) are located in intergenic regions
or introns; about 40% are in introns of protein-coding host genes (Rodriguez et al. 2004).
Some occur in clusters of two or more within one pri-miRNA (Baskerville and Bartel
2005). The hairpin is recognized and excised by the nuclear RNaseIII enzyme Drosha and
its double-stranded RNA binding protein partner DGCR8 (Han et al. 2004). Associated
proteins include p68 and p72 RNA helicases and hnRNPs that may promote Drosha
processing of some precursors (Gregory et al. 2004). The excision occurs cotranscriptionally, before the introns are spliced (Morlando et al. 2008). The excised
hairpin, called the pre-miRNA, is ~70 nt long with a 2-nt 3' overhang. A Droshaindependent processing pathway also exists: mirtrons are pre-miRNAs generated by
splicing and debranching of short introns (Ruby et al. 2007, Okamura et al. 2007).
The pre-miRNA is transported to the cytoplasm by Exportin-5 in a Ran-GTP-dependent
manner (Lund et al. 2004). In the cytoplasm, pre-miRNAs are rapidly recognized and cut
by the RNaseIII enzyme Dicer (Hutvigner and Zamore 2002) and its double-stranded
RNA binding protein partner TRBP (Chendrimada et al. 2005) or PACT (Lee et al.
2006). The product is a ~22-nucleotide duplex with 2-nt 3' overhangs. The strands of the
duplex are unwound and one strand is loaded into one of the four Argonaute proteins to
form the core miRNA effector complex (Pillai et al. 2004). Typically the miRNA strand
whose 5' end is less stably paired gets incorporated into Argonaute as the mature miRNA
guide strand (Schwarz et al. 2003). The remaining strand, called the passenger strand or
miRNA star (*) strand, is degraded. Some miRNA duplexes incorporate each strand
frequently, e.g. miR-17-5p and -3p, and the relative amounts of miRNA and miRNA* (or
-5p and -3p) can vary substantially among different tissues (Landgraf et al. 2007, Chiang
et al. 2010). One Dicer-independent miRNA has been discovered: miR-451 is processed
from its pre-miRNA by Ago2 (Cheloufi et al. 2010), the sole Argonaute that has
endonucleolytic activity for paired RNAs (Meister et al. 2004). Biochemical purifications
have identified several proteins that associate with Argonautes and can influence miRNA
loading or co-regulate target mRNAs. These include FMRP (Caudy et al. 2002), Gemin-3
and -4 (Mourelatos et al. 2002), GW182 (Liu et al. 2005a), MOV10 (Meister et al. 2005),
and RCK/p54 (Chu and Rana 2006).
Regulation of miRNA biogenesis
miRNA expression is regulated at multiple levels. Intron-embedded pri-miRNAs undergo
the same transcriptional regulation as their host genes. In embryonic cells and tumor
cells, the Drosha processing of some pri-miRNAs is blocked (Thomson et al. 2006). In
some cell lines Drosha processing is more efficient when there are more cell-cell contacts
(Hwang et al. 2009). The splicing regulatory protein KSRP associates with both Drosha
and Dicer complexes and binds to the loop region of a subset of miRNA precursors,
promoting their processing (Trabucchi et al. 2009). The pri-miRNA can be subject to Ato-I editing by ADAR adenosine deaminase; this can prevent Drosha processing (Yang et
al. 2006) or alter the miRNA sequence and thereby change its target specificity
(Kawahara et al. 2007). The let-7 pre-miRNAs are recognized by Lin28, which recruits
the uridylyl transferase TUT4 to add a 3' oligouridine tail, blocking Dicer processing
(Heo et al. 2009)
miRNA turnover is also regulated. Mature miRNAs are protected through their
association with Argonaute protein complexes (Diederichs and Haber 2008). Perhaps for
this reason they can be very stable - the heart muscle-specific miRNA miR-208 has a
measured in vivo half-life of about 12 days (van Rooij et al. 2007) - but a broad range of
differential miRNA half-lives are observed in cultured cells (Bail et al. 2010). The
turnover of miR- 150 occurs rapidly in stimulated T cells (Monticelli et al. 2005) and
miR-122 degradation is accelerated by interferon beta signaling in liver cells (Pedersen et
al. 2007). On the other hand, 3' monoadenylation of mature miR-122 by the polyA
polymerase GLD-2 has a stabilizing effect (Katoh et al. 2009). Non-templated addition of
adenines and uridines to the miRNA 3' end is common and adds to the 3' heterogeneity
that arises from imprecise processing (Chiang et al. 2010).
miRNA target recognition
Most known miRNA-target interactions occur at partially complementary sites in 3'
UTRs. Some binding sites have been identified in coding regions (Rigoutsos 2009) and it
is possible for miRNAs to repress expression by binding to sites in the 5' UTR (Lytle et
al. 2007). The major specificity determinant for miRNA-target binding is called the seed,
which is defined as miRNA nucleotides 2-8 of and is the region of highest evolutionary
conservation (Lim et al. 2003) and greatest importance for repression (Doench and Sharp
2004). Structural studies show it is presented for target recognition by the Argonaute
protein, pre-structured for base-pairing, and that miRNA position 1 is not paired to the
target (Parker et al. 2005). Genome-wide analysis of target repression shows the strongest
effects from 8mer seed matches, viz. matches to miRNA positions 2-8 plus an A opposite
position 1 (Baek et al. 2008). More moderate effects are observed with 7mer seed
matches: pairing at positions 2-8, or pairing at positions 2-7 plus an A opposite position
1. miRNAs are grouped into families that share a common seed and whose members may
be encoded at near or distant genomic loci. Seed family members are expected to regulate
the same set of targets, with perhaps slight preferences based on different pairing to the 3'
ends (Grimson et al. 2007). The contribution of the miRNA 3' end to base-pairing is
unresolved and may be minimal (Bartel 2009). Features of the sequence flanking the seed
match also contribute to the strength of repression. Optimal targeting occurs where the
binding site is located more than 15 nt downstream of the stop codon (presumably
placing it out of the way of translating ribosomes); near the proximal or distal end of the
3' UTR rather than in the middle; in an AU-rich, relatively unstructured region; and with
other miRNA binding sites nearby (Grimson et al. 2007). Some cooperativity is observed
where sites are 13-35 nt apart (Saetrom et al. 2007). Sites that are deeply conserved tend
to show stronger repression but non-conserved sites can also be functional (Farh et al.
2005).
Mechanisms of gene repression
miRNAs were first thought to act through translational repression. In recent years there
have been reports suggesting mechanisms involving steps in both translation initiation
and elongation. The observation of miRNAs and target mRNAs cosedimenting with
polysome fractions supports a post-initiation mechanism such as slowed elongation or
enhanced termination (Olsen and Ambros 1999, Petersen et al. 2006). Other evidence
implicates the initiation step of translation: dependence of repression on the presence of
the m7 G mRNA cap (Pillai et al. 2005), or association of the miRNA complex with the
inhibitory factor eIF6, which blocks formation of the 80S ribosome (Chendrimada et al.
2007). The mechanism of translational repression remains unresolved and it is possible
that different miRNAs in different cellular contexts use more than one mechanism. It
appears too that the experimental protocols used in these reports generate different
outcomes: applying different DNA or mRNA transfection methods (Lytle et al. 2007) or
using different promoters for target reporters (Kong et al. 2008) altered the apparent
mechanism of repression. Moreover, the interpretation of some experimental results is
problematic where experimental modulations such as inefficient IRES elements or an
m7A cap create a new rate-limiting step in translation (Nissan and Parker 2008).
Increasingly miRNAs have been shown to act through mRNA degradation in addition to
and independent of translational repression. Targets of let-7 and lin-4 in C. elegans show
enhanced mRNA degradation (Bagga et al. 2005) and in mammalian cells, genome-wide
analysis indicates mRNA knockdown often in excess of translational repression for target
genes (Baek et al. 2008, Hendrickson et al. 2009). Where there is extensive
complementarity to the miRNA, Ago2 can perform endonucleolytic cleavage of target
mRNA (Meister et al. 2004), but the general mRNA knockdown effects are not
dependent on Ago2 catalysis, with the known exception of miR- 196 and its almost
perfectly complementary target HoxB8 (Yekta et al. 2004). Rather the mechanism
involves accelerated deadenylation and decapping of the target mRNA (Wu et al. 2006,
Piao et al. 2010). The miRNA complex recruits the CAF 1 and CCR4 deadenylases, and
the deadenylated message is then decapped by Dcp 1/Dcp2 enzymes, after which is it
vulnerable to 5'-to-3' exonucleolytic decay by XrnI and 3'-to-5' decay by the exosome
(Fabian et al. 2009). This accelerated turnover mechanism appears to play a role in the
clearance of many deposited maternal transcripts during the maternal-zygotic transition in
zebrafish and other animals (Giraldez et al. 2006).
The subcellular localization of miRNA repression is also an ongoing area of
investigation. Mature miRNAs are cytoplasmic with the known exception of miR-29b,
which contains a 6-nt nuclear localization sequence in its 3' end (Hwang et al. 2007).
Although miRNA complexes are associated with polysomes, they are also found in Pbodies, cytoplasmic granules that exclude ribosomes and contain RNA degradation
enzymes such as Dcpl/2 and XrnI (Liu et al. 2005b). Whether localization in P-bodies is
a cause (Liu et al. 2005a) or consequence (Eulalio et al. 2007) of repression is not entirely
clear. Additionally, the dynamics of miRNA complexes (dis)associating with P-bodies
appear to be slow whereas those of the (dis)association with stress granules, another type
of translationally silent cytoplasmic granule, are rapid (Leung et al. 2006). Upon cellular
stress, Argonaute proteins traffick to stress granules in a miRNA-dependent manner, and
target mRNAs may be stored there for subsequent re-initiation of translation or for
degradation.
Modulation of miRNA-mediated repression
Some miRNA targets show reversible repression. In hippocampal neurons, miR- 134mediated repression of Limkl near synapses is partially rescued by BDNF treatment
(Schratt et al. 2006). In hepatocarcinoma cells, HuR binds an AU-rich motif in the 3'
UTR of CAT- 1 mRNA, releasing it from P-bodies and relieving miR- 122-mediated
repression upon amino acid starvation or other stresses (Bhattacharyya et al. 2006). In
zebrafish and mammalian cells, Dndl binds several target mRNAs in U-rich regions
adjacent to miRNA binding sites, thereby occluding miRNA binding and rescuing target
expression (Kedde et al. 2007). In addition to modulation of specific targets by RNAbinding proteins, some miRNA-target interactions may be preempted by the expression
of competing target RNAs with strong seed matches (see Chapter 3). There are also
factors that may globally modulate miRNA activity by (de)stabilizing Argonaute
proteins. mLin-41 is an E3 ubiquitin ligase that polyubiquitinates Ago2, promoting its
protein turnover in murine stem cells (Rybak et al. 2009). On the other hand, proline
hydroxylation of Ago2 by C-P4H(I) appears to stablize the protein (Qi et al. 2008). These
post-translational modifications may be important as Argonaute is a limiting component
in the miRNA pathway: transfection of artificial miRNA is seen to compete with
endogenous miRNAs for loading into Argonaute complexes, such that many miRNA
target genes are partially derepressed (Khan et al. 2009).
Target prediction and validation
We assume that miRNAs function through their target genes, so it is critical to identify
the set of target genes for each miRNA. Computational methods such as TargetScan and
PicTar score all or some of the following parameters in annotated 3' UTRs: number of
seed matches, type of seed matches (e.g. 8mer match or 7mer match), context around the
site, and species conservation (Lewis et al. 2003, Krek et al. 2005, Friedman et al. 2009).
With these approaches, a typical mammalian miRNA has several hundred predicted
targets.
To validate target predictions, there are several commonly used assays. One approach is
to append the 3' UTR or UTR fragment of a predicted target onto a reporter gene such as
luciferase, and transfect cultured cells with the reporter plasmid. By measuring the
average protein output of a wild-type UTR to a version with the miRNA binding sites
mutated (typically with point mutations in the seed) in cells expressing the miRNA of
interest, one can assess the degree of repression (Lewis et al. 2003). Adding or inhibiting
the miRNA should modulate the strength of repression. While the luciferase assay is
convenient and sensitive, its limitations are beginning to be recognized: since it measures
the population average of transfected cells, it may obscure substantial cell-to-cell
differences in repression. Furthermore, the luciferase reporters are typically driven not by
cellular promoters but by strong viral promoters that may produce enough target mRNA
to overwhelm the pool of endogenous miRNAs and under-report the miRNA activity
relative to its physiological activity (see Chapter 4).
Genome-wide approaches are also used to provide evidence for target predictions.
Microarray or RNA deep sequencing in the presence and absence of a miRNA of interest
reveals changes in target mRNA abundance (Lim et al. 2005). To capture total repression
including translational repression, mass spectrometry measures target protein abundance
(Baek et al. 2008), and ribosome profiling measures translational activity on target
mRNAs (Hendrickson et al. 2009). With all of these methods, the degree of target
repression can be correlated to the number and quality of miRNA binding sites in the
target genes. Unlike in the UTR reporter assay, however, repression is not necessary due
to direct targeting but may also reflect secondary effects. The biochemical association of
miRNA complexes with targets can be assayed by immunopurification of Argonaute
complexes followed by isolation and sequencing of the associated mRNAs (Karginov et
al. 2007, Azuma-Mukai et al. 2008). More recently, this method has been improved by
means of cross-linking the complexes in cells (Chi et al. 2009, Zisoulis et al. 2010), in
some cases site-specifically (Hafner et al. 2010), before immunopurification of the
miRNA-target complexes.
miRNA expression profiling
The first miRNAs were discovered by classical genetics, but most have been identified by
small RNA cloning. This procedure takes advantage of the expected size range (-19-25
nt, amenable to gel purification of total RNA) and end chemistries (5' monophosphate
and 3' hydroxyl, amenable to oligoribonucleotide linker ligation) of the mature miRNAs.
Cloning analyses from a myriad of human and rodent tissues led to the compilation of a
basic miRNA expression atlas (Landgraf et al. 2007). More recently, high-throughput
Illumina sequencing has enabled confident identification of even very rare sequences
(Chiang et al. 2010). Further evidence that a newly discovered small RNA is a miRNA
includes the existence of a hairpin precursor, and dependence on Drosha and Dicer for
expression.
miRNA expression profiles vary not only by tissue but also by developmental stage and
other conditions. A common practice .is to screen miRNA expression in a tissue of
interest comparatively, for example before and after differentiation (Sempere et al. 2004,
Xie et al. 2009); in diseased tissue and its healthy counterpart (Ikeda et al. 2007,
Valastyan et al. 2009); or before and after the application of a specific stimulus (Krol et
al. 2010), to find miRNAs whose expression increases or decreases substantially. Such
miRNAs then become candidates for regulating the process in question. To test whether
they play a causal role, one can experimentally manipulate the miRNA expression or
activity in several ways.
Functional manipulation
Gain-of-function approaches increase the expression of the miRNA by introducing a
hairpin precursor (Dickens et al. 2005) or a transfectable miRNA duplex (Doench et al.
2003). Cells treated in this manner can provide useful information about targets and
phenotypes: adding brain- or muscle-specific miRNA to HeLa cells changed their mRNA
expression profiles to resemble those of the corresponding tissue type (Lim et al. 2005),
and ectopically expressing a B cell-specific miRNA in hematopoietic progenitor cells
increased the fraction of cells committed to the B lineage (Chen et al. 2004). Nonetheless,
ectopic miRNA expression does not necessarily reveal physiological targets. Gain-offunction experiments are more natural models when they serve to restore physiological
miRNA function, as by introducing let-7 precursors to lung cancer cells that have
aberrantly down-regulated expression of the endogenous let-7 family members (Kumar et
al. 2008).
Loss-of-function approaches for miRNAs include genetic knockouts (Thai et al. 2007,
van Rooij et al. 2007, Johnnidis et al. 2008); chemically modified antisense
oligonucleotide inhibitors (Hutvaigner et al. 2004); and target mimic inhibitors called
miRNA sponges (see Chapters 2 and 3 for a detailed discussion of these strategies).
Whether by abrogating the miRNA's expression or preventing it from accessing targets,
deletion and inhibitor strategies cause derepression of the set of target genes.
Alternatively one can block a specific miRNA-target interaction using 'target protector'
oligonucleotides that pair to the miRNA binding site and its gene-specific flanking
sequence (Choi et al. 2007). In some cases miRNA-target interactions are disrupted or
created by natural mutations. For example, a chromosomal translocation in the oncogene
HMGA2 results in removal of a region of the 3' UTR that contains multiple let-7 binding
sites (Mayr et al. 2007). In sheep, a single nucleotide polymorphism in the 3' UTR of the
myostatin gene creates a seed match for a muscle-specific miRNA, resulting in muscle
hypertrophy (Clop et al. 2006).
To date miRNAs have been implicated in controlling embryonic development, regulating
the physiology of many organs of the body, and preventing or exacerbating human
diseases. Nonetheless, much remains to be learned about the target genes and functions of
miRNAs. This thesis addresses some of the fundamental properties of miRNA-target
interactions in mammalian cells. Chapter 2 describes a method that we developed to
inhibit specific miRNA seed families in cell lines and in transgenic animals. Chapter 3
reviews the expanding applications of this loss-of-function method and their
contributions to the field to date. It also considers the potential for endogenous transcripts
to act as target mimics to inhibit miRNAs in the same manner as our artificial inhibitors.
Chapter 4 describes the results of assaying miRNA activity in single mammalian cells
with a quantitative reporter for both target gene transcription and target protein
expression. As will be seen, the target gene expression threshold set by miRNA
concentration and miRNA binding sites helps explain the efficacy of sponge inhibitors:
above a certain level of target mRNA, the endogenous pool of miRNA is overwhelmed
and additional target transcripts are free to be translated. Chapter 5 explores how the gene
expression threshold fits the emerging roles of miRNAs in conferring robustness to gene
expression. One of those potential roles, the buffering of random fluctuations in protein
output, could be assayed with an experimental model described in Chapter 6. Another
role, the coordinate regulation of multiple components of signaling pathways and protein
complexes, is described in the Appendix.
References
Azuma-Mukai A, Oguri H, Mituyama T, Qian ZR, Asai K, Siomi H, Siomi MC.
Characterization of endogenous human Argonautes and their miRNA partners in RNA
silencing. Proc. Natl Acad. Sci. USA 105, 7964-7969 (2008).
Baek D, Villen J, Shin C, Camargo FD, Gygi SP, Bartel DP. The impact of microRNAs
on protein output. Nature 455, 64-71 (2008).
Bagga S, Bracht J, Hunter S, Massirer K, Holtz J, Eachus R, Pasquinelli AE. Regulation
by let-7 and lin-4 miRNAs results in target mRNA degradation. Cell 122, 553-563
(2005).
Bail S, Swerdel M, Liu H, Jiao X, Goff LA, Hart RP, Kiledjian M. Differential regulation
of microRNA stability. RNA 16, 1032-1039 (2010).
Bartel DP. MicroRNAs: target recognition and regulatory functions. Cell 136, 215-233
(2009).
Baskerville S, Bartel DP. Microarray profiling of microRNAs reveals frequent
coexpression with neighboring miRNAs and host genes. RNA 11, 241-247 (2005).
Beilharz TH, Humphreys DT, Clancy JL, Thermann R, Martin DI, Hentze MW, Preiss T.
microRNA-mediated messenger RNA deadenylation contributes to translational
repression in mammalian cells. PLoS One 4, e6783 (2009).
Bhattacharyya SN, Habermacher R, Martine U, Closs El, Filipowicz W. Relief of
microRNA-mediated translational repression in human cells subjected to stress. Cell 125,
1111-1124 (2006).
Borchert GM, Lanier W, Davidson BL. RNA polymerase III transcribes human
microRNAs. Nat. Struct. Mol. Biol. 13, 1097-1101 (2006).
Caudy AA, Myers M, Hannon GJ, Hammond SM. Fragile X-related protein and VIG
associate with the RNA interference machinery. Genes Dev. 16, 2491-2496 (2002).
Cheloufi S, Dos Santos CO, Chong MM, Hannon GJ. A dicer-independent miRNA
biogenesis pathway that requires Ago catalysis. Nature (2010).
Chen CY, Zheng D, Xia Z, Shyu AB. Ago-TNRC6 triggers microRNA-mediated decay
by promoting two deadenylation steps. Nat. Struct. Mol. Biol. 16, 1160-1166 (2009).
Chen CZ, Li L, Lodish HF, Bartel DP. MicroRNAs modulate hematopoietic lineage
differentiation. Science 303, 83-86 (2004).
Chendrimada TP, Finn KJ, Ji X, Baillat D, Gregory RI, Liebhaber SA, Pasquinelli AE,
Shiekhattar R. MicroRNA silencing through RISC recruitment of eIF6. Nature 447, 823828 (2007).
Chendrimada TP, Gregory RI, Kumaraswamy E, Norman J, Cooch N, Nishikura K,
Shiekhattar R. TRBP recruits the Dicer complex to Ago2 for microRNA processing and
gene silencing. Nature 436, 740-744 (2005).
Chi SW, Zang JB, Mele A, Darnell RB. Argonaute HITS-CLIP decodes microRNAmRNA interaction maps. Nature 460, 479-486 (2009).
Chiang HR, Schoenfeld LW, Ruby JG, Auyeung VC, Spies N, Baek D, Johnston WK,
Russ C, Luo S, Babiarz JE, Blelloch R, Schroth GP, Nusbaum C, Bartel DP. Mammalian
microRNAs: experimental evaluation of novel and previously annotated genes. Genes
Dev. 24, 992-1009 (2010).
Chu CY, Rana TM. Translation repression in human cells by microRNA-induced gene
silencing requires RCK/p54. PLoS Biol. 4, e210 (2006).
Clop A, Marcq F, Takeda H, Pirottin D, Tordoir X, Bib6 B, Bouix J, Caiment F, Elsen
JM, Eychenne F, Larzul C, Laville E, Meish F, Milenkovic D, Tobin J, Charlier C,
Georges M. A mutation creating a potential illegitimate microRNA target site in the
myostatin gene affects muscularity in sheep. Nat. Genet. 38, 813-818 (2006).
Dickins RA, Hemann MT, Zilfou JT, Simpson DR, Ibarra I, Hannon GJ, Lowe SW.
Probing tumor phenotypes using stable and regulated synthetic microRNA precursors.
Nat. Genet. 37, 1289-1295 (2005).
Diederichs S, Haber DA. Dual role for argonautes in microRNA processing and
posttranscriptional regulation of microRNA expression. Cell 131, 1097-1108 (2007).
Doench JG, Petersen CP, Sharp PA. siRNAs can function as miRNAs. Genes Dev. 17,
438-442. (2003).
Doench JG, Sharp PA. Specificity of microRNA target selection in translational
repression. Genes Dev. 18, 504-511 (2004).
Eulalio A, Behm-Ansmant I, Schweizer D, Izaurralde E. P-body formation is a
consequence, not the cause, of RNA-mediated gene silencing. Mol. Cell Biol. 27, 39703981 (2007).
Eulalio A, Huntzinger E, Izaurralde E. GW 182 interaction with Argonaute is essential for
miRNA-mediated translational repression and mRNA decay. Nat. Struct. Mol. Biol. 15,
346-353 (2008).
Eulalio A, Huntzinger E, Nishihara T, Rehwinkel J, Fauser M, Izaurralde E.
Deadenylation is a widespread effect of miRNA regulation. RNA 15, 21-32 (2009).
Fabian MR, Mathonnet G, Sundermeier T, Mathys H, Zipprich JT, Svitkin YV, Rivas F,
Jinek M, Wohlschlegel J, Doudna JA, Chen CY, Shyu AB, Yates JR 3rd, Hannon GJ,
Filipowicz W, Duchaine TF, Sonenberg N. Mammalian miRNA RISC recruits CAF 1 and
PABP to affect PABP-dependent deadenylation. Mol. Cell 35, 868-880 (2009).
Farh KK, Grimson A, Jan C, Lewis BP, Johnston WK, Lim LP, Burge CB, Bartel DP.
The widespread impact of mammalian MicroRNAs on mRNA repression and evolution.
Science 310, 1817-1821 (2005).
Feinbaum R, Ambros V. The timing of lin-4 RNA accumulation controls the timing of
postembryonic developmental events in Caenorhabditis elegans. Dev. Biol. 210, 87-95
(1999).
Friedman RC, Farh KK, Burge CB, Bartel DP. Most mammalian mRNAs are conserved
targets of microRNAs. Genome Res. 19, 92-105 (2009).
Giraldez AJ, Mishima Y, Rihel J, Grocock RJ, Van Dongen S, Inoue K, Enright AJ,
Schier AF. Zebrafish MiR-430 promotes deadenylation and clearance of maternal
mRNAs. Science 312, 75-79 (2006).
Gregory RI, Yan KP, Amuthan G, Chendrimada T, Doratotaj B, Cooch N, Shiekhattar R.
The Microprocessor complex mediates the genesis of microRNAs. Nature 432, 235-240
(2004).
Grimson A, Farh KK, Johnston WK, Garrett-Engele P, Lim LP, Bartel DP. MicroRNA
targeting specificity in mammals: determinants beyond seed pairing. Mol. Cell 27, 91105 (2007).
Grishok A, Pasquinelli AE, Conte D, Li N, Parrish S, Ha I, Baillie DL, Fire A, Ruvkun
G, Mello CC. Genes and mechanisms related to RNA interference regulate expression of
the small temporal RNAs that control C. elegans developmental timing. Cell 106, 23-34
(2001).
Hafner M, Landthaler M, Burger L, Khorshid M, Hausser J, Berninger P, Rothballer A,
Ascano M Jr, Jungkamp AC, Munschauer M, Ulrich A, Wardle GS, Dewell S, Zavolan
M, Tuschl T. Transcriptome-wide identification of RNA-binding protein and microRNA
target sites by PAR-CLIP. Cell 141, 129-141 (2010).
Han J, Lee Y, Yeom KH, Kim YK, Jin H, Kim VN. The Drosha-DGCR8 complex in
primary microRNA processing. Genes Dev. 18, 3016-3027 (2004).
Hendrickson DG, Hogan DJ, McCullough HL, Myers JW, Herschlag D, Ferrell JE,
Brown PO. Concordant regulation of translation and mRNA abundance for hundreds of
targets of a human microRNA. PLoS Biol. 7, e1000238 (2009).
Heo I, Joo C, Cho J, Ha M, Han J, Kim VN. Lin28 mediates the terminal uridylation of
let-7 precursor MicroRNA. Mol. Cell 32, 276-284 (2008).
Heo I, Joo C, Kim YK, Ha M, Yoon MJ, Cho J, Yeom KH, Han J, Kim VN. TUT4 in
concert with Lin28 suppresses microRNA biogenesis through pre-microRNA uridylation.
Cell 138, 696-708 (2009).
Hutvigner G, Simard MJ, Mello CC, Zamore PD. Sequence-specific inhibition of small
RNA function. PLoS Biol. 2, E98 (2004).
Hwang HW, Wentzel EA, Mendell JT. A hexanucleotide element directs microRNA
nuclear import. Science 315, 97-100 (2007).
Hwang HW, Wentzel EA, Mendell JT. Cell-cell contact globally activates microRNA
biogenesis. Proc. Natl Acad. Sci. USA 106, 7016-7021 (2009).
Ikeda S, Kong SW, Lu J, Bisping E, Zhang H, Allen PD, Golub TR, Pieske B, Pu WT.
Altered microRNA expression in human heart disease. Physiol. Genomics 31, 367-373
(2007).
Jin P, Zarnescu DC, Ceman S, Nakamoto M, Mowrey J, Jongens TA, Nelson DL, Moses
K, Warren ST. Biochemical and genetic interaction between the fragile X mental
retardation protein and the microRNA pathway. Nat. Neurosci. 7, 113-117 (2004).
Johnnidis JB, Harris MH, Wheeler RT, Stehling-Sun S, Lam MH, Kirak 0,
Brummelkamp TR, Fleming MD, Camargo FD. Regulation of progenitor cell
proliferation and granulocyte function by microRNA-223. Nature 451, 1125-1129 (2008).
Karginov FV, Conaco C, Xuan Z, Schmidt BH, Parker JS, Mandel G, Hannon GJ. A
biochemical approach to identifying microRNA targets. Proc. Natl Acad. Sci. USA 104,
19291-19296 (2007).
Katoh T, Sakaguchi Y, Miyauchi K, Suzuki T, Kashiwabara S, Baba T, Suzuki T.
Selective stabilization of mammalian microRNAs by 3' adenylation mediated by the
cytoplasmic poly(A) polymerase GLD-2. Genes Dev. 23, 433-438 (2009).
Kawahara Y, Zinshteyn B, Sethupathy P, Iizasa H, Hatzigeorgiou AG, Nishikura K.
Redirection of silencing targets by adenosine-to-inosine editing of miRNAs. Science 315,
1137-1140 (2007).
Kedde M, Strasser MJ, Boldajipour B, Oude Vrielink JA, Slanchev K, le Sage C, Nagel
R, Voorhoeve PM, van Duijse J, Orom UA, Lund AH, Perrakis A, Raz E, Agami R.
RNA-binding protein Dndl inhibits microRNA access to target mRNA. Cell 131, 12731286 (2007).
Khan AA, Betel D, Miller ML, Sander C, Leslie CS, Marks DS. Transfection of small
RNAs globally perturbs gene regulation by endogenous microRNAs. Nat. Biotechnol. 27,
549-555 (2009).
Kong YW, Cannell IG, de Moor CH, Hill K, Garside PG, Hamilton TL, Meijer HA,
Dobbyn HC, Stoneley M, Spriggs KA, Willis AE, Bushell M. The mechanism of microRNA-mediated translation repression is determined by the promoter of the target gene.
Proc. Natl Acad. Sci. USA 105, 8866-8871 (2008).
Krek A, Grn D, Poy MN, Wolf R, Rosenberg L, Epstein EJ, MacMenamin P, da
Piedade I, Gunsalus KC, Stoffel M, Rajewsky N. Combinatorial microRNA target
predictions. Nat. Genet. 37, 495-500 (2005).
Krol J, Busskamp V, Markiewicz I, Stadler MB, Ribi S, Richter J, Duebel J, Bicker S,
Fehling HJ, Schtibeler D, Oertner TG, Schratt G, Bibel M, Roska B, Filipowicz W.
Characterizing light-regulated retinal microRNAs reveals rapid turnover as a common
property of neuronal microRNAs. Cell 141, 618-631 (2010).
Kumar MS, Erkeland SJ, Pester RE, Chen CY, Ebert MS, Sharp PA, Jacks T.
Suppression of non-small cell lung tumor development by the let-7 microRNA family.
Proc. Natl Acad. Sci. USA 105, 3903-3908 (2008).
Landgraf P, Rusu M, Sheridan R, Sewer A, Iovino N, Aravin A, Pfeffer S, Rice A,
Kamphorst AO, Landthaler M, Lin C, Socci ND, Hermida L, Fulci V, Chiaretti S, Foi R,
Schliwka J, Fuchs U, Novosel A, MUller RU, Schermer B, Bissels U, Inman J, Phan Q,
Chien M, Weir DB, Choksi R, De Vita G, Frezzetti D, Trompeter HI, Hornung V, Teng
G, Hartmann G, Palkovits M, Di Lauro R, Wernet P, Macino G, Rogler CE, Nagle JW, Ju
J, Papavasiliou FN, Benzing T, Lichter P, Tam W, Brownstein MJ, Bosio A, Borkhardt
A, Russo JJ, Sander C, Zavolan M, Tuschl T. A mammalian microRNA expression atlas
based on small RNA library sequencing. Cell 129, 1401-1414 (2007).
Lagos-Quintana M, Rauhut R, Lendeckel W, Tuschl T. Identification of novel genes
coding for small expressed RNAs. Science 294, 853-858 (2001).
Lau NC, Lim LP, Weinstein EG, Bartel DP. An abundant class of tiny RNAs with
probable regulatory roles in Caenorhabditis elegans. Science 294, 858-862 (2001).
Lee RC, Ambros V. An extensive class of small RNAs in Caenorhabditis elegans.
Science 294, 862-864 (2001).
Lee RC, Feinbaum RL, Ambros V. The C. elegans heterochronic gene lin-4 encodes
small RNAs with antisense complementarity to lin- 14. Cell 75, 843-854 (1993).
Lee Y, Hur I, Park SY, Kim YK, Suh MR, Kim VN. The role of PACT in the RNA
silencing pathway. EMBO J. 25, 522-532 (2006).
Lee Y, Kim M, Han J, Yeom KH, Lee S, Baek SH, Kim VN. MicroRNA genes are
transcribed by RNA polymerase II. EMBO J. 23, 4051-4060 (2004).
Leung AK, Calabrese JM, Sharp PA. Quantitative analysis of Argonaute protein reveals
microRNA-dependent localization to stress granules. Proc. Natl Acad. Sci. USA 103,
18125-18130 (2006).
Lewis BP, Shih IH, Jones-Rhoades MW, Bartel DP, Burge CB. Prediction of mammalian
microRNA targets. Cell 115, 787-798 (2003).
Lim LP, Lau NC, Garrett-Engele P, Grimson A, Schelter JM, Castle J, Bartel DP, Linsley
PS, Johnson JM. Microarray analysis shows that some microRNAs downregulate large
numbers of target mRNAs. Nature 433, 769-773 (2005).
Lim LP, Lau NC, Weinstein EG, Abdelhakim A, Yekta S, Rhoades MW, Burge CB,
Bartel DP. The microRNAs of Caenorhabditis elegans. Genes Dev. 17, 991-1008 (2003).
Liu J, Carmell MA, Rivas FV, Marsden CG, Thomson JM, Song JJ, Hammond SM,
Joshua-Tor L, Hannon GJ. Argonaute2 is the catalytic engine of mammalian RNAi.
Science 305, 1437-1441 (2004).
Liu J, Rivas FV, Wohlschlegel J, Yates JR 3rd, Parker R, Hannon GJ. A role for the Pbody component GW182 in microRNA function. Nat. Cell Biol. 7, 1261-1266 (2005).
Liu J, Valencia-Sanchez MA, Hannon GJ, Parker R. MicroRNA-dependent localization
of targeted mRNAs to mammalian P-bodies. Nat. Cell Biol. 7, 719-723 (2005).
Lund E, Gfttinger S, Calado A, Dahlberg JE, Kutay U. Nuclear export of microRNA
precursors. Science 303, 95-98 (2004).
Lytle JR, Yario TA, Steitz JA. Target mRNAs are repressed as efficiently by microRNAbinding sites in the 5' UTR as in the 3' UTR. Proc. Natl Acad. Sci. USA 104, 9667-9672
(2007).
Mayr C, Hemann MT, Bartel DP. Disrupting the pairing between let-7 and Hmga2
enhances oncogenic transformation. Science 315, 1576-1579 (2007).
Meister G, Landthaler M, Patkaniowska A, Dorsett Y, Teng G, Tuschl T. Human
Argonaute2 mediates RNA cleavage targeted by miRNAs and siRNAs. Mol. Cell 15,
185-197 (2004).
Meister G, Landthaler M, Peters L, Chen PY, Urlaub H, Lflhrmann R, Tuschl T.
Identification of novel argonaute-associated proteins. Cuff Biol. 15, 2149-2155 (2005).
Monticelli S, Ansel KM, Xiao C, Socci ND, Krichevsky AM, Thai TH, Rajewsky N,
Marks DS, Sander C, Rajewsky K, Rao A, Kosik KS. MicroRNA profiling of the murine
hematopoietic system. Genome Biol. 6, R71 (2005).
Morlando M, Ballarino M, Gromak N, Pagano F, Bozzoni I, Proudfoot NJ. Primary
microRNA transcripts are processed co-transcriptionally. Nat. Struct. Mol. Biol. 15, 902909 (2008).
Mourelatos Z, Dostie J, Paushkin S, Sharma A, Charroux B, Abel L, Rappsilber J, Mann
M, Dreyfuss G. miRNPs: a novel class of ribonucleoproteins containing numerous
microRNAs. Genes Dev. 16, 720-728 (2002).
Nissan T, Parker R. Computational analysis of miRNA-mediated repression of
translation: implications for models of translation initiation inhibition. 14, 1480-1491
(2008).
Okamura K, Hagen JW, Duan H, Tyler DM, Lai EC. The mirtron pathway generates
microRNA-class regulatory RNAs in Drosophila. Cell 130, 89-100 (2007).
Olsen PH, Ambros V. The lin-4 regulatory RNA controls developmental timing in
Caenorhabditis elegans by blocking LIN- 14 protein synthesis after the initiation of
translation. Dev. Biol. 216, 671-680 (1999).
Parker JS, Roe SM, Barford D. Structural insights into mRNA recognition from a PIWI
domain-siRNA guide complex. Nature 434, 663-666 (2005).
Pasquinelli AE, Reinhart BJ, Slack F, Martindale MQ, Kuroda MI, Maller B, Hayward
DC, Ball EE, Degnan B, Muller P, Spring J, Srinivasan A, Fishman M, Finnerty J, Corbo
J, Levine M, Leahy P, Davidson E, Ruvkun G. Conservation of the sequence and
temporal expression of let-7 heterochronic regulatory RNA. Nature 408, 86-89 (2000).
Pedersen IM, Cheng G, Wieland S, Volinia S, Croce CM, Chisari FV, David M.
Interferon modulation of cellular microRNAs as an antiviral mechanism. Nature 449,
919-922 (2007).
Petersen CP, Bordeleau ME, Pelletier J, Sharp PA. Short RNAs repress translation after
initiation in mammalian cells. Mol. Cell 21, 533-542 (2006).
Pfeffer S, Zavolan M, Grasser FA, Chien M, Russo JJ, Ju J, John B, Enright AJ, Marks
D, Sander C, Tuschl T. Identification of virus-encoded microRNAs. Science 304, 734736 (2004).
Piao X, Zhang X, Wu L, Belasco JG. CCR4-NOT deadenylates mRNA associated with
RNA-induced silencing complexes in human cells. Mol. Cell Biol. 30, 1486-1494 (2010).
Pillai RS, Artus CG, Filipowicz W. Tethering of human Ago proteins to mRNA mimics
the miRNA-mediated repression of protein synthesis. RNA 10, 1518-1525 (2004).
Pillai RS, Bhattacharyya SN, Artus CG, Zoller T, Cougot N, Basyuk E, Bertrand E,
Filipowicz W. Inhibition of translational initiation by Let-7 MicroRNA in human cells.
Science 309, 1573-1576 (2005).
Qi HH, Ongusaha PP, Myllyharju J, Cheng D, Pakkanen 0, Shi Y, Lee SW, Peng J, Shi
Y. Prolyl 4-hydroxylation regulates Argonaute 2 stability. Nature 455, 421-424 (2008).
Reinhart BJ, Slack FJ, Basson M, Pasquinelli AE, Bettinger JC, Rougvie AE, Horvitz
HR, Ruvkun G. The 21 -nucleotide let-7 RNA regulates developmental timing in
Caenorhabditis elegans. Nature 403, 901-906 (2000).
Reinhart BJ, Weinstein EG, Rhoades MW, Bartel B, Bartel DP. MicroRNAs in plants.
Genes Dev. 16, 1616-1626 (2002).
Rigoutsos I. New tricks for animal microRNAS: targeting of amino acid coding regions
at conserved and nonconserved sites. Cancer Res. 69, 3245-3248 (2009).
Rodriguez A, Griffiths-Jones S, Ashurst JL, Bradley A. Identification of mammalian
microRNA host genes and transcription units. Genome Res. 14, 1902-1910 (2004).
Ruby JG, Jan CH, Bartel DP. Intronic microRNA precursors that bypass Drosha
processing. Nature 448, 83-86 (2007).
Rybak A, Fuchs H, Hadian K, Smirnova L, Wulczyn EA, Michel G, Nitsch R,
Krappmann D, Wulczyn FG. The let-7 target gene mouse lin-41 is a stem cell specific E3
ubiquitin ligase for the miRNA pathway protein Ago2. Nat. Cell Biol. 11, 1411-1420
(2009).
Saetrom P, Heale BS, Snove 0 Jr, Aagaard L, Alluin J, Rossi JJ. Distance constraints
between microRNA target sites dictate efficacy and cooperativity. Nucleic Acids Res. 35,
2333-2342 (2007).
Schratt GM, Tuebing F, Nigh EA, Kane CG, Sabatini ME, Kiebler M, Greenberg ME. A
brain-specific microRNA regulates dendritic spine development. Nature 439, 283-289
(2006).
Schwarz DS, Hutvigner G, Du T, Xu Z, Aronin N, Zamore PD. Asymmetry in the
assembly of the RNAi enzyme complex. Cell 115, 199-208 (2003).
Sempere LF, Freemantle S, Pitha-Rowe I, Moss E, Dmitrovsky E, Ambros V. Expression
profiling of mammalian microRNAs uncovers a subset of brain-expressed microRNAs
with possible roles in murine and human neuronal differentiation. Genome Biol. 5, R13
(2004).
Thai TH, Calado DP, Casola S, Ansel KM, Xiao C, Xue Y, Murphy A, Frendewey D,
Valenzuela D, Kutok JL, Schmidt-Supprian M, Rajewsky N, Yancopoulos G, Rao A,
Rajewsky K. Regulation of the germinal center response by microRNA-155. Science 316,
604-608 (2007).
Thomson JM, Newman M, Parker JS, Morin-Kensicki EM, Wright T, Hammond SM.
Extensive post-transcriptional regulation of microRNAs and its implications for cancer.
Genes Dev. 20, 2202-2207 (2006).
Trabucchi M, Briata P, Garcia-Mayoral M, Haase AD, Filipowicz W, Ramos A, Gherzi
R, Rosenfeld MG. The RNA-binding protein KSRP promotes the biogenesis of a subset
of microRNAs. Nature 459, 1010-1014 (2009).
Valastyan S, Reinhardt F, Benaich N, Calogrias D, Szaisz AM, Wang ZC, Brock JE,
Richardson AL, Weinberg RA. A pleiotropically acting microRNA, miR-3 1, inhibits
breast cancer metastasis. Cell 137, 1032-1046 (2009).
van Rooij E, Sutherland LB, Qi X, Richardson JA, Hill J, Olson EN. Control of stressdependent cardiac growth and gene expression by a microRNA. Science 316, 575-579
(2007).
Wu L, Fan J, Belasco JG. MicroRNAs direct rapid deadenylation of mRNA. Proc. Natl
Acad. Sci. USA 103, 4034-4039 (2006).
Xie H, Lim B, Lodish HF. MicroRNAs induced during adipogenesis that accelerate fat
cell development are downregulated in obesity. Diabetes 58, 1050-1057 (2009).
Yang W, Chendrimada TP, Wang Q, Higuchi M, Seeburg PH, Shiekhattar R, Nishikura
K. Modulation of microRNA processing and expression through RNA editing by ADAR
deaminases. Nat. Struct. Mol. Biol. 13, 13-21 (2006).
Yekta S, Shih IH, Bartel DP. MicroRNA-directed cleavage of HOXB8 mRNA. Science
304, 594-596 (2004).
Zdanowicz A, Thermann R, Kowalska J, Jemielity J, Duncan K, Preiss T, Darzynkiewicz
E, Hentze MW. Drosophila miR2 primarily targets the m7GpppN cap structure for
translational repression. Mol. Cell 35, 881-888 (2009).
Zisoulis DG, Lovci MT, Wilbert ML, Hutt KR, Liang TY, Pasquinelli AE, Yeo GW.
Comprehensive discovery of endogenous Argonaute binding sites in Caenorhabditis
elegans. Nat. Struct. Mol. Biol. 17, 173-179 (2010).
Chapter 2. MicroRNA sponges: competitive inhibitors of small
RNAs in mammalian cells
This chapter was written by Margaret S. Ebert and edited by Joel R. Neilson and Phillip
A. Sharp.
This chapter was published as an article in Nature Methods vol. 4 pp. 721-726 (2007).
Copyright belongs to the authors.
MicroRNAs are predicted to regulate thousands of mammalian genes, but relatively
few targets have been experimentally validated and few microRNA loss-of-function
phenotypes have been assigned. As an alternative to chemically modified antisense
oligonucleotides, we developed microRNA inhibitors that can be expressed in cells,
as RNAs produced from transgenes. Termed 'microRNA sponges,' these
competitive inhibitors are transcripts expressed from strong promoters, containing
multiple, tandem binding sites to a microRNA of interest. When vectors encoding
these sponges are transiently transfected into cultured cells, sponges derepress
microRNA targets at least as strongly as chemically modified antisense
oligonucleotides. They specifically inhibit microRNAs with a complementary
heptameric seed, such that a single sponge can be used to block an entire microRNA
seed family. RNA polymerase II promoter (Pol II)-driven sponges contain a
fluorescence reporter gene for identification and sorting of sponge-treated cells. We
envision the use of stably expressed sponges in animal models of disease and
development.
Introduction
MicroRNAs are 20-24-nucleotide RNAs derived from hairpin precursors. Through
pairing with partially complementary sites in 3' untranslated regions (UTRs), they
mediate post-transcriptional silencing of a predicted 30% of protein-coding genes in
mammals (Lewis et al. 2005). MicroRNAs have been implicated in critical processes
including differentiation, apoptosis, proliferation, and the maintenance of cell and tissue
identity; furthermore, their misexpression has been linked to cancer and other diseases
(Lu et al. 2005, Li et al. 2007, Cheng et al. 2007, Chang et al. 2007, He et al. 2005, Care
et al. 2007). But relatively few microRNA-target interactions have been experimentally
validated in cell culture or in mouse models, and the functions of most microRNAs
remain to be discovered. Creating genetic knockouts to determine the function of
microRNA families is difficult, as individual microRNAs expressed from multiple
genomic loci may repress a common set of targets containing a complementary seed
sequence. Thus, a method for inhibiting these functional classes of paralogous
microRNAs in vivo is needed. Presently, loss-of-function phenotypes are induced by
means of chemically modified antisense oligonucleotides - 2' 0-methyl, locked nucleic
acid (LNA) and others - which are presumed to pair with and block mature microRNAs
through extensive sequence complementarity (Hutvagner et al. 2004, Meister et al. 2004,
Orom et al. 2006). Typically, oligonucleotide inhibitors are transiently transfected into
cells, providing a correspondingly transient derepression of microRNA targets. One type
of inhibitor has been demonstrated to silence microRNAs in vivo: 'antagomirs,' which
are 2' 0-methyl, phosphorothioate, cholesterol-modified antisense oligonucleotides; their
effect in an animal, however, is only achieved with a high dose (Krutzfeldt et al. 2005).
Antisense oligonucleotides work as competitive inhibitors of microRNAs, presumably by
annealing to the mature microRNA guide strand after the RNA-induced silencing
complex has removed the passenger strand (Davis et al. 2006). Delivering a dose
sufficient to saturate the cellular pool of microRNAs is critical to their function. We
reasoned that a microRNA target expressed at a sufficiently high level could,
analogously, function as a competitive inhibitor of cognate microRNA(s). To boost the
affinity of a decoy target for its cognate microRNA, multiple binding sites could be
inserted into its 3' UTR. By designing the microRNA binding sites with a bulge at the
position normally cleaved by Argonaute 2, these targets would be able to stably interact
with, or 'soak up', microribonucleoprotein complexes (microRNPs) loaded with the
corresponding microRNA. Such inhibitor RNAs could be expressed transiently from
transfected plasmids or stably from chromosomal insertions. Because the interaction
between microRNA and target is nucleated by and largely dependent on base-pairing in
the seed region (positions 2-8 of the microRNA), a decoy target should interact with all
members of a microRNA seed family. In so doing, it should better inhibit functional
classes of microRNAs than do antisense oligonucleotides, which are thought to block
single microRNA sequences.
We made decoy targets for several microRNA seed families, named them 'microRNA
sponges,' and tested their ability to derepress microRNA targets in mammalian cells.
Here we present evidence that microRNA sponges are at least as effective as present
antisense technology, that their activity is specific to microRNA seed families, and that
they can be used to validate target predictions and assay microRNA loss-of-function
phenotypes.
Results
Construction of microRNA sponges
We constructed Pol II sponges by inserting tandemly arrayed microRNA binding sites
into the 3' UTR of a reporter gene encoding destabilized GFP driven by the CMV
promoter (Fig. 1a). Binding sites for a particular microRNA seed family were perfectly
complementary in the seed region with a bulge at positions 9-12 to prevent RNA
interference-type cleavage and degradation of the sponge RNA (Fig. Ib). We also
constructed perfectly complementary sponges for individual microRNAs. As a control,
we constructed a sponge with repeated binding sites complementary to an artificial
microRNA based on a sequence from the CXCR4 gene (but not complementary to any
known microRNA). Binding site information for all sponge constructs is available in
Supplementary Table 1.
We constructed a second class of microRNA sponges to take advantage of strong RNA
polymerase III promoters (Pol III), which are known to drive expression of the mostabundant cellular RNAs (Fig. 1c). We subcloned tandemly arrayed microRNA binding
sites from the GFP sponge constructs into a modified U6 small nuclear RNA promoterterminator vector, which produces short (<300 nt) RNAs with structurally stabilized 5'
and 3' ends (Paul et al. 2003). As they lack an open reading frame, these U6 sponges are
substrates for microRNA binding, but not for translation or translational repression.
Efficacy of microRNA sponges
We transfected HEK293T cells expressing abundant endogenous miR-20 with the
CXCR4 control sponge plasmid (C-CX) or with sponge plasmids imperfectly (C-20b) or
perfectly (C-20pf) complementary to miR-20. We cotransfected a sponge plasmid and a
TK promoter-driven gene encoding Renilla reniformis luciferase (RLuc) regulated by 7
bulged miR-20 sites and an unregulated gene encoding firefly luciferase as a transfection
control, at a ratio of 8:1 sponge plasmid to target plasmid. We assayed the expression of
the RLuc target 24 h after transfection and observed that it was rescued by both Pol IIand Pol IL-driven sponges with bulged or perfect miR-20 binding sites (Fig. 2a). At 48 h,
we observed similar results (data not shown). We measured amounts of reporter mRNA
by real-time PCR and found that derepression occurred mostly at the translational level
(data not shown). For both sponge classes, sponges with 4-7 bulged binding sites
produced stronger derepressive effects than sponges with two perfect binding sites. This
difference may be due to the availability of more binding sites in the bulged sponges,
and/or to the greater stability expected of bulged sponge RNAs compared to sponge
RNAs that can be cleaved by miR-20-loaded Argonaute 2. Between the two sponge
classes, the CMV sponges and U6 sponges derepressed the target reporter about equally
well - nearly 50% rescue of a target with 7 miR-20 binding sites relative to an
unrepressed control reporter - but the U6 sponges also produced a general inhibition of
RLuc expression (Supplementary Fig. 1). Fluorescence in situ hybridization with a probe
against the U6 sponge RNAs primarily labeled the nucleus, as in previous work (Paul et
al. 2003) (data not shown). How an inhibitor localized primarily to the nucleus can
function against microRNA localized primarily in the cytoplasm is not clear. We
speculate that a sufficient fraction of the U6 sponge RNA is present in the cytoplasm to
inhibit mature microRNA.
We performed subsequent assays with the GFP bulged sponges, as they gave the highest
activity on both microRNA target reporters tested (miR-16 and miR-20). Cells
transfected with these sponge plasmids expressed large amounts of GFP, with only slight
repression by endogenous microRNAs. Transfected at low doses, the sponge plasmids
expressed GFP mRNA at a subsaturating level such that translation was visibly repressed
by endogenous microRNAs relative to unregulated GFP control constructs (data not
shown). Thus, the sponge mRNAs function by associating with active microRNPs.
To quantify the inhibition of cognate microRNAs by sponges, we used a target reporter
with a single bulged binding site for an artificial microRNA based on the CXCR4
sequence. (This system, established in our laboratory, has been used to show that
transfected small interfering RNA (siRNA) enters the same effector pathway as
endogenous microRNA (Doench et al. 2003).) The majority of predicted microRNA
targets contain a single binding site in their 3' UTR, so this target reporter probably
mimics the response of a natural microRNA target. We cotransfected the CXCR4 siRNA
at varying concentrations and included the CXCR4 sponge containing 7 bulged binding
sites to the microRNA, or, as a negative control, a sponge containing 7 bulged binding
sites to miR-21, a microRNA not expressed in 293T cells (Fig. 2b). At transfected siRNA
concentrations of 1 and 5 nM, the luciferase target was repressed 2-2.5-fold, similar to
the observed regulation by endogenous microRNAs of natural UTRs containing one
binding site. Furthermore, flow cytometry analysis of GFP revealed that the CXCR4
sponge targeted by 5 nM CXCR4 siRNA was repressed to the same extent that a miR-21
sponge was repressed by endogenous miR-21 in T98G, a cell type that highly expresses
that microRNA (data not shown). We infer that this range of transfected siRNA
corresponds to the concentration range of natural endogenous microRNAs acting on
typical target messages. In this range, the CXCR4 sponge rescued target gene expression
75-95% (1.8-1.9-fold derepression) and rescue was above 60% even at the highest siRNA
concentration tested (20 nM). We conclude that the GFP sponge RNAs are being
produced and accumulating to sufficiently high level to inhibit most endogenous
microRNAs.
To compare the efficacy of inhibiting endogenous microRNAs by microRNA sponges to
that of present antisense technology, we transfected 293T cells with target reporters and
either a 2' 0-methyl antisense oligonucleotide, LNA antisense oligonucleotide, or a
bulged GFP sponge, or with control inhibitors (Fig. 2c). The GFP sponge more strongly
derepressed the target reporter than the 2' 0-methyl antisense oligonucleotide transfected
at standard conditions (20 nM) for all microRNAs tested (miR-16, 18, 20, 21 and 30).
This effect could be increased slightly by cotransfecting sponge and oligonucleotide. A
miR-20 sponge outperformed, but a miR- 16 sponge only performed about as well as, an
LNA antisense oligonucleotide transfected at 20 nM (Fig. 2c and Supplementary Fig. 2).
Perhaps the cross-reactivity of the sponges to seed family members, such as miR-17-5p in
the case of the miR-20 sponge, allows them to rescue the effects of entire microRNA
families more completely than specific antisense oligonucleotides.
We tested sponges with artificial target reporters and 3' UTR reporters in two additional
human cell lines and in mouse 3T3 cells and found them to be similarly active in all cell
lines (data not shown).
To investigate the possibility of expressing sponges continuously from multicopy
chromosomal insertions, we constructed polyclonal cell lines by cotransfecting 293T
cells with linearized GFP sponge plasmids and a puromycin selection marker. After
sorting the cell lines for a high-GFP fraction, we assayed the activity of endogenous
microRNA in comparison to cells transiently transfected with sponge plasmids. The
stable miR- 16 sponge-expressing cell line allowed threefold higher expression of a miR16 target (relative to an untargeted control reporter) than the stable CXCR4 sponge cell
line or the parental 293T cells (Supplementary Fig. 3). This represents an activity
approximately 40% as strong as that of the transiently transfected sponge. Thus, sponges
expressed from transgenes have the potential to at least partially inhibit endogenous
microRNAs.
Seed specificity of microRNA sponges
To assess the specificity of the Pol I-driven sponges, we transfected HeLa cells with
target reporters and sponges against two microRNAs with different seeds: miR-20, miR21 or a 50:50 combination of the two sponges (Fig. 3a). Dose-dependent derepression
was apparent in samples treated with a 50:50 mixture of the two plasmids. Each target
was derepressed by its cognate microRNA sponge and unaffected by the other microRNA
sponge relative to treatment with the CXCR4 sponge control. In contrast, we expected
sponges based on the sequence of a given microRNA to be recognized as targets by
multiple microRNAs that share the seed. In HeLa cells, microRNA expression profiling
detects high levels of miR-30c and miR-30d, and a much lower level of miR-30e (Barad
et al. 2004). We reasoned that a sponge element based on the sequence of the lowabundance microRNA would recognize each family member through the common seed
and thereby derepress a target of the high-abundance microRNA family member.
Accordingly, we assayed a target reporter with perfect sites for miR-30c with either a 2'
0-methyl antisense oligonucleotide against miR-30e or a sponge with 6 bulged sites
against miR-30e (Fig. 3b). As expected, the antisense oligonucleotide derepressed the
miR-30c target to a very low degree, <1.5-fold, presumably by inhibiting only the lowabundance miR-30e. In contrast, the sponge designed to miR-30e derepressed the target
by over fourfold, suggesting cross-reactivity with the more abundant miR-30 family
members. Consistent with this, transfection of 20 nM 2' 0-methyl oligonucleotide against
the more abundant miR-30c derepressed the miR-30c target to a slightly greater extent
than the miR-30e sponge. Further supporting the generality of seed recognition by
sponges, we observed derepression of perfect target reporters for miR- 15a, miR- 15b and
miR- 16, which share a common seed, by treatment with sponges based on the miR- 16
sequence (data not shown).
Validation of predicted microRNA targets
To test the ability of sponges to derepress natural microRNA targets, we assayed the
E2F 1 protein, a demonstrated target of the miR-20 seed family and a predicted target of
miR-18 (O'Donnell et al. 2005; Fig. 4a). The amount of the target protein increased by
about 1.5-fold after treatment with the miR- 18 GFP sponge and by about 2.5-fold after
treatment with the miR-20 GFP sponge, as shown in relation to lanes loaded with 1 or 1.5
times the amount of lysate from the control CXCR4 sponge treatment. This difference
likely results from the presence of two miR-20 binding sites and one miR- 18 site in the
E2F] 3' UTR, plus the added inhibition of the coexpressed miR-20 family member miR17-5. These effects were recapitulated in a luciferase assay wherein the RLuc reporter
was fused to a fragment of the E2F1 UTR spanning the two miR-20 sites (Fig. 4b). Thus,
sponges show direct effects on natural and endogenous targets, and can be used to
validate target predictions. To test some predicted targets that had not yet been
experimentally validated, we used a luciferase reporter regulated by a large fragment of
the CD69 3' UTR (Neilson et al. 2007) or by the E2F5 UTR. As predicted by the
TargetScan 4.0 and miRanda algorithms, respectively, these UTRs are each regulated by
a single miR-20 site (Lewis et al. 2003, John et al. 2004). Correspondingly, each reporter
was derepressed upon treatment with a miR-20 sponge in 293T cells (Fig. 4c and
Supplementary Fig. 4).
Effect of sponges on microRNA levels
Antisense oligonucleotides have been shown to reduce the cellular concentration of their
cognate microRNAs (KrUtzfeldt et al. 2005, Davis et al. 2006). These results from
northern blots are complicated by the possibility that the complementary RNA could
compete with a labeled probe for base-pairing to the microRNA or prevent transfer of the
short RNA to the hybridization matrix. We expected that the overexpression of a
microRNA target, namely, expression of a microRNA sponge construct, would not alter
the amount of endogenous microRNA. But northern blot analysis showed a modest
(typically about twofold, ranging from 1.2-3-fold) specific decrease in free microRNA
24-48 h after transfection of the corresponding sponge (Fig. 5). We observed this effect
for bulged and perfect sponges of both the Pol II and Pol III classes. The northern blots
also showed microRNA signal near the location of the bands detected by probing against
the GFP and U6 sponge RNAs, respectively. Thus, cellular microRNA concentration may
be unchanged by sponge expression and the loss of a northern blot signal explained by
microRNA retention at the top of the gel owing to interaction with the cognate sponge
RNA. It is important to note that the signal of the GFP sponge RNAs is comparable to the
signal of endogenous miR- 16 detected with the same-length DNA probe and after the
same exposure time, supporting the expected inhibition of microRNAs by excess binding
sites in the form of Pol II-driven sponges. To evaluate the abundance of GFP sponge
RNAs in transfected cells, we quantified GFP transcripts by real-time PCR in relation to
GFP plasmid standards (data not shown). We estimated the copy number of bulged GFP
mRNAs in transiently transfected 293T cells to be at least 1,000-2,000 per cell. If all
seven binding sites in the sponge RNA's UTR were used to bind microRNA, then this
level of sponge expression should allow inhibition of approximately 104 microRNAs per
cell, which would be sufficient to inhibit most microRNAs in most cell types.
Discussion
Sponges designed as decoy targets for microRNAs were effective and specific inhibitors
of microRNA seed families. Somewhat surprisingly, the sponges with perfectly
complementary binding sites were not degraded so rapidly as to be ineffective at
competing microRNA from targets. Although these sponge RNAs should be degraded by
Argonaute 2-catalyzed cleavage, they probably also stably associate with microRNAs
complexed to the cleavage-incompetent Argonautes 1, 3 and 4. They could also form
stable interactions with other microRNAs that share the same seed but vary at nucleotides
10-11, producing a bulge that protects against endonucleolytic cleavage.
Inclusion of the GFP reporter in the sponge mRNA is useful for assessing transfection
efficiency and for tracking those cells that express high levels of the inhibitor RNA. We
envision multiple applications of the GFP sponges for target validation and phenotypic
analysis. Cells with poor transfection rates can be subjected to fluorescence-activated cell
sorting to isolate subpopulations expressing the sponge RNA and thus suppressing
microRNA activity. This could be critical for detecting typically subtle (less than
twofold) changes in the levels of proteins targeted by endogenous microRNA.
Alternatively, transfected cells can be immunostained for predicted targets or phenotypic
markers and two-color flow cytometry can be used to assess the correlation between GFP
expression and target-protein level. In these applications GFP expression serves both as
an indicator of sponge plasmid dose and as a sensor of cellular microRNA activity. By
contrast, chemically modified antisense oligonucleotides, which lack a reporter function,
limit the experimenter to pooled cell analyses and dilute the inhibitor's effect often to
unobservable levels in cell lines with low transfection rates. The properties of antisense
oligonucleotides and sponges are summarized in Supplementary Table 2.
There might be several ways to improve the sponge technology described in this study.
Addition of more microRNA binding sites to the sponge UTRs would increase the dose
of antisense sequences and should therefore increase the potency of the sponges. Testing
a microRNA sponge with 6, 10 or 18 sites showed a marginal increase in activity above 6
sites, with apparently saturating effect, but for sponges expressed at lower levels from
chromosomal insertions, the additional sites may be beneficial (data not shown).
Alternatively, the spacing between sites might be optimized to enhance the binding of
miRNPs to every possible site, although previous results suggest that nearby sites are
fully functional (Doench and Sharp 2004). One could also construct sponges with
combinations of seed binding sites for two or more microRNA families of interest. To
express sponges at a high level transiently in vivo, one could use viral vectors as in a
recent work using adenovirus delivered to cardiac tissue. Finally, there may be Pol III
elements other than U6 that would produce sponge RNAs at a high level that are
transported to the cytoplasm where they would encounter mature microRNA. Just as
sponges inhibit endogenous microRNAs, they could also be used to inhibit siRNAs. In a
short hairpin RNA-expressing cell line, a siRNA sponge could provide another level of
regulatory control.
An extension of the current technology would be to express sponges from stably
integrated transgenes in vivo. Just as short hairpin RNAs, mRNA inhibitors expressed
from transgenes, have expanded the experimental scope of siRNAs, transgenic sponges
could expand the scope of antisense microRNA inhibitors. Beyond assaying long-term
effects of microRNA loss of function in cell lines, we envision the use of drug-inducible
sponges in xenograft models to investigate microRNA contributions to tumorigenesis;
bone marrow reconstitution approaches to investigate microRNA roles in immune cell
development; and, ultimately, germline transgenic sponge mice to ascertain the functions
of microRNA families at cell, tissue, organ and organism levels. In principle, microRNA
sponges expressed from appropriate promoters should be applicable in any transgenic
model organism, including worm, fly and plants.
Methods
Construction of sponge plasmids and reporters
We annealed, ligated, gel purified and cloned oligonucleotides for microRNA binding
sites with 4-nt spacers for bulged sites, or with no spacers for perfect sites, into pcDNA5CMV-d2eGFP vector (Invitrogen/Clontech) digested with Xhol and ApaI. We
constructed Pol III sponges by subcloning the UTR into pTZ-U6+27 vector (see
Acknowledgments). We constructed luciferase reporters by the same oligonucleotide
annealing method or by subcloning the UTR into pcDNA5-TK-RLuc vector. We PCRamplified and ligated the E2F] UTR fragment (nucleotides 393-978), the CD69 UTR
fragment (nucleotides 25-899) and the E2F5 UTR (1-653) into the same vector.
Luciferase assays
We plated 293T cells or HeLa cells the day before transfection and transfected them in
triplicate with Lipofectamine 2000 (Invitrogen) and 50 ng of pGL3 (Firefly luciferase
plasmid), 90 ng of RLuc target reporter plasmid, and 700 ng of sponge plasmid. We
transfected the E2F] UTR reporter at 4.5 ng, the E2F5 UTR reporter at 0.9 ng. We
cotransfected 2' O-methyl antisense (Dharmacon) and LNA antisense (Exiqon,
Dharmacon) oligonucleotides at 20 nM. We transfected the CXCR4 microRNA in the
form of a siRNA mixed in varying ratios with negative control siRNA (Dharmacon) to
maintain 20 nM total siRNA concentration. We performed all assays at 24 h after
transfection with the dual luciferase assay (Promega) on an Optocomp I luminometer
(MGM Instruments).
Additional methods
Primers used, western blot and northern blot analyses, construction of stable cell lines,
and quantification of sponge RNAs are described in Supplementary Methods.
Author contributions
M.S.E. and J.R.N. conceived the experimental design and made the sponge constructs.
M.S.E. performed the experiments and wrote the manuscript. P.A.S. supervised the work.
References
Barad 0, Meiri B, Avniel A, Aharonov R, Barzilai A, Bentwich I, Einav U, Gilad S,
Hurban P, Karov Y, Lobenhofer EK, Sharon E, Shiboleth YM, Shtutman M, Bentwich Z,
Einat P. MicroRNA expression detected by oligonucleotide microarrays: system
establishment and expression profiling in human tissues. Genome Res. 14, 2486-2494
(2004).
Care A, Catalucci D, Felicetti F, Bonci D, Addario A, Gallo P, Bang ML, Segnalini P, Gu
Y, Dalton ND, Elia L, Latronico MV, Hoydal M, Autore C, Russo MA, Dorn GW 2nd,
Ellingsen 0, Ruiz-Lozano P, Peterson KL, Croce CM, Peschle C, Condorelli G.
MicroRNA-133 controls cardiac hypertrophy. Nat. Med. 13, 613-618 (2007).
Chang TC, Wentzel EA, Kent OA, Ramachandran K, Mullendore M, Lee KH, Feldmann
G, Yamakuchi M, Ferlito M, Lowenstein CJ, Arking DE, Beer MA, Maitra A, Mendell
JT. Transactivation of miR-34a by p53 broadly influences gene expression and promotes
apoptosis. Mol. Cell 26, 745-752 (2007).
Cheng HY, Papp JW, Varlamova 0, Dziema H, Russell B, Curfman JP, Nakazawa T,
Shimizu K, Okamura H, Impey S, Obrietan K. MicroRNA modulation of circadian-clock
period and entrainment. Neuron 54, 813-829 (2007).
Davis S, Lollo B, Freier S, Esau C. Improved targeting of miRNA with antisense
oligonucleotides. Nucleic Acids Res. 34, 2294-2304 (2006).
Doench JG, Petersen CP, Sharp PA. siRNAs can function as miRNAs. Genes Dev. 17,
438-442 (2003).
Doench JG, Sharp PA. Specificity of microRNA target selection in translational
repression. Genes Dev. 18, 504-511 (2004).
He L, Thomson JM, Hemann MT, Hernando-Monge E, Mu D, Goodson S, Powers S,
Cordon-Cardo C, Lowe SW, Hannon GJ, Hammond SM. A microRNA polycistron as a
potential human oncogene. Nature 435, 828-833 (2005).
Hutvagner G, Simard MJ, Mello CC, Zamore PD. Sequence-specific inhibition of small
RNA function. PLoS Biol. 2, e98 (2004).
John B, Enright AJ, Aravin A, Tuschl T, Sander C, Marks DS. Human microRNA
targets. PLoS Biol. 2, 363 (2004).
Krntzfeldt J, Rajewsky N, Braich R, Rajeev KG, Tuschl T, Manoharan M, Stoffel M.
Silencing of microRNAs in vivo with 'antagomirs'. Nature 438, 685-689 (2005).
Lewis BP, Burge CB, Bartel DP. Conserved seed pairing, often flanked by adenosines,
indicates that thousands of human genes are microRNA targets. Cell 120, 15-20 (2005).
Lewis BP, Shih IH, Jones-Rhoades MW, Bartel DP, Burge CB. Prediction of mammalian
microRNA targets. Cell 115, 787-798 (2003).
Li QJ, Chau J, Ebert PJ, Sylvester G, Min H, Liu G, Braich R, Manoharan M, Soutschek
J, Skare P, Klein LO, Davis MM, Chen CZ. miR- 181 a is an intrinsic modulator of T cell
sensitivity and selection. Cell 129, 147-161 (2007).
Lu J, Getz G, Miska EA, Alvarez-Saavedra E, Lamb J, Peck D, Sweet-Cordero A, Ebert
BL, Mak RH, Ferrando AA, Downing JR, Jacks T, Horvitz HR, Golub TR. MicroRNA
expression profiles classify human cancers. Nature 435, 834-838 (2005).
Meister G, Landthaler M, Dorsett Y, Tuschl T. Sequence-specific inhibition of
microRNA- and siRNA-induced RNA silencing. RNA 10, 544-550 (2004).
Neilson JR, Zheng GX, Burge CB, Sharp PA. Dynamic regulation of miRNA expression
in ordered stages of cellular development. Genes Dev. 21, 578-589 (2007).
O'Donnell KA, Wentzel EA, Zeller KI, Dang CV, Mendell JT. c-Myc-regulated
microRNAs modulate E2F1 expression. Nature 435, 839-843 (2005).
Orom UA, Kauppinen S, Lund, AH. LNA-modified oligonucleotides mediate specific
inhibition of microRNA function. Gene 372, 137-141 (2006).
Paul CP, Good PD, Li SX, Kleihauer A, Rossi JJ, Engelke DR. Localized expression of
small RNA inhibitors in human cells. Mol. Ther. 7, 237-247 (2003).
Acknowledgments
This work was funded by US Public Health Service grants U 19-AI056900 from the
National Cancer Institute, by an Integrative Cancer Biology Program Grant U54
CA] 12967 from the National Institutes of Health to P.A.S. and partially by Cancer
Center Support (core) P30-CA14051 from the National Cancer Institute. M.S.E. is
supported by a Howard Hughes Medical Institute Predoctoral Fellowship and a Paul and
Cleo Schimmel Scholarship. J.R.N. is supported by the Cancer Research Institute. We
thank A. Garfinkel and M. Kumar for luciferase reporter preparations, A. Leung for
assistance with fluorescence in situ hybridization, D. Engelke (University of Michigan)
for the U6 vector and members of the Sharp laboratory for helpful discussions.
Figures
Figure 1. Design of microRNA sponges.
(a) We constructed GFP sponges by inserting multiple microRNA binding sites into the 3' UTR
of a 2-h destabilized GFP reporter gene driven by the CMV promoter. (b)The imperfect pairing
between a microRNA and a sponge with bulged binding sites is diagrammed for miR-2 1. We
designed sponges with a bulge to protect against endonucleolytic cleavage by Argonaute 2. (c)
We constructed U6 sponges by subcloning the microRNA binding site region into a vector
containing a U6 snRNA promoter with 5'and 3' stem-loop elements.
BGH poly(A)
d2eGFP
microRNA binding sites
(bulged or perfect)
CMV
b
C
AGAC
3'- AGU)UGUA GUC
Sponge 5'-UCAACAUCAG
UAUUCGAU-5' miR-21
AUAAGCUA-3'
Polil1,5'stemloop
U6
rn-0n-
--
3' stemloop
microRNA binding sites
(bulged or perfect)
Figure 2. Efficacy of microRNA sponges.
(a-c) RLuc activity relative to firefly luciferase activity was assayed in 293T cells 24 h after
transfection with RLuc microRNA target reporters, firefly luciferase transfection control and
microRNA sponge plasmids. An RLuc target regulated by 7 miR-20 sites was derepressed by
GFP sponges and U6 sponges with bulged or perfect binding sites for miR-20 (a). C, CMV
sponge; U, U6 sponge. CX, CXCR4 control; 20b, 7 bulged miR-20 sites; 20pf, two perfect miR20 sites. Bars represent the expression of the miR-20 target relative to an untargeted control
reporter. We measured an artificial CXCR4 target reporter with a single bulged binding site in the
presence of a control GFP sponge against miR-21 (miR-21 sponge) or a GFP sponge containing
seven CXCR4 binding sites (CXCR4 sponge; b). We transfected cells with 20 nM antisense
oligonucleotide (2' 0-methyl 20 or LNA 20) or with the CMV bulged sponge against miR-20
(sponge 20; c). Negative controls; mock (no oligonucleotides or sponges), 2' O-methyl against
miR-30, LNA against miR-122, CXCR4 sponge. We performed each experiment at least three
times and have shown a representative example. Error bars, s.d.; n = 3.
aa
C1
* CXUPJspo
10'
L-75]
C
IXC-20-1 C,20,60 UCX
1
5
U0b U-25Pf
CXCN
:iJ
LA
~HN
Figure 3. Specificity of microRNA sponges.
(a) We assayed RLuc activity relative to firefly luciferase activity in HeLa cells 24 h after
transfection with RLuc microRNA target reporters, firefly luciferase transfection control and
microRNA sponge plasmids. Targets of miR-20 and miR-21 are specifically derepressed by the
corresponding GFP sponge. Bars are normalized to the relative RLuc units of samples treated
with the CXCR4 control sponge. (b) We assayed a perfect target reporter of miR-30c in HeLa
cells transfected with oligonucleotide or sponge inhibitors of miR-30e. Controls: 2' 0-methyl
anti-miR-181, CXCR4 sponge. MicroRNA sequences below show the heptameric seed sequence
in bold, with nucleotide differences between the two family members underlined. We performed
each experiment at least three times and have shown a representative example. Error bars, s.d; n
miR-20 target
0 miR-21 target
b
5
.
4
miR-30c target
a.
X 3
c-2
1.0
0.0
0.5
0.5
0.0 miR-20 sponge
1.0 miR-21 sponge
Control
2' 0-methyl miR-30e
anti-miR-30e sponge
miR-20 5'-UAAAGUGCUUAUAGUGCAGGUA-3'
miR-30c 5'- UGUAAACAUCCUACACUCUCAGC-3'
miR-21 5'- UAGCUUAUCAGACUGAUGUUGA-3'
miR-30e 5'- UGUAAACAUCCUUGACUGGA -3'
Figure 4. Validation of microRNA targets.
(a) We assayed 293T cells transfected with GFP sponges against miR-18, miR-20 or the CXCR4
control by western blot 48 h after transfection. The increase in endogenous E2F 1 upon inhibition
of miR- 18 or miR-20 is shown relative to the control samples loaded at indicated amounts; E2F 1
is the 60 kDa band indicated; the other bands are nonspecific (top). Beta-actin loading control
(bottom). (b) We assayed 293T cells transfected with an RLuc reporter fused to a fragment of the
E2F1 UTR spanning two miR-20 sites, firefly luciferase and GFP sponges. Bars represent RLuc
units relative to firefly luciferase units. (c) We assayed RLuc activity relative to firefly luciferase
activity in 293T cells transfected with an RLuc reporter fused to a fragment of the CD69 UTR
containing a predicted miR-20 binding site, firefly luciferase and GFP sponges. We performed
each experiment at least three times and have shown a representative example. Error bars, s.d; n
3.
C
C
C
C)
Sponge
1.50
1.25
2.0-
1.00
0.75
1.55
0.50
0.25
0.5-
E2F1
-e -
"
-
fp-actin
0.00
1.0-
0.0
C-CX
C-20
Sponge
C-CX C-20
Sponge
Figure 5. Effect of sponges on microRNA levels.
(a) We transfected 293T cells with sponge plasmids and collected total RNA for northern blot
analysis 48 h later. We probed the blot for miR-16 (top; 24-h exposure), then stripped the blot and
reprobed it for GFP mRNA, U6 sponge RNA (both 24-h exposure) and a tRNA loading control
(3-h exposure; bottom). (b) Quantitation of miR-16 relative to tRNA for each sponge-treated
sample. We performed northern blots for miR-16 and miR-20 >10 times in 293T and HeLa cells,
and show results from a representative blot.
a
Jer%)
K4
(>
miR-16
GFP sponges
4$
- U6 sponges
C-CX C-16b C-16pf U-CX U-16b U-16pf
Sponge
- tRNAGIn
...........
. ...................................
-
Supplementary Information
Supplementary Figure 1. Effect of sponges on a miR-20 target and an untargeted control.
We transfected 293T cells with a Renilla luciferase vector containing seven miR-20 sites or an
otherwise identical vector containing seven CXCR4 control sites and the sponge plasmids
indicated. Bars represent Renilla luciferase units relative to Firefly luciferase units. Error bars
represent standard deviation among triplicate samples. Results are representative of a minimum
of three independent experiments.
120
100
80
60
40
20
C-CX
C-20b
C-20pf
U-CX
U-20b
U-20pf
Sponge
Supplementary Figure 2. Comparison of sponges to antisense oligos.
We transfected 293T cells with 20 nM antisense oligo (2' O-methyl or LNA) or with the CMV
bulged sponge against miR-16. Negative controls: mock (no oligos or sponges), 2' O-methyl
against miR-30, LNA against miR-122, CXCR4 sponge. The target reporter contains nine miR-16
sites and is derepressed slightly more strongly by the LNA and sponge than by the 2' O-methyl
oligo. Error bars denote standard deviation in triplicate samples. Results are representative of a
minimum of three independent experiments.
10
8
6
4-
2
CC
0
~
'(0
(
(0
e. 0
0o
X"
4
o
0'
Supplementary Figure 3. Inhibition of microRNA by a stably expressed sponge.
We stably transfected 293T cells with the miR-16 sponge or CXCR4 sponge (control) plasmid,
sorted for high GFP expression, and tested by dual luciferase assay with a Renilla luciferase
reporter for miR-16 or an untargeted Renilla luciferase control. Bars represent expression of the
miR- 16 target relative to the untargeted control in each cell line. The miR-16 target is rescued
about 40 percent as well by the stably expressed sponge as by the transiently transfected sponge.
Error bars represent standard deviation in triplicate samples.
0
-10
IQb
C
.......
..
. .................
Supplementary Figure 4. Validation of new microRNA targets.
We fused the E2F5 UTR (which contains a predicted miR-20 binding site) to a Renilla luciferase
reporter. We transfected 293T cells with the UTR reporter, Firefly luciferase, and GFP sponges.
Bars represent Renilla luciferase units relative to Firefly luciferase units. Error bars represent
standard deviation among triplicate samples. Results are representative of a minimum of three
independent experiments. Interestingly, miR-20 is now shown to directly regulate at least two
members of the E2F family of transcription factors.
1.4 .
1.2 -
S
1.0
X
(
0.8
0.6
S0.4
0.2
0.0
C-20
C-CX
Sponge
Supplementary Table I. Sequences of sponges and reporters. sites written 5' to 3'.
CXCR4 bulged Renilla luciferase reporter 7 sites or 1 site; CMV sponge. 7 sites. U6
sponge. 4 sites.
AAGUUUUCAGAAAGCUAACA
miR- 16 bulged Remulla luciferase reporter. CMV sponge. and U6 sponge, 9 sites
AAUAUUC UAUGCUGCUA
miR- 16 perfect CMV sponge and U6 sponge, 2 sites
CGCCAAUALUUACGUGCUGCUA.
miR- 18 bulged CIV sponge. 8 sites
UAUCUGCAC UUAGGCAC-CUUA.
miR-20 bulged Renilla luciferase reporter and CMV sponge. 7 sites. U6 sponge, 4 sites
UACCUGCACUC GCGCACUUUA.
miR-20 perfect CMV sponge and U6 sponge. 2 sites
CUACC'UGCAC'UAUAAGCACUUUA.
miR-21 bulged Renilla luciferase reporter. 6 sites. CMV sponge. 7 sites
UCAACAUCAGGACAUAAGCUA.
miR-30c perfect Remlla luciferase reporter
GCUGAGAGUGUAGGAUGUUUACA.
2 sites
miR-30e bulged CIV sponge. 6 sites
UCCAGUCC'CUAUGUUUACA.
Method
2'-O-methyl
antisense
Modified RNA
oligo
One
LNA antisense
MicroRNA sponge
Modified RNA and
DNA oli2o
One
iRNA containing a
3'UTR
Multiple
Transient
transfection
Transient
transfection
Reporter function
None
None
Transient
transfection or
stable expression
fronm chromosomal
insertions
GFP or other
genetically encoded
Specificity
Single microRNA
Single microRNA
Composition
Number of binding
sites
Means of addition
to cells
reporter
proteins
MicroRNA seed
family
Supplementary Table 2. Comparison of microRNA sponges to antisense oligos.
Supplementary Methods
Primers
E2F 1 UTR fragment: forward primer AATATTCTAGACTCTAACTGCACTTTCGGCC and
reverse primer AATAAGGGCCCGAAGCAAATCAAAGTGCAGATTG.
CD69 UTR fragment: forward primer AGCTAGCTCGAGACTGTGCCATAGCACCACAG and
reverse primer ATGCATGCGGCCGCACAGCTTAAACTTTATAGTGGGTTTT.
E2F5 UTR: forward primer GACTCGAGATTCCATGGAAACTTGGGAC and reverse primer
CCGCGGCCGCAATGTTTTATACAATTTTATTTT.
Western blot
We transfected 293T cells two days in a row with Lipofectamine 2000 and sponge plasmids.
Fluorescence microscopy confirmed that 95-100 percent of the cells were GFP-positive 48 hours
after the first transfection. We lysed cells in RIPA buffer and resolved the lysates on a Tris-HCl
4-20% gel, transferred to a nitrocellulose membrane, and probed with anti-E2F 1 (Santa Cruz sc193), stripped, and re-probed for beta-actin (Sigma A5441). We imaged the blots with Western
Lightning chemiluminescence reagent (PerkinElmer) and film. We performed the experiment
three times and have shown a representative result.
Northern blot
We transfected 293T cells with Lipofectamine2000 and sponge plasmids. We harvested total
RNA by Trizol extraction 48 hours post-transfection. We ran 20 pg RNA on a 12%
polyacrylamide gel, along with end-labeled 10-bp DNA ladder (Invitrogen), transferred to
Hybond N+ membrane, and probed against miR- 16, then stripped and reprobed for glutamine
tRNA, then for the 3' end of the d2eGFP coding region, then for the 3' end of the U6 sponge
RNA. We imaged the blots with a Storm scanner (Molecular Dynamics) and quantified the bands
with ImageQuant software (Amersham Biosciences). We performed the experiment at least three
times each for miR-16 and for miR-20 and have shown a representative result.
Construction of stable cell lines
We cotransfected 293T cells with linearized GFP sponge plasmids for miR-16 or the CXCR4
control at a 20:1 ratio to linear puromycin marker (Clontech). We cultured the cells in 2.5 pg/ml
puromycin for about six weeks and sorted on a MoFlo FACS instrument (Cytomation) for the
highest 10 percent of GFP expression. We cultured these fractions for another week before
performing transfection assays.
Quantification of sponge RNAs
We transfected 293T cells with CMV sponges (CXCR4, miR-16, miR-20) and harvested total
RNA 24-48 hours later. We treated RNA with DNaseI (Ambion) and reverse transcribed it with
random primers using MMLV Reverse Transcriptase (Ambion). We used the cDNA samples or
no-RTase controls as templates for real-time PCR with SYBRGreen detection (Applied
Biosystems) and primers in the coding region of GFP. We used a dilution series of GFP plasmid
standards to estimate the number of GFP cDNAs present in each reaction. We ran each PCR
experiment in triplicate and averaged the results of three experiments.
Chapter 3. MicroRNA sponge inhibitors: progress and possibilities
This chapter was written by Margaret S. Ebert and edited by Phillip A. Sharp.
The microRNA (miRNA) "sponge" method was introduced three years ago as a
means to create continuous miRNA loss of function in cell lines and transgenic
organisms. Sponge RNAs contain complementary binding sites to a miRNA of
interest, and are produced from transgenes within cells. As with most miRNA target
genes, a sponge's binding sites are specific to the miRNA seed region, which allows
them to block a whole family of related miRNAs. Whether termed sponges, decoys,
erasers, or lentiviral antagomirs, this transgenic approach has proven to be a useful
tool to probe miRNA functions in a variety of experimental systems. In this review
we discuss the recent applications of miRNA sponges with particular emphasis on
stable sponge expression in cancer studies and in transgenic animals. We also
consider the likelihood that there exist natural mRNAs or non-coding RNAs that
function as miRNA sponges to inhibit miRNA families.
Introduction
The widespread involvement of microRNAs (miRNAs) in regulating developmental
processes, physiological responses, and pathological conditions in animals has been
amply demonstrated (He and Hannon 2004, Bushati and Cohen 2007, Bartel 2009).
Nonetheless, the specific functions of each miRNA in the various contexts in which it is
expressed are only beginning to be discovered. The typical miRNA is computationally
predicted to regulate hundreds of target genes (Friedman et al. 2009), and while there has
been progress in compiling sets of predicted targets into pathways (Tsang et al. 2010; see
Appendix), every prediction still needs to be experimentally validated. The best
experimental approaches create a loss of function in the miRNA of interest. Loss-offunction approaches are superior because they reveal functions that depend on
physiological miRNA levels; by contrast, adding exogenous miRNA to the system can
result in repression of non-physiological target mRNAs since miRNA-target interaction is
strongly concentration-dependent (Mukherji et al. 2010; see Chapter 4).
There are three general methods for miRNA loss-of-function studies: genetic knockouts,
antisense oligonucleotide inhibitors (Meister et al. 2004, Orom et al. 2006, Kritzfeldt et
al. 2005), and sponges (Ebert et al. 2007). The sponge mRNA containing multiple target
sites complimentary to a seed family is a dominant negative method. When the sponge is
expressed at high levels, it inhibits the activity of the set of miRNAs with the common
seed but not other seed families of miRNAs. While deleting a miRNA is the only way to
guarantee complete loss of its activity, the sponge method offers several advantages. First
is the convenience of making dominant negative transgenics over knockouts, and the
applicability to a broader range of model organisms and cell lines. Second, many
miRNAs have seed family members encoded at multiple distant loci; due to this
functional redundancy, these miRNAs would have to be knocked out individually and the
animals bred to generate the complete knockout strain. Furthermore, some miRNA
precursors are transcribed in clusters; the proximity of the miRNAs within a cluster may
make it difficult to cleanly delete one miRNA without affecting the processing of the
others. Since sponges have a trans dominant activity, the clustering of miRNA precursors
is irrelevant.
Sponges also offer advantages over chemically modified antisense oligonucleotide
inhibitors. First, these antisense inhibitors are specific for one miRNA since they depend
upon extensive sequence complimentarity. Thus, to neutralize a family of miRNAs
requires the delivery of a mixture of oligonucleotides. In addition, many cells both in
vitro and in vivo are resistant to the uptake of antisense oligonucleotides. In contrast, for
difficult-to-transfect cell lines or cells in vivo, the sponge transgene can be delivered by a
retroviral vector. Inclusion of an open reading frame for a selectable marker or reporter
gene in the vector allows for selection or screening, fluorescence-activated cell sorting, or
even laser capture microdissection of cells strongly expressing the sponge. (See
Supplementary Information for suggestions on sponge design.) This makes it possible to
isolate a fraction of cells in which the family of miRNAs is strongly inhibited, which can
reveal even subtle changes in target gene expression. In principle one could include
regulatory elements in the sponge promoter to make it drug-inducible or tissue-specific
for the tissue of choice. By contrast, the cholesterol-modified 'antagomir'
oligonucleotides that can be injected into the mouse cannot access all tissues, and mostly
accumulate in the liver (KrUtzfeldt et al. 2005). Finally, antagomirs require repeated
administration in large doses to inhibit a miRNA over long durations, whereas one could
generate germline transgenic sponge-expressing animals to continuously inhibit the
miRNA of interest for the lifetime of the animal. The current status of sponge technology
will be described below as examples of the above principles. It is remarkable how useful
this technology has become.
Recent applications of miRNA sponges
The immediate application of miRNA sponges as first described was transient treatment
and assay in cell culture models. A number of reports demonstrate the flexibility of the
method with respect to cell type, promoter, vector, reporter gene, and type of miRNA
targeted. Sponges were transfected or transduced into human, mouse, and rat cell lines
such as non-small cell lung cancer (Kumar et al. 2008), B cell lymphoma (Bolisetty et al.
2009), embryonic neural stem cells (Rybak et al. 2008), and dissociated hippocampal
neurons (Edbauer et al. 2010). Sponge RNAs were transcribed from strong promoters
such as CMV (Elcheva et al. 2009, Rybak et al. 2008), U6 (Sayed et al. 2008), and viral
LTRs (Kumar et al. 2008). The most commonly used vectors were plasmids (Elcheva et
al. 2009, Kumar et al. 2008, Edbauer et al. 2010, Rybak et al. 2008) but some used
retroviruses (Bolisetty et al. 2009), lentiviruses (Nachmani et al. 2009, Horie et al. 2009)
or adenovirus (Sayed et al. 2008). Individual miRNAs e.g. miR-155 (Bolisetty et al.
2009) or large seed families e.g. let-7 (Kumar et al. 2008) were successfully targeted. The
most common reporter gene was eGFP (Elcheva et al. 2009, Kumar et al. 2008, Bolisetty
et al. 2009, Rybak et al. 2008, Nachmani et al. 2009), but mCherry (Edbauer et al. 2010)
and luciferase (Horie et al. 2009) were also used. Typically, cellular assays and target
validation assays (visualization of derepressed target protein or 3' UTR reporter
expression) were performed 24-72 hours after introduction of the sponge construct.
Transient delivery to tissue is also feasible: Care et al. used an adenoviral eGFP sponge
to inhibit miR-133 in cardiac myocytes in vivo in a mouse model of cardiac hypertrophy
(Care et al. 2007). Krol et al. used adeno-associated virus (AAV) to deliver sponges to
mice subretinally. In the latter case, the eGFP sponge was driven by the rhodopsin
promoter to allow for specific expression in photoreceptor cells, and each animal
received a combination sponge for three light-regulated miRNAs (miR- 182, -96, and 183) in one eye and an empty control sponge in the other (Krol et al. 2010). Three weeks
post-injection, retinas were isolated and dissected into retinal layers using laser capture
microdissection for eGFP-expressing cells. Western blotting revealed strong derepression
for the target glutamate transporter SLCIAl.
One fortuitous aspect of sponge treatment is that it can cause a significant and specific
reduction in the miRNA level (Sayed et al. 2008, Rybak et al. 2008, Horie et al. 2009).
This may indicate that miRNA-target interaction stimulates degradation of the miRNA.
Another positive outcome is the absence of any feedback response that would upregulate
the miRNA upon introduction of increased target sites in the form of the miRNA sponge.
Even though early results with transiently introduced sponges were encouraging, it was
not certain that sponge mRNAs would be able to accumulate to levels sufficient to inhibit
miRNA in stable expression formats. Recent results indicate that this is possible.
Stable miRNA sponge expression
Continuous expression of the sponge inhibitor makes it possible to perform long-term
miRNA loss-of function studies in cell culture and in vivo assays such as bone marrow
reconstitution and cancer xenografts. Several groups have achieved stable miRNA
sponge activity by expressing the transgene from one or more chromosomal integrations
(Scherr et al. 2007, Haraguchi et al. 2009, Gentner et al. 2009, Bonci et al. 2008,
Valastyan et al. 2009, Starczynowski et al. 2010, Ma et al. 2010a, Ma et al. 2010b, Gatt et
al. 2010, Papapetrou et al. 2010). In principle, stably propagated episomal vectors
(Kimchi et al. 1999) should also yield similar results. The challenge for stable expression
is to produce a sufficient dose of sponge mRNA given much lower transgene copy
numbers compared to transient plasmid transfection. The good news from recent reports
is that even partial miRNA inhibition can yield measurable and interesting phenotypes.
Papapetrou et al. sought to probe the role of the erythroid-specific, closely clustered
miRNAs miR- 144 and miR-451 in blood cell development. To this end they used
lentiviral sponges marked with a different color fluorescent reporter for each miRNA to
dissect their relative contributions in erythropoiesis (Papapetrou et al. 2010). Bone
marrow reconstitution was performed with a 1:1 mixture of green control sponge with red
(miR- 144) or yellow (miR-45 1) sponge, or both. Three to four weeks after
transplantation, the competitive repopulation of the chimeric blood was analyzed by flow
cytometry. Both miRNAs were found to be required for normal progression through the
first stage of erythroblast maturation, and their simultaneous inhibition showed that they
act additively.
One of the most common applications of stably expressed sponges is to mimic the downregulation of specific miRNAs that are aberrantly expressed in certain disease states. For
example, by screening miRNA expression and metastatic potential of a panel of
mammary cell lines, Valastyan et al. identified miR-31 as strongly down-regulated in
aggressive metastatic cancer (Valastyan et al. 2009). They set up an experimental model
wherein human non-metastatic breast cancer cells transduced with retroviral eGFP
sponges for miR-31 or an irrelevant sequence were orthotopically implanted in mouse
mammary fat pads. Primary tumor size was not significantly affected by the inhibition of
miR-3 1, but, while the control sponge tumors did not metastasize, miR-3 1 sponge tumors
metastasized to the lungs, forming ten times more lesions (easily identifiable by their
GFP fluorescence). This result allowed the authors to identify miR-31 as a suppressor of
metastasis. A similar approach was taken to show that miR-Ob (Ma et al. 2010a) and
miR-9 (Ma et al. 201 Ob) promote breast cancer metastasis. The recent finding that
reduction in the expression of a tumor suppressor by a mere 20 percent can promote the
development of cancer (Alimonti et al. 2010) suggests that screens with sponges, which
may alter target gene expression to a similar extent, could be generally informative.
A related experiment is the application of a sponge to mimic the genetic state of patients
with a genomic deletion of a particular miRNA or miRNA cluster. For example, the miR15a-16-1 cluster is located within a region of chromosome 13q14 that is frequently
deleted in leukemia, prostate cancer, and other malignancies (Bottoni et al. 2005, Bandi
et al. 2009, Bonci et al. 2008, Hanlon et al. 2009, Corthals et al. 2010, Gatt et al. 2010).
Bonci et al. and Gatt et al. used lentiviral GFP sponges with sites for miR- 15a and miR16 respectively and tested transduced human prostate cancer and multiple myeloma cell
lines by xenograft assay. In both cases the miR-15/16-inhibited cancers developed larger,
more invasive tumors than their negative controls; in the multiple myeloma study, the
animals showed substantially decreased survival, from a median of 80 to 31 days.
Analysis of the tumors implicated several signaling pathways in which the miR- 15/16
family acts to suppress survival, proliferation and invasiveness (Gatt et al. 2010).
Another instance of a disease-associated miRNA cluster deletion occurs in the 5qsubtype of myelodysplastic syndrome (MDS) (Starczynowski et al. 2010). In this case the
miRNAs in the cluster, miR- 145 and -146a, have different seeds. To model the partial
loss of these two miRNAs in hematopoietic stem/progenitor cells, Starczynowski et al.
used a combination sponge containing 8-9 bulged sites for each miRNA. Cells transduced
with retroviral YFP sponges were transplanted into lethally irradiated recipient mice, and
were mixed with wild-type cells to mimic the chimerism of human 5q- patients. Eight
weeks post-transplantation, the animals' blood cells manifested most of the features of
MDS. Observation over the long term proved the benefit of including a fluorescent
reporter in the competition assay: over the course of several months, YFP* cells were
depleted from the blood of the sponge-transduced (but not vector control) recipients, yet
thrombocytosis was still evident, indicating a cell non-autonomous effect of miRNA
depletion. This correlated with an increased serum IL-6 concentration attributable to the
derepression of miR-146 target gene TRAF6. Sustained, systemic phenotypes may result
from transient miRNA perturbation in a subset of cells if secreted cytokines operate in a
positive feedback loop, as in the recently described inflammatory cascade driven by IL6,
let-7 down-regulation, and NF-kappaB (Iliopoulos et al. 2009). As in the case of miR15a-16-1 depletion in cancer, the ability of the stable sponge to partially knock down
miRNA activity provides a good mimic for the partial loss of miRNA expression in
patients with a heterozygous deletion. The miR- 145-146a miRNA cluster was shown to
be haploinsufficient in conferring protection against disease (Starczynowski et al. 2010).
miRNA sponges in transgenic animals
The first transgenic organisms made to express miRNA sponges were plants (FrancoZorrilla et al. 2007). These incorporated a single bulged binding site for the miRNA of
interest in the context of an overexpressed non-coding RNA, and successfully generated
phenotypes opposite those of the corresponding miRNA-overexpressing plants.
Stable, germline miRNA sponge expression in an animal model organism was first
achieved in Drosophilausing the Gal4-UAS (Upstream Activation Sequence) system
(Loya et al. 2009). The sponge constructs consist of five UAS elements, a fluorescent
reporter, and ten bulged miRNA binding sites in the 3' UTR. Gal4 expressed from a
tissue-specific promoter drives high expression of the sponge transgene. These inhibitors
were able to completely suppress a neomorphic phenotype caused by an overexpressed
miRNA in the eye, and to largely rescue expression of a target UTR reporter regulated by
an endogenous miRNA in the wing imaginal disc. Hypomorphic phenotypes were
enhanced by means of a sensitized background: the heterozygous miRNA deletion
mutant, which has a reduced level of the miRNA but no detectable phenotype on its own.
In this background, the sponge transgenics could phenocopy miRNA-null mutant flies.
Varying the number of transgene copies also modulated the inhibitory effect, which could
be used in combination with the miRNA genetic background to generate allelic series.
The power of the Gal4 inducible system to dissect a null phenotype was shown by
inhibiting a miRNA's activity in specific subtypes of cells. It is known that the miR-8
knockout has neuromuscular junction defects; activating the expression of a miR-8
sponge specifically in neurons or in muscle cells revealed the locally required activity
(and regulation of the target gene Ena) in the postsynaptic muscle cell, even though miR8 is present in both pre- and post-synaptic cells. The ability to probe miRNA function in
restricted subsets of cells could be critical, as there are cases of miRNA-target
interactions restricted to one cell type; an extreme example is miR-273 repressing the
transcription factor die-I in the right chemosensory ASE neuron, and lsy-6 repressing
cog-I in the left chemosensory ASE neuron in C. elegans (Chang et al. 2004).
Transgenic vertebrates expressing sponges are a work in progress. The recent
development of the Tol2 transposon system and various Gal4 strains should facilitate the
introduction of sponge transgenes for tissue-specific expression in zebrafish (Asakawa
and Kawakami 2008). In the mouse, an inducible sponge could be created by means of
the Cre-lox system (to remove a transcriptional stop cassette with tissue-specific
recombinase expression) or with a tet-responsive element driving the sponge and tissue-
specific reverse tet transactivator (rtTA) expression in combination with feeding the
animal doxycycline. A sensitized background of DGCR8 and/or Dicer heterozygosity,
which show partially reduced levels for some miRNAs (Murchison et al. 2005, Wang et
al. 2007), might enhance the loss of function. It should be noted, however, that the Dicer
heterozygous state can accelerate the development of tumors in mouse models (Kumar et
al. 2009).
Are there natural miRNA sponges?
Given the ability of stably integrated mRNA-based miRNA sponges to specifically and in
some cases inducibly inhibit miRNA seed families, it seems reasonable to expect that
nature might also have invented this type of miRNA inhibitor. There are further reasons
to support this hypothesis. First, miRNAs have been shown to be very stable (Bail et al.
2010), some with in vivo half-lives of more than a week (van Rooij et al. 2007); thus it
should be more effective to induce a sponge RNA to sequence-specifically sequester a
miRNA than to sequence-specifically degrade the mature miRNA strand, which is
encased in an Argonaute protein complex. Sequestration by a target mimic RNA would
operate through seed specificity, so an entire functional class of miRNA seed family
members would be inhibited. Finally, effective sponges should be easy to evolve as they
require only short stretches of complementarity to miRNA seeds in regions of relatively
unstructured RNA. A sponge could contain sites for one miRNA family or for a
combination of miRNAs such that it could serve as a specific rescue molecule for one or
a few target genes.
One can imagine several scenarios in which the expression of a sponge RNA could add a
layer of regulation to post-transcriptional control of miRNA targets. During a
developmental transition or in response to a cellular stress, when a miRNA is
transcriptionally down-regulated, induction of a sponge RNA could sharpen the loss of
that miRNA activity over time (Figure 3A). A miRNA induced to respond to a transient
stress could be inhibited shortly thereafter by the accumulation of a stress-induced sponge
(Figure 3B). Alternatively, such a stress-induced sponge could act as a quality control
mechanism, setting a threshold above which miRNA expression must rise to successfully
enact a change in the expression of critical target genes. A viral sponge RNA could
inhibit a host miRNA to change the infected cell's gene expression program so as to
evade immune response or hijack cellular pathways to promote viral propagation. A
sponge RNA expressed in a specific tissue could uncouple the activity of an intronderived miRNA from the expression of its host gene. A tissue-specific sponge could also
neutralize passenger strand miRNPs to enhance the specificity of miRNA loading
(beyond what is determined by the thermodynamic asymmetry of the miRNA duplex that
normally controls strand assembly), as has been done with artificial sponges to prevent
passenger strand-mediated off-target effects from shRNA vectors (Mockenhaupt et al.
2010). A sponge could be constitutively expressed to fine-tune the activity of a miRNA
to a slightly lower level. In certain cellular contexts such as in neurons, spatially
separated zones of translation could experience major consequences from local
sequestration of miRNA and the ensuing rescue of expression of a small pool of
messages.
All speculation aside, the best reason to believe that there could be natural miRNA
sponges in animal systems is that there is already evidence for one in plants (FrancoZorrilla et al. 2007). The TPSI family of non-coding RNAs (IPS 1 and its paralog At4) are
processed as mRNAs but contain very short, poorly conserved open reading frames. They
also contain in the 3' UTR a 23-nt sequence that is highly conserved among different
plant species, and that can act as a single bulged binding site for miR-399. In fact, the
miRNA's nucleotides 1-10 are perfectly paired in more than 80 percent of IPS 1 genes;
there is additional strong, conserved pairing to the miRNA's 3' end. The mismatches
opposite nucleotides 10 and 11 protect the mRNA from endonucleolytic cleavage by
miR-399-loaded Argonautes. While the TPSI RNAs are induced upon phosphate
starvation, miR-399 is also induced, and the miR-399 target gene PHO2 is initially downregulated (Chitwood and Timmermans 2007). Franco-Zorrilla et al. found that
overexpressing IPS 1 in the presence of miR-399 was able to rescue the level of PHO2
mRNA and thereby lower the shoot P1 content. Whether the endogenous TPSI levels are
sufficient to derepress PHO2 to incur the same physiological response remains to be
shown. As miR-399 and its sponge inhibitor are both induced by phosphate stress, they
appear to act in an incoherent manner to regulate PHO2 target expression. Depending on
the relative production and turnover rates of the miRNA and the sponge RNA, this type
of regulatory architecture could serve to generate a brief pulse of miRNA activity
followed by an attenuation period during which target mRNA levels recover (Chitwood
and Timmermans 2007).
mRNAs that act as competitive inhibitors of regulatory small RNAs (sRNAs) were also
recently discovered in prokaryotes (Overgaard et al. 2009, Figueroa-Bossi et al. 2009). In
this case a constitutively expressed, long-lived sRNA binds to and is destabilized by a
target mimic RNA which is induced by chitobiose, a breakdown product of chitin from
the outer membrane (Mandin and Gottesman 2009). What results is derepression of a
chitoporin gene whose message is normally degraded by the sRNA.
In animals systems, one place to look for potential sponge RNAs is in viral transcripts,
which can be expressed at very high levels. In fact, there are hints that a viral miRNA
sponge might be at work in cells lytically infected with murine cytomegalovirus (Buck et
al. 2010). Upon infection, Buck et al. observed rapid post-transcriptional down-regulation
of miR-27a and -27b, in a manner dependent on RNA polymerase activity; higher
multiplicity of infection correlated with lower miR-27 levels. A gain-of-function
experiment showed that the miR-27 family suppresses viral replication, supporting the
possibility that inhibition of this miRNA family by a viral sponge RNA could facilitate
viral replication.
Cellular RNAs are also potential candidates for miRNA target mimics. Recently genomewide analysis of chromatin marks uncovered hundreds of large intergenic non-coding
RNAs (lincRNAs) (Guttman et al. 2009), some of which localize to the cytoplasm where
they could interact with mature miRNAs. There are also dozens of PolIII products and
PolII-generated mRNA-like non-coding RNAs of undetermined function listed in noncoding RNA databases; some have been detected at high levels in specific cell types or
under specific conditions (Pang et al. 2005). While such RNAs may be transcribed from
intergenic promoters or promoters within 3' UTRs, another mechanism that can generate
a UTR RNA was recently observed in mouse embryonic development: an exon exclusion
event causes the entire coding region of the mRNA to be spliced out, leaving the
untranslated regions in a non-coding transcript (Kanadia and Cepko 2010). Such
transcripts could act as target mimics for the miRNA or combination of miRNAs with
binding sites in their 3' UTRs.
Concluding remarks
Sponges are simply mRNAs with target sites in their 3' UTR that are expressed at
sufficient levels to competitively interfere with miRNA regulation of specific endogenous
targets. This can be pictured in the context of a cell as a population of miRNAs
associated with Argonuate proteins that are bound as RNPs to the population of mRNAs
with target sites of different affinities. Surprisingly, the concentration dependence of
miRNA interactions with targets of different affinity is not well understood. But survey
experiments suggest that miRNA concentrations of 500-1,000 per cell are necessary for
silencing (Calabrese et al. 2008, Mukherji et al. 2010). The concentration of individual
miRNAs in some differentiated tissue can range as high as 30,000-50,000 per cell. Under
steady-state conditions most of these miRNAs are probably bound to endogenous
mRNAs with different affinities, and the addition of sponges at levels of perhaps 50-100
RNAs per cell with 5-10 target sites each, can, at least in some cases, compete the
endogenous miRNAs from targets adequately to produce a physiological change.
Typically, the sponge's binding sites are designed to be of high affinity with extensive
complementarity to the miRNA. Thus, the effectiveness of a sponge for de-repression of
a specific target might be expected to vary with the abundance of the miRNA, the nature
of the pool of its endogenous targets, and the affinity of the interaction of the miRNA
with the specific target mRNA. Given the large number of variables, it is difficult to
predict whether a sponge will be effective in a particular cell. Recent statistical analysis
of dependences of silencing by miRNA and siRNA on the cellular abundance of target
mRNAs illustrates these issues (Arvey et al. 2010). As outlined above, it is encouraging
that the use of sponges has generated phenotypes with multiple vectors and in multiple
organisms. They will probably be used more broadly as the study of miRNAs focuses on
more physiological questions requiring perturbation of their activities in specific cell
states.
In conclusion, the miRNA sponge has become a versatile method for miRNA inhibition
in cell culture and in vivo. The successful production of transgenic fruitflies whose
sponge activity mimics known null mutant phenotypes was a major advance that should
encourage attempts to generate other transgenic sponge animals. In plants, the discovery
of an endogenous mRNA that acts as a natural miRNA sponge to attenuate a stress
response may open the door to discovering more natural target mimics.
Acknowledgments
We thank Mary Lindstrom for help preparing the figures. This work was supported by
United States Public Health Service grant RO1-CA133404 from the National Institutes of
Health to P.A.S. and partially by Cancer Center Support (core) grant P30-CA14051 from
the National Cancer Institute.
References
Alimonti A, Carracedo A, Clohessy JG, Trotman LC, Nardella C, Egia A, Salmena L,
Sampieri K, Haveman WJ, Brogi E, Richardson AL, Zhang J, Pandolfi PP. Subtle
variations in Pten dose determine cancer susceptibility. Nature Genet. 42, 454-458
(2010).
A, Larsson E, Sander C, Leslie CS, Marks DS. Target mRNA abundance dilutes
microRNA and siRNA activity. Mol. Syst. Bio. 6, 363 (2010).
Asakawa K, Kawakami K. Targeted gene expression by the Gal4-UAS system in
zebrafish. Dev. Growth Differ. 50, 391-399 (2008).
Bail S, Swerdel M, Liu H, Jiao X, Goff LA, Hart RP, Kiledjian M. Differential regulation
of microRNA stability. RNA 16, 1032-1039 (2010).
Bandi N, Zbinden S, Gugger M, Arnold M, Kocher V, Hasan L, Kappeler A, Brunner T,
Vassella E. miR- 15a and miR- 16 are implicated in cell cycle regulation in a Rbdependent manner and are frequently deleted or down-regulated in non-small cell lung
cancer. Cancer Res. 69, 5553-5559 (2009).
Bartel DP. MicroRNAs: target recognition and regulatory functions. Cell 136, 215-233
(2009).
Bolisetty MT, Dy G, Tam W, Beemon KL. Reticuloendotheliosis virus strain T induces
miR-155, which targets JARID2 and promotes cell survival. J. Virol. 83, 12009-12017
(2009).
Bonci D, Coppola V, Musumeci M, Addario A, Giuffrida R, Memeo L, D'Urso L,
Pagliuca A, Biffoni M, Labbaye C, Bartucci M, Muto G, Peschle C, De Maria R. The
miR- 15a-miR- 16-1 cluster controls prostate cancer by targeting multiple oncogenic
activities. Nature Med. 14, 1271-1277 (2008).
Bottoni A, Piccin D, Tagliati F, Luchin A, Zatelli MC, degli Uberti EC. miR- 15a and
miR-16-1 down-regulation in pituitary adenomas. J. Cell Physiol. 204, 280-285 (2005).
Buck AH, Perot J, Chisholm MA, Kumar DS, Tuddenham L, Cognat V, Marcinowski L,
DOlken L, Pfeffer S. Post-transcriptional regulation of miR-27 in murine cytomegalovirus
infection. RNA 16, 307-315 (2010).
Bushati N, Cohen SM. microRNA functions. Annu. Rev. Cell Dev. Biol. 23, 175-205
(2007).
Calabrese JM. Dicer delection and short RNA expression analysis in mouse embryonic
stem cells. Doctoral thesis (2008).
Care A, Catalucci D, Felicetti F, Bonci D, Addario A, Gallo P, Bang ML, Segnalini P, Gu
Y, Dalton ND, Elia L, Latronico MV, Hoydal M, Autore C, Russo MA, Dom GW 2nd,
Ellingsen 0, Ruiz-Lozano P, Peterson KL, Croce CM, Peschle C, Condorelli G.
MicroRNA-133 controls cardiac hypertrophy. Nature Med. 13, 613-618 (2007).
Chang S, Johnston RJ Jr, Frokjaer-Jensen C, Lockery S, Hobert 0. MicroRNAs act
sequentially and asymmetrically to control chemosensory laterality in the nematode.
Nature 430, 785-789 (2004).
Chitwood DH, Timmermans MC. Target mimics modulate miRNAs. Nature Genet. 39,
935-936 (2007).
Corthals SL, Jongen-Lavrencic M, de Knegt Y, Peeters JK, Beverloo HB, Lokhorst HM,
Sonneveld P. Micro-RNA- 15 a and micro-RNA- 16 expression and chromosome 13
deletions in multiple myeloma. Leuk. Res. 34, 677-681 (2010).
Ebert MS, Neilson JR, Sharp PA. MicroRNA sponges: competitive inhibitors of small
RNAs in mammalian cells. Nature Methods 4, 721-726 (2007).
Edbauer D, Neilson JR, Foster KA, Wang CF, Seeburg DP, Batterton MN, Tada T, Dolan
BM, Sharp PA, Sheng M. Regulation of Synaptic Structure and Function by FMRPAssociated MicroRNAs miR-125b and miR-132. Neuron 65, 373-3 84 (2010).
Elcheva I, Goswami S, Noubissi FK, Spiegelman VS. CRD-BP protects the coding
region of betaTrCP1 mRNA from miR-183-mediated degradation. Mol. Cell 35, 240-246
(2009).
Figueroa-Bossi N, Valentini M, Malleret L, Bossi L. Caught at its own game: regulatory
small RNA inactivated by an inducible transcript mimicking its target. Genes Dev. 23,
2004-2015 (2009).
Franco-Zorrilla JM, Valli A, Todesco M, Mateos I, Puga MI, Rubio-Somoza I, Leyva A,
Weigel D, Garcia JA, Paz-Ares J. Target mimicry provides a new mechanism for
regulation of microRNA activity. Nature Genet. 39, 1033-1037 (2007).
Friedman RC, Farh KK, Burge CB, Bartel DP. Most mammalian mRNAs are conserved
targets of microRNAs. Genome Res. 19, 92-105 (2009).
Gatt ME, Ebert MS, Mani M, Zhang Y, Gazit R, Carrasco DE, Dutta J, Adamia S,
Munshi NC, Minvielle S, Avet-Loiseau H, Tai YT, Anderson KC, Carrasco DR.
MicroRNAs 15a/16-1 function as tumor suppressor genes in multiple myeloma.
Submitted (2010).
Gentner B, Schira G, Giustacchini A, Amendola M, Brown BD, Ponzoni M, Naldini L.
Stable knockdown of microRNA in vivo by lentiviral vectors. Nature Methods 6, 63-66
(2009).
Guttman M, Amit I, Garber M, French C, Lin MF, Feldser D, Huarte M, Zuk 0, Carey
BW, Cassady JP, Cabili MN, Jaenisch R, Mikkelsen TS, Jacks T, Hacohen N, Bernstein
BE, Kellis M, Regev A, Rinn JL, Lander ES. Chromatin signature reveals over a
thousand highly conserved large non-coding RNAs in mammals. Nature 458, 223-227
(2009).
Hanlon K, Rudin CE, Harries LW. Investigating the targets of MIR- 15a and MIR- 16-1 in
patients with chronic lymphocytic leukemia (CLL). PLoS One 4, e7169 (2009).
Haraguchi T, Ozaki Y, Iba H. Vectors expressing efficient RNA decoys achieve the longterm suppression of specific microRNA activity in mammalian cells. Nucleic Acids Res.
37, e43 (2009).
He L, Hannon GJ. MicroRNAs: small RNAs with a big role in gene regulation. Nature
Rev. Genet. 5, 522-531 (2004).
Horie T, Ono K, Nishi H, Iwanaga Y, Nagao K, Kinoshita M, Kuwabara Y, Takanabe R,
Hasegawa K, Kita T, Kimura T. MicroRNA-133 regulates the expression of GLUT4 by
targeting KLF 15 and is involved in metabolic control in cardiac myocytes. Biochem.
Biophys. Res. Commun. 389, 315-320 (2009).
Iliopoulos D, Hirsch HA, Struhl K. An epigenetic switch involving NK-kappaB, Lin28,
Let-7 MicroRNA, and IL6 links inflammation to cell transformation. Cell 139, 693-706
(2009).
Kanadia RN, Cepko CL. Alternative splicing produces high levels of noncoding isoforms
of bHLH transcription factors during development. Genes Dev. 24, 229-234 (2010).
Kimchi A. Functional approaches to gene isolation in mammalian cells. Science 285, 299
(1999).
Kotin RM, Siniscalco M, Samulski RJ, Zhu XD, Hunter L, Laughlin CA, McLaughlin S,
Muzyczka N, Rocchi M, Berns KI. Site-specific integration by adeno-associated virus.
Proc. Natl Acad. Sci. USA 87, 2211-2215 (1990).
Krol J, Busskamp V, Markiewicz I, Stadler MB, Ribi S, Richter J, Duebel J, Bicker S,
Fehling HJ, Schtubeler D, Oertner TG, Schratt G, Bibel M, Roska B, Filipowicz W.
Characterizing Light-Regulated Retinal MicroRNAs Reveals Rapid Turnover as a
Common Property of Neuronal MicroRNAs. Cell 141, 618- 631 (2010).
Krultzfeldt J, Rajewsky N, Braich R, Rajeev KG, Tuschl T, Manoharan M, Stoffel M.
Silencing of microRNAs in vivo with 'antagomirs'. Nature 438, 685-689 (2005).
Kumar MS, Erkeland SJ, Pester RE, Chen CY, Ebert MS, Sharp PA, Jacks T.
Suppression of non-small cell lung tumor development by the let-7 microRNA family.
Proc. Natl Acad. Sci. USA 105, 3903-3908 (2008).
Kumar MS, Pester RE, Chen CY, Lane K, Chin C, Lu J, Kirsch DG, Golub TR, Jacks T.
Dicer1 functions as a haploinsufficient tumor suppressor. Genes Dev. 23, 2700-2704
(2009).
Loya CM, Lu CS, Van Vactor D, Fulga TA. Transgenic microRNA inhibition with
spatiotemporal specificity in intact organisms. Nature Methods 6, 897-903 (2009).
Ma L, Reinhardt F, Pan E, Soutschek J, Bhat B, Marcusson EG, Teruya-Feldstein J, Bell
GW, Weinberg RA. Therapeutic silencing of miR- 1Ob inhibits metastasis in a mouse
mammary tumor model. Nature Biotech. 28, 341-347 (2010).
Ma L, Young J, Prabhala H, Pan E, Mestdagh P, Muth D, Teruya-Feldstein J, Reinhardt
F, Onder TT, Valastyan S, Westermann F, Speleman F, Vandesompele J, Weinberg RA.
miR-9, a MYC/MYCN-activated microRNA, regulates E-cadherin and cancer metastasis.
Nature Cell Biol. 12, 247-256 (2010).
Mandin P, Gottesman S. Regulating the regulator: an RNA decoy acts as an OFF switch
for the regulation of an sRNA. Genes Dev. 23, 1981-1985 (2009).
Meister G, Landthaler M, Dorsett Y, Tuschl T. Sequence-specific inhibition of
microRNA- and siRNA-induced RNA silencing. RNA 10, 544-550 (2004).
Mockenhaupt S, Schurmann N, Grimm D. Alleviation of adverse shRNA off-targeting
via vector-encoded passenger strand decoys. Keystone symposium poster (2010).
Mukherji S, Ebert MS, Zheng GZ, Tsang JS, Sharp PA, van Oudenaarden A. microRNAs
generate gene expression thresholds with ultrasensitive transitions. Submitted (2010).
Murchison EP, Partridge JF, Tam OH, Cheloufi S, Hannon GJ. Characterization of Dicerdeficient murine embryonic stem cells. Proc. Natl Acad. Sci. USA 102, 12135-12140
(2005).
Nachmani D, Stern-Ginossar N, Sarid R, Mandelboim 0. Diverse herpesvirus
microRNAs target the stress-induced immune ligand MICB to escape recognition by
natural killer cells. Cell Host Microbe 5, 376-385 (2009).
Orom UA, Kauppinen S, Lund AH. LNA-modified oligonucleotides mediate specific
inhibition of microRNA function. Gene 10, 137-141 (2006).
Overgaard M, Johansen J, Moller-Jensen J, Valentin-Hansen P. Switching off small RNA
regulation with trap-mRNA. Mol. Microbiol. 73, 790-800 (2009).
Pang KC, Stephen S, Engstr6m PG, Tajul-Arifin K, Chen W, Wahlestedt C, Lenhard B,
Hayashizaki Y, Mattick JS. RNAdb--a comprehensive mammalian noncoding RNA
database. Nucleic Acids Res. 33, D125-130 (2005).
Papapetrou EP, Korkola JE, Sadelain M. A genetic strategy for single and combinatorial
analysis of miRNA function in mammalian hematopoietic stem cells. Stem Cells 28, 287296 (2010).
Rybak A, Fuchs H, Smimova L, Brandt C, Pohl EE, Nitsch R, Wulczyn FG. A feedback
loop comprising lin-28 and let-7 controls pre-let-7 maturation during neural stem-cell
commitment. Nature Cell Biol. 10, 987-993 (2008).
Sayed D, Rane S, Lypowy J, He M, Chen IY, Vashistha H, Yan L, Malhotra A, Vatner D,
Abdellatif M. MicroRNA-21 targets Sprouty2 and promotes cellular outgrowths. Mol.
Biol. Cell 19, 3272-3282 (2008).
Scherr M, Venturini L, Battmer K, Schaller-Schoenitz M, Schaefer D, Dallmann I,
Ganser A, Eder M. Lentivirus-mediated antagomir expression for specific inhibition of
miRNA function. Nucleic Acids Res. 35, e149 (2007).
Starczynowski DT, Kuchenbauer F, Argiropoulos B, Sung S, Morin R, Muranyi A, Hirst
M, Hogge D, Marra M, Wells RA, Buckstein R, Lam W, Humphries RK, Karsan A.
Identification of miR- 145 and miR- 146a as mediators of the 5q- syndrome phenotype.
Nature Med. 16, 49-58 (2010).
Tsang JS, Ebert MS, van Oudenaarden A. Genome-wide dissection of microRNA
functions and cotargeting networks using gene set signatures. Mol. Cell 38, 140-153
(2010).
Valastyan S, Reinhardt F, Benaich N, Calogrias D, Szisz AM, Wang ZC, Brock JE,
Richardson AL, Weinberg RA. A pleiotropically acting microRNA, miR-3 1, inhibits
breast cancer metastasis. Cell 137, 1032-1046 (2009).
van Rooij E, Sutherland LB, Qi X, Richardson JA, Hill J, Olson EN. Control of stressdependent cardiac growth and gene expression by a microRNA. Science 316, 575-579
(2007).
Wang Y, Medvid R, Melton C, Jaenisch R, Blelloch R. DGCR8 is essential for
microRNA biogenesis and silencing of embryonic stem cell self-renewal. Nature Genet.
39, 380-385 (2007).
Figures
Figure 1.
(A) In the absence of sponge treatment, target mRNAs for a particular miRNA seed family are
repressed.
(B) After introduction of the sponge transgene, sponge mRNAs are expressed at a high level and
sequester the miRNA complexes, rescuing the expression of the endogenous targets. Spongetreated cells can be identified by their eGFP reporter expression.
(C) Pairing of a miRNA with a bulged sponge site shows mismatches opposite miRNA
nucleotides 9-12. The miRNA seed region is highlighted.
(A)
No Sponge
Sponge Treated
(B)
7
Gemmems.....w
V
(C)
AGAC
miR-21 3'- AGUUGUAGUC
UAUUCGAU -5'
Sponge 5'- UCAACAUCAGGAC AUAAGCUA-3'
Figure 2.
(A) Tissue-specific expression of the Gal4 transcription factor was used to drive miRNA sponge
expression under the control of upstream activating sequences (UAS) in transgenic fruitflies.
(B) Dissection of a complex phenotype using tissue-specific sponges. A developmental defect in
the axonal branching and synaptic boutons of neuromuscular junctions (NMJ) was observed in
the miR-8 knockout (second panel) and in miR-8 heterozygous flies expressing a miR-8 sponge
inhibitor specifically in muscle (fourth panel). Wild-type appearance of the NMJ is seen in the
miR-8 heterozygote (first panel) and in miR-8 heterozygous flies expressing a miR-8 sponge
specifically in neurons (third panel). Sponge expression is indicated by GFP fluorescence (shown
in green).
-GA--
Im
W
......
.
..
....................
ilm"
I
I
Figure 3. Roles for natural sponges in regulating miRNA activity.
(A) Rapid transitions: transcriptional down-regulation of a miRNA is sharpened by induction of a
sponge RNA that sequesters the lingering mature miRNA.
(B) Transient responses: a stress-induced miRNA is allowed a pulse of activity before being
inhibited by accumulating stress-induced sponge RNA.
Stage 1
Stage 2
Time
M
miRNA activity with sponge
I1lll111
miRNA activity without sponge
Time
Sponge expression
Supplementary Information
Optimizing the sponge construct
Design of miRNA binding sites: Sites perfectly complementary to the miRNA show some
inhibitory activity (Ebert et al. 2007, Sayed et al. 2008, Gentner et al. 2009), perhaps because
miRNAs complexed with the catalytically inactive Argonautes 1, 3, and 4 can still be titrated by
these sites without cleavage of the sponge RNA. More effective are bulged sites mispaired
opposite miRNA positions 9-12 (Ebert et al. 2007, Gentner et al. 2009), presumably because they
form a more stable interaction with the miRNA, including miRNA complexed with Ago2.
Typical sponge constructs contain 4-10 binding sites separated by a few nucleotides each.
Increasing the number of binding sites may have diminishing marginal utility, as each site
increases the probability of sponge RNA degradation. Variations in the bulged mismatches and
the spacers can be introduced to reduce the risk of recombination during cloning and reduce the
risk of introducing multiple unintended binding sites for other regulatory factors. Sites should be
placed in an unstructured, non-coding region of the RNA; for PolIll-generated sponges, terminal
stem-loops can be included as stabilizing elements (Ebert et al. 2007).
Expression and delivery: To maximize sponge expression, the strongest available promoter for
the cell type of interest should be used. For transient assays, plasmid transfection can deliver the
highest dose of the sponge transgene. For longer-term assays, viral transduction with high
multiplicity of infection should be performed. In vivo delivery can be achieved with adenoviral or
adeno-associated viral (AAV) vectors; AAV vectors may be ideal given their ability to infect
non-dividing cells and give high expression from a non-random integration site (Kotin et al.
1990). It should be noted that optimized sponges may still exhibit different degrees of inhibition
in different contexts: where miRNA concentration is very high, complete titration demands a very
high dose of sponge RNA. On the other hand, where the pool of endogenous targets for the
miRNA of interest is large, there should be less free miRNA available, so a lower dose of sponge
RNA should suffice to give strong inhibition.
Chapter 4. MicroRNAs generate gene expression thresholds with
ultrasensitive transitions
This chapter was written by Margaret S. Ebert and Shankar Mukherji and edited by
Phillip A. Sharp and Alexander van Oudenaarden.
MicroRNAs (miRNAs) are short, highly conserved non-coding RNA molecules that
repress gene expression in a sequence-dependent manner. Each miRNA is predicted
to target hundreds of genes (Lewis et al. 2005, Selbach et al. 2008, Baek et al. 2008,
Friedman et al. 2009) and a majority of protein-coding genes are predicted to be
miRNA targets (Friedman et al. 2009). Bulk measurements on populations of cells
have indicated that, although pervasive, repression due to miRNAs is on average
quite modest (-2-fold) (Selbach et al. 2008, Baek et al. 2008, Bartel and Chen 2004).
Information on the magnitude of repression in single cells, however, has been
lacking. Here we perform single-cell measurements using quantitative fluorescence
microscopy and flow cytometry to monitor a target gene's protein expression in the
presence and absence of regulation by miRNA. We find that while the average level
of repression is modest and in agreement with previous population-based
measurements, the repression among individual cells varies dramatically. In
particular, we show that regulation by miRNAs establishes a threshold level of
target mRNA below which protein production is highly repressed. Beyond this
threshold, there is a regime in which expression responds ultrasensitively to target
mRNA input until reaching high enough mRNA levels to almost escape repression
by miRNA. We constructed a mathematical model describing repression of target
gene expression by both non-catalytic and catalytic activity of miRNA. The model
predicted, and experiments confirmed, that the ultrasensitive regime could be
shifted to higher target mRNA levels by transfecting additional miRNA or by
increasing the number of miRNA binding sites in the 3' UTR of the target mRNA.
The ultrasensitive transition is not observed when the miRNA targets a perfectly
complementary site that can undergo catalytic cleavage. These results demonstrate
that even a single species of miRNA can act both as a switch to effectively silence
gene expression and as a fine-tuner of gene expression.
Introduction
MicroRNAs regulate protein synthesis in the cell cytoplasm by promoting target
mRNAs' degradation or inhibiting their translation. Their importance is suggested by
their abundance, with some miRNAs expressed as high as 50,000 copies per cell (Lim et
al. 2003); by their sequence conservation, with some miRNAs conserved from sea
urchins to humans (Grimson et al. 2008); and by their number of targets, the majority of
protein-coding genes (Friedman et al. 2009). miRNAs can regulate a large variety of
cellular processes, from differentiation and proliferation to apoptosis (Yi et al. 2008,
Sluijter et al. 2010, Esau et al. 2004, Cimmino et al. 2005, Li and Carthew 2005,
Bernstein et al. 2003). Further, miRNAs also confer robustness to systems by stabilizing
gene expression during stress and in developmental transitions (Li et al. 2009, Li et al.
2006).
Results and Discussion
Despite the evidence for the importance of gene regulation by miRNAs, the typical
magnitude of observed repression by miRNAs is relative small (Baek et al. 2008), with
some notable exceptions such as the switch-like transitions due to miRNAs lin-4 and let7 targeting the heterochronic genes lin-14 and lin-41 respectively in Caenorhabditis
elegans (Bagga et al. 2005). Importantly however, most of the previous studies of
regulation by miRNAs in mammalian cells have measured population averages, which
often obscure how individual cells respond to signals (Raj and van Oudenaarden 2008).
To assay for miRNA activity in single mammalian cells, we constructed a two-color
fluorescent reporter system that permits simultaneous monitoring of protein levels in the
presence and absence of regulation by miRNA (Fig. 1a). The construct consists of a
bidirectional Tet-inducible promoter driving two genes expressing the fluorescent
proteins mCherry and eYFP tagged with nuclear localization sequences. The 3' UTR of
mCherry is engineered to contain N binding sites for miRNA regulation. In the initial
experiments, the inserted sites are recognized by miR-20, which is expressed
endogenously in Hela cells along with its seed family members miR-17-5p and miR106b. The 3' UTR of eYFP is left unchanged so that it can serve as a reporter of the
transcriptional activity in a single cell.
We constructed cell lines that stably expressed the fluorescent reporter construct with
either a single bulged miR-20 binding site or no site in the mCherry 3' UTR. The levels
of eYFP and mCherry protein were measured for single cells using quantitative
fluorescence microscopy. Arranging individual cells according to their eYFP expression
level, we observed that cells whose mCherry 3' UTR lacks miRNA binding sites had a
concomitant increase in mCherry expression (Fig. Ib). This indicates that in the absence
of miRNA targeting of the mCherry mRNA, the level of expression of eYFP is directly
related to the level of expression of mCherry. However, in cells with a miR-20 site in the
mCherry 3' UTR, the eYFP fluorescence initially increases with no corresponding
increase in mCherry expression level (Fig. 1c). To capture this behavior quantitatively,
we measured joint distributions of mCherry and eYFP levels in single cells, binned the
single cell data according to their eYFP levels, and calculated the mean mCherry level in
each eYFP bin (Supplementary Fig. 1). We refer to this binned joint distribution as the
transfer function. As suggested by the representative single cells shown in Fig. 1c, the
transfer function shows a threshold-linear behavior in which the mCherry level, which
represents the target protein production, does not appreciably rise until the curve reaches
a threshold level of eYFP.
We developed a simple mathematical model of miRNA-mediated regulation that could
reproduce the nonlinearity in the above transfer function. This model (Fig. 2a) is similar
to previous models (Elf et al. 2003) used to describe protein-protein titration (Buchler and
Louis 2008) and small RNA (sRNA) regulation in bacterial systems (Levine et al. 2007).
It describes the concentration of free target mRNA (r) subject to regulation by miRNA
(m). We assume that only r can be translated into protein. Experimentally, we expect the
mCherry signal to be proportional to the concentration of r, and the eYFP signal to be
proportional to the concentration of runtargeted.The core of the model involves the binding
of r to m to form a mRNA-miRNA complex and the release of m from the complex back
into the pool of active miRNA molecules either with or without the accompanying
destruction of r. We assume that the total amount of miRNA is fixed; experimentally we
observe no decrease in the miR-20 level beyond experimental uncertainty as a function of
eYFP (see Supplementary Fig. 2). The qualitative shape of the transfer functions
generated by the model depends on two key lumped parameters. The first parameter X,
which behaves like a dissociation constant, governs the sharpness of the threshold (Fig.
2b). On a log-log plot relating r to runtargeted(Fig. 2d) the increased sharpness manifests
itself as a slope (which we refer to as the logarithmic gain) greater than 1, marking an
ultrasensitive transition connecting the branches of the transfer function of slope 1 that
indicate little protein expression (below the ultrasensitive transition) and nearly maximal
protein production (above the ultrasensitive transition). k is inversely proportional to the
rate at which miRNA binds the target mRNA (ko,); as ko, increases at a constant kff, k
decreases and thus sharpens the transition. The threshold constant 0 plays a role in the
placement of the threshold and also in the sharpness of the transition between the
threshold and escape regimes (Fig. 2c). 0 is proportional to the concentration of free
miRNA available within the cell; as the total concentration of free miRNAs increases, 0
increases and pushes the ultrasensitive transition to higher values of runtargeted(Fig.2e).
The mathematical model thus suggests experiments that could be performed to modulate
the ultrasensitive transitions generated by miRNA-mediated regulation. As our stable cell
lines could not achieve high enough levels of reporter expression to capture the complete
ultrasensitive transition to escape from miRNA-mediated repression, we carried out the
remainder of our experiments by transiently transfecting HeLa cells with reporter
constructs and measuring fluorescence via flow cytometry to increase the number of cells
in our datasets.
To sharpen the transitions by increasing kon we increased the number of miRNA binding
sites N in the 3' UTR of mCherry. The maximum logarithmic gain increases smoothly
from approximately 1 when N= 1 to 1.8 when N = 7 (Fig. 3a); as expected from the
model, the effect is stronger going from 1 to 4 binding sites than from 4 to 7 sites. We
were also able to recapitulate a similar transfer function with N= 7 in the 3' UTR of
eYFP, thus isolating the effect to miR-20 mediated regulation rather than any property
intrinsic to the mCherry reporter (Supplementary Fig. 3). Interestingly, unlike with
previous experiments using bacterial sRNA (Levine et al. 2007), we can also directly test
the importance of titration to generate thresholds by using miR-20 binding sites that are
perfectly complementary to the endogenous miR-20, thus converting the interaction
between target and miRNA into a strongly catalytic, RNAi-type repression. We observe
that when the miR-20 bulged binding sites are replaced by a perfectly complementary
binding site that yields the same maximum repression as N= 7 bulged sites, the
ultrasensitive transition is abolished altogether (Fig. 3a, grey points).
To measure the fold repression as a function of target expression level, we measure the
transfer function in the absence of miR-20 binding sites and calculate the ratio of this
control transfer function to transfer functions in the presence of 1, 4, and 7 miR-20 sites
(Fig. 3b). As expected from Fig. 3a, increasing the number of binding sites increases the
fold repression at lower eYFP levels, from just over 2-fold repression with a single miR20 site to approximately 10-fold repression with seven miR-20 sites, while not
significantly changing the fold repression at high eYFP (Fig. 3b). Seen this way, we
demonstrate that rather than being only a subtle effect as suggested by population-based
averages, which in this case results in at most 2.5-fold repression with seven binding sites
(Supplementary Fig. 4), regulation by miR-20 can exert very strong repression of protein
production at low target transcript levels. Moreover the boundary of the regime of
strongest repression is marked by the ultrasensitive transition, so shifting this transition to
lower or higher target mRNA levels can be of functional significance.
Consistent with the model, the ultrasensitive transition can be shifted to either higher or
lower eYFP levels by transfecting either miR-20 mimic oligonucleotides (siRNAs) or
miRNA sponges that inhibit miR-20 activity (Ebert et al. 2007) (Fig. 3c, d;
Supplementary Fig. 5). Increasing the level of miRNA increased the fold repression
below the threshold; the threshold mRNA level needed for protein expression; and the
sharpness of the transition. In the extreme case of seven miR-20 binding sites with 30 nM
miR-20 mimic transfected (Fig. 3d), miRNA-mediated repression can achieve ~40-fold
repression compared to a target with no miRNA binding site; the threshold is shifted to a
10-fold higher eYFP level; and the transition between repressed and unrepressed
expression is quite sharp with a maximum logarithmic gain of ~5.4 (Fig. 3d), compared
to ~1.8 without the transfected miR-20 mimic, i.e. endogenous levels (Fig.3a). To
quantitatively compare the data to the model, we simultaneously fit all the datasets
holding k constant across the fits for particular values of N and 0 constants for a
particular amount of transfected siRNA mimic. Interestingly, we see that the fit parameter
0 increases with increasing siRNA mimic (Fig. 3e), but in a saturable fashion, while 1/k
increases linearly with N (Fig. 3f). This suggests that the amount of transfected miRNA
entering functional complexes is limited by entry into the cytoplasm and/or availability of
miRNP components.
In order to test the generality of these findings, that the strength of repression of a
miRNA target depends strongly on the relative amounts of the miRNA and its target, we
sought to recapitulate the results in more physiological settings. First, we tested whether
similar ultrasensitive transitions would be observed when the reporter construct
incorporated naturally occurring miRNA binding sequences by fusing the 3' UTRs of the
oncogene HMGA2 and the major GABA transporter gene SLC6AI to the mCherry
reporter and performing dual-color flow cytometry. The HMGA2 3' UTR contains seven
binding sites for the miRNA family let-7, which is abundant in HeLa cells, while
SLC6A 1 contains three binding sites for the neuronal miRNA miR-218, which we
supplied exogenously. The experiments showed that we could indeed observe
ultrasensitive transitions with these constructs (Supplementary Fig. 6) and for HMGA2,
we increased the ultrasensitive threshold incrementally by transfecting higher doses of
let-7 siRNA mimic (Supplementary Fig. 6).
Finally we used a standard dual luciferase assay (see Methods, Supplementary Fig. 7) to
measure target expression in mouse embryonic stem cells (ES cells) using only their
endogenous pool of miRNA to retain physiological relevance. Furthermore, we measured
a transfer function complementary to that in the experiments with Hela cells: the mRNA
target level remained fixed while the miRNA concentration varied. To test varying
miRNA concentrations we exploited the fact that different miRNA species are present at
different abundances in ES cells (Calabrese et al. 2007). Finally, to gauge the strength of
miRNA repression, target expression in wild-type ES cells was normalized to target
expression in ES cells that lack the enzyme Dicer and thus contain no miRNAs. We
observe a similar threshold-linear curve except that it reflects the level of miRNAs
(Supplementary Fig. 7): at high miRNA abundances, repression is 5-fold but decreases
with miRNA abundance until at the lowest miRNA levels target expression in wild-type
cells is virtually indistinguishable from that in the miRNA-free Dcr' cells.
The threshold in regulation by miRNA is determined by the level of the miRNA and by
the number and affinity of the target sites. Taking the case described above for regulation
by endogenous miR-20 in Hela cells, the threshold transition occurs at approximately 60
target mRNAs per cell with seven typical sites in the 3' UTR at an endogenous level of
approximately 2,000 miR-20 molecules per cell (Supplementary Fig. 2, Supplementary
Fig. 8). Many of these miRNAs as miRNP complexes could be bound to the endogenous
miR-20 target mRNAs in the cell, leaving a limited pool for binding to the reporter
mRNAs. Since these experiments are done at steady state conditions, this suggests that
the miRNA system probably has limited capacity to accommodate increases in target
populations. These results are consistent with our ability to strongly suppress miR-20
regulation of the target reporter by adding high levels of miR-20 target sites in the form
of an exogenous sponge inhibitor (Ebert et al. 2007) (Supplementary Fig. 5). The sponge
phenomenon has been observed in multiple mammalian (Edbauer et al. 2010;
Starczynowski et al. 2010) and non-mammalian (Loya et al. 2009) organisms indicating
its generality in miRNA regulation.
Our analysis of miRNA-mediated gene regulation at high target expression levels is
consistent with previous population-based results, but measuring single cells offers a
level of detail inaccessible to bulk assays. The detailed picture, which revealed the
ultrasensitive response bounded by a high degree of repression at low target mRNA
levels and little repression at high levels of target mRNA, may have important
implications for miRNA-mediated regulation. There has been disparity between the
concept of miRNAs as switches, exemplified by the lin-14 developmental switch in
Caenorhabditiselegans where there is a high degree of repression by the miRNA lin-4,
versus many observations of miRNA-mediated regulation in mammalian cells where they
are best considered as fine-tuners of gene expression. These results show that for some
miRNA-target interactions, the miRNA behaves both as a switch, in the target expression
regime below the threshold, and as a fine-tuner, in the ultrasensitive transition between
the threshold and the minimal repression regime at high mRNA levels.
The target expression thresholds generated by miRNAs could be important in
development. Ultrasensitivity characterizes developmental switches such as cell fate
decisions. To maintain their identity, differentiated cells must be able to distinguish
between leaky and legitimate transcripts. In addition to participating in feedback and
feed-forward networks (Tsang et al. 2007, Stark et al. 2005), tissue-specific miRNAs
could use molecular titration to set a threshold below which transcripts would be treated
as leaky. Such a phenomenon is consistent with the reported tendency of Drosophila
miRNAs to target mRNAs that are highly expressed in neighboring tissues derived from a
common progenitor (Stark et al. 2005), and with the observed tendency of mammalian
miRNAs induced upon differentiation to target mRNAs that were highly expressed in the
previous developmental stage (Farh et al. 2005). The ultrasensitive transition would
minimize the range of uncertainty between leaky and legitimate messages. Decisive onoff regulation of gene expression is necessary in differentiation and in the continual
reinforcement of cell/tissue identity throughout the life of the animal.
Methods
Reporter plasmid construction:
Fluorescent reporters were cloned into pTRE-Tight-BI (Clontech). NLS sequences
(ATGGGCCCTAAAAAGAAGCGTAAAGTC) were appended to the N-terminus of the
eYFP and mCherry ORFs (Clontech) by PCR. The NLS-eYFP was inserted with EcoRI
and NdeI. The NLS-mCherry was inserted with BamHI and Clal. Regulatory elements
were placed into the eYFP 3' UTR with NdeI and XbaI; they were placed into the
mCherry 3' UTR with Clal and EcoRV. N= 1 bulged miR-20 binding site
(TACCTGCACTCGCGCACTTTA) was appended by PCR. N= 4 and N= 7 miR-20
sites, separated by CCGG spacers, were PCR-amplified from miR-20 sponge constructs
(Ebert et al. 2007). All constructs were sequence-confirmed. HMGA2 w.t. and seedmutant 3' UTRs (Mayr et al. 2007) were a gift from Christine Mayr, David Bartel lab.
The SLC6A1 3' UTR fragment (nt 703-2041) was PCR-amplified from human genomic
DNA.
Generation of stable lines:
Reporter plasmids were linearized with Asel and cotransfected at 20:1 ratio with linear
puromycin marker (Clontech). Transfected cells were selected in 2.5 pg/ml puromycin
with 200 pig/mi G418. Individual eYFP-positive colonies were isolated, grown, and
sorted for eYFP-positivity upon dox induction (MoFlo instrument, DAKO-Cytomation).
Fluorescence microscopy:
Cells were plated on glass-bottomed Nunc chambers (#1), induced with dox for 4 days,
and imaged in a Nikon TEI-2000 inverted fluorescence microscope with a Princeton
Instruments Pixis back-cooled CCD camera. Images were processed using custom
software in MATLAB. Briefly, following subtraction of camera background and any
cellular autofluorescence, pixel values in both eYFP and mCherry channels
corresponding to cells expressing the construct were extracted. The single-cell data were
then binned along the eYFP axis. Figure 1d reports the result of this binning procedure;
the error bars are the standard errors of the mean within its corresponding bin.
Transient transfection:
Tet-On HeLa cells (Clontech) below passage 10 were plated in G418 (Gibco) 200 ptg/ml
and doxycycline (Sigma) 1 pig/ml media in 12-well dishes the day before transfection.
Reporter plasmids were diluted 1:50 in pUC18b carrier plasmid (Qiagen HiSpeed
maxipreps) and mixed with DreamFect Gold (Oz Biosciences), 8 pil reagent and 2 ptg
DNA per well. miR-20a, let-7b, and miR-218 mimics (Dharmacon) were cotransfected at
the indicated concentrations. For U6 sponge assays, reporter plasmids were diluted 1:50
in sponge plasmid. Media was changed 24hr post-transfection. Assays were performed
48hr post-transfection. Reporter transfections were also performed with Lipofectamine
2000 (Invitrogen) with the same results.
Flow cytometry:
Cells were run on LSRII analyzer (Becton Dickinson) with FACSDiva software. The raw
FACS data were analyzed with FlowJo to gate cells according to their forward (FSC-A)
and side (SSC-A) scatter profiles; specifically we chose cells near the peak of the (FSCA, SSC-A) distribution. Untransfected cells were used to characterize the cellular
autofluorescence in the LSRII analyzer from which we obtain the mean and standard
deviation of the autofluorescence distribution. Each cell's eYFP and mCherry
fluorescence values were subtracted by the mean autofluorescence plus twice the
standard deviation. Following background subtraction, cells with eYFP fluorescence
levels less than 0 (i.e. indistinguishable from background) were excluded from further
analysis and mCherry fluorescence levels less than 0 were set equal to 0. The single-cell
data were then binned in the same manner as described above.
Fluorescence-activated cell sorting:
Cells transfected with the N= 0 or N= 7 reporter were sorted 48hr post-transfection into
low and high fractions using a MoFlo high-speed sorting instrument (DAKOCytomation). Cell pellets were washed and snap-frozen before RNA isolation.
RT-PCR:
Total RNA was harvested using RNeasy Micro Plus kit with the protocol modified for
inclusion of small RNAs (Qiagen). RNA was treated with DNaseI (Ambion) and reversetranscribed with oligo-dT primer using MMLV RTase (Ambion). qPCR for mCherry and
eYFP was performed in triplicate reactions using SYBRGreen mix (Applied Biosystems),
run on an Applied Biosystems 7500 Real-Time PCR instrument. Single-stranded DNA
standards spiked into untransfected cell cDNAs were used for estimation of mCherry
mRNAs per cell. miR-20 was measured with miScript RT-PCR assay (Qiagen) in
quadruplicate reactions using miR-31 and snoRNA as controls.
Small RNA Northern blot:
Total RNA was extracted from transfected cells with TRIzol (Invitrogen). 24 ptg of total
RNA was run on 12% polyacrylamide gel (UreaGel system, National Diagnostics), with
miR-20 mimic as a standard, spiked into yeast sheared total RNA (Ambion). The blot
was probed for miR-20a and tRNAgn as a loading control. Quantitation of bands was
performed with ImageJ.
mES cell luciferase assays:
Reporters were constructed by insertion of two bulged binding sites into the 3' UTR of
CMV Renilla luciferase. Cells were transfected in triplicate in 24-well plates with 2 pl
Lipofectamine 2000 (Invitrogen), 0.01 pg of CMV-Renilla plasmid, 0.1 pg of pGL3
(Promega), and 0.69 pg of pWS (carrier plasmid). Cells were lysed and assayed 24hr
post-transfection by Dual Luciferase reporter assay (Promega) using a Glomax 20/20
luminometer (Promega).
References
Baek D, Villen J, Shin C, Camargo FD, Gygi SP, Bartel DP. The impact of microRNAs
on protein output. Nature 455, 64-71 (2008).
Bagga S, Bracht J, Hunter S, Massirer K, Holtz J, Eachus R, Pasquinelli AE. Regulation
by let-7 and lin-4 miRNAs results in target mRNA degradation. Cell 122, 553-563
(2005).
Bartel DP, Chen CZ. Micromanagers of gene expression: the potentially widespread
influence of metazoan microRNAs. Nat. Rev. Genet. 5, 396-400 (2004).
Bernstein E, Kim SY, Carmell MA, Murchison EP, Alcorn H, Li MZ, Mills AA, Elledge
SJ, Anderson KV, Hannon GJ. Dicer is essential for mouse development. Nat. Genet. 35,
215-217 (2003).
Buchler N, Louis M. Molecular titration and ultrasensitivity in regulatory networks. J.
Mol. Biol. 384, 1106-1119 (2008).
Calabrese JM, Seila AC, Yeo GW, Sharp PA. RNA sequence analysis defines Dicer's
role in mouse embryonic stem cells. Proc. Natl Acad. Sci. USA 104, 18097-18102
(2007).
Cimmino A, Calin GA, Fabbri M, Lorio MV, Ferracin M, Shimizu M, Wojcik SE,
Aqeilan RI, Zupo S, Dono M, Rassenti L, Alder H, Volinia S, Liu CG, Kipps TJ, Negrini
M, Croce CM. miR- 15 and miR- 16 induce apoptosis by targeting Bcl2. Proc. Natl Acad.
Sci. USA 102, 13944-13949 (2005).
Ebert MS, Neilson JR, Sharp PA. MicroRNA sponges: competitive inhibitors of small
RNAs in mammalian cells. Nat. Meth. 4, 721-726 (2007).
Edbauer D, Neilson JR, Foster KA, Wang CF, Seeburg DP, Batterton MN, Tada T, Dolan
BM, Sharp PA, Sheng M. Regulation of Synaptic Structure and Function by FMRPAssociated MicroRNAs miR-125b and miR-132. Neuron 65, 373-384 (2010).
Elf J, Paulsson J, Berg OG, Ehrenberg M. Near-critical phenomena in intracellular
metabolite pools. Biophys. J. 84, 154-170 (2003).
Esau C, Kang X, Peralta E, Hanson E, Marcusson EG, Ravichandran LV, Sun Y, Koo S,
Perera RJ, Jain R, Dean NM, Freier SM, Bennett CF, Lollo B, Griffey R. MicroRNA-143
regulates adipocyte differentiation. J. Biol Chem. 279, 52361-52365 (2004).
Farh KK, Grimson A, Jan C, Lewis BP, Johnston WK, Lim LP, Burge CB, Bartel DP.
The widespread impact of mammalian microRNAs on mRNA repression and evolution.
Science 310, 1817-1821 (2005).
Friedman RC, Farh KK, Burge CB, Bartel DP. Most Mammalian mRNAs are conserved
targets of microRNAs. Genome Res. 19, 92-105 (2009).
Grimson A, Srivastava M, Fahey B, Woodcroft BJ, Chiang HR, King N, Degnan BM,
Rokhsar DS, Bartel DP. Early origins and evolution of microRNAs and Piwi-interacting
RNAs in animals. Nature 455, 1193-1197 (2008).
Levine E, Zhang Z, Kuhlman T, Hwa T. Quantitative characteristics of gene regulation
by small RNA. PLoS Biol. 5, e229 (2007).
Lewis BP, Burge CB, Bartel DP. Conserved seed pairing, often flanked by adenosines,
indicates that thousands of human genes are microRNA targets. Cell 120, 15-20 (2005).
Li X, Carthew RW. A microRNA mediates EGF receptor signaling and promotes
photoreceptor differentiation in the Drosophila eye. Cell 123, 1267-1277 (2005).
Li X, Cassidy JJ, Reinke CA, Fischboeck S, Carthew RW. A microRNA imparts
robustness against environmental fluctuation during development. Cell 137, 273-282
(2009).
Li Y, Wang F, Lee JA, Gao FB. MicroRNA-9a ensures the precise specification of
sensory organ precursors in Drosophila. Genes Dev. 20, 2793-2805 (2006).
Lim LP, Lau NC, Weinstein EG, Abdelhakim A, Yekta S, Rhoades MW, Burge CB,
Bartel DP. The microRNAs of Caenorhabditis elegans. Genes Dev. 17, 991-1008 (2003).
Loya CM, Lu CS, Van Vactor D, Fulga TA. Transgenic microRNA inhibition with
spatiotemporal specificity in intact organisms. Nat. Meth. 6, 897-903 (2009).
Mayr C, Hemann MT, Bartel DP. Disrupting the pairing between let-7 and Hmga2
enhances oncogenic transformation. Science 315, 1576-1579 (2007).
Raj A, van Oudenaarden A. Nature, nurture, or chance: stochastic gene expression and its
consequences. Cell 135, 216-226 (2008).
Selbach M, Schwanhausser B, Thierfelder N, Fang Z, Khanin R, Rajewsky N.
Widespread changes in protein synthesis induced by microRNAs. Nature 455, 58-63
(2008).
Sluijter JP, van Mil A, van Vliet P, Metz CH, Liu J, Doevendans PA, Goumans MJ.
MicroRNA- 1 and -499 regulate differentiation and proliferation in human-derived
cardiomyocyte progenitor cells. Arterioscler. Thromb. Vasc. Biol. 30, 859-868 (2010).
Starczynowski DT, Kuchenbauer F, Argiropoulos B, Sung S, Morin R, Muranyi A, Hirst
M, Hogge D, Marra M, Wells RA, Buckstein R, Lam W, Humphries RK, Karsan A.
Identification of miR- 145 and miR- 146a as mediators of the 5q- syndrome phenotype.
Nat. Med. 16, 49-58 (2010).
Stark A, Brennecke J, Bushati N, Russell RB, Cohen SM. Animal microRNAs confer
robustness to gene expression and have a significant impact on 3' UTR evolution. Cell
123, 1133-1146 (2005).
Tsang J, Zhu J, van Oudenaarden A. MicroRNA-mediated feedback and feedforward
loops are recurrent network motifs in mammals. Mol. Cell 26, 753-767 (2007).
Yi R, Poy MN, Stoffel M, Fuchs E. A skin microRNA promotes differentiation by
repressing 'stemness'. Nature 452, 225-229 (2008).
Acknowledgments
This work as supported by an NIH Director's Pioneer Award to A.v.O.
(1DP1OD003936); and by United States Public Health Service grants ROl-ROlCA 133404 from the National Institutes of Health, PO1-CA42063 from the National
Cancer Institute and partially by Cancer Center Support (core) grant P30-CA14051 from
the National Cancer Institute (to P.A.S.) M.S.E. was supported by a HHMI Predoctoral
Fellowship and a Paul and Cleo Schimmel Scholarship. G.X.Z. and J.S.T. were partially
supported by Natural Sciences and Engineering Research Council of Canada Post
Graduate Scholarships. We thank Gregor Neuert for help with cloning the reporter genes,
Koch Institute flow cytometry staff for training and cell sorting, and David Bartel for
helpful discussions.
Author Contributions
M.S.E., J.S.T., P.A.S. and A.v.O. conceived the project. M.S.E., S.M. and G.X.Z.
performed the experiments. S.M. and M.S.E. analyzed the data. S.M. performed the
modeling. S.M., M.S.E., A.v.O. and P.A.S. interpreted the results and wrote the paper.
u
uuumuinii
Figures
Figure 1: Quantitative fluorescence microscopy reveals microRNA-mediated gene
expression threshold.
a. The two-color fluorescent reporter construct consists of a bidirectional Tet promoter that coregulates the enhanced yellow fluorescent protein (eYFP) and mCherry. Each fluorescent protein
is tagged with a nuclear localization sequence (NLS) to aid in image analysis. The 3' UTR of the
mCherry gene is engineered to contain N binding sites for the microRNA mir-20.
b. Sample fluorescence microscopy data from representative single cells stably expressing eYFP
and mCherry both in the presence and absence of regulation of mCherry by miR-20. The cells are
arranged according to eYFP intensity.
c. Transfer function relating eYFP to mCherry generated by binning according to eYFP intensity
and plotting the mean mCherry in each bin (see Supplementary Fig. 1).
PTRE-Tight
N miR-20
binding site(s)
3'-UTR
A
AN=O
AA
AN=1
(TACCTGCACTCGCGCACTTTA )z
AA
AIL
LAM
3'-UTR
6
A
A
AA A
AM
A.O
AAA
A
0
25
50
75
eYFP (a.u.)
100
Figure 2: Biochemical model of microRNA-mediated gene regulation.
a. The model describes the steady state level of mRNA free to be translated (r), which we
experimentally observe as the mCherry signal, as a function of transcriptional activity (runtargeted),
which we experimentally observe as eYFP. miRNA and mRNA bind with rate kon, unbind with
rate koff, and result in mRNA decay, but not miRNA decay, with rate yr*.
b. Steady state solutions for r as a function of runtargeted for various values of konc. Steady state solutions for r as a function of runtargeted for various values of [miRNA]total.
d, e. Same as b and c except depicted in log-log axes. The slope of the log-log curve is known as
the logarithmic gain. Notably, thresholds in the linear representation appear as segments with
logarithmic gain greater than 1 in the log-log representation. Increasing kon increases the
maximum logarithmic gain, but does not change its position along the ru.ntargeted axis, while
increasing [miRNA]ttai increases the maximum logarithmic gain and shifts it to higher levels of
runtargeted-
........
....................................
kR
translation
free mRNA (r)
miRNA-mRNA complex (r*)
YR*
runtargeted
runtargeted
e
CD
0
og0o(runtargeted)
1
og0o(runtargeted)
1
Figure 3: Modulating the ultrasensitive response.
a. Log-log transfer functions for N = 0, 1, 4, and 7. Additionally, we can abolish the ultrasensitive
response by using a miR-20 binding site that is perfectly complementary to miR-20.
b. Ratio of N = 0 transfer function to N= 1, 4, and 7 transfer functions, depicting the fold
repression as a function of eYFP expression.
c, d. Effects of titrating defined amounts of miR-20 mimic siRNA on the transfer function for N
4 (c) and N= 7 (d).
e, f. Following simultaneous fitting of all transfer function data to the quantitative model, the
fitting parameter 0, proportional to the total amount of active miR-20 in the cell, is plotted against
the amount of miR-20 mimic transfected (e), and 1/X, proportional to the rate of mCherry-miR-20
association, is plotted against N (f).
a
*NM1
*N=4
5
-N-
0
4
E
Pa-r
0
3
0
x 104
C
1
2
eYFP
3
4
d
*Mao
N 4 3nM mimic
5
+nM
mimic
SNWmmi
"
c>
(>
4
E
E
0
0
C>
3
3
4
5
og1 0(eYFP)
e
3
4
5
loglo(eYFP)
f
10
Z
0
8
6
4
OD2
Xi 10
0x io
0
10
20
[miRNA]transe
30
(nM)
Mukherji et al. - Fig. 3
-11
,
111111- -
-
MMMW*W"!M-
Supplementary Information
Figure Si: Data processing.
a. Each cell's raw eYFP and mCherry intensities, either from fluorescence microscopy (not
shown) or flow cytometry (shown here), are plotted.
b. Then the background, autofluorescent levels of eYFP and mCherry are subtracted. The
background-corrected correlation data are then binned according to eYFP levels, and for each
eYFP bin its mean mCherry signal is calculated; this binned curve is depicted in c.
subtract
autofluorescent
background
1og1 (eYFP)
calculate mean
mCherry in each
eYFP bin
C
E
1og1 (eYFP)
Mukherjj et a1 - Fig 81
NNW-
-
Figure S2: miR-20 expression in Tet-On HeLa cells.
a. Absolute miR-20 expression measured by northern blot. Total RNA from Tet-On HeLa cells
transfected with various reporter constructs was probed for miR-20 expression compared to a
standard curve of miR-20 mimic spiked into yeast RNA. tRNAgin serves as a loading control.
b. Relative miR-20 expression above and below the threshold measured by RT-PCR. Cells
transfected with the N= 7 target reporter or the N= 0 control reporter were sorted into low and
high fractions. Total RNA was assayed for miR-20 and normalized to miR-31 as a loading
control. Bar height and error bars represent the average relative normalized miR-20 value in the
high fraction compared to the low fraction and the s.e.m. of three RT-PCR assays.
a
copies/cell
1
b
3:
30 nt
C 0
miR-20a
1
0.9
----
-
.
0.8
O0
20 nt
E
0.6
0.5
0 0.3
~
N=O
tRNAin
Mukherji et al. - Fig. S2
N=7
-
-- - -
-----------------...
.
............
.
.. .......
Figure S3: Dye swap control.
The binding site region from the N = 7 reporter was fused onto the 3' UTR of eYFP instead of
mCherry. Cells transfected with this construct were assayed by flow cytometry at 48hr posttransfection.
-
slope= 1
*
2gY
N=7
*
/
Iogj(eYFP)
An i
1
,
I
-
Figure S4: Average fold repression as a function of N.
Using the flow cytometry data, we compute the ratio of the mean eYFP level to the mean
mCherry level for N= 1, 4, and 7. We then normalize this ratio by the mean eYFP to mean
mCherry ratio for N= 0; we refer to this normalized ratio as the fold repression. Error bars are
estimated by bootstrapping.
3
0
L
422
NV=1I
N=-4
N=7I
Figure S5: Inhibition of endogenous miR-20 using miRNA sponges.
Reporter plasmids were cotransfected with Pol 111-driven sponges containing seven CXCR4
control sites or seven bulged miR-20 binding sites. Samples were assayed by flow cytometry at
48hr post-transfection.
a. N= 0 reporter.
b. N = 1 reporter.
c. N = 1 perfect reporter.
d. N= 4 reporter.
e. N = 7 reporter.
a
+ control
sponge
+ miR-20 sponge
+ control sponge
V + control sponge
+ miR-20 sponge
0 + miR-20 sponge
5
5
-c
04
4
03
3 :N=O
3
4
3
5
log 1 (eYFP)
d
4
4
5
+control
sponge
5
()
4
04
control sponge
*+
+ miR-20 sponge
C
5
log 10(eYFP)
1og1 (eYFP)
*
+ miR-20 sponge
5
0
E
E
0
3
4
5
1og1 (eYFP)
4
03
3
4
5
log 10(eYFP)
Mukherji et al - Fig. S5
Figure S6: Ultrasensitivity in endogenous 3' UTRs.
a. The 3' UTR of HMGA2 or a version with the seven let-7 seed matches mutated was fused to
mCherry. The reporters were cotransfected with varying concentrations of let-7b mimic. Cells
were assayed by flow cytometry 48hr post-transfection.
b. The 3' UTR of SLC6A1, which contains three seed matches for miR-218, was fused to
mCherry. The reporter was transfected with or without miR-218 mimic. Cells were assayed by
flow cytometry 48hr post-transfection.
a
3
2.5
3.5
4
4.5
Iogje(eYFP)
logio(eYFP)
0
MutantHMGA23'UTR
*
SLC6AI 3'UTR
*
+
" HMGA2 3'UTR
HMGA2 3' UTR + 1OnMlet-7 mimic
5
30nM miR-218 mimic
" HMGA2 3' UTR + 31nM let-7 mimic
* HMGA2 3' UTR + 1OOnM
let-7 mimic
Mukherji et al - Fig S6
..........
Figure S7: Fold repression as a function of microRNA abundance.
a. Schematic depicting dual luciferase assay used to measure fold repression in mES cells. The 3'
UTR of Renilla luciferase is re-engineered for each measurement to contain two binding sites for
different miRNAs.
b. Fold repression as a function of miRNA concentration in copies per cell.
aN=2
sites or
GXCR4 control
3' UTR
Transfect R-luc with 2
bulged miRNA sites or
CXCR4 control sites;
F-luc is the loading control
*
Measure expression of
construct with miRNA
sites relative to construct
with CXCR4 control sites
relative expression in
fold repression =
relative expression in
0
2 4 6 8 10 12 14
[miRNA]
x
103 per cell
relative expression in 4
relative expression in (
Mukherji et al - Fig.
7
Figure S8: mRNA quantitation above and below the threshold.
a. Cells transfected with the N = 7 target reporter were sorted into low and high fractions
separated by the ultrasensitive transition. Plots are in log-log scale.
b. Corresponding low and high fractions were collected from cells transfected with the N = 0
control reporter.
c. RT-PCR from the N= 0 control reporter's sorted fractions. The absolute mCherry mRNA
levels were estimated by making a standard curve using a DNA oligo spanning the mCherry
cDNA's amplicon, spiked into cDNA from untransfected cells. Shown are the average mCherry
mRNAs per cell +/- s.e.m. from three RT-PCR assays. The threshold for the N = 7 target reporter
is represented by the average mCherry mRNAs per cell present in the corresponding low fraction
of the untargeted control reporter.
d. mRNA knockdown above and below the threshold. The relative mCherry mRNA expression in
the N= 0 and N = 7 low and high fractions was calculated by normalizing the mCherry RT-PCR
signal to the eYFP RT-PCR signal in each fraction. Bar height and error bars indicate the average
relative mCherry to eYFP value and s.e.m. of four RT-PCR assays.
a
I
LG
1, FI1 5 R2
104
R,
AS
.90
P.7
.0
6
"VF
Fraction
N= 0 low
mRNAs per cell
56+/-36
0.6
N= 0 high
1066 +/- 472
0.2
N-0 low
N-7 low
N-0 high
Mukherji et a/. - Fig. S8
N-7 high
Molecular titration model of miRNA-mediated gene regulation
In order to describe our data, we devised a simple mathematical model of the biochemistry of
miRNA-mediated gene regulation. The model is largely similar to models of protein-protein
interactions proposed by Buchler and Louis as well as models of sRNA regulation of expression
proposed by Levine et al. The model describes the time evolution of the target mRNA free of
miRNA (r) and the target mRNA bound by miRNA (r*) and assumes that the turnover of miRNA
is slow compared to the timescale of gene expression so that it can be held constant. The model
consists of the following set of coupled, first-order, ordinary differential equations and the
conservation relation for miRNA:
dr
-=
-k
kR
dt
0
r [miRNA] + ko,,r
- YRr
dr *
dt = konr [miRNA] - koffr* - yR r*
(2)
[miRNA]T = [miRNA] + r*
(3)
For the sake of simplicity, we assume that no translation can occur from the miRNA-bound target
mRNA such that for the purposes of protein production it is sufficient to track only the free target
mRNA (r). Solving for the steady-state level of r yields:
ir
=
~~Ungeted 2 ruaree
A
- 0+
rutXeta-
XO
2
0)2
+
4krentaqgete1 ]
+Xulree](4)
where:
kg
runtgeted =R
YR
YR*
+ off
ko
0=
[miRNA]total
YR
Just as in the Buchler and Louis and Levine et al. cases, when the dissociation constant (here
denoted by k) is small - meaning that the interaction strength is high between the miRNA and its
target - it is possible to achieve a threshold-linear relationship between the free target mRNA and
the total amount of mRNA (denoted by runtargeted, which in the experiments is reported by the
eYFP signal). In our case, because we allow recycling of the miRNA following destruction of its
bound target mRNA, the titration effect only becomes apparent when the rate at which free
miRNAs are removed from the system (kan) is much larger than the rate at which they reappear in
the system, which itself consists of two parts: unbinding of the miRNA from its target (koff) and
destruction of the target (YR*). In the most extreme case, for example, where kon >> koff+ YR* such
that k -> 0 one obtains:
r
2 [rurntargeted
I[untargeted
- 0
if
+
I runtargeted
runtageted
<
(untargeted
-
0)2]
-
0
(6)
0
(7)
untargeted
In this limit, we see that the constant 0 sets the level of expression at which the threshold takes
place.
Chapter 5. Roles for microRNAs in conferring robustness to
biological processes
This chapter was written by Margaret S. Ebert and edited by Phillip A. Sharp.
Biological systems use a variety of mechanisms to maintain their functions in the
face of environmental and genetic perturbations. Increasing evidence suggests that,
among their roles as post-transcriptional repressors of gene expression, microRNAs
(miRNAs) help to confer robustness to gene expression by reinforcing
transcriptional programs and attenuating leaky transcripts, and they may in some
contexts help suppress random fluctuations in transcript copy number. These
activities have important consequences for normal development and physiology,
disease, and evolution. Here we will discuss examples and principles of miRNAs
acting in networks that contribute to robustness in several animal systems.
Introduction
microRNAs (miRNAs) are -20-24-nucleotide-long hairpin-derived RNAs that posttranscriptionally repress the expression of target genes. As a class, miRNAs constitute
about 1-2% of the genes in worms, flies, and mammals (Bartel 2009). About 60% of
protein-coding genes are computationally predicted as targets based on conserved basepairing between the 3' UTR and the 5' region of the miRNA termed the seed (Friedman et
al. 2009). The diversity of miRNA expression increases over the course of embryonic
development (Thomson et al. 2006), and the diversity of the miRNA repertoire in animal
genomes has increased with increasing organismal complexity (Lee et al. 2007,
Heimberg et al. 2008). While many miRNAs and their target binding sites are deeply
conserved, suggesting important function, many of these interactions seem to produce
only very subtle repression (-2-fold), and many miRNAs can be knocked out without
creating any obvious phenotype (Leaman et al. 2005, Miska et al. 2007). As more
miRNA-target relationships are validated and more phenotypes are described, a view is
emerging that miRNAs evolved to play the role not of the primary decision-maker but
rather of the reinforcer, one that sharpens transitions and entrenches identities.
Robustness refers to a system's ability to maintain its function in spite of internal or
external perturbations (Kitano 2004). In biology, such systems can be considered at
several levels: a biochemical pathway regulating the expression of a protein, a cluster of
cells undergoing differentiation, or an organism responding to variable nutrient sources,
for example. All of these biological systems, like sophisticated man-made systems, use
controls such as feedback loops and back-up components to be able to carry on reliably
when conditions change or one component fails. It is clear that animals living in the wild
face unpredictable environments such as fluctuations in temperature or food availability,
though we may lose sight of this aspect of biology when we grow our model organisms
under standard, consistent laboratory conditions. The production of macromolecules
within cells over time and between different cells of the same type also suffers from
inherent noisiness that must be managed for biochemical pathways to function robustly.
These requirements are especially relevant to the development and physiology of
multicellular organisms with complex body plans. Not only must embryonic cells choose
many different fates, but they must also remember their choice to maintain their cell type
identity in the adult. From the cell's perspective, environment may mean one of many
microenvironments within the organism, such as different regions along a morphogen
gradient during embryonic development, which must be sensed and interpreted for
normal morphogenesis.
Why might miRNAs, in addition to other regulators of gene expression, have been
selected for making biological systems more robust? As post-transcriptional regulators,
miRNAs can intervene late in the pathway of gene expression to counteract variation
from the upstream processes of transcription and splicing. Their mechanisms of target
repression may also be specifically suited for various types of regulation: by accelerating
mRNA degradation, they swiftly and irreversibly reduce target protein production; by
inhibiting translation, they allow for temporary silencing followed by restoration of the
target message to a translationally competent state (Bhattacharyya et al. 2006). As
titrating molecules for mRNAs, miRNAs partition among thousands of targets in
equilibrium association to stabilize protein expression. In the sections that follow, we will
discuss the roles miRNAs play in gene regulatory networks, and specifically in
dampening leaky transcripts and buffering the effects of mRNA fluctuations. These
sections will shed light on the contributions of miRNAs during periods of stress and
pathological states.
miRNA-target architectures that increase robustness
miRNAs participate in several stereotyped network motifs that are enriched in nature and
known to act in making systems robust (Milo et al. 2002). One is the simple negative
feedback loop, in which component A activates component B and component B inhibits
component A. This motif contributes to the homeostasis of component A (and component
B). For example, methyl CpG-binding protein 2 (MeCP2) acts through BDNF to induce
the neuronal miRNA miR-132, which feeds back to repress MeCP2 (Klein et al. 2007;
Figure 1A). Homeostasis in the level of MeCP2 expression is important, as over- or
under-expression of this regulator causes neurodevelopmental defects.
One of the most common feedback motifs known to involve miRNAs is the mutual
negative feedback loop, in which components A and B inhibit each other (Chang et al.
2004, Bracken et al. 2008, Burk et al. 2008, Juan et al. 2009, Kefas et al. 2009, Roush
and Slack 2009, Xu et al. 2009, Zhao et al. 2009). Typically this motif helps to establish
bistability between a precursor cell type and a terminally differentiated cell fate. For
example, the transcription factor NFI-A suppresses expression of the primary miR-223
transcript in undifferentiated myeloid precursors, and upon retinoic acid-induced
differentiation into granulocytes, miR-223 accumulates and represses NFI-A, thereby
helping to prevent a return to the precursor state (Fazi et al. 2005; Figure IB).
Positive feedback loops, in which component A and component B activate each other,
also contribute to switches in development. For example, the "2 degrees" vulval
precursor cell fate is established in the worm when LIN 12 directly activates transcription
of miR-6 1, which then represses vav- 1, a negative regulator of LIN 12 activity (Yoo and
Greenwald, 2005; Figure IC). In this case the indirect link may build additional control
into the lineage decision, as LIN 12 expression must be sustained enough for miR-61 to
accumulate to sufficiently lower the level of Vav- 1 protein in order to allow for adequate
LIN12 activity.
Feedforward loops are another set of common motifs that involve miRNAs and are
consistent with conferring robustness. In a coherent feed-forward loop, component A
inhibits (or activates) component B and activates (or inhibits, respectively) component C,
which is another repressor of component B. This architecture can increase the fidelity of
inhibition of the downstream component by acting on it redundantly; that is, a transient
loss of component A can be compensated for by the lingering presence of component C.
As with positive and negative feedback loops, it is often used in lineage commitment. For
example, CCAAT enhancer binding protein alpha (C/EBPalpha) inhibits transcription of
the cell cycle regulator E2F1 during granulopoiesis (Pulikkan et al. 2010). C/EBPalpha
also induces miR-223, which post-transcriptionally represses E2F 1. As is often the case,
this feedforward loop is interlocked with a feedback loop: E2F 1 inhibits production of
miR-223 (Figure ID). This example illustrates several principles of miRNA networks in
development: 1, In these loops, the miRNA often targets a transcriptional regulator. 2,
Combining feedforward with feedback motifs may allow cells to distinguish between
transient fluctuations (which should be counteracted) and permanent changes (which
should be enhanced or maintained). 3, There are often other network motifs involving a
cell type-specific miRNA that reinforce the same cell fate decision, as with miR-223 and
NFI-A in granulocytes (see above).
Incoherent feedforward loops also comprise three components, but instead of reinforcing
a signal, the added component sends a contradictory signal. Component A activates
component B and simultaneously activates component C, which is a repressor of
component B. This motif can play several roles depending on the relative rates of
production and turnover of its components. Where component C is slower to accumulate
or decay compared to the downstream component B, the feedforward can create a pulse
(or, where component A is an inhibitor of both B and C, a delay) in the expression of
component B. Where both component C and B respond on the same timescale, the
feedforward can buffer the expression of component B against fluctuations in component
A. Overall, it fine-tunes the target protein level below the level set by transcriptional
control. One example of an incoherent feedforward loop is c-myc activating transcription
of both E2F1 and the 17~92 miRNA cluster on chromosome 13 (O'Donnell et al. 2005).
The constituent seed family members miR-17-5p and -20 directly repress E2F 1 (Figure
1E). There are also a few cases of a pure incoherent feedforward loop wherein a miRNA
expressed from an intron of a host gene targets that very gene, e.g. miR-26a and its
host/target CTDSP2 (Tsang et al. 2007). Finally, the incoherent feedforward motif can
act indirectly to disambiguate a signaling pathway whose components are produced in
different cells: the chemorepellent axon guidance ligand Slit is the host gene for miR-
218, and miR-218 targets the ROBO receptors for Slit (Tie et al. 2010). Thus expression
of Slit leads to signaling through the ROBO pathway, but in the same cell, the coexpressed miRNA represses the ROBO pathway. During neural development, this
feedforward control could prevent the Slit-expressing cell from sending a repellent signal
to itself or from wasting its secreted ligand on its own cell-surface ROBO receptors,
thereby making the paracrine signaling more robust.
In addition to miRNAs' roles in feedback and feedforward loops, it is common for a
single miRNA family to coordinately regulate multiple components of a signaling
pathway or protein complex (Tsang et al. 2010; see Appendix). Where the target
components act coherently to stimulate or suppress the signaling output, the miRNA
could act as a master regulator to enhance a decisive signal or tightly shut off signaling.
For example, miR- 181 regulates T cell signaling by repressing multiple phosphatases in
the cascade: targets SHP-2 and PTPN22 act immediately downstream of the T cell
receptor complex, with the latter inhibiting Lck kinase; targets DUSP6 and DUSP5 act
late in the pathway to inhibit phosphoErk in the cytoplasm and in the nucleus,
respectively (Li et al. 2007; Figure IF). In this way, miR- 181 helps to set different
activation thresholds in different stages of T cell development. In double positive T cells,
miR- 181 level is relatively high, which heightens sensitivity even to low-affinity self
antigens and therefore facilitates positive and negative selection. In more mature
differentiated T cells, miR- 181 level is downregulated and the signaling pathway is only
responsive to high-affinity foreign antigens.
miRNA-target networks are highly connected not only due to coordinate regulation of
interacting targets by a single miRNA, but also due to the co-targeting of common genes
by combinations of miRNAs (Tsang et al. 2010). Different miRNA seeds regulate
overlapping sets of target genes, with each miRNA backing up the other for a given
shared target. There is also redundancy in the miRNAs themselves: many have
functionally redundant seed family members expressed from the same polycistronic
cluster or from distant genomic loci (Kim 2005).
miRNAs attenuate leaky transcripts
The regulated expression of miRNAs can provide robustness to developmental processes.
Global gene expression analysis in fly, fish, and mouse has shown that miRNAs and their
targets tend to have anticorrelated RNA expression across tissues, especially in
neighboring tissues derived from common progenitors (Stark et al. 2005, Farh et al. 2005,
Sood et al. 2006, Tsang et al. 2007, Shkumatava et al. 2009). This suggests that miRNAs
can act to reinforce the transcriptional gene expression program by repressing leaky
transcripts. For example, in the Drosophilaembryo, neurectodermal progenitors express
miR- 124 as they differentiate into neurons. Neuronal genes that are activated during this
transition tend not to have miR- 124 sites whereas genes expressed in epidermal tissues
that are also ectodermal derivatives are enriched for miR- 124 sites (Stark et al. 2005).
Thus expression of miR- 124 stabilizes the neuronal transition. A reciprocal pattern holds
for the ectoderm-specific miR-9a. These miRNA-target relationships likely involve many
instances of the coherent feedforward loop described above, with tissue-specific
transcription factors in the role of component A. Even amongst closely related cell types
there can be an inverse relationship: the miR- 124 target repo is expressed only in lateral
glia of the central nervous system, cells shown to apparently lack miR-124 expression
(Stark et al. 2005). Intriguingly, this anticorrelative pattern may apply not only to
transcription but also to alternative splicing: a non-muscle-specific isoform of
tropomyosin- 1 is targeted by the muscle-specific miRNA miR- 1 whereas the three
muscle-expressed isoforms lack miR- 1 sites, a trend that is conserved in vertebrates
(Stark et al. 2005). Thus a mis-splicing event that generated the cytoplasmic
gut/brain/epidermis isoform in muscle cells would be corrected by miRNA-mediated
repression.
For the anticorrelated expression trend to have arisen over the course of evolution, there
must have been selective pressure in the form of biological variation or noise at the level
of transcription or alternative splicing of functionally important genes. Leakiness may be
a necessary tradeoff for cells that are differentiated from plastic progenitors. In the
context where miRNAs continuously reinforce cell type identity once differentiation is
initiated, miRNAs would serve best as long-lived molecules. Indeed some miRNAs show
extreme stability compared to mRNAs, such as the heart muscle-specific miR-208, which
has a half-life of about 12 days in vivo (van Rooij et al. 2007).
Controlling leaky transcripts may be especially important when transient spikes in
mRNA level are enhanced by positive feedback loops. This appears to be the function of
a miRNA that acts as a gatekeeper for sensory organ precursor (SOP) determination in
flies. miR-9a targets the proneural transcription factor Senseless (Li et al. 2006).
Normally, only one cell in a proneural cluster becomes a SOP; it arises when a transient
and apparently random increase in Senseless protein feeds back positively through other
proneural genes (Figure 2). Lateral inhibition via Delta-Notch signaling maintains the
neighboring cells in a non-SOP fate by repressing Senseless and its downstream
proneural genes. miR-9a is expressed in all the cells of the neuroectodermal clusters and
then after differentiation only in the non-SOPs, keeping Senseless expression low.
Deletion of miR-9a resulted in the appearance of variable numbers of extra sensory
bristles in about 40% of mutant animals. Given the incomplete penetrance of the
phenotype and the randomness of the development of ectopic sensory organs, it seems the
miRNA is required to suppress some of the random spikes in Senseless protein level. By
setting a threshold above which the proneural transcription factor is allowed to trigger the
feedback, miR-9a makes the cell fate switch less error-prone (Cohen et al. 2006).
miRNAs set gene expression thresholds for their targets
The ability of a miRNA to set an expression threshold for its target genes was recently
generalized. Mukherji et al. assayed miRNA activity in single cells using a tet-responsive
promoter driving an mCherry reporter targeted by endogenous miRNA, and from the
other direction, an eYFP reporter that served as a proxy for target transcription (Mukherji
et al. 2010). Inducing the genes over a wide range of target mRNA production, the
strength of miRNA-mediated repression varied dramatically: in the low target input
regime the mCherry reporter was almost completely repressed; above a certain threshold
of target input, repression was very mild, and at very high target levels, essentially null
(Figure 3). A simple mathematical model predicted the threshold effect from the titrationlike nature of miRNA-target interaction. The threshold was modulated by changing the
miRNA concentration and the number and strength of the binding sites in the target's 3'
UTR. This RNA titration system could allow cells expressing a certain set of miRNAs to
discriminate between low-level, transient, leaky transcripts, and legitimate transcripts
expressed at higher levels and for sustained periods. By post-transcriptionally silencing
and inducing destruction of the sub-threshold transcripts, miRNAs contribute to cell fate
decisions and the maintenance of cell/tissue identity. The use of cell type-specific
combinations of miRNAs and co-targeting of multiple miRNAs per mRNA not only
provides vast combinatorial control, but also allows for tuning of the threshold based on
total miRNA concentration and total number of binding sites. Though most target genes
have only one conserved binding site per miRNA seed family, the majority have sites for
multiple miRNA families, with an average of more than four total conserved sites per 3'
UTR (Friedman et al. 2009).
Below the miRNA-determined threshold for target mRNA production, target protein
expression is essentially switched off. Above the threshold, target protein output
increases steeply in what is termed an ultrasensitive transition (Mukherji et al. 2010).
Across this transition, the target is repressed at every possible degree until it is entirely
derepressed. The lower the miRNA concentration, the lower the target mRNA expression
at the point of escape from repression. The pool of endogenous target mRNAs also
contributes to the concentration of available miRNA; during a developmental transition
where a miRNA is upregulated and its pool of target genes are down-regulated, its
effective concentration and therefore its potency could greatly increase for a small
number of functionally important targets.
miRNAs may buffer transcriptional noise
As discussed above, anticorrelated expression of miRNAs and targets is common.
Perhaps surprisingly, incoherent expression of miRNAs and targets is also prevalent.
Tsang et al. used expression profiles of human and mouse host genes to infer the
expression of intron-embedded miRNAs and compared this large dataset to that of
predicted target gene expression. To avoid ambiguity resulting from mixed cell types in a
given tissue, they also used data from homogeneous isolated neuronal cell populations.
About 70% of the 60 miRNAs analyzed had a significantly higher number of targets
(genes with conserved seed matches) in the top 10 percentile of correlated or
anticorrelated genes, whereas at most 8% had a significantly higher number of targets in
the middle ten percentile sets (Tsang et al. 2007). Correlated and anti-correlated miRNAtarget patterns were about equally prevalent. Co-expression across a variety of conditions
implies transcriptional control by (a) common transcription factor(s). What benefit might
accrue from expressing a gene and simultaneously expressing a repressor for that gene?
The additional resources used to do so may pay for enhanced robustness in gene
expression.
Random fluctuations in protein levels arise from several sources. Intrinsic noise refers to
variation arising from stochastic events including promoter binding, mRNA decay,
translation, and protein degradation (Raser and O'Shea 2005). Extrinsic noise refers to
variation arising from differences such as transcription factor or ribosome concentration
or cell cycle stage. Both sources cause protein levels to fluctuate in a given cell over time
and between clonally identical cells. The degree to which protein level fluctuates around
its mean may be influenced by the rates at which transcription and translation occur. To
synthesize 100 molecules of a certain protein per cell, one can imagine two extreme
strategies: transcribe 10 copies of mRNA and translate 10 protein copies per mRNA, or
transcribe 100 mRNAs and repress their translation so as to translate 1 protein copy per
mRNA (Figure 4A). What would be the consequences of these strategies? The mean
protein output would be the same but the expected variance would be substantially
different (Figure 4B). Transcription occurs in stochastic bursts (Blake et al. 2006) and
higher transcription rates correlate with lower noise (Paulsson 2004). Translation events
amplify transcriptional noise (Paulsson 2004, Pedraza and van Oudenaarden 2005) such
that noise increases linearly with the rate of translation (Ozbudak et al. 2002). By
transcribing a gene at a high rate and simultaneously reducing its translation rate using
miRNAs, cells should reduce fluctuations in target protein number. Specific coexpression of miRNA and target in an incoherent feedforward loop has been proposed as
a way to do this (Hornstein and Shomron 2006).
In prokaryotes, small non-coding regulatory RNAs (sRNAs) induce target mRNA
degradation and are seen to reduce protein noise, but only when target transcription falls
below a certain threshold (Levine et al. 2010). Below the target expression threshold set
by miRNA, variation in target mRNA input is transmitted into disproportionately small
variation in protein output (slope < 1, Figure 3); in this regime, random fluctuations in
target mRNA number could be suppressed (Mukherji et al. 2010). But within the
ultrasensitive transition, which corresponds to higher transcription rates, the variation in
mRNA input corresponds to greater variation in protein output for miRNA-targeted
genes than for non-targeted genes (slope > 1, Figure 3). Perhaps the cost of tunability
within the ultrasensitive transition is increasednoise in protein expression. There are
other, ubiquitously active cis regulatory elements that may help buffer fluctuations by
attenuating the translation of mRNAs, but that lack the tunability provided by
combinations of miRNA binding sites and cell type-specific miRNA repertoires. For
example, upstream open reading frames and weak noncanonical Kozak sequences reduce
the efficiency of translation initiation (Calvo et al. 2009), and rare codons and secondary
structures slow translation elongation. Pairing strong transcription with these mechanisms
may reduce protein fluctuations arising from fluctuations in mRNA copy number.
In the regime where miRNAs are expected to suppress noise, coordinate targeting of
multiple pathway components may be especially useful since noise can propagate
through a network (Blake et al. 2003, Pedraza and van Oudenaarden 2005). Moreover, in
an incoherent feedforward loop, if the miRNA has a half-life comparable to that of the
target mRNA and protein - which is plausible given the wide range of observed miRNA
stability (Bail et al. 2010) - then this motif could serve to partially decouple the target
protein output from fluctuations in the upstream transcription factors. When the
transcription factor spikes, it will induce the target gene but also increase miRNA-
mediated repression; when it plummets, the drop in target transcription will be
counteracted by post-transcriptional derepression. As a result, the expression level of
target proteins with short half-lives would be stabilized over time. The consequences for
fluctuations in the level of a protein depend on the timescale of fluctuation (the time it
takes for a peak or trough to return to the mean). It is not yet clear whether incoherent
feedforward regulation buffers target genes at the relevant timescale to produce
physiological effects.
Some miRNA phenotypes appear upon stress
Many miRNAs show no phenotype when inhibited or knocked out in cells or animals
under normal conditions (Leaman et al. 2005, Miska et al. 2007). Dicer-mutant zebrafish
lacking any detectable miRNAs still develop all the major organs and differentiated cell
types (Giraldez et al. 2005). If miRNAs act to confer accuracy and uniformity to
developmental transitions, then loss of a miRNA may result not in catastrophic defects
but rather in imprecise, variable phenotypes. If other feedback or back-up mechanisms
are in place, then the loss of robustness may only be detected by applying additional
perturbations. Indeed, several miRNA knockout animals have shown losses of function
only upon stress.
In Drosophila,miR-7, like miR-9a, plays a role in the determination of sensory organs
(Li and Carthew 2005). Loss of miR-7 had no observable impact on the development of
the sensory organs under normal, uniform conditions; expression of the proneural
transcription factor Atonal was also detected at wild-type level (Li et al. 2009). But when
an environmental perturbation was added during larval development - fluctuating the
temperature between 31 0C and 18"C roughly every 90 minutes - the miR-7 mutant eyes
showed abnormally low Atonal expression and abnormally high, irregular expression of
the antineural transcription factor Yan. Sensory organ precursor (SOP) defects also
appeared: some groups of antennal SOPs failed to develop, or developed with abnormal
patterning; their cells showed low Atonal levels. These outcomes were better understood
by elucidating the regulatory networks for these components.
In photoreceptor determination, Yan inhibits and Pnt-P 1 activates, respectively, the
transcription of the miR-7 precursor through binding at an upstream enhancer element
(Figure 5A). Yan also inhibits miR-7 production indirectly by repressing phyllopod, an
E3 ubiquitin ligase that promotes the degradation of TTK69, a transcription factor that
represses miR-7 precursor transcription. Pnt-P 1 also transcriptionally inhibits Yan. If the
Yan protein level drops transiently in a Yan-ON state, miR-7 can still be repressed by
TTK69; since Yan is a target of miR-7, this loop sustains the expression of Yan protein.
If the Pnt-P 1 protein level drops transiently in a Yan-OFF state, Yan expression can still
be kept low through the activity of miR-7. By contrast, when there is a persistent
decrease in Yan protein, the mutual negative feedback with miR-7 switches the cell state
to miR-7 ON-Yan OFF, and the coherent feedforward loop further reinforces the
sustained high expression of miR-7. Thus, in the absence of miR-7, there is no
counterweight to Yan, so the switch mechanism to achieve the Yan-OFF state is
impaired. miR-7 also participates in an incoherent feedforward loop in SOP
determination: Atonal activates E(spl) but likewise activates miR-7, which represses
E(spl) (Figure 5B). There is an interlocking negative feedback loop from E(spl) to
Atonal. In wild-type flies, transient rises in Atonal protein level would be counteracted by
rises in E(spl), whereas a sustained increase in Atonal would be maintained by miR-7
repressing E(spl). In the miR-7 mutants, Atonal expression is decreased due to
derepression of E(spl), so cells are impaired for switching to the Atonal-ON state, and
sensory organs fail to develop in normal numbers.
Several other miRNA knockout phenotypes appear in response to internal stressors. In
the Drosophilalarva, loss of the muscle-specific miRNA miR-l does not impair the
formation or physiological function of muscle, but a dramatic phenotype appears when
the larvae start feeding and their muscle cells undergo rapid post-mitotic growth (Sokol
and Ambros 2005, Brennecke et al. 2005). The mutants experienced paralysis, growth
arrest, and death, and showed severely deformed body wall musculature. This larval
phenotype was rescued by providing a protein-free diet that blocks the normal rapid
growth phase. In mice, deletion of the heart muscle-specific miRNA miR-208 has little
phenotype under normal conditions but results in a failure to induce cardiac remodeling
upon stress (van Rooij et al. 2007). When the mice were treated to induce pressure
overload or hypothyroidism, miR-208 activity was required in the cardiomyocytes to help
upregulate betaMHC by targeting the thyroid receptor signaling pathway.
miRNAs, robustness and disease
When feedback or feedforward loops get co-opted in inappropriate contexts, they may
contribute to disease. Iliopoulos et al. describe a network of feedback loops that flips an
epigenetic switch in cancer. Transient activation of Src or other triggers of NF-kappaB
induced stable transformation of a mammary epithelial cell line (Iliopoulos et al. 2010).
NF-kappaB transcriptionally activates IL6 and inhibits let-7 family members by
activating Lin28B (which prompts destruction of let-7 precursor RNAs). The ensuing
drop in let-7 level derepresses IL6, a direct let-7 target, and IL6 is further activated by
derepression of the let-7 target Ras. IL6 feeds back in both an autocrine and paracrine
fashion to activate NF-kappaB, which further inhibits let-7, and it signals through STAT3
to promote cell growth and motility (Figure 6). In a xenograft model, inhibiting NFkappaB, Lin28B, or IL6 suppressed tumor growth. What function does this inflammatory
network serve in a healthy animal? In normal tissue, a transient inflammatory cue could
signal through this pathway to induce cell growth to repair damage. The miRNA holds
the positive feedbacks in check, as in the case of miR-9a in Drosophila SOP
determination (see above). In cancer, where let-7 is typically down-regulated (Kumar et
al. 2008, Dong et al. 2010), the positive feedbacks would go unchecked, and continuous,
self-reinforcing proliferation would result. In human tumors the positive feedback loop
would be made even stronger by the presence of oncogenic v-Src or Ras-V12 (Iliopoulos
et al. 2010).
Another example involves co-option of a developmental process in cancer. The
transcription factor ZEB 1 induces the epithelial-to-mesenchymal transition, which is
important for tissue remodeling during embryonic development. ZEB 1 suppresses
transcription of miR-200 family members, and the miR-200 family strongly represses
ZEB 1 (Bracken et al. 2008, Burk et al. 2008). In development, this mutual negative
feedback reinforces the mesenchymal cell fate decision. Within carcinomas, some tumor
cells lose miR-200 expression and switch to a mesenchymal state, which promotes their
ability to metastasize (Gibbons et al. 2009).
In a healthy animal, robust signaling processes keep cells behaving appropriately. In a
cancer patient, the tumor might actually become robust against therapy by virtue of
unhinging the usual controls and making gene expression less stable. miRNAs are
globally depleted in tumors relative to their normal tissue counterparts (Lu et al. 2005)
and modeling this state by knocking down components of the miRNA biogenesis
pathway (Kumar et al. 2007) or by heterozygous deletion of Dicer (Kumar et al. 2009)
accelerates tumor growth. In addition, 3' UTRs are globally shortened in tumors via
alternative polyadenylation site choice (Sandberg et al. 2008, Mayr and Bartel 2009). The
combined effect of these trends should be widespread derepression of miRNA target
genes and potentially also un-buffering of gene expression. Might this increase the
heterogeneity and plasticity of the tumor cell population by epigenetic dysregulation?
Perhaps the tumor can be thought of as analogous to a clonal population of bacteria or
yeast where noise in the population adapts them to unpredictably changing environmental
conditions (Acar et al. 2008, Cagatay et al. 2009). For cancer cells, these changing
conditions could include increasingly hypoxic tumor cores, new microenvironments for
metastases, or on-and-off chemotherapy regimens, and the consequence of their noisedriven adaptability would be that a fraction of the cells survive almost any condition.
Implications for miRNA influence in evolution
While we speculate about the ability of miRNAs to act as buffers of gene expression,
there is as of yet only one well-characterized example of a general mutation buffering
agent. The chaperone Hsp90 assists the folding of client proteins such that it can
compensate for point mutations in the protein coding regions of client genes (Rutherford
and Lindquist 1998). In doing so Hsp90 acts as a capacitor of phenotypic variation,
storing cryptic genetic variation until environmental stress overwhelms Hsp90 and
reveals the mutant proteins, allowing them to affect phenotypes and become substrates
for selection. Do miRNAs potentiate cryptic genetic variation in their target genes? If so,
the mutations would be in the promoters or transcription factors driving target gene
expression or the expression of downstream genes. The ability of the miRNA to
compensate for otherwise elevated target protein levels would allow such mutations to
accrue without selective penalty. The emergence of non-lethal mutations that give diverse
phenotypes is one requirement for evolvability (Kitano 2004). Analagously to the case of
Hsp90, loss of miRNA activity due to mutations or misexpression of components in the
miRNA pathway could unleash the mutated gene products for exposure to natural
selection.
Another way to describe miRNAs' hypothetical role in enhancing evolvability is
canalization. For a trait to be canalized refers to the evolved robustness conferred by
entrenching mechanisms that protect against environmental or genetic perturbations
(Hornstein and Shomron 2006). Where miRNAs reduce fluctuations in target gene
expression and stabilize signaling decisions, they would tighten the linkage between
genotype and phenotype, thereby increasing the heritability of traits (Peterson et al.
2009). The more heritable a trait is, the more efficiently it is selected. Thus not only does
evolution favor robustness, but robustness also promotes evolution (Kitano 2004).
Conclusions
Multicellular organisms must manage the tasks of development and physiology in
unpredictably changing environments and with imperfect genetic and biochemical
components. Random noise in gene expression must be reduced or, as in the case of some
cell fate decisions, harnessed to a system control network to designate one fate or another
among neighboring cells. Robustness goes beyond the job of keeping one state the same
in the face of perturbations. In development, it can mean not sending a signal until the
right time, and then sending it strongly and irreversibly. The addition of miRNAs to
metazoan genomes over time and the diversity of miRNA repertoires among different
tissues of developing animals suggest that miRNAs are involved in reinforcing
developmental decisions to make organismal complexity reliable and heritable from one
generation to the next.
Acknowledgments
We thank Mary Lindstrom for help making the figures, and Shankar Mukherji, Anthony
Leung, and Dave Bartel for insightful comments on the manuscript.
References
Acar M, Mettetal JT, van Oudenaarden A. Stochastic switching as a survival strategy in
fluctuating environments. Nat. Genet. 40, 471-475 (2008).
Bail S, Swerdel M, Liu H, Jiao X, Goff LA, Hart RP, Kiledjian M. Differential regulation
of microRNA stability. RNA 16, 1032-1039 (2010).
Bar-Even A, Paulsson J, Maheshri N, Carmi M, O'Shea E, Pilpel Y, Barkai N. Noise in
protein expression scales with natural protein abundance. Nat. Genet. 38, 636-643 (2006).
Bartel DP. MicroRNAs: target recognition and regulatory functions. Cell 136, 215-233
(2009).
Bartel DP, Chen CZ. Micromanagers of gene expression: the potentially widespread
influence of metazoan microRNAs. Nat. Rev. Genet. 5, 396-400 (2004).
Bhattacharyya SN, Habermacher R, Martine U, Closs El, Filipowicz W. Relief of
microRNA-mediated translational repression in human cells subjected to stress. Cell 125,
1111-1124 (2006).
Blake WJ, Balizsi G, Kohanski MA, Isaacs FJ, Murphy KF, Kuang Y, Cantor CR, Walt
DR, Collins JJ. Phenotypic consequences of promoter-mediated transcriptional noise.
Mol. Cell 24, 853-865 (2006).
Bracken CP, Gregory PA, Kolesnikoff N, Bert AG, Wang J, Shannon MF, Goodall GJ. A
double-negative feedback loop between ZEB 1-SIP 1 and the microRNA-200 family
regulates epithelial-mesenchymal transition. Cancer Res. 68, 7846-7854 (2008).
Brennecke J, Stark A, Cohen SM. Not miR-ly muscular: microRNAs and muscle
development. Genes Dev. 19, 2261-2264 (2005).
Burk U, Schubert J, Wellner U, Schmalhofer 0, Vincan E, Spaderna S, Brabletz T. A
reciprocal repression between ZEB 1 and members of the miR-200 family promotes EMT
and invasion in cancer cells. EMBO Rep. 9, 582-589 (2008).
Cagatay T, Turcotte M, Elowitz MB, Garcia-Ojalvo J, Snlel GM. Architecture-dependent
noise discriminates functionally analogous differentiation circuits. Cell 139, 512-522
(2009).
Chang S, Johnston RJ Jr, Frekjaer-Jensen C, Lockery S, Hobert 0. MicroRNAs act
sequentially and asymmetrically to control chemosensory laterality in the nematode.
Nature 430, 785-789 (2004).
Calvo SE, Pagliarini DJ, Mootha VK. Upstream open reading frames cause widespread
reduction of protein expression and are polymorphic among humans. Proc. Natl Acad.
Sci. USA 106, 7507-7512 (2009).
Cohen SM, Brennecke J, Stark A. Denoising feedback loops by thresholding--a new role
for microRNAs. Genes Dev. 20, 2769-2772 (2006).
Dong Q, Meng P, Wang T, Qin W, Qin W, Wang F, Yuan J, Chen Z, Yang A, Wang H.
MicroRNA let-7a inhibits proliferation of human prostate cancer cells in vitro and in vivo
by targeting E2F2 and CCND2. PLoS One 5, e10147 (2010).
Farh KK, Grimson A, Jan C, Lewis BP, Johnston WK, Lim LP, Burge CB, Bartel DP.
The widespread impact of mammalian MicroRNAs on mRNA repression and evolution.
Science 310, 1817-1821 (2005).
Fazi F, Rosa A, Fatica A, Gelmetti V, De Marchis ML, Nervi C, Bozzoni I. A
minicircuitry comprised of microRNA-223 and transcription factors NFI-A and
C/EBPalpha regulates human granulopoiesis. Cell 123, 819-831 (2005).
Friedman RC, Farh KK, Burge CB, Bartel DP. Most mammalian mRNAs are conserved
targets of microRNAs. Genome Res. 19, 92-105 (2009).
Gibbons DL, Lin W, Creighton CJ, Rizvi ZH, Gregory PA, Goodall GJ, Thilaganathan N,
Du L, Zhang Y, Pertsemlidis A, Kurie JM. Contextual extracellular cues promote tumor
cell EMT and metastasis by regulating miR-200 family expression. Genes Dev. 23, 21402151 (2009).
Giraldez AJ, Cinalli RM, Glasner ME, Enright AJ, Thomson JM, Baskerville S,
Hammond SM, Bartel DP, Schier AF. MicroRNAs regulate brain morphogenesis in
zebrafish. Science 308, 833-838 (2005).
Heimberg AM, Sempere LF, Moy VN, Donoghue PC, Peterson KJ. MicroRNAs and the
advent of vertebrate morphological complexity. Proc. Natl Acad. Sci. USA 105, 29462950 (2008).
Hornstein B, Shomron N. Canalization of development by microRNAs. Nat. Genet. 38
Suppl, S20-4 (2006).
Iliopoulos D, Hirsch HA, Struhl K. An epigenetic switch involving NF-kappaB, Lin28,
Let-7 MicroRNA, and IL6 links inflammation to cell transformation. Cell 139, 693-706
(2009).
Juan AH, Kumar RM, Marx JG, Young RA, Sartorelli V. Mir-214-dependent regulation
of the polycomb protein Ezh2 in skeletal muscle and embryonic stem cells. Mol. Cell 36,
61-74 (2009).
Kefas B, Comeau L, Floyd DH, Seleverstov 0, Godlewski J, Schmittgen T, Jiang J,
diPierro CG, Li Y, Chiocca EA, Lee J, Fine H, Abounader R, Lawler S, Purow B. The
neuronal microRNA miR-326 acts in a feedback loop with notch and has therapeutic
potential against brain tumors. J. Neurosci. 29, 15161-15168 (2009).
Kim VN. MicroRNA biogenesis: coordinated cropping and dicing. Nat. Rev. Mol. Cell
Biol. 6, 376-385 (2005).
Kitano H. Biological robustness. Nat. Rev. Genet. 5, 826-837 (2004).
Klein ME, Lioy DT, Ma L, Impey S, Mandel G, Goodman RH. Homeostatic regulation
of MeCP2 expression by a CREB-induced microRNA. Nat. Neurosci. 10, 1513-1514
(2007).
Kumar MS, Erkeland SJ, Pester RE, Chen CY, Ebert MS, Sharp PA, Jacks T.
Suppression of non-small cell lung tumor development by the let-7 microRNA family.
Proc. Natl Acad. Sci. USA 105, 3903-3908 (2008).
Kumar MS, Lu J, Mercer KL, Golub TR, Jacks T. Impaired microRNA processing
enhances cellular transformation and tumorigenesis. Nat. Genet. 39, 673-677 (2007).
Kumar MS, Pester RE, Chen CY, Lane K, Chin C, Lu J, Kirsch DG, Golub TR, Jacks T.
Dicerl functions as a haploinsufficient tumor suppressor. Genes Dev. 23, 2700-2704
(2009).
Leaman D, Chen PY, Fak J, Yalcin A, Pearce M, Unnerstall U, Marks DS, Sander C,
Tuschl T, Gaul U. Antisense-mediated depletion reveals essential and specific functions
of microRNAs in Drosophila development. Cell 121, 1097-1108 (2005).
Lee CT, Risom T, Strauss WM. Evolutionary conservation of microRNA regulatory
circuits: an examination of microRNA gene complexity and conserved microRNA-target
interactions through metazoan phylogeny. DNA Cell Biol. 26, 209-218 (2007).
Levine E, Huang M, Huang Y, Kuhlman T, Shi H, Zhang Z, Hwa T. On noise and silence
in small RNA regulation. Submitted (2010).
Li QJ, Chau J, Ebert PJ, Sylvester G, Min H, Liu G, Braich R, Manoharan M, Soutschek
J, Skare P, Klein LO, Davis MM, Chen CZ. miR- 181 a is an intrinsic modulator of T cell
sensitivity and selection. Cell 129, 147-161 (2007).
Li X, Carthew RW. A microRNA mediates EGF receptor signaling and promotes
photoreceptor differentiation in the Drosophila eye. Cell 123, 1267-1277 (2005).
Li X, Cassidy JJ, Reinke CA, Fischboeck S, Carthew RW. A microRNA imparts
robustness against environmental fluctuation during development. Cell 137, 273-282
(2009).
Li Y, Wang F, Lee JA, Gao FB. MicroRNA-9a ensures the precise specification of
sensory organ precursors in Drosophila. Genes Dev. 20, 2793-2805 (2006).
Lu J, Getz G, Miska EA, Alvarez-Saavedra E, Lamb J, Peck D, Sweet-Cordero A, Ebert
BL, Mak RH, Ferrando AA, Downing JR, Jacks T, Horvitz HR, Golub TR. MicroRNA
expression profiles classify human cancers. Nature 435, 834-838 (2005).
Mayr C, Bartel DP. Widespread shortening of 3'UTRs by alternative cleavage and
polyadenylation activates oncogenes in cancer cells. Cell 138, 673-684 (2009).
Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D, Alon U. Network motifs:
simple building blocks of complex networks. Science 298, 824-827 (2002).
Miska EA, Alvarez-Saavedra E, Abbott AL, Lau NC, Hellman AB, McGonagle SM,
Bartel DP, Ambros VR, Horvitz HR. Most Caenorhabditis elegans microRNAs are
individually not essential for development or viability. PLoS Genet. 3, e215 (2007).
Mukherji S, Ebert MS, Zheng GZ, Tsang JS, Sharp PA, van Oudenaarden A. MicroRNAs
set gene expression thresholds with ultrasensitive transitions. Submitted (2010).
O'Donnell KA, Wentzel EA, Zeller KI, Dang CV, Mendell JT. c-Myc-regulated
microRNAs modulate E2F1 expression. Nature 435, 839-843 (2005).
Ozbudak EM, Thattai M, Kurtser I, Grossman AD, van Oudenaarden A. Regulation of
noise in the expression of a single gene. Nat. Genet. 31, 69-73 (2002).
Paulsson J. Summing up the noise in gene networks. Nature 427, 415-418 (2004).
Pedraza JM, van Oudenaarden A. Noise propagation in gene networks. Science 307,
1965-1969 (2005).
Peterson KJ, Dietrich MR, McPeek MA. MicroRNAs and metazoan macroevolution:
insights into canalization, complexity, and the Cambrian explosion. Bioessays 31, 736747 (2009).
Pulikkan JA, Dengler V, Peramangalam PS, Peer Zada AA, MUller-Tidow C, Bohlander
SK, Tenen DG, Behre G. Cell-cycle regulator E2F I and microRNA-223 comprise an
autoregulatory negative feedback loop in acute myeloid leukemia. Blood 115, 1768-1778
(2010).
Raser JM, O'Shea EK. Noise in gene expression: origins, consequences, and control.
Science 309, 2010-2013 (2005).
Roush SF, Slack FJ. Transcription of the C. elegans let-7 microRNA is temporally
regulated by one of its targets, hbl- 1. Dev. Biol. 334, 523-534 (2009).
Rutherford SL, Lindquist S. Hsp90 as a capacitor for morphological evolution. Nature
396, 336-342 (1998).
Sandberg R, Neilson JR, Sarma A, Sharp PA, Burge CB. Proliferating cells express
mRNAs with shortened 3' untranslated regions and fewer microRNA target sites. Science
320, 1643-1647 (2008).
Shalgi R, Lieber D, Oren M, Pilpel Y. Global and local architecture of the mammalian
microRNA-transcription factor regulatory network. PLoS Comput. Biol. 3, e131 (2007).
Shkumatava A, Stark A, Sive H, Bartel DP. Coherent but overlapping expression of
microRNAs and their targets during vertebrate development. Genes Dev. 23, 466-481
(2009).
Sokol NS, Ambros V. Mesodermally expressed Drosophila microRNA- 1 is regulated by
Twist and is required in muscles during larval growth. Genes Dev. 19, 2343-2354 (2005).
Sood P, Krek A, Zavolan M, Macino G, Rajewsky N. Cell-type-specific signatures of
microRNAs on target mRNA expression. Proc. Natl Acad. Sci. USA 103, 2746-2751
(2006).
Stark A, Brennecke J, Bushati N, Russell RB, Cohen SM. Animal MicroRNAs confer
robustness to gene expression and have a significant impact on 3'UTR evolution. Cell
123, 1133-1146 (2005).
Thattai M, van Oudenaarden A. Intrinsic noise in gene regulatory networks. Proc. Natl
Acad. Sci. USA 98, 8614-8619 (2001).
Thomson JM, Newman M, Parker JS, Morin-Kensicki EM, Wright T, Hammond SM.
Extensive post-transcriptional regulation of microRNAs and its implications for cancer.
Genes Dev. 20, 2202-2207 (2006).
Tie J, Pan Y, Zhao L, Wu K, Liu J, Sun S, Guo X, Wang B, Gang Y, Zhang Y, Li Q,
Qiao T, Zhao Q, Nie Y, Fan D. MiR-218 inhibits invasion and metastasis of gastric
cancer by targeting the Robo 1 receptor. PLoS Genet. 6, e 1000879 (2010).
Tsang JS, Ebert MS, van Oudenaarden A. Genome-wide dissection of microRNA
functions and cotargeting networks using gene set signatures. Mol. Cell 38, 140-153
(2010).
Tsang J, Zhu J, van Oudenaarden A. MicroRNA-mediated feedback and feedforward
loops are recurrent network motifs in mammals. Mol. Cell 26, 753-767 (2007).
van Rooij E, Sutherland LB, Qi X, Richardson JA, Hill J, Olson EN. Control of stressdependent cardiac growth and gene expression by a microRNA. Science 316, 575-579
(2007).
Varghese J, Cohen SM. microRNA miR- 14 acts to modulate a positive autoregulatory
loop controlling steroid hormone signaling in Drosophila. Genes Dev. 21, 2277-2282
(2007).
Xu N, Papagiannakopoulos T, Pan G, Thomson JA, Kosik KS. MicroRNA-145 regulates
OCT4, SOX2, and KLF4 and represses pluripotency in human embryonic stem cells. Cell
137, 647-658 (2009).
Yang X, Feng M, Jiang X, Wu Z, Li Z, Aau M, Yu Q. miR-449a and miR-449b are direct
transcriptional targets of E2F 1 and negatively regulate pRb-E2F 1 activity through a
feedback loop by targeting CDK6 and CDC25A. Genes Dev. 23, 2388-2393 (2009).
Yoo AS, Greenwald I. LIN- 12/Notch activation leads to microRNA-mediated downregulation of Vav in C. elegans. Science 310, 1330-1333 (2005).
Zhao C, Sun G, Li S, Shi Y. A feedback regulatory loop involving microRNA-9 and
nuclear receptor TLX in neural stem cell fate determination. Nat. Struct. Mol. Biol. 16,
365-371 (2009).
Figures
Figure 1. miRNA-target network motifs
(A) A negative feedback loop contributes to homeostasis for MeCP2 protein in neurons.
(B) A mutual negative feedback loop contributes to bistability between myeloid precursors and
granulocytes.
(C) A positive feedback loop enforces lineage commitment of nematode "2 degrees" vulval cells.
(D) A coherent feedforward loop both directly and indirectly inhibits the cell cycle regulator
E2F 1 in granulopoiesis.
(E) An incoherent feedforward loop activates E2F I while indirectly repressing it through the
miR-20 seed family.
(F) miR-181 coordinately targets four components to modulate the sensitivity of the T cell
receptor signaling pathway.
(1A)
A
)
B
MeCP2
BDNF
-
miR-1 32
(B)
A-1
B
NFl-A
B
LIN12
miR-223
I
(1 C)
A
)-
miR-61
)
Vav-1
(I D)
A
)
C
B
C/EBPa
)
miR-223
E2F1..
(lE)
01miR-17-5 17-3p, 18,
y19a, 2019b-1, 92-1
C
E2F1
(1F)
B
SHP-2
C
PTPN22 --
ILCK
D
DUSP6 ---
ERK
E
DUSP5
I
-IT
cell receptor
Ir
miR-1l81
A
-IERK
(cytoplasmic)
(nuclear)
Figure 2. miR-9a suppresses random spikes in the level of the proneural transcription factor
Senseless, setting a threshold for positive feedback activation in Drosophilasensory organ
precursor (SOP) formation. Figure adapted from Cohen et al. 2006.
SOP
c
0_j
SOP
SOPP
0 0O
Vi)
SOP
/A
A
miR-9a mutant
wild-type
Figure 3. miRNA-target interaction produces non-linear target protein output. Below a
certain threshold of target mRNA production, the target is strongly repressed. Above the
threshold, repression is fine-tuned along every degree in an ultrasensitive transition. The
threshold can be modulated by changing the miRNA concentration or the number of miRNA
binding sites in the target mRNA.
Switch
Fine-tuning
|
||
4-J
0
Lower [miRNA] or
Fewer binding sites
4-J
0
L_
~ Higher [miR NA] or
More bindin g sites
mRNA Input
Untargeted
miRNA target
Figure 4. (A) A given mean protein level can be achieved by a variety of transcription-translation
strategies. (B) The variance around the mean is anticipated to be smaller when strong
transcription is paired with attenuated translation.
(B)
(A)
U)
C
0
4-
0
4-I
C"
0
C
C"
I.)
5..-
F-
Transcription rate
Protein copies per cell
.............
.
Figure 5. The requirement for miR-7 in DrosophilaSOP cell fate switching is revealed by adding
an environmental perturbation during development. Diagrams adapted from Li et al. 2009. (A)
miR-7 and the antineural transcription factor Yan participate in a coherent feedforward loop in
DrosophilaSOP determination such that SOP cells switch to a miR-7 ON-Yan OFF state. (B)
miR-7 and the proneural transcription factor Atonal participate in an incoherent feedforward loop
in DrosophilaSOP determination such that SOP cells switch to a miR-7 ON-Atonal ON state.
Notch
EGFR
Su(H)
ERK
Notch
EGFR
E(spl)C
Pnt-P1
miR-7
miR-7 p-Yn
T I-I
TTK69
Atonal
F-Pnt-P1
Phyl --
44
TTK88
Senseless
Figure 6. A transient inflammatory cue induces stable malignant transformation through an NFkappaB/IL6 positive feedback network that is normally kept in check by let-7. Diagram adapted
from Iliopoulos et al. 2009.
Src
-
Ras
NFKB -
IL-6
|-
Lin-28B
let-7
STAT3
100
Chapter 6. Conclusions and future directions
This chapter was written by Margaret S. Ebert.
Conclusions
The discussions in Chapter 5 lead to some suggestions that connect systems biology
concepts to the experimental study of miRNA biology. 1, It is important to think of
miRNAs and targets in terms of regulatory networks. Small differences in target protein
expression can have dramatic outcomes when the target protein participates in a positive
feedback loop and potentially even more so when the loop involves paracrine signaling.
2, The output of a regulatory motif depends on the relative stabilities of the molecules
involved. One must consider the miRNA's half-life in relation to the processes it
regulates, and consider dynamics (pulses and lags in target protein), not just steady states
(fine-tuning of target protein). 3, To test the potential buffering effects of miRNAs on
target gene expression, experimenters will need to make use of single-cell assays that
measure mRNA and protein levels among different cells within a clonal population (see
Future directions). 4, miRNA loss-of-function phenotypes may need to be teased out by
culturing the mutant animals or cells under non-standard conditions that mimic the
stresses of a natural environment. If miRNAs are responsible for sharpening
developmental transitions, then mutants may show ambiguous intermediate states
between characteristic developmental stages.
Another way of thinking that emerges from previous chapters is about the relationship
not between a single miRNA interacting with single target gene in isolation, but rather
between a pool of different miRNAs that co-target the gene of interest, and a pool of
different endogenous mRNAs that compete for binding to those miRNAs. Depending on
the abundance and quality of binding sites in the endogenous mRNAs, the free miRNA
pool may be much smaller than suggested by total cellular miRNA concentration, as a
large fraction of the miRNA complexes may be partitioned among hundreds of targets.
The link between the work on miRNA sponges (described in Chapters 2 and 3) and the
work on miRNA-generated expression thresholds (described in Chapters 4 and 5) can be
thought of as molecular titration. miRNAs titrate target mRNAs, and miRNA sponges
titrate miRNAs. The effectiveness of sponge inhibitors is probably due in part to the fact
that many of the miRNAs are already sequestered on endogenous target mRNAs. Perhaps
some miRNA target genes whose repression is functionally inconsequential evolved
binding sites to act as sponges, tuning miRNA availability to a precise level for the
regulation of a small number of targets whose repression does have important phenotypic
consequences (Seitz 2009). In this sense, the distinction between 'target' and 'sponge' is
blurred. We see that a sponge mRNA expressed at an insufficient level just behaves as a
miRNA target; a miRNA target overexpressed to a very high level behaves as a miRNA
sponge. Indeed, in cells above but not below the threshold, the miR-20 N= 7 mCherry
target reporter partially derepresses another target of miR-20 but not a target of the
101
unrelated miRNA miR- 16 (data not shown). The more highly expressed a given miRNA
is, the higher the threshold for an N= 7 target such as a GFP sponge mRNA; hence the
higher the concentration of sponge required to sequester it. The less abundant the
miRNA, the more complete its suppression at a given level of sponge expression.
The threshold result also has interesting implications for how we assay miRNA targets.
Conventionally, to validate a miRNA target prediction, one fuses the putative target's 3'
UTR to a luciferase reporter driven by a strong viral promoter and transfects cultured
cells with many copies of plasmid DNA. After one to three days, the cells are lysed and
the average luciferase expression is measured. Typically the degree of repression reported
by this assay is less than two-fold. Considering however that most of the luciferase
expression arises from cells containing many copies of the reporter gene, presumably
those located above the target expression threshold, there are likely other cells in the
population that express less luciferase mRNA and experience stronger repression. By
averaging over the population of cells, the strength of repression is diluted. We expect
that transcriptional output from cellular promoters in chromosomal DNA would
correspond to the lower range of expression from transfected luciferase reporter plasmids,
placing it below and around the presumed threshold. Thus we conclude that the bulk
reporter assay most commonly used to test miRNA targets may be systematically
underestimating the in vivo potency of miRNAs in these interactions. On the other hand,
in experiments where endogenous target genes are tested in the presence of added
miRNA, there is opportunity to overestimate the strength of repression. By introducing
non-physiological concentrations of miRNA, one can shift the threshold such that certain
targets become strongly repressed whereas they are only weakly repressed in their natural
context. Perhaps the parameters for thresholding could be applied not only for
interpreting but also for predicting miRNA targeting effects. Given predictions that score
the overall quality and number of binding sites for a given set of miRNAs, and expression
data for mRNAs and miRNAs in cell types of interest, one could weight the target
predictions by the relative concentration of miRNA to mRNA.
Threshold effects are prevalent in biology. Switch-like thresholds can be created by other
mechanisms such as strongly cooperative binding of regulatory proteins, but molecular
titration is more tunable than cooperativity (Buchler and Louis 2008). miRNA-target
titration allows different thresholds to be generated for different genes in the same tissue
(by virtue of different miRNA binding sites) and for the same gene in different tissues
(by virtue of different miRNA concentrations). Over developmental time and in response
to environmental cues, miRNA profiles change. This resets the threshold for many target
genes such that the protein output for some targets now falls below or rises above the
level required for a functional outcome. Where steep miRNA thresholds connect to such
protein thresholds, switch-like responses can be produced.
Perhaps as important as the conclusions from this work are the experimental tools that we
made. Sponge vectors are being tested and adapted in many labs around the world, and
their use in published reports is still rising. Efforts to generate transgenic mice with
inducible tissue-specific miRNA sponges are underway. The bidirectional eYFPmCherry vector will be a powerful new tool for measuring miRNA activity in lieu of
102
luciferase reporters or single-color fluorescent reporters. Its inducibility allows for
assaying any possible level of physiological transcription by varying the doxycycline
concentration. It could be transferred to lentiviral vectors for delivery to a broader
selection of cell lines. We envision its application in target validation assays and assays
for regulators of the miRNA pathway. It should be possible to use automated flow
cytometry with stable dual color miRNA reporter cell lines to screen libraries of chemical
compounds for their effect on miRNA activity.
Future directions
In Chapter 3 we considered the possibility that there exist natural RNAs that act as
miRNA sponge inhibitors to sequester one or more miRNA seed families and rescue the
expression of their target genes. Recently a set of hundreds of mRNA-like (spliced,
polyadenylated, > 200 nt) mammalian non-coding RNAs were discovered (Guttman et al.
2009). This library of sequences will be screened computationally for potential miRNA
target mimics, scoring for the prevalence and quality of miRNA binding sites. Candidates
will be assessed with respect to their expression profile and subcellular localization to
determine whether these RNAs might plausibly encounter the miRNA(s) whose sites they
contain. Any prospective sponge RNAs will be assayed for interaction with the
miRNA(s) in question and for the ability to derepress other targets of the same miRNA
seed(s) in their natural context.
In Chapter 4 we used a bidirectional dual color reporter to assay miRNA activity in HeLa
cell lines, and observed dramatic variation in target repression among individual cells. To
what extent does miRNA activity vary amongst different cells in animal tissue? To
address this question we plan to apply the same reporter system to measure miRNA
activity in vivo. One approach is to generate tumors stably expressing the eYFP-mCherry
constructs from chromosomal insertions. This could be done by subcutaneous xenograft
in immunocompromised mice using stable Tet-On HeLa cell lines with multi-site miR-20
reporters or the N= 0 control reporter. Tissue sections from the tumors could be imaged
by fluorescence microscopy, and the relative expression of the mCherry target reporter to
the eYFP internal control would serve as an indication of miRNA activity. The
mCherry:eYFP ratio in miR-20 targeted tumors would be normalized to the same ratio
from the N = 0 control tumors generated contralaterally in the same animal. Another
approach is to generate transgenic mice expressing the dual color reporters throughout the
body. This approach could provide information about miRNA activity in healthy tissue
and at different stages of the animal's development. Another advantage is the option to
express the miRNA reporter genes from isogenic chromosomal insertions: a system
developed in the Jaenisch laboratory allows for a reporter construct to be site-specifically
inserted immediately downstream of the ColA1 collagen locus in murine embryonic stem
(mES) cells that also stably express the rtTA transcription factor for tet inducibility
(Beard et al. 2006). mES cell clones with single-copy integration of miRNA-targeted or
control reporters can be used to generate the transgenic mice. Previously this system was
shown to achieve robust inducible fluorescent reporter expression in the majority of cells
in the liver, spleen, thymus, intestine, and skin, and measurable expression in many more
organs and cell types. Blood cells and tissue sections could be harvested for analysis by
flow cytometry and fluorescence microscopy to measure the relative mCherry and eYFP
103
expression. Transgenic miRNA reporter mice could be crossed to various genetic mutants
to explore the effects of different signaling environments on miRNA activity for example
in mouse models of cancer.
In Chapter 5, one of the prevalent miRNA-target network motifs we described is the
incoherent feedforward loop. In its stereotypical form, this motif consists of a trancription
factor that induces both a protein-coding gene and a miRNA that represses that gene
(Tsang et al. 2007). To our knowledge this motif has not been experimentally tested to
see if it performs the following three predicted functions: 1, fine-tuning of protein output
compared to equal transcriptional induction of the protein-coding gene without miRNAmediated repression; 2, reduction in stochastic fluctuations in protein output compared to
transcriptional induction that produces the same mean protein output without miRNAmediated repression; 3, production of a timed pulse of target protein whose attenuation
phase depends on miRNA-mediated repression. To test these predictions, we will adapt
the bidirectional tet-inducible fluorescence reporter system so as to express both a protein
reporter and a miRNA that targets the reporter. Nuclear-localized eCFP serves as the
quantitative reporter of protein output; from the other side of the bidirectional promoter,
the precursor for the liver-specific miRNA miR- 122 is or is not inserted; finally, the
eCFP 3' UTR contains seven bulged binding sites for miR-122 or no sites (Figure 1). The
incoherent feedforward loop is constituted when the construct containing both miRNA
and eCFP target sites is expressed in HeLa or mES cells expressing the rtTA transcription
factor in the presence of doxycycline. Versions lacking the miRNA or the target sites
remove a link from the loop such that eCFP is induced without being regulated by
miRNA. miR-122 is chosen as it is not expressed in the cell lines to be used; this should
provide a clean background and a large dynamic range of inducible expression. The tetinducible promoter modulates transcription rate in a manner that can be finely tuned by
varying the doxycycline concentration. Stable lines will be assayed by flow cytometry
and fluorescence microscopy. To assay whether the incoherent feedforward loop
generates a pulse of eCFP expression, fluorescence microscopy images of live cells
induced with doxycycline will be acquired at timed intervals over the course of at least
24-48 hours. To assay whether fluctuations in eCFP are buffered by miRNA repression,
cell lines expressing constructs with and without miRNA or target sites will be induced
with different doxycycline concentrations such that the mean eCFP expression between
samples of cells is equal; then the variance of each cell population will be compared.
Finally, the motif in which miRNA is induced in concert with target mRNA will be
compared to a scenario in which the miRNA is constitutively expressed. We expect that
the incoherent feedforward loop will still produce a target expression threshold, but one
with a somewhat flattened ultrasensitive transition where repression is relatively weaker
at lower target mRNA production levels and relatively stronger at higher target mRNA
production levels.
In 2010 the study of miRNAs and other RNAi pathways is flourishing. The work in this
thesis suggests that there is still fundamental knowledge about the molecular biology of
miRNA regulation that can be gleaned from simple experimental models such as HeLa
cells expressing artificial target reporters. In the future, we look forward to the dissection
of miRNA target networks and functions in more physiological contexts.
104
Acknowledgments
John Tsang provided helpful guidance for testing the incoherent feedforward loop.
Evgeny Kiner, an undergraduate research student, helped construct the reporters for the
incoherent feedforward loop.
References
Beard C, Hochedlinger K, Plath K, Wutz A, Jaenisch R. Efficient method to generate
single-copy transgenic mice by site-specific integration in embryonic stem cells. Genesis
44, 23-28 (2006).
Buchler NE, Louis M. Molecular titration and ultrasensitivity in regulatory networks. J.
Mol. Biol. 384, 1106-1119 (2008).
Guttman M, Amit I, Garber M, French C, Lin MF, Feldser D, Huarte M, Zuk 0, Carey
BW, Cassady JP, Cabili MN, Jaenisch R, Mikkelsen TS, Jacks T, Hacohen N, Bernstein
BE, Kellis M, Regev A, Rinn JL, Lander ES. Chromatin signature reveals over a
thousand highly conserved large non-coding RNAs in mammals. Nature 458, 223-227
(2009).
Seitz H. Redefining microRNA targets. Curr. Biol. 19, 870-873 (2009).
Tsang J, Zhu J, van Oudenaarden A. MicroRNA-mediated feedback and feedforward
loops are recurrent network motifs in mammals. Mol. Cell 26, 753-767 (2007).
105
....................
.........
. .......
Figures
Figure 1. Experimental model for an incoherent feedforward loop.
In the presence of rtTA transcription factor and doxycycline, a bidirectional tet-inducible
construct drives expression of the eCFP reporter and miR-122 precursor. miR-122 represses
eCFP through interaction with multiple bulged binding sites in the 3' UTR.
F-
1
pre-miR-122
4-1 F
TRE
MMMMM
NLS-eCFP
rtTA (+ dox)
miR-122
|eCFP
106
miR-122 sites
....
......
......
....
....
Appendix. Genome-wide dissection of microRNA functions and
co-targeting networks using gene set signatures
This chapter was written by John S. Tsang and Margaret S. Ebert and edited by
Alexander van Oudenaarden.
This article was published in Molecular Cell vol. 38 pp. 14 0 - 15 3 (2010). Permission to
use the full article was obtained from Elsevier.
MicroRNAs (miRNAs) are emerging as important regulators of diverse biological
processes and pathologies in animals and plants. Though hundreds of human
miRNAs are known, only a few have known functions. Here we predict human
miRNA functions by using a new method that systematically assesses the statistical
enrichment of several miRNA targeting signatures in annotated gene sets such as
signaling networks and protein complexes. Some of our top predictions are
supported by published experiments, yet many are entirely new or provide
mechanistic insights to known phenotypes. Our results indicate that coordinated
miRNA targeting of closely connected genes is prevalent across pathways. We use
the same method to infer which miRNAs regulate similar targets and provide the
first genome-wide evidence of pervasive co-targeting, where a handful of "hub"
miRNAs are involved in a majority of co-targeting relationships. Our method and
analyses pave the way to systematic discovery of miRNA functions.
Introduction
MicroRNAs (miRNAs) regulate diverse biological processes in animals and plants
(Bushati and Cohen 2007) and are among the most abundant regulatory factors in the
human genome, comprising 3-5% of known human genes (Griffiths-Jones et al. 2008).
miRNAs recognize target mRNAs by imperfect base pairing to sites in the 3' untranslated
region (3' UTR), usually with perfect pairing of the miRNA seed region (nucleotides 28), ultimately leading to translational repression and/or mRNA degradation (Bushati and
Cohen 2007). Thousands of human genes are predicted to be targeted by miRNAs
(Rajewsky 2006), suggesting that miRNAs play a pervasive role in the regulation of gene
expression.
Although hundreds of human miRNAs have been identified and new ones are continually
being discovered (Griffiths-Jones et al. 2008), the function of most miRNAs remains
unknown. Increasingly, miRNA expression changes are being linked to phenotypes, but
the mechanistic role of the miRNA in the underlying biological network is often unclear.
107
Given that many human miRNAs can target up to thousands of genes, how often do
miRNAs target a set of related genes to regulate a specific pathway or process? Though
recent studies show that a few miRNAs have pathway-specific functions (Xiao and
Rajewsky 2009), earlier work suggests that miRNAs primarily serve to fine-tune and
confer robustness upon the expression of many genes (Bartel and Chen 2004, Farh et al.
2005, Stark et al. 2005).
The prevalence of multiple miRNAs targeting the same gene ("co-targeting") is also
unclear. While many genes contain putative binding sites for multiple miRNAs (Krek et
al. 2005, Stark et al. 2005), many putative sites may not be functional in vivo. More
specifically, the combinations of miRNAs that function together by regulating common
targets are unknown. Knowledge of such co-targeting relationships would also enable one
to infer a miRNA's function from the function of its co-targeting miRNAs.
Typically miRNA function is predicted by assessing whether the predicted targets of a
given miRNA are enriched for particular functional annotations. Such an approach has
several limitations: (1) target prediction is imperfect and can lead to spurious targets
(Rajewsky 2006); (2) having a subset of one's favorite pathway genes in the putative
target set does not necessarily mean that the miRNA functions in the pathway; (3)
predicted target sets are often so large (hundreds to thousands of genes) and have such
heterogeneous functional annotations that standard algorithms are not sufficiently
sensitive to make high-confidence predictions. Rather than progressing from a miRNA to
a potentially spurious target set that may or may not have enriched function, here we
introduce a computational method called mirBridge, which starts with a gene set of
known function, then assesses whether functional sites for a given miRNA are enriched in
the gene set compared to random gene sets with similar properties.
We apply mirBridge to a variety of annotated gene sets for signaling pathways, diseases,
drug treatments, and protein complexes. We also use mirBridge to infer miRNA pairs
that tend to function together by regulating common targets and use the results to
assemble a miRNA-miRNA co-targeting network. Together, our analyses provide: (1)
hundreds of miRNA function predictions, many of which are supported by published
experiments; (2) genome-wide evidence that many miRNAs coordinately regulate
multiple components of pathways or protein complexes; and (3) evidence that miRNA
co-targeting is highly prevalent, with a small number of "hub" miRNA families involved
in a large fraction of the co-targeting interactions. Both the mirBridge method and the
predictions it has generated can serve as important resources for the future experimental
dissection of miRNA functions.
108
Results
mirBridge: linking miRNAs to gene sets
Many gene sets contain tens to hundreds of putative targets for any particular miRNA.
However, for a variety of reasons (e.g. mRNA secondary structure occludes binding, or
the miRNA and the target are not expressed together) many target sites are not functional
in vivo. The goal of mirBridge is to infer whether an unusually large proportion and
number of putative target sites for a miRNA (m) in a given gene set (G) are likely to be
functional in vivo. Toward this end, mirBridge computes a score by combining the results
of three statistical tests that evaluate different aspects of likely functional target-site
enrichment in G. It is essential that the enrichment of sites in G be compared to
enrichment in appropriate control gene sets. Below we describe the individual tests and
the method for constructing the control gene sets (see Supplemental Experimental
Procedures for details).
The following definitions are essential to the methodology of mirBridge. First, any gene
with one or more seed-matched sites for m in its 3' UTR is deemed a "putative target."
Second, seed-matched sites can be classified into two categories (Figure 1A): "conserved
sites" (CS) are sites that are conserved across mammalian genomes; "high-context
scoring sites" (HCS) are sites with a context score above a predefined threshold. The
context score reflects the likelihood of a seed-matched site to confer repression based on
several features, including the distance of the site from the stop codon, accessibility of the
site based on secondary structure, and the extent of base pairing beyond the seed
(Grimson et al. 2007).
The first test used by mirBridge, called "conservation enrichment signature" (CE), infers
whether the number of CS in G is significantly higher than that of random gene sets
containing the same number of putative targets as G. This test is similar to evaluating
whether the sites have evolved at a slower rate compared to random putative target sets,
but is fundamentally different from prior tests that utilize sequence conservation (Lewis
et al. 2005, Stark et al. 2005) (see Supplemental Experimental Procedures). The second
test, called "context-score signature" (CTX), evaluates whether the number of HCS is
significantly higher than that of random gene sets containing the same number of putative
targets as G. The CTX test is designed to detect enrichment of sites in G that are likely
functional but not necessarily conserved. The third test, called "site occurrence signature"
(OC), evaluates whether the number of putative target sites in G is unusually high
compared to random gene sets containing the same number of genes. While target site
abundance alone is not necessarily indicative of functional targeting by m, functional
targeting enrichment becomes a likely scenario even when G tests as moderately
significant for the CE and/or CTX tests. Note that both CE and CTX are based on
comparison with random gene sets with the same number of putative targets to detect
109
enrichment in the proportion rather than the number of CS or HCS. This ensures that the
comparisons are valid, as gene sets with more putative targets tend to have more CS or
HCS. Because true positives are more likely than false positives to test as simultaneously
significant across the tests, we combine the three tests and form a composite score ("OCCE-CTX") to increase sensitivity without sacrificing specificity.
We developed a nearest-neighbor gene sampling algorithm, motivated by the principle of
kernel-based density estimators (Wegman 1972), to generate random gene sets that are
similar to the input gene set with respect to general conservation level, 3' UTR length,
and GC content, which primarily bias the CE, OC, and CTX tests, respectively.
Simultaneous adjustment is particularly important because these factors are correlated
with each other across genes. Specifically, for the OC test, comparable random gene sets
are generated by replacing each member of G with a randomly drawn gene that has
similar GC content, 3' UTR length, and general conservation level (Figure IB). To ensure
that the number of putative targets in the random gene sets is the same as that in G for the
CE and CTX tests, the same nearest-neighbor procedure is used, but only putative targets
in G are replaced by random putative targets (Figure 1C).
Finally, to obtain the OC-CE-CTX p value, the p values of the individual tests are
combined using a customized version of the inverse-normal method that corrects for
dependencies among tests (Joachim 1999). When multiple gene sets and/or miRNAs are
tested simultaneously, multiple hypothesis testing is corrected by computing the false
discovery rate (FDR) using the q-value method (Storey and Tibshirani 2003). "FDR" and
"q value" are used interchangeably below.
Besides 3' UTR length, GC content, and general conservation, other less apparent factors
could bias mirBridge results, but their effects are likely small (see Supplemental
Experimental Procedures). The statistical model in mirBridge was also designed to
incorporate additional factors if needed; in principle, any number of factors can be
accounted for by our nearest-neighbor sampling procedure.
mirBridge is fundamentally different from testing whether the number of predicted
miRNA targets in a gene set is significantly higher than expected using the Fisher Exact
Test (FET), a standard way to assess the significance of gene set overlaps. First,
mirBridge takes gene set properties into account; second, it combines different and
important biological characteristics of target sites; and finally, it uses metrics (CE and
CTX) that focus on the proportion of likely functional target sites instead of the number
of predicted target overlaps. In fact, mirBridge has superior sensitivity and specificity
compared to FET as shown in the applications below.
110
Inferring human miRNA functions
To link human miRNA families (miRNAs with a shared seed sequence) to functions, we
applied mirBridge to gene sets from (1) canonical signaling pathways from MSigDB
(Subramanian et al. 2005); (2) KEGG (Kanehisa and Goto 2000); (3) human protein
complexes from the CORUM database (Ruepp et al. 2008); (4) gene co-expression
modules (Segal et al. 2004); (5) Gene Ontology (GO) Biological Process; (6) GO
Component; and (7) GO Function (Ashburner et al. 2000). At a FDR cutoff of 0.2,
mirBridge predicts 185, 128, 1198, 456, 432, 71, and 175 distinct miRNA-function
associations, respectively (Tables S1-S7). Most predictions implicate pathways or protein
complexes with multiple putative targets for the miRNA, whereas some have only one (or
very few) putative targets containing multiple high-quality sites (e.g. miR-33 and statin
pathway). The latter fits the paradigm implied in some recent papers where a miRNA
phenotype seems to be accounted for by one (or just a few) targets: "miR-X regulates
process Y by targeting gene Z." However, the prevalence of coordinate targeting of
multiple related genes suggests that most miRNAs exert their phenotypic effects by
targeting multiple network components.
To facilitate a succinct discussion of such a large set of predictions, Tables 1 and 2 show
a selection of predictions that either already have support from the literature or wherein
the predicted pathway (1) has known activity in the tissue where the miRNA is known to
be expressed; or (2) represents core cellular processes (e.g. "apoptosis") and has a large
number of putative targets for the miRNA. We also favor predictions that reoccur in
closely related or synonymous gene sets, e.g. "cell cycle" and "G1 to S transition."
mirBridge is sensitive to biological signals and can independently uncover known
miRNA functions
Although mirBridge is not trained on any dataset of known miRNA functions, several of
the top hits already have experimental support in the literature (Table 1), such as the
association of miR-16 with the cell cycle, Wnt signaling, and prostate cancer (Calin et al.
2005, Linsley et al. 2007) (Figure SlA). This is also an example in which mirBridge links
a disease and the pathways underlying its pathology: miR-16 has been shown to work
through the Wnt pathway to function as a tumor suppressor in prostate cancer (Bonci et
al. 2008). Analogously, miR-7 hits the ErbB pathway in glioblastoma (Kefas et al. 2008,
Webster et al. 2009); miR-221/222 hits the estrogen signaling pathway in breast cancer
(Miller et al. 2008, Zhao et al. 2008); and let-7 hits the Gl-S cell-cycle pathway in breast
cancer (Schultz et al. 2008, Yu et al. 2007). mirBridge can also implicate a pathway of
interest given the tissue specificity of a miRNA: miR-7 is predicted to regulate the insulin
receptor pathway and is known to be highly expressed in insulin-producing cells of
111
pancreatic islets (Bravo-Egana et al. 2008, Correa-Medina et al. 2009, Joglekar et al.
2009). mirBridge also independently uncovered feedback loops: miR-146 is predicted to
target several upstream signaling genes in the NF-kB pathway, whereas its transcription
is known to be activated by NF-kB (Taganov et al. 2006) (Figure S 1B). Another notable
prediction supported by the literature is miR-34 targeting BCL2 and several additional
anti-apoptotic genes in the BAD pathway (Chang et al. 2007, Cloonan et al. 2008, He et
al. 2007). This prediction provides an attractive hypothesis for how miR-34 upregulation
could lead to apoptosis. In sum, these results are reassuring and indicate that mirBridge
can capture biologically relevant signals.
mirBridge is significantly more sensitive than the standard approach of evaluating gene
set overlaps using FET. For instance, when FET is applied to the canonical pathway gene
sets, only five predictions can be made at the 0.2 FDR cutoff (Table S8); all five have
FDRs greater than 0.18, and only one has support from the literature (miR-16 and the
Gleevec pathway, given that miR-16 is associated with leukemia). Furthermore, none of
the top mirBridge predictions supported by published experiments were uncovered. For
example, for miR-16, none of the cell-cycle related pathways are ranked near the top,
even if we ignore the statistical significance and order the pathways within each miRNA
family by their q values (the top cell-cycle related entry has rank 54, q = 0.55). These
results suggest that mirBridge can better uncover biologically relevant signals than FET.
It is important to note that the comprehensiveness of our predictions is dependent on the
gene sets used. Some known miRNA functions are not in our predicted list because the
appropriate gene set(s) were not included in the analysis. For example, miR-200 is known
to function in the epithelial-mesenchymal transition (Burk et al. 2008, Gregory et al.
2008, Korpal et al. 2008, Park et al. 2008), but none of the gene sets used in our analysis
captures this process. However, when mirBridge is applied to genes whose function
annotation in the GeneCards database includes "epithelial-mesenchymal transition," miR141/200a has the lowest q value among all miRNAs (q = 0.08).
To further assess the ability of mirBridge to predict known miRNA functions
independently, we compiled eight additional miRNA phenotypes from the literature and
applied mirBridge to seemingly relevant gene sets from KEGG or GeneCards (Table
S10). Of nine phenotypes, four miRNA-gene set p values are significant and two are
marginally significant (Table 3). In a multiple hypothesis testing context in which all
miRNAs are tested simultaneously for the phenotype gene set, however, only two would
have been predicted at a FDR cutoff of 0.2 even though the desired miRNA ranks at or
near the top for all four of the significant cases. This suggests that, for these specific gene
sets, mirBridge is sensitive to the relevant biological signals but lacks sufficient statistical
power after multiple-testing correction. It follows that the hundreds of low-FDR
112
predictions that are made by mirBridge are compelling candidates for experimental
follow-up given that these emerged in the simultaneous testing of thousands of miRNAgene set combinations. We expect the statistical power of mirBridge to continue to
improve as additional genomes and knowledge of miRNA-target interactions become
available.
We also sought to understand cases where mirBridge failed to predict the correct
functions. Closer examination of the three failed cases in Table 3 suggests that, for let-7
and miR-133, the gene sets used do not capture the biology relevant to the miRNA
targeting. The cell cycle may be a key pathway through which let-7 exerts its effect on
lung cancer (Esquela-Kerscher et al. 2008, Kumar et al. 2008, Schultz et al. 2008), but
the non-small cell lung cancer gene set lacks most cell cycle genes and other postulated
targets such as HMGA2 and MYC (let-7 does hit the Gl-S cell-cycle transition pathway;
Table 1). Similarly, for miR-133 and cardiac hypertrophy, two out of the three known
targets relevant to the phenotype are not in the GeneCards set (CDC42 and WHSC2; Care
et al. 2007). Finally, for miR-122, it turns out that inhibition of miR-122 by antagomir
treatment tends to downregulate, rather than upregulate, cholesterol biosynthetic genes
(Krutzfeldt et al. 2005), suggesting that the effect of miR-122 on cholesterol biosynthetic
genes is indirect. Thus, the insignificant mirBridge p value for miR-122 and cholesterol
biosynthesis genes is not surprising.
mirBridge provides many new miRNA function predictions
The majority of mirBridge predictions are as yet untested (Tables 2 and S1-S7). Some
pathways predicted in common for multiple miRNAs seem particularly compelling .
because the miRNAs are known to be co-regulated. For example, the apoptosis pathway
is predicted for miR-23 and -24, which are different in sequence but are co-expressed
from the same cluster (Chhabra et al. 2009). Some predictions seem reasonable based on
the function of the miRNA host gene. For example, the statin/cholesterol homeostasis
pathway is linked to miR-33, which is embedded in an intron of a transcription factor
(SREBP2) that regulates cholesterol synthesis and uptake (Figure SIC). Other predictions
seem plausible based on known miRNA functions with similar developmental placement
and timing. For example, axon guidance pathways are predicted for miR-124, which has
already been shown to positively regulate neurogenesis (Cheng et al. 2009, Visvanathan
et al. 2007). Consistently, miR-124 was linked to the SNARE protein complex as it
putatively targets VAMP3, a component of SNARE, via three conserved and high
context-scoring sites; VAMP3 is known to function in the docking and fusion of synaptic
vesicles with the presynaptic membrane (Sudhof 2004).
113
mirBridge predictions can also provide mechanistic interpretations of published
experiments. For example, it is known that activation of PIP3 signaling leads to the
hypertrophic response in cardiac myocytes and that miR-1 expression is down-regulated
upon hypertrophic stress (Care et al. 2007, Heineke and Molkentin 2006, Sayed et al.
2007). mirBridge links miR-1 to the PIP3 pathway, and the putative miR-1 targets in the
pathway are all pro-hypertrophic except PTPNJ (Table 1), suggesting that the downregulation of miR-1 helps to drive pathway activation (Figure 2). Post-transcriptional
repression by miR-1 could allow these genes to be transcribed at higher (or leaky) levels
without triggering a hypertrophic response, such that a reduction in miR-1 expression
would suffice to rapidly activate signaling at multiple levels. For example, de-repression
of the most downstream factors (e.g. CDC42) could quickly lead to sarcomere
remodeling, a first step in the hypertrophic response (Nagai et al. 2003). Increasing levels
of upstream factors coupled with positive feedback loops would intensify the response.
We envision that a useful application of mirBridge would be to probe a function of
interest guided by the known expression profile of miRNAs. Because we are interested in
neurotransmitter pathways, we applied mirBridge to manually curated gene sets for these
pathways (see Supplemental Experimental Procedures). miR-218, a known neuronal
miRNA (Sempere et al. 2004), is the most and second-most significant hit for GABA and
glutamate gene sets, respectively (q = 0.025 and 0.033). That these two neurotransmitter
activities may be regulated by the same miRNA is intriguing given that glutamate and
GABA are, respectively, the major excitatory and inhibitory neurotransmitters and that
the latter can be enzymatically converted from the former. In addition, we tested a gene
set for synaptic vesicle formation because miR-218 is enriched at synapses of
hippocampal neurons (Siegel et al. 2009). miR-135, a brain-enriched miRNA (Sempere et
al. 2004), and miR-218 are the top two hits (q = 0.000003 and 0.024, respectively). In
sum, the mirBridge hits for these gene sets extend early experimental findings to
implicate miR-218 as a potential regulator of neuronal activity at hippocampal synapses.
miRNA co-targeting is prevalent
Our miRNA-pathway map indicates that some miRNAs function in the same pathway(s)
by targeting a similar set of genes. Indeed, many miRNAs may function together (via
"co-targeting") to regulate target-gene expression. To assess the prevalence of cotargeting and infer which miRNAs are co-targeting partners, we next used sets of genes
likely regulated by particular miRNAs to create a miRNA-to-miRNA mapping.
Specifically, our inputs to mirBridge were the predicted target sets (PTS) of 73 deeply
conserved human miRNA families. We call a miRNA family Y a "co-targeting partner"
of a miRNA family X if at least one of Y's seed-matched sequences has a significant
mirBridge q value in the PTS of X and denote the relationship as "X->Y." We predicted
114
co-targeting relationships for all ordered pairs of the 73 families (73 X 72 = 5256
distinct pairs).
Our results indicate that miRNA co-targeting is prevalent: 221 distinct X->Y co-targeting
relationships are inferred at a FDR cutoff of 0.2 (Table S 11). A subset of these
predictions corresponds to miRNA genomic clusters (Yu et al. 2006), such as the miR]9b-2/106a cluster on Xq26.2 and the miR-1 7-18-19a-20-92 cluster on 13q31.3 (Table
S11). Co-targeting pairs in close genomic proximity are not surprising: these miRNAs are
polycistronic and co-expressed, and are thus likely to function together to regulate
common targets. In fact, clustered miRNAs are enriched for co-targeting relationships:
when X and Y are members of a genomic cluster, they are predicted as co-targeting
partners 25% of the time, compared to 3% when X and Y are not clustered.
Consequently, the median q-value of clustered pairs is significantly lower than that of
unclustered ones (p < 2.1 X 10- 7 , Mann-Whitney Test; see Table SI1 for the clusters
used in this analysis), indicating that our method for detecting co-targeting is sensitive,
specific, and capable of uncovering biologically relevant signals.
If our predictions reflect bonafide biological signals, we also expect a significant
percentage of the X->Y pairs to possess mutual co-targeting relationships, i.e. each
miRNA's putative binding sites would have a score below the FDR cutoff in the other
miRNA's PTS. Indeed, 96/221 (43%) of the X->Y predicted pairs do. Though the
remaining 57% of the X->Y pairs do not have the corresponding Y->X pairs falling
below the FDR cutoff, there is nonetheless a significant correlation between their q
values (Spearman correlation = 0.42 (p = 0); Figure S2). Also, the reciprocal (Y->X) q
values of significant X-)Y pairs are lower than those of pairs with q values greater than
0.2 (p < 5 X 10-14 Mann-Whitney Test). The general reciprocation of co-targeting
scores indicates that a significant percentage of our predictions are specific and that the
signals we are detecting are likely biologically relevant.
We also tested whether co-targeting relationships could be inferred from gene set
overlaps, where the X->Y q value was computed using FET on the number of genes
shared between the PTSs of the miRNA family pair. This analysis failed to provide
informative results because almost all tested pairs have a significant q value: 2264 (86%)
and 2628 (100%) of the pairs have a q value of less than 0.05 by using the Bonferroni and
FDR correction, respectively. This suggests that a core set of genes are frequently
predicted as targets for many miRNA family pairs; these likely correspond to genes with
highly conserved 3' UTRs and/or low GC content, properties that favor a gene being
predicted as a target using Targetscan. This result strongly suggests that the degree of
PTS overlap is not sufficiently specific to detect authentic co-targeting relationships,
115
whereas mirBridge has superior specificity and is thus able to provide biologically
relevant signals, as shown above.
Network analysis of co-targeting interactions
Our co-targeting predictions can naturally be organized as a network in which the nodes
are miRNA families and the directed edges between nodes denote the X->Y predictions.
A network representation enables examination of connectivity patterns beyond pairwise
interactions. We first checked whether the edges in the network are evenly distributed
across nodes or concentrated around a few nodes ("hubs"). Strikingly, the edges
connecting the 10 most highly connected nodes (out of 69 nodes with at least one
adjacent edge) account for more than 55% (123/221) of the edges in the network (Figure
3A and Table S11). While overall the size of a miRNA family's PTS is correlated to its
connectivity ranking (p = 10~6 Spearman correlation), this correlation becomes
insignificant when restricted to families with at least 900 predicted targets (p > 0.1).
Since only six of the top 40 most-connected families have fewer than 900 predicted
targets, the size of a miRNA family's PTS alone cannot explain the connectivity pattern
among the top 40 families. The hub miRNA families probably have functions in diverse
contexts. For example, some hubs have a large number of members and therefore are
likely to have more diverse functions depending on the spatial-temporal expression of
individual miRNAs (e.g. miR-93.hd/291-3p/294/295/302/372/373/520).
We reasoned that groups of tightly interconnected nodes might represent miRNAs that
perform similar functions. To identify such groups we used a graph clustering tool that
ignores edge weights to identify tightly interconnected nodes (Bader and Hogue 2003)
(Figure 3B). We find that subnetwork 1 has four families and is the largest and most
highly interconnected; three of the families (miR-1 7 -5p, -130, -93.hd) are among the most
connected families (Figure 3A). This subnetwork is also well connected to subnetwork 3
(miR-18, -19, -181), probably because miR-1 7-18-19-20 are co-expressed from a
polycistronic transcript. The miR-1 7 cluster is known to be overexpressed in a number of
human cancers, including B-cell tumors, whereas miR-142 is also highly expressed in B
cells (Chen and Lodish 2005, Mendell 2008). Their shared PTS is enriched for genes in
developmental processes (p < 3.8 x 10a), consistent with the miR-1 7 cluster's function
in the development of B cells, the heart, and lungs (Mendell 2008, Ventura et al. 2008).
Our linking of the miR-142 and miR-130/301 families - whose functions are largely
unknown - to the miR-1 7 cluster suggests that these miRNA families also participate in
similar developmental and oncogenic processes.
116
Discussion
We have introduced a systematic method for inferring miRNA functions by assessing the
enrichment of likely functional target sites in gene sets. Key features of mirBridge
include combining test metrics that detect different aspects of functional targeting, and a
sampling algorithm for removing gene set biases to improve estimation of statistical
significance. Hundreds of human miRNA-function associations were inferred by
mirBridge; some are reassuringly supported by published experiments, but many are asyet untested and/or provide mechanistic insights beyond published data.
Our results provide hints about the general principles of miRNA-mediated regulation in
networks. While some miRNAs could act as global regulators by repressing up to
thousands of targets genome-wide (Lewis et al. 2005), many appear to have pathwayspecific functions, and these miRNAs tend to target multiple genes in the same pathway.
Typically, the predicted targets of the miRNA are genes that drive pathway activity in a
coherent direction (e.g. miR-1 6 targeting of G 1-to-S-promoting genes). Such coordinate
targeting could partially explain how individual miRNAs can be potent effectors of
pathway activity even though the amount of repression conferred by miRNAs tends to be
modest for any single target (Baek et al. 2008, Selbach et al. 2008, Xiao and Rajewsky
2009). As was observed earlier (Martinez et al. 2008, Tsang et al. 2007), some of our
predictions (e.g. miR-1) involve miRNAs mediating feedback and feedforward loops,
whose functions include protein homeostasis and signal amplification, respectively. For
example, miRNAs could be "master" regulators of pathways and thus serve as effective
therapeutic targets because positive feedbacks could amplify small changes in protein
concentration conferred by miRNA targeting of multiple genes. Our analysis also
indicates that miRNAs can function in, and mediate crosstalk among, multiple canonical
pathways, such as miR-16's potential roles across the cell cycle and Wnt pathways to
coordinately regulate cellular growth and proliferation.
mirBridge also facilitates context-specific target prediction: one can first predict which
pathways a miRNA regulates and then compile high-quality putative targets within a
pathway. This strategy may be especially effective for miRNAs that function in only a
few pathways, as targets predicted genome-wide may have low specificity (Lewis et al.
2005). Additional filtering can be used to strengthen the target predictions, for example,
by requiring that the putative target and the miRNA be significantly correlated in their
expression using miRNA-mRNA expression data sets (Lu et al. 2005) (Table S9).
In addition to providing functional links across miRNAs, our human miRNA-miRNA
map provides, to the best of our knowledge, the first genome-wide evidence that miRNA
co-targeting is prevalent, and that a handful of hub miRNA families are involved in a
large fraction of the co-targeting connections. The abundance of co-targeting further
117
suggests that while individual miRNAs may provide only modest levels of repression,
combinatorial targeting by multiple miRNAs (Krek et al. 2005) can potentially achieve a
wide range of target-level modulations. Given that multiple miRNAs are expressed at
different levels in any given cell type, individual genes can evolve combinations of
miRNA binding sites to optimize expression levels across cell types (Bartel and Chen
2004). miRNA target sites are short and could thus be acquired or lost relatively quickly
over evolution to fine-tune gene expression levels.
Designating a group of miRNAs as "co-targeting" does not necessarily imply that these
miRNAs are co-expressed so as to regulate their common targets at the same time and
place. In fact, the exact opposite is also likely: different miRNAs are responsible for
controlling a given set of targets in different contexts. In general, a combination of the
above scenarios is likely for individual cases, and additional data (e.g. miRNA and target
expression profiles) are needed to further dissect the mechanistic basis of individual cotargeting predictions.
mirBridge is currently limited to assessing enrichment at the level of miRNA families
using seed-matched motifs. But this is largely due to our lack of general understanding of
miRNA-target interaction beyond seed pairing and features captured by the context score.
In principle, the mirBridge methodology is general and can be applied to any
combinations of gene sets, sequence motifs, and site scoring metrics, including nonmiRNA motifs, such as those involved in regulating mRNA stability. Given mirBridge's
ability to simultaneously correct for multiple gene set biases, and the increasing
availability of genomes and annotated gene sets, mirBridge is poised to serve as a key
resource for the comprehensive functional dissection of miRNAs and other regulatory
sequence motifs in genomes.
Experimental Procedures
Seed-matched site compilation miRNA family memberships, 3' UTR sequences, seedmatched sites and their context scores and conservation status were downloaded from
TargetScan (http://www.targetscan.org/vert_40/). For each known human gene, the
number of seed-matched sites for each miRNA family, the number of those that are
conserved, and the context score were computed. Since the context score depends on the
full miRNA sequence, the context score for a miRNA family is defined as the average of
all human members of that family.
mirBridge The method as described in the text was implemented in Matlab. More details
and related discussions can be found in Supplementary Experimental Procedures.
118
miRNA function analysis Canonical signaling pathway and KEGG gene sets were
downloaded from http://www.broad.mit.edu/gsea/msigdb/index.jsp. The cancer,
CORUM, and GO sets were downloaded from
http://robotics.stanford.edu/-erans/cancer/, http://mips.helmholtzmuenchen.de/genre/proj/corum, and NCBI Gene, respectively. To reduce noise and avoid
spurious annotations, we only used GO annotations with experimental and peer-reviewed
evidence. A miRNA-gene set prediction requires at least one of the miRNA seed motifs
(m2-8 and/or m7-A) to test as significant in the gene set. The q value reported for
individual miRNAs corresponds to the q value of the seed motif with the smaller p value.
miRNA family selection The deeply conserved miRNAs are ones that are conserved
across human, mouse, rat, dog and chicken. We focused on these miRNAs because they
probably have (1) more conserved functions, (2) a larger number of targets compared to
less-conserved miRNAs, and (3) stronger conservation enrichment signals.
Target prediction Targets were compiled for each miRNA by including genes with at
least one conserved seed-match (across human, mouse, rat and dog) or a seed-match with
a context score of greater than 68 in the 3' UTR (see Supplemental Experimental
Procedures). Predictions based on context score alone were included because functional
target sites can be imperfectly conserved. High-quality putative targets in gene sets
(Table 1 and S1) were compiled using the same definition.
X->Y predictions and analysis mirBridge was applied to the predicted target set of each
miRNA family. Only the seed-matched motifs of the 73 families were scored. When both
seed-matched motifs of a miRNA family are tested significant, the smaller q value is used
as the X->Y q value. Human miRNA clusters were obtained from (Yu et al. 2006).
Predicted target set overlap analysis The number of overlaps between the predicted
target set of each miRNA-family pair was computed. The statistical significance was
computed using Fisher Exact Test (see Supplemental Experimental Procedures).
Predicted target set and pathway overlap analysis Similar to above except that (1) all
genes that are not predicted as a target for any miRNA were removed from the pathway
gene sets; and (2) the population size is taken as the number of genes that are predicted as
a target for at least one miRNA family and belong to at least one pathway.
Acknowledgments
We thank H. Fraser, D. Muzzey, M. Narayanan and M. Umbarger for comments on the
manuscript; J. Zhu for discussions; D. Bartel for the suggestion to examine co-targeting
119
by polycistronic miRNAs; M. Fang for help on importing gene sets. This work was
supported by a NIH Director's Pioneer Award to A.v.O.; J.T. was partially supported by
a doctoral scholarship from the NSERC of Canada; M.S.E. was supported by a HHMI
Predoctoral Scholarship and Paul and Cleo Schimmel Scholarship.
References
Ashbumer M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski
K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S,
Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G. Gene ontology: tool for
the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25-29 (2000).
Bader GD, Hogue CW. An automated method for finding molecular complexes in large
protein interaction networks. BMC Bioinformatics 4, 2 (2003).
Baek D, Vill6n J, Shin C, Camargo FD, Gygi SP, Bartel DP. The impact of microRNAs
on protein output. Nature 455, 64-71 (2008).
Bartel DP, Chen CZ. Micromanagers of gene expression: the potentially widespread
influence of metazoan microRNAs. Nat. Rev. Genet. 5, 396-400 (2004).
Bonci D, Coppola V, Musumeci M, Addario A, Giuffrida R, Memeo L, D'Urso L,
Pagliuca A, Biffoni M, Labbaye C, Bartucci M, Muto G, Peschle C, De Maria R. The
miR- 15a-miR- 16-1 cluster controls prostate cancer by targeting multiple oncogenic
activities. Nat. Med. 14, 1271-1277 (2008).
Bravo-Egana V, Rosero S, Molano RD, Pileggi A, Ricordi C, Dominguez-Bendala J,
Pastori RL. Quantitative differential expression analysis reveals miR-7 as major islet
microRNA. Biochem. Biophys. Res. Commun. 366, 922-926 (2008).
Burk U, Schubert J, Wellner U, Schmalhofer 0, Vincan E, Spadema S, Brabletz T. A
reciprocal repression between ZEB 1 and members of the miR-200 family promotes EMT
and invasion in cancer cells. EMBO Rep. 9, 582-589 (2008).
Bushati N, Cohen SM. microRNA functions. Annu. Rev. Cell Dev. Biol. 23, 175-205
(2007).
Calin GA, Ferracin M, Cimmino A, Di Leva G, Shimizu M, Wojcik SE, Iorio MV,
Visone R, Sever NI, Fabbri M, luliano R, Palumbo T, Pichiorri F, Roldo C, Garzon R,
Sevignani C, Rassenti L, Alder H, Volinia S, Liu CG, Kipps TJ, Negrini M, Croce CM. A
MicroRNA signature associated with prognosis and progression in chronic lymphocytic
leukemia. N. Engl. J. Med. 353, 1793-1801 (2005).
Care A, Catalucci D, Felicetti F, Bonci D, Addario A, Gallo P, Bang ML, Segnalini P, Gu
Y, Dalton ND, Elia L, Latronico MV, Hoydal M, Autore C, Russo MA, Dom GW 2nd,
120
Ellingsen 0, Ruiz-Lozano P, Peterson KL, Croce CM, Peschle C, Condorelli G.
MicroRNA-133 controls cardiac hypertrophy. Nat. Med. 13, 613-618 (2007).
Chan JA, Krichevsky AM, Kosik KS. MicroRNA-21 is an antiapoptotic factor in human
glioblastoma cells. Cancer Res. 65, 6029-6033 (2005).
Chang TC, Wentzel EA, Kent OA, Ramachandran K, Mullendore M, Lee KH, Feldmann
G, Yamakuchi M, Ferlito M, Lowenstein CJ, Arking DE, Beer MA, Maitra A, Mendell
JT. Transactivation of miR-34a by p53 broadly influences gene expression and promotes
apoptosis. Mol. Cell 26, 745-752 (2007).
Chen CZ, Lodish HF. MicroRNAs as regulators of mammalian hematopoiesis. Semin.
Immunol. 17, 155-165 (2005).
Cheng LC, Pastrana E, Tavazoie M, Doetsch F. miR-124 regulates adult neurogenesis in
the subventricular zone stem cell niche. Nat. Neurosci. 12, 399-408 (2009).
Chhabra R, Adlakha YK, Hariharan M, Scaria V, Saini N. Upregulation of miR-23a
approximately 27a approximately 24-2 cluster induces caspase-dependent and independent apoptosis in human embryonic kidney cells. PLoS One 4, e5848 (2009).
Cloonan N, Brown MK, Steptoe AL, Wani S, Chan WL, Forrest AR, Kolle G, Gabrielli
B, Grimmond SM. The miR-17-5p microRNA is a key regulator of the Gl/S phase cell
cycle transition. Genome Biol. 9, R127 (2008).
Correa-Medina M, Bravo-Egana V, Rosero S, Ricordi C, Edlund H, Diez J, Pastori RL.
MicroRNA miR-7 is preferentially expressed in endocrine cells of the developing and
adult human pancreas. Gene Expr. Patterns 9, 193-199 (2009).
Esquela-Kerscher A, Trang P, Wiggins JF, Patrawala L, Cheng A, Ford L, Weidhaas JB,
Brown D, Bader AG, Slack FJ. The let-7 microRNA reduces tumor growth in mouse
models of lung cancer. Cell Cycle 7, 759-764 (2008).
Farh KK, Grimson A, Jan C, Lewis BP, Johnston WK, Lim LP, Burge CB, Bartel DP.
The widespread impact of mammalian MicroRNAs on mRNA repression and evolution.
Science 310, 1817-1821 (2005).
Gregory PA, Bert AG, Paterson EL, Barry SC, Tsykin A, Farshid G, Vadas MA, KhewGoodall Y, Goodall GJ. The miR-200 family and miR-205 regulate epithelial to
mesenchymal transition by targeting ZEB 1 and SIP 1. Nat. Cell Biol. 10, 593-601 (2008).
Griffiths-Jones S, Saini HK, van Dongen S, Enright AJ. miRBase: tools for microRNA
genomics. Nucleic Acids Res. 36, D154-158 (2008).
Grimson A, Farh KK, Johnston WK, Garrett-Engele P, Lim LP, Bartel DP. MicroRNA
targeting specificity in mammals: determinants beyond seed pairing. Mol. Cell 27, 91105 (2007).
121
He L, He X, Lim LP, de Stanchina E, Xuan Z, Liang Y, Xue W, Zender L, Magnus J,
Ridzon D, Jackson AL, Linsley PS, Chen C, Lowe SW, Cleary MA, Hannon GJ. A
microRNA component of the p53 tumour suppressor network. Nature 447, 1130-1134
(2007).
Heineke J, Molkentin JD. Regulation of cardiac hypertrophy by intracellular signalling
pathways. Nat. Rev. Mol. Cell Biol. 7, 589-600 (2006).
Ji Q, Hao X, Meng Y, Zhang M, Desano J, Fan D, Xu L. Restoration of tumor suppressor
miR-34 inhibits human p53-mutant gastric cancer tumorspheres. BMC Cancer 8, 266
(2008).
Ji Q, Hao X, Zhang M, Tang W, Yang M, Li L, Xiang D, Desano JT, Bommer GT, Fan
D, Fearon ER, Lawrence TS, Xu L. MicroRNA miR-34 inhibits human pancreatic cancer
tumor-initiating cells. PLoS One 4, e6816 (2009).
Joachim H. A Note on Combining Dependent Tests of Significance. Biometrical Journal
41, 849-855 (1999).
Joglekar MV, Joglekar VM, Hardikar AA. Expression of islet-specific microRNAs
during human pancreatic development. Gene Expr. Patterns 9, 109-113 (2009).
Johnnidis JB, Harris MH, Wheeler RT, Stehling-Sun S, Lam MH, Kirak 0,
Brummelkamp TR, Fleming MD, Camargo FD. Regulation of progenitor cell
proliferation and granulocyte function by microRNA-223. Nature 451, 1125-1129 (2008).
Johnson CD, Esquela-Kerscher A, Stefani G, Byrom M, Kelnar K, Ovcharenko D,
Wilson M, Wang X, Shelton J, Shingara J, Chin L, Brown D, Slack FJ. The let-7
microRNA represses cell proliferation pathways in human cells. Cancer Res. 67, 77137722 (2007).
Jones SW, Watkins G, Le Good N, Roberts S, Murphy CL, Brockbank SM, Needham
MR, Read SJ, Newham P. The identification of differentially expressed microRNA in
osteoarthritic tissue that modulate the production of TNF-alpha and MMP 13.
Osteoarthritis Cartilage 17, 464-472 (2009).
Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids
Res. 28, 27-30 (2000).
Kefas B, Godlewski J, Comeau L, Li Y, Abounader R, Hawkinson M, Lee J, Fine H,
Chiocca EA, Lawler S, Purow B. microRNA-7 inhibits the epidermal growth factor
receptor and the Akt pathway and is down-regulated in glioblastoma. Cancer Res. 68,
3566-3572 (2008).
Korpal M, Lee ES, Hu G, Kang Y. The miR-200 family inhibits epithelial-mesenchymal
transition and cancer cell migration by direct targeting of E-cadherin transcriptional
repressors ZEBI and ZEB2. J. Biol. Chem. 283, 14910-14914 (2008).
122
Krek A, Grtin D, Poy MN, Wolf R, Rosenberg L, Epstein EJ, MacMenamin P, da
Piedade I, Gunsalus KC, Stoffel M, Rajewsky N. Combinatorial microRNA target
predictions. Nat. Genet. 37, 495-500 (2005).
Krutzfeldt J, Rajewsky N, Braich R, Rajeev KG, Tuschl T, Manoharan M, Stoffel M.
Silencing of microRNAs in vivo with 'antagomirs'. Nature 438, 685-689 (2005).
Kumar MS, Erkeland SJ, Pester RE, Chen CY, Ebert MS, Sharp PA, Jacks T.
Suppression of non-small cell lung tumor development by the let-7 microRNA family.
Proc. Natl Acad. Sci. USA 105, 3903-3908 (2008).
Lewis BP, Burge CB, Bartel DP. Conserved seed pairing, often flanked by adenosines,
indicates that thousands of human genes are microRNA targets. Cell 120, 15-20 (2005).
Li QJ, Chau J, Ebert PJ, Sylvester G, Min H, Liu G, Braich R, Manoharan M, Soutschek
J, Skare P, Klein LO, Davis MM, Chen CZ. miR- 181 a is an intrinsic modulator of T cell
sensitivity and selection. Cell 129, 147-161 (2007).
Li Z, Hassan MQ, Jafferji M, Aqeilan RI, Garzon R, Croce CM, van Wijnen AJ, Stein JL,
Stein GS, Lian JB. Biological functions of miR-29b contribute to positive regulation of
osteoblast differentiation. J. Biol. Chem. 284, 15676-15684 (2009).
Li, Z., Hassan, M. Q., Volinia, S., van Wijnen, A. J., Stein, J. L., Croce, C. M., Lian, J.
B., and Stein, G. S. (2008). A microRNA signature for a BMP2-induced osteoblast
lineage commitment program. Proc. Natl Acad. Sci. USA 105, 13906-13911.
Linsley PS, Schelter J, Burchard J, Kibukawa M, Martin MM, Bartz SR, Johnson JM,
Cummins JM, Raymond CK, Dai H, Chau N, Cleary M, Jackson AL, Carleton M, Lim L.
Transcripts targeted by the microRNA- 16 family cooperatively regulate cell cycle
progression. Mol. Cell Biol. 27, 2240-2252 (2007).
Liu Q, Fu H, Sun F, Zhang H, Tie Y, Zhu J, Xing R, Sun Z, Zheng X. miR-16 family
induces cell cycle arrest by regulating multiple cell cycle genes. Nucleic Acids Res. 36,
5391-5404 (2008).
Lu J, Getz G, Miska EA, Alvarez-Saavedra E, Lamb J, Peck D, Sweet-Cordero A, Ebert
BL, Mak RH, Ferrando AA, Downing JR, Jacks T, Horvitz HR, Golub TR. MicroRNA
expression profiles classify human cancers. Nature 435, 834-838 (2005).
Lu TX, Munitz A, Rothenberg ME. MicroRNA-21 is up-regulated in allergic airway
inflammation and regulates IL-12p35 expression. J. Immunol. 182, 4994-5002 (2009).
Martinez NJ, Ow MC, Barrasa MI, Hammell M, Sequerra R, Doucette-Stamm L, Roth
FP, Ambros VR, Walhout AJ. A C. elegans genome-scale microRNA network contains
composite feedback motifs with high flux capacity. Genes Dev. 22, 2535-2549 (2008).
Mendell JT. miRiad roles for the miR- 17-92 cluster in development and disease. Cell
133, 217-222 (2008).
123
Miller TE, Ghoshal K, Ramaswamy B, Roy S, Datta J, Shapiro CL, Jacob S, Majumder
S. MicroRNA-221/222 confers tamoxifen resistance in breast cancer by targeting
p27Kipl. J. Biol. Chem. 283, 29897-29903 (2008).
Nagai T, Tanaka-Ishikawa M, Aikawa R, Ishihara H, Zhu W, Yazaki Y, Nagai R,
Komuro I. Cdc42 plays a critical role in assembly of sarcomere units in series of cardiac
myocytes. Biochem. Biophys. Res. Commun. 305, 806-810 (2003).
Park SM, Gaur AB, Lengyel E, Peter ME. The miR-200 family determines the epithelial
phenotype of cancer cells by targeting the E-cadherin repressors ZEB 1 and ZEB2. Genes
Dev. 22, 894-907 (2008).
Pickering MT, Stadler BM, Kowalik TF. miR- 17 and miR-20a temper an E2F 1-induced
GI checkpoint to regulate cell cycle progression. Oncogene 28, 140-145 (2009).
Rajewsky N. microRNA target predictions in animals. Nat. Genet. 38 Suppl, S8-13
(2006).
Raver-Shapira N, Marciano E, Meiri E, Spector Y, Rosenfeld N, Moskovits N, Bentwich
Z, Oren M. Transcriptional activation of miR-34a contributes to p53-mediated apoptosis.
Mol. Cell 26, 731-743 (2007).
Ruepp A, Brauner B, Dunger-Kaltenbach I, Frishman G, Montrone C, Stransky M,
Waegele B, Schmidt T, Doudieu ON, Stumpflen V, Mewes HW. CORUM: the
comprehensive resource of mammalian protein complexes. Nucleic Acids Res. 36, D646650 (2008).
Sayed D, Hong C, Chen IY, Lypowy J, Abdellatif M. MicroRNAs play an essential role
in the development of cardiac hypertrophy. Circ. Res. 100, 416-424 (2007).
Schultz J, Lorenz P, Gross G, Ibrahim S, Kunz M. MicroRNA let-7b targets important
cell cycle molecules in malignant melanoma cells and interferes with anchorageindependent growth. Cell Res. 18, 549-557 (2008).
Segal E, Friedman N, Koller D, Regev A. A module map showing conditional activity of
expression modules in cancer. Nat. Genet. 36, 1090-1098 (2004).
Selbach M, Schwanhausser B, Thierfelder N, Fang Z, Khanin R, Rajewsky N.
Widespread changes in protein synthesis induced by microRNAs. Nature 455, 58-63
(2008).
Sempere LF, Freemantle S, Pitha-Rowe I, Moss E, Dmitrovsky E, Ambros V. Expression
profiling of mammalian microRNAs uncovers a subset of brain-expressed microRNAs
with possible roles in murine and human neuronal differentiation. Genome Biol. 5, R13
(2004).
Siegel G, Obernosterer G, Fiore R, Oehmen M, Bicker S, Christensen M, Khudayberdiev
S, Leuschner PF, Busch CJ, Kane C, Hubel K, Dekker F, Hedberg C, Rengarajan B,
124
Drepper C, Waldmann H, Kauppinen S, Greenberg ME, Draguhn A, Rehmsmeier M,
Martinez J, Schratt GM. A functional screen implicates microRNA-13 8-dependent
regulation of the depalmitoylation enzyme APTI in dendritic spine morphogenesis. Nat.
Cell Biol. 11, 705-716 (2009).
Stark A, Brennecke J, Bushati N, Russell RB, Cohen SM. Animal MicroRNAs confer
robustness to gene expression and have a significant impact on 3' UTR evolution. Cell
123, 1133-1146 (2005).
Storey JD, Tibshirani R. Statistical significance for genomewide studies. Proc. Natl Acad.
Sci. USA 100, 9440-9445 (2003).
Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich
A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP. Gene set enrichment analysis: a
knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl
Acad. Sci. USA 102, 15545-15550 (2005).
Sudhof TC. The synaptic vesicle cycle. Annu. Rev. Neurosci. 27, 509-547 (2004).
Taganov KD, Boldin MP, Chang KJ, Baltimore D. NF-kappaB-dependent induction of
microRNA miR- 146, an inhibitor targeted to signaling proteins of innate immune
responses. Proc. Natl Acad. Sci. USA 103, 12481-12486 (2006).
Tarasov V, Jung P, Verdoodt B, Lodygin D, Epanchintsev A, Menssen A, Meister G,
Hermeking H. Differential regulation of microRNAs by p53 revealed by massively
parallel sequencing: miR-34a is a p53 target that induces apoptosis and GI-arrest. Cell
Cycle 6, 1586-1593 (2007).
Thai TH, Calado DP, Casola S, Ansel KM, Xiao C, Xue Y, Murphy A, Frendewey D,
Valenzuela D, Kutok JL, Schmidt-Supprian M, Rajewsky N, Yancopoulos G, Rao A,
Rajewsky K. Regulation of the germinal center response by microRNA- 155. Science 316,
604-608 (2007).
Tsang J, Zhu J, van Oudenaarden A. MicroRNA-mediated feedback and feedforward
loops are recurrent network motifs in mammals. Mol. Cell 26, 753-767 (2007).
van Rooij E, Sutherland LB, Thatcher JE, DiMaio JM, Naseem RH, Marshall WS, Hill
JA, Olson EN. Dysregulation of microRNAs after myocardial infarction reveals a role of
miR-29 in cardiac fibrosis. Proc. Natl Acad. Sci. USA 105, 13027-13032 (2008).
Ventura A, Young AG, Winslow MM, Lintault L, Meissner A, Erkeland SJ, Newman J,
Bronson RT, Crowley D, Stone JR, Jaenisch R, Sharp PA, Jacks T. Targeted deletion
reveals essential and overlapping functions of the miR-17 through 92 family of miRNA
clusters. Cell 132, 875-886 (2008).
Visvanathan J, Lee S, Lee B, Lee JW, Lee SK. The microRNA miR- 124 antagonizes the
anti-neural REST/SCP 1 pathway during embryonic CNS development. Genes Dev. 21,
744-749 (2007).
125
Webster RJ, Giles KM, Price KJ, Zhang PM, Mattick JS, Leedman PJ. Regulation of
epidermal growth factor receptor signaling in human cancer cells by microRNA-7. J.
Biol. Chem. 284, 5731-5741 (2009).
Wegman EJ. Nonparametric Probability Density Estimation: I. A Summary of Available
Methods. Technometrics 14, 533 (1972).
Xiao C, Rajewsky K. MicroRNA control in the immune system: basic principles. Cell
136, 26-36 (2009).
Xie H, Lim B, Lodish HF. MicroRNAs induced during adipogenesis that accelerate fat
cell development are downregulated in obesity. Diabetes 58, 1050-1057 (2009).
Yang Z, Kaye DM. Mechanistic insights into the link between a polymorphism of the 3'
UTR of the SLC7A1 gene and hypertension. Hum. Mutat. 30, 328-333 (2009).
Yu F, Yao H, Zhu P, Zhang X, Pan Q, Gong C, Huang Y, Hu X, Su F, Lieberman J, Song
E. let-7 regulates self renewal and tumorigenicity of breast cancer cells. Cell 131, 11091123 (2007).
Yu J, Wang F, Yang GH, Wang FL, Ma YN, Du ZW, Zhang JW. Human microRNA
clusters: genomic organization and expression profile in leukemia cell lines. Biochem.
Biophys. Res. Commun. 349, 59-68 (2006).
Zhao JJ, Lin J, Yang H, Kong W, He L, Ma X, Coppola D, Cheng JQ. MicroRNA221/222 negatively regulates estrogen receptor alpha and is associated with tamoxifen
resistance in breast cancer. J. Biol. Chem. 283, 31079-31086 (2008).
126
..........
.
wwffi* e
...
.......
i ...............
............
Mg
Figures
Figure 1. mirBridge overview
(A) The input to mirBridge is a set of genes. Red and blue squares denote conserved and nonconserved seed-matched sites in the 3' UTR respectively. The number inside the squares denotes
the context score. For each miRNA target sequence of interest, mirBridge computes the N, K, H,
and T as illustrated. (B) The procedure for evaluating whether N is significantly higher than that
of comparable random gene sets (the OC test). To obtain the null distribution for N, random gene
sets with similar 3' UTR properties were constructed by replacing each gene in the original set
(gl. ... g,; solid red dots) by a randomly drawn gene (r, r2 ,
...
r).
The probability that ri is
drawn to replace gi is inversely proportional to its distance to g, in the 3-D space defined by 3'
UTR length, GC content and general conservation level. The histogram depicts the null
distribution of N for miR-16 in the cell-cycle gene set. (C) The procedure for evaluating whether
K and H are significantly higher than those of random gene sets containing T putative targets with
similar 3' UTR properties as the putative targets in G (the CE and CTX tests, respectively). The
same gene sampling procedure from (B) is used except that only the putative targets in G (empty
) so that T is identical across G
red dots) are replaced by random putative targets (
and the random gene sets. The histograms depict the null distributions of K and H, respectively,
for random gene sets with T=5 putative targets for the miR-16 and the cell-cycle gene set.
A
T UTRs of gene 1 to gene n
gene set G
31
gene 3
-
non -on,-ed~e
m R X seed-matc
with contet score
wico
N total#of(U]and[
76
te t score 92.&'na
I
K total#of l
with contet score > threshold t
H total # of [ ]and[
T total # genes in G with at least one seed-niatch (either (NJ or
127
)
*
IJRS agne %eto
3 UTqs ginome
Draw random gene se R
(r,,
Obtan p(N)bydrawmng
5000 gene sets
a r~r.r~ taigws in gP~ ~et ~ ~ ur~ wir~aq~It* OfUwd~nakh
Putattv~t~ta in genmw
r.)
~ UTRawitti hi bail an, iuinr mut~li
Draw random putative target set I'a (;,
Obtain p(KI T)by dtawmng
5000 gen sets
r;.
r;
Obtain p(Hi T) by drawing
5000 gene sets
vegsites in
ge-netet 0
cx
0
N
4
i
-
a
$te Occurrqnce $ignature (OC)
onsrvation Signature (CE)
Conte.score Signature (CTX)
Is N sgnditcantly higher than
comparabue random gene sets
with n genes?
Is K sgnatcantly higher than
corpawabe random putative
target sets with T5 mR-16
putative targets?
Is H signscatly higher than
rmeer
o putatve
COMve
target sets with T*5 miR-16
putative targets?
128
..
.........
-------
............................................
.......
.
. ....
Figure 2. miR-Jand PIP3 signaling in cardiac hypertrophy
The orange repressive arrows depict high-quality putative targets of miR-1 in the PIP3 pathway in
cardiac myocytes (see Experimental Procedures). The rest of the network is based on known
interactions compiled from the literature (Heineke and Molkentin, 2006). See Figure SI for
network diagrams of other selected predictions discussed in the text.
Cardiac
hypertrophic signals
mR-1
K
stress
+14.
i
Poieiw~a4iy
ie~~e~i Oy
I
iT1~R~
111NOW
I,
4
*
129
............
Figure 3. The miRNA-cotargeting network inferred by mirBridge
The thickness of the edges is proportional to - log (q). (A) The ten most highly connected nodes
and the adjacent edges are highlighted in yellow and red, respectively. (B) Examples of highly
interconnected subnetworks. See also Figure S3.
/
~<
/
I
~
A
~
A
m*INA lomiuy
miR-93.hd/2910
4
3p/29 /295/3 2/ 372/3
73/520
mfR-175p/20/93 m r/106/519 d
miR-130/301
miR-148/152
miR-181
rm#R-1O1
mtR-34/449
miR-26
m)R-19
miR-221/222
TotWl
o*fpmEked ttore
7
17
24
1717
6
18
24
1388
13
13
9
7
10
4
11
9
10
10
11
13
9
13
5
6
23
23
20
20
19
17
16
15
1121
1063
1322
1725
1280
1430
1332
787
alt
130
7 )7341
Table 1. Selected mirBridge predictions with published evidence.
Due to space limitations, typically only targets with a conserved and high context-scoring site are
shown (see Table SI for details). "High-quality putative targets" are ones with either a conserved
or high context-scoring site (see Experimental Procedures).
miRNA
Function
q value
# of highquality putative
targets
Selected targets
Evidence
0
3
TRAF6, IRAK1
(Joneset al,
2009, Taganov
al. 2006)
IL1 receptor,
146
NFKB, Toll Like
Receptor
signaling
signlinget
CCNE1, CCND1, CDC25A,
CCND2
15/16/195/424/497
Cell cycle;
Cl cl
G1 to S
0
29
Collagen
0
7
E1bB signaling,
Eirsnalig
ghioma
7
insulin signaling
CCNE1, CCND1, CDC25A,
CCND2, E2F3, WEE1
7
COL4A1, COL4A2,
COL4A3, COL4A4,
COL4A5
16
PTK2, PIK3CD, RAF1,
ERBB4, RPS6KB1
12
RB1,PIK3CD,RAF1
0
0.000208
18
MKNK1, PTK3CD, RAF1,
D,
RKB1
RPS6KB1I, IRS2
(Linsley et al.
2007, Liu et
al. 2008)
(Li et al. 2009,
van Rooij et
al. 2008)
(Kefas et al.
2008, Webster
et al. 2009)
(Bravo-Egana
et al. 2008,
CorreaMedin ea
Medina et al.
2009, Joglekar
et al. 2009)
15/16/195/424/497
Wnt pathway
0.0356
14
FZD10, CCND1, CCND2,
PAFAHIBI, PPP2R5C
(Bonci et al.
2008)
103/107
TNF pathway
0.0522
6
HRB, MAP3K7, NR2C2
(Xie et al.
2009)
122
NO1 pathway
0.0546
4
CALM3, SLC7A1
Kae 2 09)
15/16/195/424/497
prostate cancer
0.07345
18
PIK3R1, AKT3, CCNE1,
CCND1, FGFR1, E2F3,'
MAP2K1
(Bonci et al
2008)e
SMAD5, FKBPIA,
ROCKI, SMURF2,
ACVR1B, INHBA,
ROCK2, TGFBR1
(Li et al. 2008)
7NUMBL,
JAGI, NOTCHI, NOTCH2,
DLLI
(Ji et al. 2008,
Ji et al. 2009)
0.0865
17
CCL1, IL12A, FASLG
ACVR2A
(Lu et al.
2009)
. .
PIP3 signahng in
cardiac myocytes
0.0977
8
IGF1, CDC42, CREB5,
YWHAZ, PTPN1,
YWHAQ, MET, PREXI
(Care et al.
2007, Sayed et
al. 2007)
cell cycle;
0.122
7
E2FI, CCND2, RBLI
(Cloonan et al.
135
TGF beta
signaling
0.07389
19
34a/449
Notch signaling
0.07389
21
cytokine-cytokine
receptor
interaction
1/206
17-
131
E2F1, CCND2, RBL1
5p/20/93.mr/106/519.d
Gi to S
0.122
221/222
breast cancer
estrogen signaling
0.1432
7
KIT, CDKN1B, ESRI
34/449
BAD pathway
(apoptosis)
0.1499
5
BCL2, KITLG, KIT, IGF1,
PRKACB
let-7/98
breast cancer
estrogen signaling
0.1595
13
CYP19A1, FASLG
let-7/98
GI to S
0.1871
8
E2F6
CCNG2, E2F1, CCND2,
WEE1, RBL1
2008,
Pickering et al.
2009)
(Miller et al.
2008, Zhao et
al. 2008)
(Chang et al.
2007, Cloonan
et al. 2008, He
et al. 2007)
(Schultz et al.
2008, Yu et al.
2007)
(Schultz et al.
2008, Yu et al.
2007)
132
Table 2. Selected new mirBridge miRNA function predictions (see Table SI for details). Same
format as Table 1A.
of high-quality
Selected targets
miRNA
Function
q value
33
statin pathway
0.00155
2
ABCA1
203
G alpha i pathway
0.00532
9
PITX2, SHC1, SRC, ITPR2
23
apoptosis
0.00801
9
IRFI, IRF2, BNIP3L, CHUK, CASP7
205
tight junction
0.01195
16
CNKSR3, YES1, EPB41, PRKCE, MAGI2,
ACTB, CLDN11
187
antigen processing
and presentation
0.02192
9
KIR2DL2, IFNA2, KIR2DL5A
219
nuclear receptors
0.02806
6
THRB, NR2C2
175p/20/93.mr/106/519.
d
JMAP3K3,
KMAPK
pathway
0.0377
12
MAPK9, DUSP8, MAP3K12,
MAP3K5, MAP3K9, NR2C2, GAB1,
MAP3K2
124.2/506
axon guidance
0.04983
24
SEMA6D, CHP, NFAT5, NRAS, GNAIl,
ROCKI, PLXNA3, GNAI3, ITGB1, NFATC1,
NRPI, SEMA6A
34a/449
glycosphingolipid
biosynthesis
0.05144
3
FUT9
128
GnRH signaling
0.05144
13
ADCY8, MAP2K7, GRB2, PRKX, PRKY,
MAPK14
24
cytokine-cytokine
receptor
interaction
0.05203
32
EDA, PDGFRA, ILIRI, TNFRSF19
375
purine metabolism
0.0544
11
PDE4D, PDE5A
141/200a
EGF/PDGF
pathway
0.0637
7
MAP2K4, STAT5A, GRB2
101
ubiquitin mediated
proteolysis
0.07681
-
9
UBE2D1, UBE2D2, UBE2D3, FBXW11,
FBXW7, UBE2A
regulation of actin
cytoskeleton
0.07816
13
CFL2, ITGB8, ROCK2, CRK, RAC1, APC,
ITGAV
Ca signaling
0.07827
21
GRIN2A, ADRB1, ADCY9, CACNA1C,
ADCY1, ITPR1, CALMI, ADCY7, SLC8A1
19
133
apoptosis
0.1148
135
integrin pathway
0.122
12
AKT3, PTK2, ROCKI, ROCK2, ANGPTL2,
PLCG1, ARHGEF6, ARHGEF7, PAK7
93.HD/2913P/294/295/302/372/
nuclear receptors
0.1342
8
NR4A2, ESRI, NR2C2
27
statin pathway
0.1396
4
HMGCR, ABCAI
33
cell cycle
0.1555
4
CDK6
insulin
signalingreceptor
0.1934
9
PIK3R1, GRB2, RPS6KB1
BCL,2L11
373/520
153
153
Table 3. Testing mirBridge on several known phenotypes compiled from the literature.
The q values were computed based on simultaneous testing across miRNA seeds for the gene set.
Black: highly significant; blue: marginally significant;
: not significant. See also Table S2.
miRNA
Known
miRN function
fnctin
Knon
p
qq
rank (outmotifs)
of 143 seedmatched
References
141/200a
epithelial-mesenchymal
0.0018
0.08
1
(Burk et al. 2008, Gregory et al. 2008,
21
apoptosis
0.006
0.39
1
(Chan et al. 2005)
155
B cell receptor signaling
0.007
0.29
5
(Thai et al. 2007)
181
T cell receptor signaling
0.008
0.07
5
34
P53 path way
0.04
0.32
14
(Li et al. 2007)
(Chang et al. 2007, He et al. 2007,
Raver-Shapira et al. 2007, Tarasov et al.
transition
Korpal et al. 2008, Park et al. 2008)
2007)
223
22______
ngranulocyte
differentiation
0.07
0.62
15
134
(Johnnidis et al. 2008)
.......................................
.....
_ _
I
Supplemental Information
Figure S1. Network diagrams of selected mirBridge predictions discussed in the main text
(related to Figure 2). Aside from the miRNA targeting links, the networks are compiled based
on the literature. (A) mirBridge predicts that miR-15/16/195 could regulate several intricately
linked pathways that control cell proliferation and cancer, suggesting that a general function of
the miR-15/16/195 family is to control proliferation and/or growth. Several putative targets have
multiple high-quality seed-matched sites (Table Sla). (B) mirBridge indicates that miR-146
functions in NF-kB, IL4 and TOLL pathways where miR-146 mediates several negative feedback
loops to upstream signaling factors. (C) mirBridge indicates that miR-33 functions in cholesterol
homeostasis. miR-33a is probably co-expressed with SREBP2 because it is embedded in an intron
of SREBP2. miR-33 also putatively regulates the cell cycle network and the PGC 1a pathway,
forming a double-negative (i.e. positive) feedback to cholesterol.
B
A
miR
| 15/16/195
ed
2a
Cychn-El-DL 02,-03,
M*-G2
Breen-cancer-assocated and
Win patiway activation
caCelycle
miR-146
Cell proliferation and cancer
C
PGClagcoactivated energy
I
SREBP-2 1
Peripheral
cells and
miR-3
Lier
liver
HMGCR
A
GOA/i
1CA1
M
Cholesterol horneostasis
135
G2
Cell cycle
Figure S2. (related to Figure 3) Correlation between reciprocal co-targeting predictions For
each miRNA-family pair (X,Y), the lowest mirBridge p values of X->Y and Y->X are plotted
against each other. The entries were partitioned into 10 bins by the X->Y p values and the average
Y->X p value wasere computed and plotted against the average X->Y p value of each bin
(resulting in the blue line). The reciprocal p values are significantly correlated (Spearman
correlation = 0.42, p = 0). It is important to note that while many miRNA families are reciprocal
co-targeting pairs (X<->Y), it is biologically plausible that X->Y need not imply Y-X. For
instance, Y may function in more diverse contexts than X, yet co-targeting may be functionally
important only in the contexts where X functions. A likely example, albeit on the more extreme
end, involves the miR-99/100 and miR-125/351 families with 80 and 1362 predicted targets,
respectively. The PTS of miR-99/100 has a large number of seed-matched sites for miR-125/351
with a significant fraction of those being conserved and/or having high context scores, yielding a
q-value of 0.03. In contrast, the reciprocal q-value is 0.92 because the larger miR-125/351 PTS
only contains a small number of sites for miR-99/100, and an insignificant fraction of those are
conserved and/or have high context scores, suggesting that most of miR-125/351's functional
contexts are not shared with miR-99/1 00. A similar example involves the miR-17 and -18 families
where the latter has a smaller PTS. Individual cases aside, PTS-size difference is not a major
contributing factor: the size-difference distribution between PTSs for miRNA-family pairs having
both X->Y and Y->X q-values of less than 0.2 do not significantly deviate from those pairs with a
significant p-value in only one direction (p = 0.24 Kolmorgorov-Smirnov Test).
Correlation between reciprocal
predictions
07
046
032
01
0[
0
0.1
02
0.3
04
05
06
07
0.8
X->Y p value
136
09
1
................
......
..................
Figure S3. General conservation level is predictive of the conservation level of individual motifs.
The distribution of correlations between general and specific conservation rates across 314 seedmatched motifs (i.e. one correlation value for each motif) is shown. The specific conservation rate
was computed based on individual motifs whereas the general conservation rate was computed
across all 7-mers. All but 5 of the correlations have P values less than or equal to 0.01. The results
are similar if only 3' UTRs that are at least 1000 nt long were used.
Distribution of correlation between general and
specific conservation rates
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Spearman Correlation
---
based on all 3UTRs ---
based on 3UTRs with length > 1000
Figure S4. Similar to Fig. S3, but the correlation was computed based on general conservation
rate and the occurrence count of individual motifs. All but 5 of the correlations have P values less
than or equal to 0.01.
Distribution of correlation between general
conservation rate and occrrence count of
conserved motifs
0.14
0.12
*
0.1
0.08
S0.06
L.
0.04
0.02
! M
0
0.05
0.1
0.15
0.2
Spearman Correlation
137
0.25
..........
Figure S5. The general conservation rate distribution of genes in the PIP3 gene set vs. that of all
genes in the genome. PIP3 genes in general have higher background conservation levels. The two
distributions are significantly different (Kolmogorov-Smirnov Test; the p value is as shown).
0.2
0.18
Conservation
0.16
0.14
j
P = 6.3e-8
C 0.12
0.1
J
1
0.08
a All
0.06
N PIP3
0.04
0.02
0
W
z
m
m
n
-
o
V
N
0
MA
p
m -4 0 00
(N00
;t 0(
17
qT T'
Wn
q4
r
iii
N
00
m
rn
-q
log(conservation)
Figure S6. GC content of a 3' UTR is negatively correlated with the context score.
Distribution of correlation between GC-content and
context score
-0.6
-0.5
-0.4
-0.3
-0.2
-0.1
0
Spearman correlation
138
0.1
-
-
.
-.
- - - -- --------------------------
..............................
. ..................................
Figure S7. Histograms are examples of kernel-based density estimators. The kernels are constant
functions with a fixed value within a defined neighborhood and zero everywhere else. The red
dots are samples, which were drawn from a normal distribution with mean=10 and standard
deviation=5. The top example uses kernel functions of width=4. The histogram was constructed
by sliding a window of size 4 starting from -10 and counting the number of samples that fall
within the window. The bottom example estimates the density by using kernels of width=2.
Bandwidth=4
I)
5
10
15.
0 15
values
139
25
3
Figure S8. Density estimation by using Gaussian kernels. The red dots are samples, which were
drawn from a normal distribution with mean=10 and standard deviation=5. The estimated density
is the sum of normal densities with means set to the values of individual samples; the standard
deviation is specified by the bandwidth parameter. The blue densities are the individual kernels
and the green density is the sum. Note when the number of samples and the bandwidth are both
small, there are lots of local bumps in the resulting density (top plot). A larger bandwidth avoids
such biases and results in a smoother estimate (bottom plot).
0
Bandwidth=1
0
3
-
0
-0
0
00
10
-5
0
5
10
15
20
25
20
25
values
2
6
Bandwidth=2
4
4 -
0
0
10
5
15
values
140
M991=
_ MMM
....................
.........................
"I'll --
Figure S9. The input gene set is the PIP3 signaling pathway in cardiac myocytes. For each
bandwidth parameter a, 100 random gene sets were generated using the algorithm described in
the text. The length, general conservation rate, and GC content distributions of each of the
random gene sets were compared to those of the input gene set by the Kolmogorov-Smirnov (KS)
test. The average KS test p value across the 100 random gene sets is plotted. Note that as
expected, the higher the bandwidth, the lower the p value. mirBridge uses the largest bandwidth
so that the lowest of the three average p values is higher than a predetermined threshold.
---
length
--.
conservation
- -- GC
500
0
1000
1500
2000
2500
Figure S10. Context score distributions of conserved (green) and non-conserved (red) sites.
Context score distributions of conserved
and non-conserved sites
/
.4
:4-X
6
46
U
04-
context score
141
Supplemental Experimental Procedures
The mirBridge algorithm
Inputs:
1.
2.
3.
4.
A set M of miRNA seed-matched motifs. The motifs can be partitioned into two classes: m2-8 and
ml-7-A-anchor
A gene set G with n genes and their 3' UTRs
The context score of all seed-matches (from M) in the 3' UTRs in G
A context score threshold (t)
Processing:
1.
For each motif m in the class m2-8, determine the following statistics in G:
a. The number of seed matches (N) (for OC)
b. The number of genes (T) in G with at least one seed-matched site
c. The number of conserved seed matches (K) (for CE)
d. The number of seed-matches (H) with context score in the top t-percentile (for CTX)
2.
Build the gene neighborhood:
a. For each gene g in G, build an ordered array A of gene neighbors by sorting all 3' UTR x
in the genome by the normalized Euclidean distance between the 3' UTRs of g and x using
length, GC content, and general conservation. A[1] is the closest, A[2] the next closest,
and so on.
3.
Build the putative target neighborhoods for each miRNA motif:
a. For each motif m from Step 1, form ordered array Am (as in Step 2) for each g in G by
removing entries in the corresponding A that do not contain the motif m in its 3' UTR
(i.e. not a putative target of the miRNA)
4.
Compute the null distributions for N (the number of seed-matches)
a.
Determine the bandwidth parameter
i.
O
For each a from a list of possible a's
1. For each g in G
a. draw a random number x from the Gaussian density with mean
0 and variance a
b. round x to the nearest integer and take its absolute value
c. use x to index to g's neighbor list to draw a gene/3' UTR; i.e.
A[x]
2.
3.
4.
Compute the Kolmogorov-Smirnov p value between the drawn random
gene set and G for each of length, GC-content, and general conservation
Repeat the above for 100 (or more) times
Take the average p value for each of length, GC content, and general
conservation over the 100 iterations
ii.
b.
Pick the largest a such that the lowest of the three average p values must be
greater than a given threshold (currently set to 0.67)
Repeat 10,000 times (or more)
i.
Using the
a from Step a, draw a random gene set R as in Step 4-a-i-i
142
ii.
5.
For each seed-matched motif in Step 1, compute N as in Step 1 for the random
gene set to obtain the null distribution-for N
Compute the null distributions for K (the number of conserved sites) and H (the number of highcontext-scoring sites) conditional on T
a. For each motif from Step 1
i. Identify the putative targets in G (i.e. genes in G with at least one site)
ii. Determining the bandwidth parameter as in Step 4a except: 1) use the putative
target neighborhood for the motif (from Step 3); 2) only use the putative targets
as members of G (i.e. ignore/remove genes without sites)
iii. Using the procedure in Step 4-a-i-I, generate random putative target sets by
replacing each of the putative targets in G with a randomly sampled putative
target from the putative target neighborhood array (Am) for the motif and gene
(from Step 3). Note that each random target set would have exactly T genes with
at least one motif site
For each random target set, compute K and H (note that by design each random
target set has exactly T putative targets)
Repeat 10,000 times (or more) to obtain the null K and H distributions
conditional on T
iv.
v.
6.
Compute the p value for N by using the null distribution from Step 4: count the percentage of
random gene sets that have an equal or higher N. Similarly compute the p values for K and H by
using the null distributions from Step 5: count the percentage of random putative target sets that
have an equal or higher corresponding statistics.
7.
FDR analysis: computing the q values across all m2-8 motifs:
a.
Use
A - 0.5 to estimate
the proportion of null features, 7r1, by counting the number of
p values that are greater than 0.5; and divide this by n(1
b.
-
A).
For each motif with p value p.
Use
A = 0.5, and estimate the proportion of null features by counting the
#{pj > A; i =1,2 ... ,n}
nA)=
ii.
c.
Coipute q as Q=
For each q, set q, to q) where q; is the minimum of all q values for which the
corresponding p alues are greater than P .
8.
Construct the composite test statistics (CE-CTX and OC-CE-CTX) and compute the
corresponding p and q values (modified inverse-normal method):
a.
For each p value from the basic statistics (i.e. Pces
te=
4 -(l
-
Poc>Pe.x), compute
pce) where 4~1 is the inverse of the standard normal cumulative
distribution funetion (i.e. normal with mean 0 and std 1). Similarly compute toc and
tc~rx*
b.
Construct the composite statistics for each motif in G:
tce ctx
~~ Weetce
toc 1cX ctx
c ete,
+ Wctxtctx
+
Wetxtctx
143
14 Octo
where we, + Wctv =
1 a+d wt ,
+ w
= 1.
The w's can be adjusted to assign different weights to basic statistics (currently
c.
wc
0.5,wer =0.5;w,
0.4, w,
= 0.35,w e = 0.25 ).
To compute p values, obtain the null distributions of ice crx and toc-c,_crx by the
following method:
i.
ii.
Compute the covariance between each pair of tC,, t0c, tr (by using values
across all motifs. In cases where multiple gene sets are considered, values from
all motif-gene-set combinations can be used)
Compute the variance of t.,_c.
anld toc-ce crx by using the formula:
var(at, + bt, + ct 3 )
a'var(t1 ) + b 2 var(t,) + c. var(tO
3 +
+2abc(cov(t. t.) + cov (t, t) + cov(t Z, t 3 *))
iii.
Compute the means of te,_ctx and toc-c,_cex by using the formula:
mean(at1 + bt,+ct)=
mean(t)
a
eant 1) + b -mean(t) +c -
The null distributions of tcce, and toc ce ~cx are normal distributions with
the mean and variances computed above.
v. The p values of the observed statistics for each motif can be computed from the
null distributions.
Compute the q values for the composite p values as in Step 7.
iv.
d.
9.
Repeat steps 1-8 for ml-7-A-anchor motifs
Output:
For each input motif, the q value of each test is provided.
Note:
If multiple gene sets are being tested simultaneously, the FDR procedure (Step 7) can be adjusted to include
p values from all motif-gene-set combinations. Similarly for Step 8c the t's from all motif-gene-set
combinations can be used to estimate the covariances and to compute the q values (Step 8d).
The mirBridge null model
The discussion below focuses on defining the appropriate null models for the test statistics used
in mirBridge (i.e. CE, CTX, OC). As discussed, the null model of the CE and CTX tests is based
on randomizing putative target sets while that of OC is based on randomizing the entire gene set
(Fig. 1 in main text). The main task is, however, that of generating a random set of genes that has
similar properties as a particular gene set (i.e. for mirBridge the gene set can be a putative target
set or the input gene set itself). Thus the following discussion revolves around "gene sets," but it
should be understood that it equally applies to "putative target sets."
144
The simplest null model is to generate size-matched uniformly sampled random gene sets.
However, as discussed in the main text, this can be an inappropriate null model because other
factors, such as general (or non-specific) motif conservation rate, may lead to systematic biases.
Below these key factors are empirically analyzed to show that they can indeed introduce
systematic biases. The analysis of 3' UTR length is omitted because it is obvious that it is
correlated with motif occurrences.
General evolutionary rate For a given 3' UTR, the general (or non-specific) conservation rate is
defined as the number of conserved 7-mers (because seed matches are 7-mers) divided by the
total number of 7-mers (i.e. 3' UTR length - 6). By counting only the occurrences of a particular
motif type, a similar definition is used for the conservation rate of a motif. To investigate whether
non-specific conservation rate can affect the CE statistic, the general conservation rate and
conservation rate of each seed-matched motif were computed for all 3' UTRs. The Spearman
correlation' between the general and specific conservation rates for each motif was computed
across all human 3' UTRs, resulting in 314 correlation coefficients (one for each of the
Targetscan seed motifs of conserved miRNAs) (Fig. S3). 309 out of 314 of the motifs exhibit
significant correlations (p < 0.01). To ensure that the correlation is not primarily due to unusually
short 3' UTRs, the correlations were recomputed using only 3' UTRs that are longer than 1000
bp; the same result holds (Fig. S3). The significant correlations persist when the correlation
between general conservation rate and the occurrence count of each motif were computed
(309/314 have p < 0.01) even though the absolute correlation coefficients are lower (Fig. S4).
This analysis strongly indicates that non-specific conservation rate is a strong predictor for the
conservation rate of specific motifs. Therefore, an effective null model has to take the general
conservation level of a gene set into account. For instance, genes in many biological gene sets,
such as the human PIP3 signaling pathway in cardiac myocytes, have significantly higher general
conservation levels than the rest of the genome (Fig. S5).
GC content A key property used to compute the context score is the GC content around the seed
match: higher GC contents can lead to more stable local secondary structures that block miRNARISC access (Grimson et al. 2007). This implies that the overall GC content of the 3'UTR can
have an effect on the context score. To investigate this possibility, the percentage of bases that are
either G or C was computed for each 3' UTR. The Spearman correlation between the percent-GC
and the context score for each type of seed match was computed across all 3' UTRs, resulting in
314 correlation coefficients (Fig. S6). 304 out of 314 motifs exhibit significant negative
correlation at p < 0.01, indicating that the overall GC content of the 3' UTR is a strong predictor
of the context score.
Correlation between different factors Significant pair-wise correlation exists between length
(L), GC-content (GC), and general conservation rate (C) across human 3' UTRs, indicating that
accounting for systematic biases introduced by any one of the factors alone can over- or under1
A non-parametric correlation measure is used because the normality assumption does not hold
145
compensate others (table below). An effective null model needs to consider all factors
simultaneously (see below).
Variable Pair
Spearman correlation
Simulated P value
L-C
0.185
0
L-GC
-0.085
0
C-GC
-0.125
0
Additional factors So far three gene set properties (length, GC content and general conservation)
that can introduce systematic biases have been discussed. A key question is whether additional
factors need to be considered. In other words, are other factors largely conditionally independent 2
of the test statistics given L, GC, and C? This is a difficult question to answer empirically because
there are a large number of possible factors. For instance, can the occurrence rates of certain kmers (k=2, 3, 4...) affect the context score and/or evolutionary rate of certain seed-matched
motifs? The frequency of a given k-mer can affect the frequency of motifs containing
subsequences that are correlated in frequency to the k-mer. However, aside from OC, our test
statistics are conditional on N, so factors that affect motif frequencies are unlikely to have a
significant effect (as discussed in the main text, OC is only used in the composite score but is not
used alone as an indication of functional targeting). A related concern is that the evolutionary rate
of a subset of the motifs may be dependent upon the frequency of some k-mers, but such
dependencies should be largely captured by the general conservation rate measure, especially if
the number of affected motifs is relatively large. In fact, one would not want to miss the signal if
the differential rate is specific to a small set of motifs, because such signals can reflect constraints
imposed by miRNA-mediated regulation. L, GC, and C are likely the most direct gene-set
properties that affect the test statistics. The p value distributions from the analysis of a large
number of biological gene sets (using OC-CE-CTX) indicate that a null model that accounts for
these three factors is effective (i.e. the distribution is quite uniform). In addition, our formulation
of the null model and our method to compute the null distribution do not preclude the
incorporation of additional factors (see below). In fact, in principle any combination of factors
can be incorporated.
Defining the null model
The above analysis indicates that an effective null model can be defined based on comparable
random gene sets, i.e. ones that have similar L, GC and C distributions as the given gene set (G).
Formally, given a statistic S (e.g. K IN) and a gene set G, whose genes have a joint empirical
(L,GC, C) distribution D (i.e. L. GC, C IG ~D), the goal is to obtain the distribution of S ID. By
conditioning on D, this model formally requires that the random gene sets have similar properties
as G. Note how this definition allows the incorporation of additional factors by conditioning on a
joint distribution. The p values of the observed statistics of G can be computed from the SID
distribution.
2A
random variable X is conditionally independent of Ygiven Z if P(X. YIZ) = P(XIZ) -P(YlZ).
In other
words, all correlation between Xand Yis through Z; once Z is fixed, X and Yare no longer correlated.
146
The advantage of this model is that the joint empirical (L, GC, C) distribution of G is taken into
account, but the computation of the null distributions can be challenging. A simpler alternative is
to only condition on a summary statistic of the empirical distribution, such as the mean or
median, to account for overall trends. However, this is problematic if the higher moments of the
empirical distribution are also significantly different from the genome-wide distribution. Below a
novel sampling scheme is introduced to compute the null distribution of any gene-set based
statistic given the (L, GC, C) distribution of G.
Computing the null distributions
Given G with n genes (or putative targets), a direct way to compute the null distribution is to
generate random gene sets by sampling n gene from the genome according to the empirical
distribution D. One approach to accomplish this is to repeatedly draw a sample from D (i.e. a (1,
ge, c) triple) and pick a gene whose length, GC content, and general conservation is closest to the
drawn sample. This sampling procedure requires that a parametric form be fitted to the empirical
(L,GC,C) distribution; the joint density can also be obtained by techniques such as kernel-based
estimation (Duda et al. 2001). We opted to pursue the latter because it is non-parametric and
purely data driven, and can thus avoid potential biases introduced by parametric models; it also
allows the easy incorporation of additional conditioning factors because different parametric
models are likely needed for different combinations of factors.
A kernel-based estimator fits a given empirical density by a set of parameterized functions called
kernels. The density function is the sum of kernel functions defined over the domain of the
random variable(s). Formally, let f1 (xj.) be the ith kernel with parameter vector 6; the
f (x| ) where x can be a vector and nk is the total number
estimated density is f(x) = f'
of kernels.
A simple example of a kernel-based density estimation procedure is the construction of
histograms from data (Fig. S7). The kernels in this case are constant functions in a defined
interval. Each kernel is parameterized by two parameters: location and height. For instance, a
one-dimensional kernel has the form:
h if x e [a,b]
0 otherwise
where [a,b] specifies the location and h specifies the height (or probability mass) in [a,b]. The
location of the kernels is determined by the center of each bin and the height reflects the number
of data points that fall within the bin (Fig. S7). The location parameter in a multidimensional
kernel specifies a hypercube. The size or volume (also called the bandwidth) of the location
parameter (e.g. lb-al in the 1-d case) is a key that determines the performance of the estimator.
Ideally the bandwidth should always be small if sufficient data are available; because if the
bandwidth were too large each data point would exert bias on the density of the nearby points.
However, in practice, data can be limiting and hence the bandwidth parameter needs to be
optimized so that the maximum amount of information can be extracted from the data with
minimum bias (Turlach 1993).
147
.......
....
A common approach is to use one kernel per data point and then infer the bandwidth parameter,
either individually for each kernel or one for all kernels. Gaussian kernels are often used because
they have a tractable analytical form and nicely model the intuitive notion that the density
influence of a data point should gradually diminish as one moves away from the data point (rather
than abruptly going to 0 if a constant function is used). For instance, given n one-dimensional
data points di, the estimated density is f(x) = 1
g(x Idi, oa), where g(- li, a2) denotes
the Gaussian density with mean y and variance a 2 (Fig. S8). Sampling from such kernel-based
densities is straightforward: one can randomly pick one of the kernels and sample according to
the kernel density.
Gene-neighborhood sampling
Multidimensional Gaussian kernels (i.e, in L-GC-C space), one per gene in the input gene set G,
can be used to obtain the empirical (L, GC, C) distribution of G. The following algorithm can be
used to generate a random gene set:
For each gene g in G,
1. Sample a (lgc,c) triple from the Gaussian kernel of g
2.
Find the gene in the genome whose 3' UTR length, GC content, and general conservation is
the closest to (l,gcc).
To evaluate "closeness" in the second step, a distance metric is needed in the L-GC-C space. The
Euclidean distance can be used after normalizing each dimension by their mean and standard
deviation 3 to ensure that the variables with larger absolute magnitudes do not dominate the
distance measure (e.g. 3' UTR length).
A verbatim implementation of this algorithm can be inefficient because locating the closest gene
for any given (1,gc, c) takes time proportional to the number of genes in the genome. However,
note that for each g, the above algorithm is equivalent to sampling from genes that are close to g
in the L-GC-C space (i.e. the neighbors of g), so by indexing the neighbors using their normalized
Euclidean distance to g, the look-up step for the closest gene can be made more efficient:
1. For every gene in the genome, sort all genes in the genome in the order of normalized Euclidean
distance to g; index them by the distance.
2. For each gene g in G
a.
sample a (l,gc,c) triple from the Gaussian kernel of g
b. determine the distance d between (l,gcc) and g
c. use d to look up the index to obtain the closest gene
the (Igcc) triple associated with each gene, the normalized length, gc-content, and general
conservation level is (
,
, -),
where <-> and a are the mean and standard deviation of
3 For
the respective variables.
148
NIr NWIWM. _ W -
'.
............
......
.
. .......
.
.
.. ...........................................
M
..............
1
..
.....
. ......
............
.............
.................
Note that in this algorithm the sampling from L-GC-C space essentially reduces down to
sampling from the distance space, i.e. each (1,gc, c) triple sampled was converted to d, which is
the critical parameter for locating which gene to pick. Hence a one-dimensional kernel in distance
space can be defined for each gene in G to replace the three-dimensional L-GC-C kernel. The
distance-space sampling can be further simplified to distance-rank-space sampling:
1. For every gene u in the genome, assign ranks to all genes in the genome based on their
normalized Euclidean distance to u (e.g. the closest gene has rank 1, next has rank 2, and so
on).
2.
For each gene g in G
a.
sample a rank from the Gaussian kernel of g (draw a sample from the Gaussian, take
the absolute value and round to the nearest integer).
b. return the gene with the sampled rank
Note that the rank is gene-dependent and can correspond to different actual distance units across
genes. A rank-based kernel, such as the one used above, is desirable if one wants to ensure that
every gene has an equal-size sampling neighborhood (i.e. with the same number of genes). This
makes intuitive sense in that if a gene in G resides in a sparse neighborhood in the L-GC-C space,
its effect on the mass of the estimated density in L-GC-C space around the neighborhood should
be broader. This is equivalent to scaling the kernel bandwidth in distance space by the gene
density around the gene (i.e. genes with rare L-GC-C attributes have a kernel with larger
bandwidth).
The parameter remaining to be specified is the bandwidth of the kernels (i.e. the o of Gaussians).
If a is too large, the L-GC-C distribution of the random gene sets would be significantly different
from G; whereas a small a can lead to bias as illustrated in Fig. S8. In practice o is largely a
function of the size of G. To determine a reasonable a, we use the algorithm above to draw
random gene sets using different a and compare the L, GC and C distributions of each random
set to the respective L, GC and C distributions of G. For each a, a large number (>100) of
random gene sets are used so that an average deviation based on the Kolmogorov-Smirnov Test
can be computed. The largest a that does not result in an average deviation greater than a prespecified threshold4 from the L-GC-C distributions of G can be used as a good bandwidth
estimate. An example can be found in Fig. S9.
Compiling high-quality putative target sets
To compile high-quality putative target (HPT) sets for co-targeting analysis (and also for
examining HPTs within gene sets), we aim to include Targetscan predictions that either have at
least one perfectly conserved seed match and/or predictions with at least one seed-matched site
that has a high context score. To infer a good context score cutoff, we examined the context score
distributions of conserved and non-conserved seed-matched sites (Fig. S10). Below (above) a
4 Currently set to 0.67, which was determined based on a simulation experiment
149
context score of -68, non-conserved (conserved) sites are enriched. This suggests that a context
score of 68 is a good cutoff to use for inferring high quality non-conserved sites if we make the
plausible assumption that conserved sites are enriched with true positives. Thus we defined highquality targets as ones having at least one conserved seed-matched site and/or ones having at least
one seed-matched site with a context score greater than 68.
The connection between CE and prior tests that use evolutionary conservation
The CE test is fundamentally different from a couple of seemingly similar tests (Lewis et al.
2005, Stark et al. 2005): CE evaluates the degree of gene set-specific conservation of the miRNA
target sequence above that of the same sequence in comparable random gene sets, whereas the
earlier tests evaluate whether the conservation level of the target sequence is significantly above
that of random sequences in the same gene set. miRNA target sequences are typically
significantly more conserved than random sequences across all genes and gene categories (Stark
et al. 2005, Xie et al. 2005). Thus, merely having higher conservation than random motifs in the
same gene set may not be sufficiently specific to establish functional linkage between a miRNA
and a gene set; the type of conservation enrichment detected by the CE test is more appropriate.
Sensitivity and specificity of the OC-CE-CTX test: Alternative test scores and comparisons
Other combinations of the three basic tests (CE, CTX and OC) are possible. For instance, by
combining the CE and CTX tests one can form the "CE-CTX" score, which can lead to miRNAgene set predictions solely from known functional targeting signals (i.e. conservation and
favorable 3' UTR sequence context). Comparing the performance of different tests is difficult
because true positives (i.e. known miRNA functions), especially in the context of pathways, are
lacking. Below we discuss several analyses that suggest OC-CE-CTX has the best sensitivity and
specificity among tests that use the three basic 'scores. Specifically, we will compare two basic
tests (CE and CTX) and the CE-CTX composite test to the OC-CE-CTX test using the pathway
and module gene sets. While other tests are possible, e.g. OC-CTX and OC-CE, their utility is
clearly bested by the OC-CE-CTX test (and the OC test alone is insufficient to suggest functional
targeting as discussed in the main text).
At a global FDR cutoff of 0.2 (across gene-set and seed-motif combinations), the CE, CTX, CECTX, and OC-CE-CTX tests predict 7, 1, 37 and 215 miRNA-gene-set associations, respectively,
for the pathway gene sets; and 4, 2, 23, and 186 respective predictions for the module gene sets.
The CE and CTX predictions are all in the CE-CTX and OC-CE-CTX lists, indicating that, as
expected, the composite tests are more sensitive. Below we focus on comparing the CE-CTX and
OC-CE-CTX pathway prediction results.
The CE-CTX pathway predictions are largely in the OC-CE-CTX set, except four pathways with
higher (close to 0.2) CE-CTX q values (in the case of modules, only one prediction is in CE-CTX
exclusively; we only focus on the pathway results in the discussion below as the module results
share the same trend). However, the relative ranking of some individual predictions (based on the
150
q values) are different across the OC-CE-CTX and CE-CTX lists. For example, predictions
ranked near the top of the CE-CTX list but having a low OC score are ranked lower in the OCCE-CTX predicted list. The miR-1-PIP3 association is such an example, where it has a higher
rank (9/37 versus 72/215) and a more significant q value (0.065 versus 0.098) in the CE-CTX list
because the number of putative miR-1 binding sites is not unusually high (p=0. 3 8 ) in the PIP3
gene set (even though the proportion of conserved and high-context-scoring sites are unusually
high-the basis of significant CE and CTX scores). The fact that the OC-CE-CTX test only
excludes a few CE-CTX predictions with higher q values is encouraging as this suggests that OCCE-CTX achieves higher sensitivity (i.e. significantly larger number of predictions) without
sacrificing specificity (that is, OC-CE-CTX selectively excludes only the less-confident
predictions in the CE-CTX list; see below).
To infer whether the additional predictions made by OC-CE-CTX are enriched for true positives,
we compare the CE-CTX p-value distribution of miRNA-pathway pairs that are exclusively
predicted by OC-CE-CTX to that of miRNA-pathway pairs not predicted by OC-CE-CTX. If the
use of OC signals by OC-CE-CTX largely results in false positives, we expect the two
distributions to be statistically indistinguishable (they would also have comparable median p
values). However, the two distributions are drastically different (p < 2.3 X 10~1S,
Kolmogorov-Smimov Test) and the median p values are 0.009 and 0.5 respectively (their
difference is highly significant: p < 4.4 X 10-3, Mann-Whitney Test). Reassuringly, the
latter distribution is essentially uniform, as is expected for p values randomly drawn from the
null. Furthermore, if we compute the CE-CTX q values by using only those miRNA-pathway
pairs predicted by OC-CE-CTX exclusively, all CE-CTX pairs would have a q value smaller than
0.2. This suggests that these pairs had insignificant CE-CTX q values (>0.2) only because the
CE-CTX test has insufficient statistical power when many miRNAs and gene sets are tested
simultaneously.
In stark contrast to the analysis of pathway and module gene sets, CE alone gives a much larger
number of miRNA-miRNA co-targeting predictions at a F DR cutoff of 0.2 than both CE-CTX
and OC-CE-CTX (3053, 85, and 221 distinct miRNA-family pairs predicted by CE, CE-CTX,
and OC-CE-CTX tests, respectively). A majority (>75%) of the CE predictions that overlap with
those of OC-CE-CTX have small CE q values (<0.1), while more than 90% of non-overlapping
pairs have CE q values larger than 0.1. This strongly suggests that a large percentage of the nonoverlapping predictions are false positives, where OC-CE-CTX excludes them because they are
not simultaneously supported by other tests (CTX and/or OC). This apparent lack of specificity of
CE compared to the composite tests indicates that the non-specific conservation biases in these
predicted target sets are extremely strong; only by combining multiple tests that use different
aspects of functional targeting can we enrich for true positives. Our method for correcting for
non-specific conservation bias has already helped significantly as CE gives a significantly smaller
number of predictions than gene-set overlap analysis using Fisher's Exact Test at the same FDR
cutoff (see main text). Similar to the results in pathway analysis, CE-CTX and OC-CE-CTX
results are largely overlapping (70 out of 85 CE-CTX predictions are in the OC-CE-CTX list).
Taken together, our analyses strongly suggest that the OC-CE-CTX test has significantly better
sensitivity and specificity than other tests.
151
Supplemental References
Duda RO, Hart PE, Stork DG. Pattern classification, 2nd edn (New York: Wiley) (2001).
Turlach BA. Bandwidth selection in kernel density estimation: A review, Paper presented at:
Discussion Paper 9317 (Voie du Roman Pays 34, B-1348 Louvain-la-Neuve, Belgium: Institut de
Statistique) (1993).
Xie X, Lu J, Kulbokas EJ, Golub TR, Mootha V, Lindblad-Toh K, Lander ES, Kellis M.
Systematic discovery of regulatory motifs in human promoters and 3' UTRs by comparison of
several mammals. Nature 434, 338-345 (2005).
Supplemental Tables are available online at Molecular Cell.
Gene sets used for miR-218 analysis
Glutamate set
Slc1Al
Slc1A2
Slc1A3
Slc1A6
Slc1A7
Slc17A6
Slc 17A7
Slc17A8
Grml
Grm2
Grm3
Grm4
Grm5
Grm6
Grm7
Grm8
GrikI
Grik2
Grik3
Grik4
Grial
Gria2
Gria3
Gria4
GrinI
Grin2A
Grin2B
Grin2C
152
Grin2D
Grin3A
Grin3B
GrinA
GrinLIA
Gridl
Grid2
Homer1
Homer2
Homer3
GLS
GAD
GLUL
GABA set
SLC6A1
SLC6A1 1
SLC6A13
SLC32A1
GABRA1
GABRA2
GABRA3
GABRA4
GABRA5
GABRA6
GABRB1
GABRB2
GABRB3
GABRD
GABRE
GABRGI
GABRG2
GABRG3
GABRP
GABRQ
GABRR1
GABRR2
GAD
ABAT
ALDH5A1
Dopamine set
DRD2
DRD3
DRD4
DRD5
DBH
DDC
COMT
MAOA
SLC6A3
153
TYR
TH
PAH
SLC29A4
Serotonin set
HTR1A
HTRIB
HTR1D
HTR1E
HTR1F
HTR2A
HTR2C
HTR3A
HTR4
HTR6
HTOR
SLC6A4
HTR5A
5HTT
HTR7
HTR2B
HTR3B
HTR5A
HTR3E
HTR3D
HTR5B
TPH
MAOA
SLC29A4
Adrenaline/epinephrine set
ADRA1A
ADRAIB
ADRA1D
ADRA2A
ADRA2B
ADRA2C
ADRB1
ADRB2
ADRB3
ADRBK1
ADRBK2
COMT
PNMT
TH
DBH
Synaptic vesicle formation set
BSN
RAPGEF4
154
RIMS1
RIMS2
PCLO
UNC13A
ERC2
SV2A
SV2B
NAPA
STXBP1
SYTI
CPLX1
CPLX2
NSF
155
Curriculum vitae
Margaret Ebert
Contact Information
Date of birth: June 25, 1981
work phone: (617) 253-6458
email: ebertms@mit.edu
Education
Hopewell Valley Central High School, 1995-1999
Yale University, B.S. in Molecular, Cellular, and Developmental Biology, May 2003
University of Cambridge, M.Phil. in Molecular Biology (Medical Research Council
Laboratory of Molecular Biology), Aug 2004
Massachusetts Institute of Technology, Ph.D. candidate in Biology, Sept 2004-June 2010
Awards and Honors
Beckman Scholarship for undergraduate research, May 2001-Aug 2002
Phi Beta Kappa, fall 2002
Editor-in-Chief, Yale Scientific Magazine, 2002-2003
Churchill Scholarship, 2003-2004
Howard Hughes Medical Institute Predoctoral Fellowship, 2004-2009
Yale College Chittenden Prize, May 2003 (highest academic record among all science
majors in the Class of 2003)
Yale College Belknap Prize for senior research in biology
Paul and Cleo Schimmel Scholarship, 2006-2009
Gene Brown-Merck Teaching Award, 2009
Teaching assistantships
Introduction to Experimental Biology and Communication, spring 2006
Molecular and Engineering Aspects of Biotechnology, spring 2009
Research experience
James Anderson, Yale University, summer 2000, tight junction complexes
Ronald Breaker, Yale University, summer 2001-2003, riboswitches in prokaryotes
Savithramma Dinesh-Kumar, Yale University, summer 2003, RNA silencing in plants
Andrew Griffiths, MRC LMB, 2003-2004, in vitro selection of proteins
Phillip Sharp, MIT, 2005-, microRNAs in mammalian cells
Publications
Nahvi A, Sudarsan N, Ebert MS, Zou X, Brown KL, Breaker RR. Genetic control by a
metabolite binding mRNA. Chem. Biol. 9, 1043-1049 (2002).
Sudarsan N, Wickiser JK, Nakamura S, Ebert MS, Breaker RR. An mRNA structure in
156
bacteria that controls gene expression by binding lysine. Genes Dev. 17, 2688-2697
(2003).
Ebert MS, Neilson JR, Sharp PA. MicroRNA sponges: competitive inhibitors of small
RNAs in mammalian cells. Nat. Methods. 4, 721-726 (2007).
Kumar MS, Erkeland SJ, Pester RE, Chen CY, Ebert MS, Sharp PA, Jacks T.
Suppression of non-small cell lung tumor development by the let-7 microRNA family.
Proc. Natl Acad. Sci. USA 105, 3903-3908 (2008).
Tsang JS, Ebert MS, van Oudenaarden A. Genome-wide dissection of microRNA
functions and co-targeting networks using gene-set signatures. Mol. Cell 38, 140-53
(2010).
Gatt ME, Ebert MS, Mani M, Zhang Y, Gazit R, Carrasco DE, Dutta J, Adamia S,
Munshi NC, Minvielle S, Avet-Loiseau H, Tai Y-T, Anderson KC, Carrasco DR.
MicroRNAs 15a/16-1 function as tumor suppressor genes in multiple myeloma.
Submitted (2010).
Mukherji S*, Ebert MS*, Zheng GZ, Tsang JS, Sharp PA, van Oudenaarden A.
MicroRNAs generate gene expression thresholds with ultrasensitive transitions.
Submitted (2010). *co-first author
Ebert MS, Sharp PA. MicroRNA sponges: progress and possibilities. Submitted (2010).
Ebert MS, Sharp PA. Roles for microRNAs in conferring robustness to biological
processes. Submitted (2010).
Ebert MS, Sharp PA. Emerging roles for natural microRNA sponges. Submitted (2010).
Patents
Riboswitches, methods for their use, and compositions for use with riboswitches. Breaker
R, Nahvi A, Sudarsan N, Ebert MS, Winkler W, Barrick JE, Wickiser J. (2003).
157
Download