Computational and experimental analysis of plant microRNAs by Matthew W. Jones-Rhoades

advertisement
Computational and experimental analysis of plant microRNAs
by
Matthew W. Jones-Rhoades
B.A., Chemistry (1999)
Grinnell College
Submitted to the Department of Biology in Partial
Fulfillment of the Requirement for the Degree of
Doctor of Philosophy in Biology
MASSACHUSETTS
INSTATE
at the
OF TECHNOLOGY
Massachusetts Institute of Technology
MAY 2 7 2005
June 2005
LIBRARIES
© 2005 Massachusetts Institute of Technology
All rights reserved
Signature
of Author
....................................
.................
.
. .................
Department of Biology
May 20, 2005
'"---.~
x..
Certified by .........
·
....................................................
David P. Bartel
Professor of Biology
Thesis Supervisor
Acceptedby
...
.........
StephenP. Bell
Professor of Biology
Chairman, Graduate Student Committee
,'
tti v ;S
Computational and experimental analysis of plant microRNAs
Matthew W. Jones-Rhoades
Submitted to the Department of Biology on May 20, 2005 in
Partial Fulfillment of the Requirement for the Degree of
Doctor of Philosophy in Biology
ABSTRACT
MicroRNAs (miRNAs) are small, endogenous, non-coding RNAs that mediate gene regulation
in plants and animals.
We demonstrated that Arabidopsis thaliana miRNAs are highly complementary (0-3 mispairs in
an ungapped alignment) to more mRNAs than would be expected by chance. These mRNAs are
therefore putative regulatory targets of their complementary miRNAs. Many miRNA
complementary sites are conserved to the monocot Oryza sativa (rice), implying evolutionary
conservation based on function at the nucleotide level. The majority of predicted miRNA targets
encode for transcription factors and other proteins with known or inferred roles in developmental
patterning, implying that the miRNAs themselves are high-level regulators of development. Our
findings indicated that miRNAs are key components of numerous regulatory circuits in plants
and set the stage for numerous additional experiments to investigate in depth the significance of
miRNA-mediated regulation for particular target families and genes.
We developed a comparative genomics approach to identify miRNAs and miRNA targets
conserved between Arabidopsis and Oryza. Seven previously unknown miRNAs families were
experimentally verified, bringing the total number of known miRNA genes in Arabidopsis to 92,
representing 22 families. We expanded the range of functionalities known to be regulated by
miRNAs to include F-box proteins, laccases, superoxide dismutases, and ATP-sulfurylases. The
expression of miR395, which targets sulfate metabolizing enzymes, is induced by sulfatestarvation, demonstrating that miRNA expression can be responsive to growth conditions.
We investigated the biological role of miR394-mediated regulation of Atlg27340, an F-box gene
of previously unknown function. Transgenic plants expressing a miR394-resistant version of
Atlg27340 displayed a range of developmental abnormalities, including radialized and fused
cotyledons, absent shoot apical meristems, curled and radialized leaves, and abortive flowers.
The severity of these abnormalities correlated with the overaccumulation of Atlg27340 mRNA.
These findings confirm the biological relevance of the interaction between miR394 and
Atlg27340, and represent the first insights into the roles of miRNA-mediated regulation of F-box
genes. Our results establish that both MIR394 and Atlg27340 are important regulators of
meristem identity, and suggest that Atlg27340 targets an activator of class III HD-ZIP function
for ubiquitination and proteolysis.
Thesis Supervisor: David Bartel
Title: Professor of Biology
2
Acknowledgements
I would like to thank David Bartel for being a tremendous advisor, role model, and friend
throughout my time at MIT. I would like to thank all the past and present members of the Bartel
lab, especially Mike Axtell, Scott Baskerville, Nelson Lau, Ben Lewis, Lee Lim, Allison
Mallory, Ramya Rajagopalan, Brenda Reinhart, I-hung Shih, Herv6 Vaucheret, and Soraya
Yekta. You have been a joy to work and play with, and a continual source of reagents, help, and
advice. I would also like to thank Bonnie Bartel, without whom the analysis of plant miRNAs
would have been ten times harder. I would like to thank my family, especially my parents, for
always believing in me and supporting me.
Most importantly, I would like to thank my wife Melinda for being my partner and my
best friend. Thanks for putting up with all the late nights and odd-hours; you were always there
to encourage me when things went well and to give me perspective when they didn't.
3
Table of contents
Abstract
2
Acknowledgements
3
Table of contents
4
Introduction
5
Chapter I
42
Prediction of plant microRNA targets
ChapterI has beenpublishedpreviouslyas:
Rhoades MW, Reinhart BJ, Lim LP, Burge CB, Bartel B, and Bartel DP,
"Predictionof plant microRNAtargets." Cell 110(4):513-520(2002)© CellPress.
Chapter II
65
Computational identification of plant microRNAs and their targets,
including a stress-induced miRNA
ChapterII has beenpublishedpreviouslyas:
Jones-RhoadesMW, and BartelDP, "Computationalidentificationof plant
microRNAsand theirtargets,includinga stress-inducedmiRNA." MolecularCell
14(6): 787-799 (2004) © Cell Press.
Chapter III
101
MicroRNA-mediated regulation of an F-box gene is required for
embryonic, floral, and vegetative development
Appendix A
120
Arabidopsis, Oryza, and Populus miRNA complementary sites
128
Appendix B
Appendix B has been published previously as:
ReinhartBJ, WeinsteinEG, RhoadesMW, BartelB, and BartelDP,
"MicroRNAs in plants. "Genes & Development 16(13): 1616-1626 © Cold
Spring HarborLaboratoryPress.
139
Appendix C
Appendix C has been published previously as:
Lewis BP, Shih IH, Jones-Rhoades MW, Bartel DP, and Burge CB, "Prediction
of mammalian microRNA targets. " Cell 115(7): 787-798 (2003) © Cell Press.
4
Introduction
The biology of multicellular organisms requires a complex network of gene regulatory
pathways. MicroRNAs (miRNAs) are key components of this network which had been
overlooked until recently. Initially discovered as regulators of developmental timing in C.
elegans, miRNAs are now known to serve in a variety of regulatory roles in both plants and
animals.
The hallmark of a miRNA is a short (-20-24 nt), endogenously expressed non-coding
RNA which is processed by RNaseIII proteins such as Dicer from a longer ssRNA precursor that
contains a stem-loop secondary structure (reviewed in (11)). MicroRNAs are chemically and
functionally similar to short interfering RNAs (siRNAs), which are processed by Dicer from long
dsRNA precursor and which are central to the related phenomenon of RNA interference (RNAi),
post-transcriptional gene silencing (PTGS), and transcriptional gene silencing (TGS). Mature
miRNAs are incorporated into RNAi-induced silencing complexes (RISCs), in which the
miRNA guides repression of target genes.
Although miRNAs are deeply conserved with both the plant and animal kingdoms, there
are substantial differences in the mechanism and scope of miRNA-mediated gene regulation
between the two kingdoms, several of which have been instrumental in the rapid increase in our
understanding of plant miRNA biology. Plant miRNAs are highly complementary to conserved
target mRNAs, a fact which has allowed for the rapid and confident bioinformatic identification
of plant miRNA targets (46, 113). Plant miRNAs guide the cleavage of their complementary
mRNA targets, an activity which is readily assayed in vitro and in vivo (49, 76, 126). In
addition, Arabidopsis is a genetically tractable model organism, which has enabled the study of
the genetic pathways which underlie miRNA-mediated regulation and the phenotypic
consequences of perturbing miRNA-mediated gene regulation. The picture emerging from this
recent research is that plant miRNAs are master regulators of genetic pathways: the majority of
genes regulated by plant miRNAs are themselves regulators such as transcription factors, F-box
proteins, and RNAi related proteins.
MicroRNAs: like siRNAs, but different
Before discussing miRNAs, it is useful to consider a highly similar class of small RNAs,
the small interfering RNAs (siRNAs). In Arabidopsis, siRNAs are the majority of small RNAs
(75, 112, 124, 138), and have been implicated in a variety of pathways, including defense against
5
viruses, the establishment of heterochromatin, silencing of transposons and transgenes, and the
post-transcriptional regulation of mRNAs (reviewed in (13)). MicroRNAs and siRNAs have
much in common; both types of small RNAs are 20-24 nucleotides long, and both are processed
from longer RNA precursors by Dicer ribonucleases (15, 38, 43, 50). Both are incorporated into
ribonucleoprotein (RNP) complexes in which the small RNAs, through their base pairing
potential, guide repression of target genes, and, as discussed below, the mechanisms by which
they repress target genes are also similar.
The fundamental difference between the two classes, then, is nature of their precursors;
siRNAs are processed from long, dsRNAs, whereas miRNAs are processed from RNAs that are
single-stranded but contain imperfect stem-loop secondary structures. In addition, there are a
number of general, if not absolute, characteristics that set miRNAs apart from siRNAs. Many
miRNAs are conserved between related organisms, whereas most endogenously expressed
siRNAs are not(57, 61, 63, 112). Many (but not all) siRNAs target the gene from which they are
derived. In contrast, a miRNA regulates genes unrelated to the locus from which the miRNA
was derived. In addition, although the proteins required for siRNA and miRNA biogenesis are
overlapping, in many organisms, including Arabidopsis, the genetic requirements for miRNA
and siRNA function are partially distinct. For example, many Arabidopsis siRNAs require
RNA-dependent RNA polymerases (RDRPs) for their biogenesis, whereas miRNAs do not (14,
27, 95, 138). Conversely, most Arabidopsis miRNAs require processing by DICER-LIKE1
(DCL1), one of four dicer-like genes in Arabidopsis, whereas many siRNAs require DICERLIKE3 (DCL3) (35, 56, 112, 138).
MicroRNA Biogenesis
Like other types of cellular RNAs, miRNAs must be properly processed and localized in
order to function. The steps through which a plant miRNA must pass include 1) transcription,
2) processing into a miRNA/miRNA* duplex, 3) covalent modification, 4) export from the
nucleus, and 5) selective incorporation of the miRNA into RISC (Figure 1).
Transcription of microRNAs
In most cases, miRNAs have been initially discovered as the mature, 20-24 nucleotide
form. Presumably, these mature miRNAs are initially transcribed as part of longer transcripts
that must minimally include enough additional sequence to generate the stem-loop structures
(typically -60-300 nucleotides in plants) that are recognized by Dicer. In several cases, miRNA
6
stem-loops have been shown to be contained within much longer transcripts, termed primary
miRNAs (pri-miRNAs). The overexpression of 0.5 kb and 1.4 kb transcripts that contain
miR319 and miR172 stem-loops, respectively, correlate with overaccumulation of the mature
miRNA (7, 100). miR163 is contained within a 0.7 kb transcript that can be processed into
mature miR163 (56). In addition, numerous miRNA precursors are found within ESTs from
various plant species that contain additional sequence outside of the stem-loop (7, 46, 100). At
least some of these longer pri-miRNA transcripts are spliced and appear to be poly-adenylated
(7, 56). Indeed, two rice miRNAs are contained within transcripts that contain exon junctions
within the presumptive stem-loop precursor, implying that in these cases splicing is a necessary
prerequisite for recognition by Dicer (123).
Because plant miRNAs are primarily found in genomic regions not associated with
protein coding genes (112), it appears that most miRNA genes are their own transcriptional units.
The fact that plant pri-miRNAs can be over 1 kb long, along with the fact that they can undergo
canonical splicing and polyadenylation, strongly suggests that RNA polymerase II is responsible
for transcribing most plant miRNAs, as has been shown to be the case for several animal
miRNAs (66). Relatively little is known about the promoters of plant miRNAs or the regulation
of miRNA transcription in plants.
MicroRNA processing and export
A central step in the maturation of miRNAs is the excision of the mature miRNA from
the pri-miRNA by RNaseIII-type endonucleases such as Dicer. Although the observed sizes of
Dicer products in plants range from around 20-25 nucleotides (126, 138), the plant miRNAs are
primarily 20-21 nt in length (112). In contrast, 24mers are most abundant in the population of
siRNAs cloned from Arabidopsis (126, 138). It has been suggested that different Dicer activities
are responsible for the different sizes of small RNAs observed in Arabidopsis (126). This idea
fits with genetic data, which suggests that the four Dicer-like genes in Arabidopsis have
functionally distinct roles. DICER-LIKE3 (DCL3) and DICER-LIKE2 (DCL2) process certain
endogenous siRNAs and viral derived siRNAs, respectively, but each is dispensable for miRNA
accumulation (138). In contrast, partial loss-of-function alleles of DICER-LIKEI (DCLI) result
in reduced accumulation of miRNAs and trans-acting siRNAs, without any obvious effect on the
accumulation or function of various other classes of siRNAs (35, 112, 138) (131).
7
In animals, miRNAs are processed in a stepwise manner. A nuclear localized RNaseIII
enzyme known as Drosha makes the initial cuts (one on each arm of the stem-loop) in the pri-
miRNAs to liberate the miRNA stem-loop, the "pre-miRNA" from the flanking sequence of the
pri-miRNA(65). After export to the cytoplasm, Dicer makes a second set of cuts, separating the
miRNA, duplexed with its near reverse complement, the miRNA*, from the loop region of the
pre-miRNA (65). The resulting miRNA/miRNA* duplex has two-nucleotide 3' overhangs,
similar to the siRNA duplexes produced by Dicer from long double-stranded RNA (15, 31, 32,
55).
The situation in plants appears to be somewhat different, as plants contain no clear
ortholog to Drosha. Whereas most animal Dicers are thought to be localized to the cytoplasm
(65), in Arabidopsis DCL1 is localized to the nucleus, and miRNA/miRNA* duplexes are
excised from the pri-miRNAs within the nucleus (101, 103, 138). RNAs corresponding to the
pre-miRNAs of animals are rarely, if ever, detected in plants (112), and it seems likely that both
sets of cleavage events happen in rapid succession. It is uncertain if DCL1 makes both sets of
cuts, or if additional nucleases are also involved.
One key difference between the biogenesis of siRNAs and miRNA, other than the nature
of the precursor, is the number of small RNA species produced per precursor. A single long,
double-stranded RNA precursor can be processed into multiple siRNA duplexes by Dicer (Figure
ib) (106, 131, 144). However, cloning and expression data show that a miRNA precursor
produces predominately a single small RNA species, the mature miRNA (57, 61, 64, 71, 112,
1.24). Although there is some heterogeneity at the 5' and 3' ends of plant miRNAs, it is clear that
DCL1 cuts preferentially at specific positions in the miRNA stem-loop precursor that result in
the accumulation the appropriate mature miRNA (112). The mechanism by which DCL1 knows
where to cut is largely a mystery, although there is evidence for the involvement of dsRNAbinding domain of DCL1. The dcll-9 allele, which disrupts the dsRNA-binding domain, cuts the
miR163 stem-loop at aberrant positions (56).
In addition to DCL1, several other genes have been shown genetically to be involved in
miRNA biogenesis. Mutations in HYPONASTIC LEAVES1 (HYL1) or HUA ENHANCER1
(HEN1) result in reduced miRNA accumulation and function (18, 41, 104, 130, 138). HYL1
contains a NLS and a dsRNA binding domain, and has some homology to R2D2 in Drosophila
and RDE-4 in C. elegans, proteins that are thought to function together with Dicer to load
8
siRNAs into the RISC (74, 125). HEN1 contains a methyltransferase domain, and is capable of
methylating miRNA/miRNA* duplexes in vitro (143). Endogenous miRNAs are methylated on
either the 2' or 3' ribose hydroxyl group of the 3' nucleotide in wild-type plants, but not in henl
mutants (143). The function of this miRNA methylation is a mystery, and it remains possible
that HEN1 may have additional activities that are important for miRNA biogenesis.
In plants, it appears that most, if not all, processing and modification of miRNAs takes
place in the nucleus. However, the majority of mature miRNAs are located in the cytoplasm
(103), suggesting that a pathway exists for miRNA export. One component of this pathway is
HASTY (HST), a member of the importin P family of nucleocytoplasmic transporters. hst
mutants have reduced accumulation of most, but not all miRNAs, suggesting that HST is an
important part of the miRNA export pathway, but that other components also exist (103). A
similar pathway exists in animals; Exportin-5, the mammalian ortholog of HST, exports premiRNA hairpins from the nucleus to the cytoplasm (78, 142). As pre-miRNAs appear to be very
short-lived in plants, it is likely that HST transports either miRNA/miRNA* duplexes or singlestranded miRNAs after they are fully excised by DCL1. Northern blot data suggests that
miRNAs are primarily single-stranded in the nucleus (103), suggesting either that a fraction of
functional miRNAs are located within the nucleus or that miRNAs are aleady single standed
before transported to the cytoplasm by HST. It is unknown whether plant miRNAs are already
associated with components of RISC when transported to the cytoplasm, or if loading into RISC
takes place after transport.
MicroRNA incorporation into RISC
MicroRNAs are processed from their pri-miRNA precursors as duplexes with their
miRNA* sequences. However, cloning and expression data indicate that the miRNA strand of
this duplex accumulates at much higher levels in vivo than does the miRNA* (71, 112). This
asymmetry of accumulation is achieved by the preferential loading of the miRNA strand into
RISC, where it is presumably protected from degradation, whereas the miRNA* strand is
preferentially excluded from RISC and consequentially subject to degradation. The key insight
into understanding this asymmetry of RISC loading came from bioinformatic and biochemical
studies of functional siRNA duplexes: the strand of siRNA duplex with less energetically strong
pairing at its 5' end is selectively loaded into RISC, where it is competent to guide silencing,
while the strand with the less stable 5' end is excluded from RISC (51, 116). Most
9
miRNA/miRNA* duplexes appear to have this energetic asymmetry; the 5' ends of most
miRNAs are less stably paired than are the 5' ends of the corresponding miRNA*s (51, 116).
The exact mechanism by which siRNA and miRNA duplexes are unwound and asymmetrically
incorporated into RISC are still only partially understood, but they appear to involve R2D2-like
proteins and perhaps an unidentified RNA helicase (reviewed in (128)).
The final product of the miRNA/siRNA biogenesis pathway is a single-stranded RNA
incorporated into a RNP complex. There are several varieties of these RNP complexes that vary
at least partially in their composition and function; a RNP that mediates RNA cleavage and
PTGS is generally referred to as a RISC, whereas a RNP that mediates chromatin modification
and TGS is referred to as a RNAi-induced transcriptional silencing (RITS) complex. A central
component of all these RNPs is a member of the Argonaute family of proteins. Argonaute
proteins, which have been implicated in a broad range of RNAi-related mechanisms, contain two
conserved domains, known as the PAZ and PIWI domains (21). The PAZ domain appears to be
an RNA-binding domain (72, 118, 140), and the PIWI domain has structural similarity to RNase
H enzymes (73, 119). Many organisms contain multiple members of the Argonaute family; in
some of these cases, there is evidence for functional specificity of the different Argonautes. For
example, only one of four mammalian Argonautes, Ago2, is capable of mediating RNA cleavage
(73). Arabidopsis contains ten Argonaute proteins, four of which have been investigated
experimentally. AGO4 is involved in the methylation of DNA associated with transposons and
inverted-repeat transgenes (146, 147). PNH/ZLL/AGO1Oand ZIP/AGO7 are required for proper
development, but the mechanism by which they act is not known (42, 79, 96, 97). Only one
Argonuate gene, AGOI, has thus far been shown to be required for miRNA function in
Arabidopsis. agol mutants have elevated levels of miRNA targets, consistent with AGO1 being
needed for miRNA function (129). A null allele of AGO1 also shows a sharp decrease in
accumulation of most miRNAs compared to wild-type (129). Although this reduction in miRNA
levels may stem from AGO1 playing an early role in miRNA processing, it may also be due the
loss of the RISC complexes needed to bind, and thus stabilize, the processed miRNAs.
Mechanisms of miRNA function
There are three basic mechanisms by which Dicer-produced small RNAs have been
shown to regulate gene expression: RNA cleavage, translational repression, and transcriptional
silencing.
10
MicroRNA-mediated
RNA cleavage
Directed RNA cleavage is perhaps the best studied mechanism by which small RNAs
regulate gene expression. In this mechanism, an siRNA/miRNA guides RISC to cleave a single
phosphodiester bond within a complementary RNA molecule. This so called "slicer" activity is
thought to reside in the PIWI domains of certain Argonaute proteins (73, 119). Several lines of
evidence indicate that plant miRNAs act to guide the cleavage of complementary mRNAs.
MicroRNA-guided slicer activity is present in wheat germ lysate (126). MicroRNA targets are
generally expressed at higher levels in plants that have impaired miRNA function as the result of
either mutations in the miRNA pathway (e.g. henl, agol, and hyll) (18, 129, 130) or the
expression of certain viral suppressors of RNA silencing (22, 24, 30, 49, 82), implying that
miRNAs negatively regulate the stability of their targets. Moreover, the 3' cleavage products of
many miRNA targets can be detected in vivo, either by Northern blot (49, 76, 80, 120) or by 5'
RACE (46, 49, 76, 80, 81, 83, 100, 123, 139).
MicroRNA-mediated translational repression
The first miRNAs to be identified, the lin-4 and let-7 RNAs, regulate the expression of
heterochronic genes that are critical for the timing of certain cell divisions during larval
development of C. elegans (40, 64, 92, 111, 117, 136). However, the induction of these miRNAs
at specified points in development does not greatly affect the mRNA levels of their targets, but
rather the amount of protein produced from the targeted mRNAs (40, 117, 136). The exact
mechanism by which this occurs is unclear, but it appears that functional translation of the
targeted mRNAs is inhibited at some point after the initiation of translation (99). It is thought
that this mode of target regulation is utilized by the majority of animal miRNAs.
What determines whether a small RNA will guide the cleavage of its target, as opposed to
directing its target for translational repression? To a certain extent, the outcome seems to depend
on the degree of complementarity between the guide RNA and the target. An siRNA or miRNA
that is perfectly complementary to a target RNA will generally lead to cleavage, whereas less
perfect complementarity is generally associated with translational repression (28, 29). Indeed,
the same small RNAs are capable of carrying out either mechanism. In mammalian cell culture,
exogenous siRNAs which are competent to direct cleavage when presented with fully
complementary targets can repress the translation of other targets which contain multiple,
imperfectly complementary sites (28, 29). Conversely, the let-7 miRNA from Drosophila, which
11
is presumed to regulate its endogenous targets through translational repression, can guide
cleavage of perfectly complementary RNAs in vitro (44). To a large extent, then, the tendency
of plant miRNAs to cleave their targets is probably due to the fact that they are highly,
sometimes perfectly, complementary to them, whereas few mRNAs have extensive
complementarity to animal miRNAs. However, there are exceptions; miR-196 guides the
cleavage of the highly complementary HoxB8 mRNA (84, 141). Furthermore, the expression of
either miR-1 or miR-124 in HeLa cells slightly reduces the levels of over 100 mRNAs with
complementarity to the 5' portion of the miRNA (70). It in unclear if these mRNAs are cleaved
by RISC, albeit at a low efficiency, or if miRNA/RISC binding affects mRNA stability through
some other mechanism. Conversely, one Arabidopsis miRNA, miR172, appears to effect the
accumulation of target protein but not target mRNA, and thus appears to mediate translational
repression (7, 25).
Small RNA directed transcriptional silencing
Sections of transcriptionally silent DNA, known as heterochromatic regions, are
associated with certain covalent modifications of DNA and histones. Evidence from several
organisms now shows that small RNAs are important for the establishment and/or maintenance
of these heterochromatic modifications. In fission yeast, Dicer-produced small RNAs
corresponding to heterochromatic repeats have been identified (110), and deletion of Dicer or
Argonaute disrupts silencing at heterochromatic regions (133, 134). This transcriptional
repression has been shown to involve the RITS complex, which, like the RISC, contains
Argonaute and a single-stranded Dicer-produced siRNA, as well as Chpl and Tas3, which are
not thought to be present in RISC (93, 98, 132). Small RNAs also guided repressive
modifications of DNA and histones in plants (reviewed in (85)). For example, AGO4 is required
for siRNA-guided transcriptional silencing of the SUPERMAN gene and the maintenance of
transcriptional repression triggered by inverted repeats (146, 147).
Do miRNAs guide transcriptional silencing in plants? Recent evidence suggests that they
might (10). Dominant mutations within the miR166 complementary sites of the PHABULOSA
(PHB) and PHAVOLUTA (PHV) mRNAs result in abnormal leaf development which correlates
with a reduction in miR166-guided mRNA cleavage (83, 88). Curiously, these phb and phv
mutants also correlate with a reduction of DNA methylation within the coding region of the
mutant alleles (10). This reduction of methylation occurs only in cis; in heterozygous plants,
12
only the mutant copy of PHB is affected, whereas the wild-type copy is not (10). Because the
miRNA complementary site in these mRNAs spans an exon-junction, miR166 is presumably not
able to interact with the genomic DNA, which suggests that interaction between miR166 and the
nascent, but spliced, PHB mRNA somehow results in DNA methylation (10). Although
intriguing, the functional significance of this change in methylation is not yet clear. While
methylated promoter regions are often associated with transcriptional silencing, the observed
methylation in PHB and PHV is near the 3' end of the coding regions (10), and it is unknown
what effect it is having on PHB or PHV transcription. It is not known if a reduction in miRNA
complementarity generally correlates with a reduction in target gene methylation.
Discovery of plant microRNAs
Discovery of plant miRNAs: Cloning
The most direct method of miRNA discovery has been to isolate and clone small cellular
RNAs from biological samples. Quite a few groups have used this approach to identify small
RNAs in animals, plants, and fungi (57, 61, 63, 71, 75, 89, 104, 110, 112, 123, 124) (58, 59, 94,
107-109, 122). Although the specifics of the protocols used by various groups differ in some
details, all essentially involve the isolation of small RNAs, followed by ligation of adaptor
oligos, reverse transcription, amplification, and sequencing. Some of these protocols incorporate
methods to select for RNAs that are products of Dicer cleavage (i.e. that have a 5' phosphate and
3' hydroxyl) and to concatemerize the short cDNAs so that many can be analyzed in a single
sequencing read (61). These cloning methods were first used to identify large numbers of
miRNAs in animals (57, 61, 63). An initial round of cloning experiments in Arabidopsis
identified nineteen miRNAs, as well as hundreds of endogenous siRNAs (75, 89, 104, 112).
Subsequent cloning experiments have expanded our knowledge of both classes of small RNAs in
Arabidopsis (124, 138), and more recently, Oryza sativa (rice) (123). The Carrington lab
maintains an online database of small RNAs cloned from Arabidopsis
(http://asrp.cgrb.oregonstate.edu/db/).
Discovery of plant miRNAs: Forward genetics
Given the abundance of miRNA genes in plants, and the mounting evidence that they are
key regulators of developmental events, it is in some ways surprising that plant miRNAs were
not discovered genetically long ago. Although it is something of a mystery as to why more
miRNAs have not been identified in genetic screens, there are several notable examples where
13
they have. However, in plants at least, in none of these cases was it realized that miRNAs were
involved until after cloning experiments had established that plant genomes contained numerous
miRNAs. In a sense, the dominant mutations in the HD-ZIP genes PHB, PHV, and REVOLUTA
(REV) in Arabidopsis and ROLLED LEAF1 (RLDI) in maize can be thought of as miRNArelated mutations; all result in adaxialization of leaves and/or vasculature as the result of
mutations within miR166 complementary sites (33, 87, 88, 145). At least three miRNA genes,
nziR319 (also known as miR-JAW), miR172 (also known as EAT), and miR166 were isolated as
dominant overexpressors in enhancer trap screens for mutants with developmental abnormalities
(7, 53, 100). To date, only a single loss of function allele at a miRNA gene has been identified
genetically in plants; early extra petalsl is caused by the insertion of a transposon 160 b.p.
upstream of miR164c, and results in flowers with extra petals (9). The fact that miRNA loss-offunction mutants have been recovered so rarely is perhaps due to redundancy; most miRNAs
exist in multigene families that are likely to have overlapping function, buffering against a loss
of function at any single miRNA gene.
Discovery of plant miRNAs: Bioinformatics
In both plants and animals, cloning has been the initial means of large-scale miRNA
discovery. However, cloning is biased towards RNAs that are highly and broadly expressed.
MicroRNAs that are expressed at low levels, or that are expressed only in specific cell types or in
response to certain environmental stimuli, will be relatively difficult to clone. Any sequence
specific biases in the cloning procedure might also cause certain miRNAs to be missed. Because
of these limitations, bioinformatic approaches to identify miRNAs have been useful as a
complement to cloning.
A relatively straightforward use of bioinformatics has been to find homologs of cloned
miRNAs, both within the same genome and in the genomes of other species (57, 61, 63, 105). A
more difficult challenge is to identify miRNAs unrelated to previously known miRNAs. This
was first done for animal miRNAs, using algorithms that search for conservation of sequence
and secondary structure (i.e. miRNA stem-loop precursors) between animal species in patterns
that are characteristic of miRNAs (6, 37, 60, 69, 71). Although these methods succeeded in
identifying numerous potential animal miRNAs, many of which were subsequently confirmed
experimentally, they are not directly useful in finding plant miRNAs because of the longer and
more heterogeneous secondary structures of plant miRNA stem-loops.
14
To address this problem, several groups have devised bioinformatic approaches specific
to the identification of plant miRNAs (2, 17, 46, 135). Like the algorithms for the identification
of animal miRNAs, these approaches all use conservation of secondary structure as a filter, but
are necessarily more relaxed in terms of the allowed structures. Some of these approaches take
advantage of the high complementarity of plant miRNAs to target mRNAs; searching for
conserved stem-loops with conserved complementarity to mRNAs not only helps to distinguish
authentic miRNAs from false positives, but also identifies putative regulatory targets of the
predicted miRNAs (2, 46).
Genomics of plant microRNAs
Taken in aggregate, cloning, genetics, and bioinformatics have identified 114 potential
miRNA genes in Arabidopsis (Table 1, Table 2). These 114 miRNA loci can be grouped into 41
multigene families, with each family comprised of stem-loops with the potential to produce
identical or highly similar mature miRNAs. 21 families are clearly conserved to additional plant
species beyond Arabidopsis (Table 1), whereas for 20 families conservation outside of
Arabidopsis has not been observed or is uncertain (Table 2). The following discussion will focus
primarily on evolutionarily conserved families, as these generally have more reliable evidence
for their expression and regulation of target genes.
Expression of plant microRNAs
Some miRNAs are among the most abundant cellular RNAs in animals, with individual
miRNAs having up to 10,000-50,000 copies per cell (71). Although the expression levels of
plant miRNAs have not been quantified, it is clear that many of them are abundantly expressed.
Certain miRNAs have been cloned hundreds of times, and most miRNAs are readily detectable
by Northern blot (3, 112)(http://asrp.cgrb.oregonstate.edu/db/). More recently, microarray
technology has been adapted to rapidly survey the expression profile of plant miRNAs (8).
Some miRNAs are expressed in a broad range of tissues, whereas others are expressed most
strongly in particular organs or developmental stages (8, 112). More precise data on the
localization of a few miRNAs in plants has come from in situ hybridization to miRNAs (25, 48,
52) or from miRNA-responsive reporter genes (102). Little is known about the transcriptional or
post-transcriptional regulation of miRNA expression. The expression levels of several miRNAs
are responsive to phytohormones or growth conditions; miR159 levels are enhanced by
gibberellin signaling (1), and miR393 levels are increased by a variety of stress conditions (124).
15
The dependence of miR395 levels on growth conditions is even more striking. A regulator of
sulfate metabolizing enzymes and sulfate transporters(2, 3, 46), miR395 is undetectable in plants
grown on standard MS media, but induced over 100 fold in plants which are starved for sulfate
(46).
Conservation of plant microRNAs
Twenty miRNA families have been identified so far that are conserved between all three
sequenced plant genomes: Arabidopsis, Oryza sativa (rice), and Populus trichocarpa (Table 1).
There are also several examples of miRNA families which are conserved within specific
lineages; miR403 is present in the dicots Arabidopsis and Populus but absent from the monocot
Oryza (124). An additional three families identified by cloning in Oryza are conserved to other
monocots such as Maize, but are not evident in either sequenced dicot (123). Within each
family, the mature miRNA is always located on the same arm of the stem-loop for each family
member (5' or 3') (Figure 2). Although the sequence of the mature miRNA and, to a lesser
extent, the miRNA*, are highly conserved between members of the same miRNA family (both
within and between species), the sequence, the secondary structure, and even the length of the
intervening "loop" region can be highly divergent between family members (Figure 2). The
pattern of pairing and non-pairing nucleotides within the mature miRNA and miRNA* is often
conserved between homologous miRNA stem-loops from different species (Figure 2). The
significance of these conserved bulges is unknown; perhaps they serve to guide DCL1 cleavage
to the appropriate positions along the stem-loop.
Most small RNA cloning efforts in plants have focused on Arabidopsis, a dicot, or Oryza,
a monocot, and bioinformatic methods have focused on miRNAs conserved between these two
species. Both species are angiosperms (flowering plants), and diverged from each other -145
million years ago (23). Growing evidence shows that many angiosperm miRNA families, and
their complementary sites in target mRNAs, are conserved in more basal land plants. Ten
miRNA families have conserved target sites in ESTs from gymnosperms or more basal plants,
and a miR159 stem-loop is present in an EST from the moss Physcomitrella patens (46). A
cDNA containing a miR166 stem loop as been cloned from the lycopod Selaginella kraussiana,
and miR166 mediates cleavage within the highly conserved miR166 complementary sites of HDZIP mRNAs from gymnosperms, ferns, lycopods, and mosses (36). A systematic search for
miRNA expression using microarray technology revealed that at least 11 miRNA families have
16
detectable expression in gymnosperms, and at least 2 (miR160 and miR390) are detectable in
moss (8). Furthermore, a clever approach to experimentally identify verify miRNA targets in
plants without sequenced genomes found evidence that four miRNA families (miR160, miR167,
miR171, and miR172) cleave target mRNAs in gymnosperms, ferns, or mosses that are
homologous to the verified Arabidopsis miRNA targets (8). Some of these miRNA families
have been shown to regulate development in Arabidopsis, being necessary for processes such as
the proper specification of floral organ identity (miR172) or leaf polarity (miR166). It is curious
then that these miRNA families regulate homologous mRNAs in basal plant that have very
different reproductive structures and leaf morphology. It is tempting to speculate that these
miRNAs are parts of ancient, conserved regulatory pathways which underlie seemingly different
developmental outcomes.
Gene count
Counting only the 21 conserved families, the Arabidopsis genome contains at least 91
potential miRNA genes (http://www.sanger.ac.uk/Software/Rfam/mira/index.shtml, Table 1).
These families are somewhat expanded in Oryza and Populus, containing 116 and 169 potential
miRNA genes, respectively (Tablel). The number of members per family in one genome ranges
from 1 to 32. It is unclear why plant genomes contain so many stem-loops encoding similar
miRNAs. The number of members in each family seems to be correlated between species;
certain families contain numerous members in all three species (e.g. miR156, miR166, miR169),
whereas others consistently contain only a few genes (e.g. miR162, miR168, miR394) (Table 1).
Although it is unclear why a plant would need, for example, 12 copies of miR156, this
correlation suggests a functional significance in the sizes of the various miRNA families.
Non-conserved microRNAs
Although many miRNA families are conserved widely in plants, others are found only in
a single genome, and thus appear to be of a more recent evolutionary origin (Table 2). Based on
extended homology between non-conserved miRNAs and target genes, it has been proposed
some of these young miRNAs arose as tandem duplications of target-gene segments (4).
Although several non-conserved miRNAs have been shown to cleave target mRNAs (4, 130), it
is difficult to confidently predict targets for many because it is not possible to use conservation
of complementary sites as a filter against false positives. In fact, it is difficult to be confident
that all annotated non-conserved miRNAs are in fact miRNAs rather than siRNAs. The
17
established minimal standard is that a small RNA with detectable expression and the potential to
from a stem-loop when joined to flanking genomic sequence can be annotated as a miRNA (5).
In practice, these requirements are too loose to be useful in categorizing small RNAs cloned
from plants. Many plant siRNAs are detectable on blots (138), and hundreds of thousands of
non-miRNA genomic sequences can be predicted to fold into secondary structures that resemble
the structures of plant miRNA precursors (46). Therefore, without the conservation of a
characteristic pattern of sequence and secondary structure, it can be difficult to know if a given
cloned RNA originated from a single-stranded stem-loop (i.e. is a miRNA) or from a doublestranded RNA (i.e. is a siRNA). In fact, many of the thousands of cloned Arabidopsis siRNAs
(http://asrp.cgrb.oregonstate.edu/) would probably meet the literal requirements for annotation as
miRNAs. A few of these sequences probably are miRNAs, but others that might meet the literal
criteria probably are not. Because of this difficulty in identifying non-conserved miRNAs, it is
not possible to propose a meaningful estimate on the total number of miRNA genes in
Arabidopsis or other plant genomes.
Because stem-loop structures that resemble miRNA precursors are so common in
genomic sequence, bioinformatic searches for plant miRNAs are also prone to identifying false
positives. Nine annotated miRNA families that differ from other miRNAs in several key aspects
were identified in a bioinformatic screen for miRNAs conserved between Arabidopsis and Oryza
(135). Unlike all the other conserved families, each has a single locus in each genome, and none
of these 9 families have clearly identifiable homologs in the Populus genome or in ESTs from
other plant species (M.W. Jones-Rhoades, personal communication). Taken together with the
fact that the stem-loops of many of these miRNAs have more unpaired nucleotides within the
miRNA/miRNA* then is typical for miRNAs with more experimental evidence, it appears likely
that these sequences are bioinformatic false positives rather than bonafide miRNAs.
Regulatory roles of plant microRNAs
Regulatory roles of animal microRNAs
As cloning experiments in animals identified large numbers of miRNAs, their functions
remained largely unknown. Experience from the founding miRNAs, the lin-4 and let-7 RNAs,
suggested that many, in not all, cloned miRNAs were also likely to repress the translation of
protein coding genes. However, there was a considerable lag between large scale miRNA
identification in animals and reliable genome-wide prediction of miRNA regulatory targets.
18
Animal miRNA are capable of repressing mRNAs to which they have quite limited
complementarity; 7 or 8 adjacent paired nucleotides in the 5' portion of the miRNA is sufficient
for repression in vivo (20, 29). Animal mRNAs contain numerous matches with this degree of
complementarity, not only to miRNAs but also to arbitrary sequences with similar dinucleotide
composition as miRNAs (67, 68). The primary challenge in predicting animal miRNA targets
has been to know which of these numerous potential targets are biologically significant.
Algorithms which search for the conservation of potential target sites across multiple species and
which take into account the pairing requirements for translational repression (e.g. emphasis on
pairing to the 5' portion of the miRNA) have identified thousands of mRNAs as probable targets
of animal miRNAs (20, 34, 45, 54, 67, 68, 121). The limited pairing required for translational
repression in animals, as well as the large number of predicted targets, has lead to the
"micromanager model" for animal miRNA-mediated regulation, whereby many, if not most,
animal mRNAs have their expression modulated to a greater or lesser extent through interaction
with miRNAs (12).
Identification of plant miRNA targets
In contrast to the delay in animals, the high degree of complementarity between
Arabidopsis miRNAs and their target mRNAs allowed for the confident prediction of targets
soon after the discovery of the miRNAs themselves. The first indication of this plant-specific
paradigm for miRNA target recognition came from miR171. miR171 has 4 matches in the
Arabidopsis genome: one is located between protein coding genes and has a stem-loop structure,
whereas the other three are all antisense to SCARECROW-LIKE (SCL) genes and lack stem-loop
structures (75, 112). The intergenic miR171 locus with the stem-loop produces a miRNA that
guides the cleavage of the complementary SCL mRNAs (76).
Although other Arabidopsis miRNAs are not perfectly complementary to mRNAs, most
of them are nearly so. An initial genome-wide screen for miRNA targets searched for mRNAs
containing ungapped, antisense alignments with 0-3 mismatches to miRNAs, a degree of
complementarity highly unlikely to occur by chance (113). Using this cutoff, targets could be
predicted for 11 out of 13 miRNA families known at the time, comprising 49 target genes in total
(113).
For conserved miRNAs, more sensitive predictions that allow for gaps and more
mismatches can be made by identifying cases where homologous mRNAs in Arabidopsis and
Oryza each have complementarity to the same miRNA family (46).
19
Because plant miRNAs affect the stability of their targets, mRNA expression arrays can
be used in experimental genome-wide screens for miRNA targets. For example, expression
array data found that five mRNAs encoding TCP transcription factors are down regulated in
plants over-expressing miR319 (100). Expression arrays may be especially useful in identifying
miRNA targets which have been missed by bioinformatics (i.e. targets with more degenerate or
non-conserved complementarity which are nonetheless subject to miRNA-guided cleavage).
However, analysis of mRNAs down-regulated in plants overexpressing one of four miRNAs
identified only two potentially direct targets not related to those found through bioinformatics
(115). Furthermore, evidence for miRNA-guided cleavage of these targets in wild-type plants
was not detected by 5' RACE, suggesting that these mRNAs may only be cleaved in plants that
ectopically express miRNAs (115).
The scope of miRNA-mediated regulation in plants
The identity of their predicted targets suggests that plant miRNAs are master regulators;
many miRNA targets encode for regulatory proteins. The 21 conserved miRNA families have
90 confirmed or predicted conserved regulatory targets in Arabidopsis (Table 3). 65 (72%) of
these encode for transcription factors, pointing to a role for miRNAs in control of transcriptional
regulation. Another six (7 %) are F-box proteins or E2 ubiquitin conjugation enzymes thought to
be involved in the selective targeting of proteins for degradation by the proteasome, implying a
role for miRNAs in regulating protein stability. DCLI, AGO1 and AGO2 are also miRNA
targets, suggesting that miRNAs regulate their own biogenesis and function. Other conserved
miRNA targets, such as ATP-sulfurylases, superoxide dismutases, and laccases have less clear
roles as regulators; although in vivo miRNA-mediated cleavage has been shown for many of
these targets, the biological significance of their regulation by miRNAs is not known.
All 20 miRNA families that are conserved between Arabidopsis, Populus, and Oryza
have complementary sites in target mRNAs that are also conserved in all three species (Table 3).
Although these miRNAs may also have targets which are not conserved, this conservation of
target sites suggests that miRNAs play similar roles in different plant species. Indeed, mutations
in class III HD-ZIP genes that reduce miR166 complementarity in Arabidopsis and Oryza have
similar phenotypes (48, 88, 113). However, the expansion of certain miRNA families and target
classes in different species suggests that some of these miRNA families may have speciesspecific roles. For example, the miR397 family is complementary to mRNAs of 26 putative
20
laccase genes in Populus, whereas it has comparable complementarity to only three in
Arabidopsis. Although the roles that laccases play in the biology of plants is not well
understood, there is speculation that they may be involved in lignification (86), a process which
may be more critical in a woody plant such as Populus.
Validation of plant miRNA targets
While the majority of plant miRNA targets were initially predicted through
bioinformatics, a growing number have been validated experimentally. One means of target
validation has been to use Agrobacterium filtration to observe miRNA-dependent cleavage of
targets in Nicotiana benthiama leaves (49, 76). Another has been to assay the endogenous
miRNA-mediated cleavage activity that is present in wheat germ lysate (83, 126). Perhaps the
most useful method of miRNA target validation has been to use 5' RACE to detect in vivo the
products of miRNA mediated cleavage reactions (46, 49, 76, 80, 81, 83, 100, 123, 139). An
adaptor oligo is ligated to the 5' end of the uncapped 3' portion of a cleaved miRNA target,
followed by PCR with a gene specific primer (49, 76). Sequencing of the resulting PCR product
maps the precise position of cleavage within the target, usually between the nucleotides that pair
to positions 10 and 11 of the miRNA.
A more informative level of target validation is to examine the biological significance of
the miRNA-mediated regulation of that target. As discussed below, reverse genetic approaches
have yielded information about the in vivo relevance of a growing number of miRNA-target
interactions.
Regulatory roles of plant microRNAs
The first evidence that small RNAs play roles in plant development came from mutants
impaired in small RNA biogenesis or function. Indeed, several genes central to miRNA
function, including DCL1, AGO1, and HEN1, were initially identified based on the
developmental consequences of their mutations before they were known to be important for
small RNA biogenesis or function. Multiple groups isolated dcll mutants; the most severe
mutations result in early embryonic arrests, and even partial loss-of-function mutants result in
pleiotropic defects, including abnormalities in floral organogenesis, leaf morphology, and
axillary meristem initiation (reviewed in (114)). agol, henl, hyll, and hst mutants all have
pleiotropic developmental defects that overlap with those of dcll plants (16, 26, 77, 91, 127). In
addition, plants that express certain viral inhibitors of small RNA processing or function, such as
21
HC-Pro and P19, also exhibit developmental defects reminiscent of dcll mutants (22, 24, 30, 49,
82). Although many or all of these developmental defects may be the result of impaired miRNA
activity, they may also reflect disruption of other pathways in which these genes are active, such
as in the generation and function of siRNAs. However, in contrast to mutations in genes needed
for miRNA biogenesis, mutations in genes required for the accumulation of certain siRNAs, such
as AGO4, RDR6, and DCL3, result in few or mild developmental abnormalities (27, 95, 131,
138, 146).
Mutations that impair a fundamental step in miRNA biogenesis result in the
misregulation of numerous miRNA targets (18, 130), making it difficult to assign the observed
phenotypes to any particular miRNA family. Fortunately, the ease by which transgenic
Arabidopsis can be generated has allowed the investigation of particular miRNA/target
interactions through one of two reverse genetic strategies. The first strategy is to make
transgenic plants that overexpress a miRNA, typically under the control of the strong double 35S
promoter (Table 4). This approach has the potential to downregulate all mRNAs targeted by the
overexpressed miRNA. The second strategy is to make transgenic plants that express a miRNAresistant version of a miRNA target, in which silent mutations have been introduced into the
miRNA complementary site that disrupt miRNA-mediated regulation without altering the
encoded protein product (Table 5). In total, eight miRNA families have been investigated in
vivo by these strategies. As might have been expected from the identity of their target mRNAs,
in all eight cases perturbation of miRNA-mediated regulation results in abnormal development.
Taken together, they prove that miRNAs are key regulators of many facets of Arabidopsis
development.
One of the better studied families of miRNA targets are the class III HD-ZIP transcription
factors. The importance of miR166-mediated regulation for the proper regulation of this gene
class is underscored by the large number of dominant gain-of-function alleles that map to the
miR166 complementary sites of HD-ZIP mRNAs (33, 48, 87, 88, 145). phb and phv mutants
result in adaxialization of leaves and over-expression of phblphv mRNA(87, 88), whereas rev
mutants result in radialized vasculature (33, 145). Similarly, mutations within the miR166
complementary site of the maize HD-ZIP gene RLD1 result in adaxialization of leaf primordia
and overaccumulation of rldl mRNA (48). All of these HD-ZIP gain-of-function mutations
result in a change in the amino acid sequence of the conserved START domain. Before the
22
discovery of miR166, it was hypothesized that the HD-ZIP mutants resulted from the loss of
negative regulatory interaction mediated by the START domain (88). However, transgenic
plants expressing miR166-resistant version of PHB, PHV, or REV result in plants that phenocopy
their respective gain-of-function mutants, whereas transgenic plants containing additional wildtype copies of these genes have no or mild phenotypes (33, 83). This demonstrates that changes
in the RNA sequence, rather than in the amino acid sequence, are sufficient to account for the
developmental abnormalities observed in HD-ZIP gain-of-function mutants.
miR172-mediated regulation of APETALA2 (AP2) and related AP2-like genes is needed
for the proper specification of organs during flower development (7, 25). Plants that overexpress
miR172 have floral defects, such as the absence of petals and the transformation of sepals to
carpels, which resemble ap2 loss-of-function mutants (7, 25). Curiously, overexpression of
miR172 substantially decreases the protein levels of target AP2-like genes without a
commensurate change in target mRNA levels, suggesting that, unlike other known plant miRNAtarget interactions, miR172 is repressing translation of AP2-like mRNAs in a manner similar to
that employed by animal miRNAs (7, 25). However, the extent of complementarity between
miR172 and the AP2-like mRNAs is high, comparable to that of other plant miRNA targets that
undergo robust miRNA-mediated cleavage, and 3' cleavage fragments of AP2-like mRNAs can
be detected by 5' RACE (7, 49). Indeed, Schwab et al. found that cleavage of miR172 targets is
increased in miR172 overexpressing plants, and postulated a feedback mechanism whereby AP2like proteins repress their own transcription, resulting in similar mRNA levels despite an increase
in mRNA cleavage (115). It appears that miR172 mediated regulation of AP2-like genes is
complex, and it is unclear how similar miR172-mediated regulation is to the miRNA-mediated
translational repression observed in animals.
Although most miRNA families are predicted to target a single class of targets, the
miR159/319 family regulates both MYB and TCP transcription factors. Although miR159 and
miR319 differ by only three nucleotides, they appear to be functionally distinct. Overexpression
of miR319, which specifically downregulates TCP mRNAs, results in plants with uneven leaf
shape and delayed flowering time (100). Expression of miR319-resistant TCP4 results in
aberrant seedling that arrest with fused cotyledons and without forming apical meristems (100).
Overexpression of miR159, which specifically reduces accumulation of MYB mRNAs, results in
male sterility (1, 115), whereas plants that express miR159-resistant MYB33 have upwardly
23
curled leaves, reduced stature, and shortened petioles (90, 100). Thus miR159 and miR319 are
related miRNAs that regulate unrelated mRNAs.
In addition to the miRNAs that target transcription factors, two miRNAs families are
known to target genes central to miRNA biogenesis and function; miR162 targets DCL1 (139)
and miR168 targets AGOI (113, 129). The targeting of these genes suggests a feedback
mechanism whereby miRNAs negatively regulate their own activity. Curiously, although plants
expressing miR168-resistant AGO1 overaccumulate AGO1 mRNA as expected, they also
overaccumulate numerous other miRNA targets and exhibit developmental defects which
overlap with those of dcll, henl, and hyll loss-of-function mutants (129). This suggests that an
overabundance of AGO1 inhibits, rather than promotes, RISC activity (129).
MicroRNAs: plants vs. animals
As our understanding of miRNA genomics and function in both plants and animals has
grown, so has the realization that there are numerous differences between the kingdoms in terms
of the ways miRNAs are made and carry out their regulatory roles. Indeed, the evolutionary
relationship between plant and animal miRNAs is unclear. Did the last common ancestor of
plants and animals possess miRNAs from which modem miRNA are descended, or did the plant
and animal lineages independently adapt conserved RNAi machinery to use endogenously
expressed stem-loop RNAs as trans regulators of other genes? Although miRNAs are deeply
conserved within each kingdom (8, 36, 57, 61, 63, 71, 105), no particular miRNA is known to be
conserved between kingdoms. There are several kingdom-specific differences in miRNA
biogenesis. For one thing, the stem-loop precursors of plant miRNAs are markedly longer and
more variable than their animal counterparts. The cellular localization of processing appears to
differ between plant miRNAs, which are entirely processed within the nucleus (101, 103, 138),
and animal miRNAs, which are processed both in the nucleus and in the cytoplasm (65).
Perhaps more importantly, the scope and mode of regulation carried out by miRNAs appears to
be drastically different between the two kingdoms. Most plant miRNAs guide the cleavage of
target mRNAs (46, 49, 76, 126), and the predicted targets of Arabidopsis miRNAs, which
comprise less than 1% of protein coding genes, are highly biased towards transcription factors
and other regulatory genes (46, 113). Although at least some animal miRNAs guide cleavage of
endogenous targets (84, 141), most appear to act through the repression of translation (19, 20, 68,
117, 136). Furthermore, the identification of conserved reverse complementary matches to the
24
5' "seed" portions of animal miRNAs suggests that a large percentage (20-30 % or more) of
animal protein coding genes are conserved miRNA targets (20, 67, 137). Whatever the
evolutionary relationship is between plant and animal miRNAs, the functional differences are
striking.
Summary
Plant miRNAs were initially identified through cloning, without any indication as to their
biological roles. In chapter one of this thesis, I describe the initial genome-wide bioinformatic
screen for plant miRNA regulatory targets. We show that Arabidopsis miRNAs are
complementary to far more mRNAs than would be expected by chance, and propose that
mRNAs that can pair to miRNAs with less than three unpaired nucleotides are likely to be
miRNA targets. Furthermore, many of these miRNA complementary sites are conserved to
orthologous Oryza mRNAs, implying that miRNA-mediated regulation of many targets predates
the divergence of dicots and monocots. It total, we identified 49 predicted targets, of which 34
encode for transcription factors. Our findings indicated that miRNAs are key components of
numerous regulatory circuits in plants and set the stage for numerous additional experiments to
investigate in depth the significance of miRNA-mediated regulation for particular target families
and genes.
Cloning is an efficient way to identify abundant miRNAs, but it is likely to miss those
expressed at low levels or under specific conditions. In chapter two, I describe the development
and implementation of a bioinformatic approach to identify conserved miRNAs unrelated to
those discovered by cloning. In conjunction with this, I used the conservation of miRNA target
sites to increase the sensitivity and selectivity of plant miRNA target prediction. Seven
previously unknown families of miRNAs were identified computationally and verified
experimentally. These newly identified families expanded the categories of genes known to be
regulated by miRNAs to include F-box genes, sulfate metabolizing genes, laccases, and
superoxide dismutases.
Bioinformatic approaches have proven effective at identifying targets of plant miRNAs,
and moderately high throughput methods such as 5' RACE can detect evidence for the
interaction of many miRNA-mRNA pairs. However, our understanding of the biological
significance of plant miRNAs has been greatly aided by reverse genetic approaches that allow
for the disruption of miRNA-mediated regulation. In chapter three, I describe the role of
25
miR394 in the regulation of F-Box gene Atlg27340. Expression of miR394-resistant Atlg27340
results in numerous developmental abnormalities, including downwardly curved rosette leaves,
radialized cauline leaves, abortive flowers, and arrested seedlings that lack shoot apical
meristems, that correlate with an increase in Atig27340 mRNA levels.
26
Table 1. Genomic loci of conserved plant miRNA families
miRNA family
A.t.
O.s.
miR156
12
12
11
miR159/319
6
8
15
miR160
3
6
8
miR162
2
2
3
miR164
3
5
6
miR166
9
12
17
miR167
4
9
8
miR168
2
2
2
miR169
14
17
32
rniR171
4
7
10
miR172
5
3
9
miR390
2
1
4
miR393
2
2
4
miR394
2
1
2
miR395
6
19
10
miR396
2
5
7
miR397
2
2
3
miR398
3
2
3
miR399
6
11
12
miR408
1
1
1
miR403
1
0
2
miR437
0
1+
0
miR444
0
1+
0
miR445
0
9+
0
P.t.
Total
91
127
169
The number of identified genes in each family of miRNAs is
indicated. Only miRNA families with strong evidence for
conservation are listed. Otyza miRNA families which appear
to be missing from Arabidopsis and Populus but are present
in Maize are marked with a plus (+).
27
Table 2. Genomic loci of non-conserved plant miRNA
families
miRNA family
A.t.
O.s.
P.t.
miR158
2
0
0
miR161
1
0
0
miR163
1
0
0
miR173
1
0
0
miR400
1
0
0
:miR401
1
0
0
miR402
1
0
0
miR404
1
0
0
miR405
3
0
0
miR406
1
0
0
miR407
1
0
0
miR435
0
1
0
miR436
0
1
0
miR438
0
1
0
miR439
0
10
0
miR440
0
1
0
miR441
0
3
0
miR442
0
1
0
miR443
0
1
0
miR446
0
1
0
miR413
1
1
0
miR414
1
1
0
miR415
1
1
0
miR416
1
1
0
miR417
1
1
0
miR418
1
1
0
miR419
1
1
0
miR420
1
1
0
miR426
1
1
0
Total
23
29
0
The number of identified genes in each family of miRNAs is
indicated. Only miRNA families without strong evidence for
conservation are listed. As discussed in the text, miR413miR426 were identified bioinformatically as conserved
between Arabidopsis and Oryza, but are not evident in
Populus and it is unclear if they are truly miRNAs.
28
Table 3. Regulatory targets of plant miRNAs
miRNA family Target family Validated targets
Validation method
miR156
SBP
SPL2, SPL3, SPL4, SPL10(3, 24, 49, 131)
5' RACE
11
9
16
miR159/319
MYB
MYB33, MYB65(1, 90, 100)
8
6
5
miR159/319
TCP
TCP2, TCP3, TCP4, TCP10, TCP24(100)
target, miRNA-resistant
target, Agro-infiltration
5' RACE, miRNA-resistanttarget
5
4
7
miR160
ARF
ARF10, ARF16, ARF17(3, 49, 80)
5' RACE, miRNA-resistant target 3
5
9
miR164
NAC
CUC1,CUC2,NAC1,At5g07680, At5g61430(39, 49,
62, 81)
5' RACE, wheat germ lysate,
miRNA-resistant target
6
6
6
miR166
HD-ZlPIII
PHB, PHV, REV, A THB-8, ATHB-15(33, 53, 83,126) 5' RACE, wheat germ lysate,
5
4
9
miR167
ARF
ARF6, ARF8(3, 49)
5' RACE
2
4
7
miR169
HAP2
Atlg17590, Atlg72830, Atlg54160, At3g05690,
5 RACE
8
7
9
miR171
SCL
SCL6-111,
SCL6-IV(49, 76)
5' RACE, Agro-infiltration
3
5
9
miR172
AP2
AP2, TOE1,TOE2,TOE3(7, 25, 49)
5' RACE, miRNA-resistant target 6
5
6
miR393
bZIP*
Atlg27340(46)
5' RACE
1
1
1
miR396
GRF
GRL1, GRL2, GRL3, GRL7, GRL8, GRL9(46)
5' RACE
7
9
9
total, transcription factors
65
65
93
miR161
PPR
Atlg06580(4, 131)
5' RACE
9
0
0
miR162
Dicer
DCL1(139)
5' RACE
1
1
1
miR163
SAMT
Atlg66690, Atlg66700, Atlg66720, At3g44860(4)
5' RACE
5
0
0
miR168
ARGONAUTE AG01(129, 131)
5' RACE, miRNA-resistant target
1
6
2
miR393
F-box
TIR1, Atlg12820, At3g26810 At4g03190,
5' RACE
4
2
5
miRNA-resistanttarget
At5g06510(46)
At3g23690(46)
A.t. O.s. P.t.
miR394
F-box
Atlg27340(46, 47)
5' RACE, miRNA-resistant target
1
1
2
miR395
APS
APS1,APS4(46)
5' RACE
3
1
2
miR395
S transporter
AST68(3)
5' RACE
1
2
3
miR396
Rhodenase
1
1
1
miR397
Laccase
At2g29130, At2g38080, At5g60020(46)
5' RACE
3
15
26
miR398
CSD*
CSD1, CSD2(46)
5' RACE
2
2
2
miR398
CytC oxidase* At3g15640(46)
5' RACE
1
1
0
miR399
Ph transporter
1
4
4
miR399
E2-UBC
2
miR403
At2g33770(3)
5' RACE
1
1
ARGONAUTE AGO2 (3)
5' RACE
1
0
1
miR408
Laccase
At2g30210(115)
5' RACE
3
2
3
miR408
Plantacyanin
7448.m00137(123)
5' RACE
1
3
1
total, non-transcription factors
39 42 55
Validated and predicted targets of Arabidopsis miRNAs are listed, grouped into those encoding transcription factors (top) and those
encoding other functionalities (bottom). For each target family, the number of genes predicted to be targets in each of three plant
species with sequenced genomes (A.t., Arabidopsis thaliana; O.s., Oryza sativa; P.t., Populus trichocarpa) is indicated. To be
counted, a potential target must contain a complementary site to at least one member of the indicated miRNA family with a score of
3 or less (as described (46) ), with the exception of the target families marked with an asterisk, for which some targets with more
relaxed complementarity were included. Non-validated target families are listed only if they are present in all three species.
miR408-directed cleavage of plantacyanin mRNAs have been validated only in Oryza. Abbreviations: SBP, SQUAMOSA-promoter
binding protein; ARF, AUXIN RESPONSE FACTOR; SCL, SCARECROW-LIKE; GRF, GROWTH REGULATING FACTOR; SAMT,
SAM-dependant methyl transferase; APS, ATP-sulfurylase; CSD, COPPER SUPEROXIDE DISMUTASE; E2-UBC, E2 ubiquitinconjugating protein
29
'Table 4. miRNA overexpression affecting development
miRNA
target family
Consequences of overexpression
miR156
SPL transcription factors
Increased leaf initiation, decreased apical dominance, delayed flowering time (115)
miR159
MYB transcription factors
Male sterility, delayed flowering time (1)
miR319
TCP transcription factors
Uneven leaf shape and curvature, late flowering (100)
miR164
NAC domain transcription factors Organ fusion (62, 81)
miR166
HD-ZIP transcription factors
Seedling arrest, fasciated apical meristems, female sterility (53)
miR172
AP2-like transcription factors
Early flowering, lack of petals, transformation of sepals to carpels (7, 25)
Table 5. miRNA-resistant target affecting development
miRNA family
miR159
miRNA-resistant target
MYB33
promoter
35S
Phenotype
Upwardly curled leaves (100)
miR159
MYB33
Endogenous
Upwardly curled leaves, reduced stature, shortened petioles (90)
miR319
TCP4
Endogenous and 35S Arrested seedlings, fused cotyledons, lack of SAM (100)
miR319
TCP2
35S
Longer hypocotyls, reduced stature and apical dominance (100)
miR160
ARF17
Endogenous
miR164
CUC1
Endogenous
miR164
CUC2
Inducible and 35S
Extra cotyledons (80)
Shortened rosette leaf petioles, aberrant leaf shape, extra petals, missing
sepals (81)
Aberrant leaf shape, extra petals, increased sepal separation (62)
miR164
NACI
35S
Increased number of lateral roots (39)
miR166
REV
Endogenous
Radialized vasculature, strands of leaf tissue attached to stem (33)
miR166
PHB
35S
Adaxialized leaves, ectopic meristems (83)
miR168
AGO1
Endogenous
Curled leaves, disorganized phyllotaxy, reduced fertility (129)
miR172
AP2
35S
Late flowering, excess of petals and stamens (25)
miR394
Atlg27340
Endogenous
Curled leaves, lack of SAM, abortive flowers, "spiked" cauline leaves (47)
30
Figure Legends
Figure 1. Mechanisms of small RNA biogenesis and function
(A) A model for miRNA biogenesis in Arabidopsis.
Following transcription (step 1), the Pri-
miRNA is processed by DCL1, and perhaps other factors, to a miRNA:miRNA* duplex
(step 2). Pre-miRNAs, which are readily detectable in animals, appear to be very shortlived in plants. The 3' sugars of the miRNA:miRNA* duplex are methylated by HEN1,
presumably within the nucleus (step3). The miRNA is exported to the cytoplasm by
HST, probably with the aid of additional factors (step 4). The mature, methylated
miRNA is separated from the miRNA*, perhaps with the aid of a helicase. The miRNA
is incorporated into RISC, an Argonaute containing ribonucleoprotein complex, while
the miRNA* is degraded (step 5). For plant miRNAs, unwinding of the
miRNA:miRNA* duplex may occur before export to the cytoplasm.
(B) A model for siRNA biogenesis. Long double-stranded RNA, perhaps generated through
the action of an RNA-dependent RNA polymerase (RDRP), is iteratively processed by
Dicer-like proteins to yield multiple siRNA duplexes. One strand from each siRNA
duplex is stably incorporated into RISC, while the other is degraded.
(C) MicroRNAs or siRNAs can guide RISC to cleave mRNAs with extensive
complementarity to the small RNA. The complementarity to the small RNA can occur at
any point within the target RNA.
(D) MicroRNAs or siRNAs can repress functional translation of target mRNAs. In animals,
most known example of translational repression involve multiple sites in the 3' UTR with
imperfect complementarity to the small RNA. The one plant miRNA which has been
reported to mediate translational repression, miR172, recognizes its targets through a
single site with near perfect complementarity.
(E) Small RNAs play roles in the establishment of transcriptionally silent heterochromatin.
The exact role played by the small RNAs in this pathway is not clear, nor is it known if
they base pair to DNA or RNA.
Figure 2. Representative miR164 stem-loop precursor from Arabidopsis, Oryza, and
Populus. The mature microRNAs are shown in red.
31
Figure 1
Exogenous RNA, transposon,
virus,endogenous RNA
miR NA gene
Near-perfect complementarity in
coding region or UTR
Pri-miRNA
Long dsRNA
I1ll-02IIIwyl or
I/
•
-
k~0
""
014D
Pre-maR NA
PremriiR NAý
siWRWAWI
PlxiesIIII
miRNA:miRNA* duplex
010
WAn
r
M e
1
Nucleus
"
IK
I
Mell"llllllIIIIIIIIlk
methylated miRNA:miRNA
Snort complementary segments ini-U
It
siR NA duplexes
I
1PI
17110
P
-=--An
duplex
=
Active chromatin
Histone methylation directed
by heterochromatic siR NAs
U-
W~ WP
Silent chromatin
Mei
Mature miRNA within RIS C
Mature siRNAs within RISC
~TE
Figure 2
AU
A
U
G U
G U
C-G
G U
U
G
A
U-A
A-U
U-A
A-U
U-A
G
G
U-A
G U
C-G
G'U
U'G
A-U G
A
U
C-G
A-U
C-G
AAU
G -C
C-G
A-UC
U
C
C
A-U
A-U
A-U
AC-GC
A-U
A-U
A-U
C-G
G-C
U-A
G-C
C-G
A-U
C
A
G-C
G'U
G'U
A
C
5'
G-C
A-U
A-U
G-C
A-U
G-C
G-C
U-A
U-A
G-C
A
A
C
U
A-U
C-G
U-A
C -G
U
U
C-G
A-U
C-GA U
A-UA G
U-A
A-U
U'G
A
G
U -A
A-U
C-G
U-A
G U
A-U
U-A
C-G
A-U
U
U
U
C
A
UU
CG -C
U-A
G6U
C-G
A-U
C-G
G-C
G -C
A
C
A
C
U
G-C
A-U
A-U
A
A
G-C
G-C
U-A
A-U
G -C
3'
MIR164a
Arabidopsis
5'
3'
MIR164b
Arabidopsis
CC
U
A
C-G
G-C
A
C
U
U
C-G
U
U
C -G
C
U
U
G
C-G
C-G
U
U
U
C-G
C-G
U-A
U-A
C-G
U
U
U
U
U-A
C-G
U-A
AUC
AA
CUAC -G
U-A
U-A
G-C
U
A
U
C
AC-G
G-C
U -AC
A-U
C-G
G-C
U-A
C-G
A-U
C-G
G
G
G U
G-C
A-U
C
C
G-C
A-U
A-U
G-C
A U
G-C
G-C
U-A
U
U
G-C
5'
3'
MIR164a
Oryza
C
G
C
C-G
C-G
G-C
UAC G
U-A
G-C
A-U
-C
G-C
CG - C20 nt loop
C-G
C
C
U-A
U'G
G-C
A-U
G-C
G -C
G-C
U-A
U
C
U
C-G
G-C
C
C
U
A-U
C-C
C -G
U -A
A-U
C-G
C-C
A-U
UC
UU
A
UU
CG.C
U-A
G'U
C-G
A-U
C-G
G-C
G-C
A
G
C
U
G-C
A-U
A-U
G-C
A-U
G-C
G-C
U-A
G-C
G-C
5'
3'
MIR164b
Oryza
UU
A
C
G-C
U -A
G-C
U-A
C-G
A-U
A-U
U-A
A-U
C-G
G-C
U-A
A-U
U
U
A
C
U
A
CG-CU
U-A
C-G
A
C
G -C
C-G
G-C
G-C
G-C
A-U
C U
G- C
A-U
G -C
AG-U G
U
C
A-U
G-C
AU
G-C
G-C
AG C
G
U
G -C
U-A
A
C
G-C
U-A
A-U
G-C
5'
3'
MIR164a
Populus
5'
3'
MIR164b
Populus
1.
2.
3.
4.
Achard P, Herr A, Baulcombe DC, Harberd NP. 2004. Modulation of floral development
by a gibberellin-regulated microRNA. Development 131: 3357-65
Adai A, Johnson C, Mlotshwa S, Archer-Evans S, Manocha V, et al. 2005.
Computational prediction of miRNAs in Arabidopsis thaliana. Genome Res 15: 78-91
Allen E, Xie Z, Gustafson AM, Carrington JC. 2005. microRNA-Directed Phasing during
Trans-Acting siRNA Biogenesis in Plants. Cell 121: 207-21
Allen E, Xie Z, Gustafson AM, Sung GH, Spatafora JW, Carrington JC. 2004. Evolution
of microRNA genes by inverted duplication of target gene sequences in Arabidopsis
thaliana.Nat Genet36: 1282-90
5.
6.
7.
8.
9.
10.
11.
12.
13.
1.4.
15.
Ambros V, Bartel B, Bartel DP, Burge CB, Carrington JC, et al. 2003. A uniform system
for microRNA annotation. RNA9: 277-9
Ambros V, Lee RC, Lavanway A, Williams PT, Jewell D. 2003. MicroRNAs and other
tiny endogenous RNAs in C. elegans. Curr Biol 13: 807-18
Aukerman MJ, Sakai H. 2003. Regulation of flowering time and floral organ identity by a
MicroRNA and its APETALA2-like target genes. Plant Cell 15: 2730-41
Axtell MJ, Bartel DP. 2005. Antiquity of MicroRNAs and Their Targets in Land Plants.
Plant Cell 17
Baker CC, Sieber P, Wellmer F, Meyerowitz EM. 2005. The early extra petalsl mutant
uncovers a role for microRNA miR164c in regulating petal number in Arabidopsis. Curr
Biol 15: 303-15
Bao N, Lye KW, Barton MK. 2004. MicroRNA binding sites in Arabidopsis class III
HD-ZIP mRNAs are required for methylation of the template chromosome. Dev Cell 7:
653-62
Bartel DP. 2004. MicroRNAs: genomics, biogenesis, mechanism, and function. Cell 116:
281-97
Bartel DP, Chen CZ. 2004. Micromanagers of gene expression: the potentially
widespread influence of metazoan microRNAs. Nat Rev Genet 5: 396-400
Baulcombe D. 2004. RNA silencing in plants. Nature 431: 356-63
Beclin C, Boutet S, Waterhouse P, Vaucheret H. 2002. A branched pathway for
transgene-induced RNA silencing in plants. Curr Biol 12: 684-8
Bernstein E, Caudy AA, Hammond SM, Hannon GJ. 2001. Role for a bidentate
ribonuclease in the initiation step of RNA interference. Nature 409: 363-6
16.
Bohmert K, Camus I, Bellini C, Bouchez D, Caboche M, Benning C. 1998. AGO1
17.
defines a novel locus of Arabidopsis controlling leaf development. EMBO J 17: 170-80
Bonnet E, Wuyts J, Rouze P, Van de Peer Y. 2004. Detection of 91 potential conserved
plant microRNAs in Arabidopsis thaliana and Oryza sativa identifies important target
genes.Proc Natl Acad Sci U S A 101: 11511-6
18.
19.
Boutet S, Vazquez F, Liu J, Beclin C, Fagard M, et al. 2003. Arabidopsis HENI: a
genetic link between endogenous miRNA controlling development and siRNA
controlling transgene silencing and virus resistance. Curr Biol 13: 843-8
Brennecke J, Hipfner DR, Stark A, Russell RB, Cohen SM. 2003. bantam encodes a
developmentally regulated microRNA that controls cell proliferation and regulates the
proapoptotic gene hid in Drosophila. Cell 113: 25-36
34
20.
21.
22.
Brennecke J, Stark A, Russell RB, Cohen SM. 2005. Principles of microRNA-target
recognition. PLoS Biol 3: e85
Carmell MA, Xuan Z, Zhang MQ, Hannon GJ. 2002. The Argonaute family: tentacles
that reach into RNAi, developmental control, stem cell maintenance, and tumorigenesis.
Genes Dev 16: 2733-42
Chapman EJ, Prokhnevsky AI, Gopinath K, Dolja VV, Carrington JC. 2004. Viral RNA
silencing suppressors inhibit the microRNA pathway at an intermediate step. Genes Dev
18: 1179-86
23.
24.
Chaw SM, Chang CC, Chen HL, Li WH. 2004. Dating the monocot-dicot divergence and
the origin of core eudicots using whole chloroplast genomes. J Mol Evol 58: 424-41
Chen J, Li WX, Xie D, Peng JR, Ding SW. 2004. Viral virulence protein suppresses
RNA silencing-mediated defense but upregulates the role of microrna in host gene
expression. Plant Cell 16: 1302-13
25.
Chen X. 2004. A microRNA as a translational repressor of APETALA2 in Arabidopsis
26.
flower development. Science 303: 2022-5
Chen X, Liu J, Cheng Y, Jia D. 2002. HEN1 functions pleiotropically in Arabidopsis
development and acts in C function in the flower. Development 129: 1085-94
27.
28.
Dalmay T, Hamilton A, Rudd S, Angell S, Baulcombe DC. 2000. An RNA-dependent
RNA polymerase gene in Arabidopsis is required for posttranscriptional gene silencing
mediated by a transgene but not by a virus. Cell 101: 543-53
Doench JG, Petersen CP, Sharp PA. 2003. siRNAs can function as miRNAs. Genes Dev
17: 438-42
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
Doench JG, Sharp PA. 2004. Specificity of microRNA target selection in translational
repression. Genes Dev 18: 504-11
Dunoyer P, Lecellier CH, Parizotto EA, Himber C, Voinnet 0. 2004. Probing the
microRNA and small interfering RNA pathways with virus-encoded suppressors of RNA
silencing. Plant Cell 16: 1235-50
Elbashir SM, Lendeckel W, Tuschl T. 2001. RNA interference is mediated by 21- and
22-nucleotide RNAs. Genes Dev 15: 188-200
Elbashir SM, Martinez J, Patkaniowska A, Lendeckel W, Tuschl T. 2001. Functional
anatomy of siRNAs for mediating efficient RNAi in Drosophila melanogaster embryo
lysate. EMBO J 20: 6877-88
Emery JF, Floyd SK, Alvarez J, Eshed Y, Hawker NP, et al. 2003. Radial patterning of
Arabidopsis shoots by class III HD-ZIP and KANADI genes. Curr Biol 13: 1768-74
Enright AJ, John B, Gaul U, Tuschl T, Sander C, Marks DS. 2003. MicroRNA targets in
Drosophila. Genome Biol 5: R1
Finnegan EJ, Margis R, Waterhouse PM. 2003. Posttranscriptional gene silencing is not
compromised in the Arabidopsis CARPEL FACTORY (DICER-LIKE1) mutant, a
homolog of Dicer-i from Drosophila. Curr Biol 13: 236-40
Floyd SK, Bowman JL. 2004. Gene regulation: ancient microRNA target sequences in
plants. Nature 428: 485-6
Grad Y, Aach J, Hayes GD, Reinhart BJ, Church GM, et al. 2003. Computational and
experimental identification of C. elegans microRNAs. Mol Cell 11: 1253-63
Grishok A, Pasquinelli AE, Conte D, Li N, Parrish S, et al. 2001. Genes and mechanisms
related to RNA interference regulate expression of the small temporal RNAs that control
C. elegans developmental timing. Cell 106: 23-34
35
39.
40.
41.
42.
43.
44.
45.
46.
47.
48.
49.
50.
51.
52.
53.
54.
55.
56.
57.
58.
Guo HS, Xie Q, Fei JF, Chua NH. 2005. microRNA164 Directs NAC1 mRNA Cleavage
to Downregulate Auxin Signals for Lateral Root Development. Plant Cell
Ha I, Wightman B, Ruvkun G. 1996. A bulged lin-4/lin-14 RNA duplex is sufficient for
Caenorhabditis elegans lin-14 temporal gradient formation. Genes Dev 10: 3041-50
Han MH, Goud S, Song L, Fedoroff N. 2004. The Arabidopsis double-stranded RNAbinding protein HYL1 plays a role in microRNA-mediated gene regulation. Proc Natl
Acad Sci U S A 101: 1093-8
Hunter C, Sun H, Poethig RS. 2003. The Arabidopsis heterochronic gene ZIPPY is an
ARGONAUTE family member. Curr Biol 13: 1734-9
Hutvagner G, McLachlan J, Pasquinelli AE, Balint E, Tuschl T, Zamore PD. 2001. A
cellular function for the RNA-interference enzyme Dicer in the maturation of the let-7
small temporal RNA. Science 293: 834-8
Hutvagner G, Zamore PD. 2002. A microRNA in a multiple-turnover RNAi enzyme
complex. Science 297: 2056-60
John B, Enright AJ, Aravin A, Tuschl T, Sander C, Marks DS. 2004. Human MicroRNA
targets. PLoS Biol 2: e363
Jones-Rhoades MW, Bartel DP. 2004. Computational identification of plant microRNAs
and their targets, including a stress-induced miRNA. Mol Cell 14: 787-99
Jones-Rhoades MW, Bartel DP. 2005. MicroRNA-mediated regulation of an F-box gene
is required for embryonic, floral, and vegetative development. In Preperation
Juarez MT, Kui JS, Thomas J, Heller BA, Timmermans MC. 2004. microRNA-mediated
repression of rolled leafl specifies maize leaf polarity. Nature 428: 84-8
Kasschau KD, Xie Z, Allen E, Llave C, Chapman EJ, et al. 2003. P1/HC-Pro, a viral
suppressor of RNA silencing, interferes with Arabidopsis development and miRNA
unction. Dev Cell 4: 205-17
Ketting RF, Fischer SE, Bernstein E, Sijen T, Hannon GJ, Plasterk RH. 2001. Dicer
functions in RNA interference and in synthesis of small RNA involved in developmental
timing in C. elegans. Genes Dev 15: 2654-9
Khvorova A, Reynolds A, Jayasena SD. 2003. Functional siRNAs and miRNAs exhibit
strand bias. Cell 115: 209-16
Kidner CA, Martienssen RA. 2004. Spatially restricted microRNA directs leaf polarity
through ARGONAUTE. Nature 428: 81-4
Kim J, Jung JH, Reyes JL, Kim YS, Kim SY, et al. 2005. microRNA-directed cleavage of
ATHB15 mRNA regulates vascular development in Arabidopsis inflorescence stems.
Plant J 42: 84-94
Kiriakidou M, Nelson PT, Kouranov A, Fitziev P, Bouyioukos C, et al. 2004. A
combined computational-experimental approach predicts human microRNA targets.
Genes Dev 18: 1165-78
Knight SW, Bass BL. 2001. A role for the RNase III enzyme DCR-1 in RNA interference
and germ line development in Caenorhabditis elegans. Science 293: 2269-71
Kurihara Y, Watanabe Y. 2004. Arabidopsis micro-RNA biogenesis through Dicer-like 1
protein functions. Proc Natl Acad Sci U S A 101: 12753-8
Lagos-Quintana M, Rauhut R, Lendeckel W, Tuschl T. 2001. Identification of novel
genes coding for small expressed RNAs. Science 294: 853-8
Lagos-Quintana M, Rauhut R, Meyer J, Borkhardt A, Tuschl T. 2003. New microRNAs
from mouse and human. Rna 9: 175-9
36
59.
60.
61.
62.
63.
64.
65.
66.
67.
68.
69.
Lagos-Quintana M, Rauhut R, Yalcin A, Meyer J, Lendeckel W, Tuschl T. 2002.
Identification of tissue-specific microRNAs from mouse. Curr Biol 12: 735-9
Lai EC, Tomancak P, Williams RW, Rubin GM. 2003. Computational identification of
Drosophila microRNA genes. Genome Biol 4: R42
Lau NC, Lim LP, Weinstein EG, Bartel DP. 2001. An abundant class of tiny RNAs with
probable regulatory roles in Caenorhabditis elegans. Science 294: 858-62
Laufs P, Peaucelle A, Morin H, Traas J. 2004. MicroRNA regulation of the CUC genes is
required for boundary size control in Arabidopsis meristems. Development 131: 4311-22
Lee RC, Ambros V. 2001. An extensive class of small RNAs in Caenorhabditis elegans.
Science 294: 862-4
Lee RC, Feinbaum RL, Ambros V. 1993. The C. elegans heterochronic gene lin-4
encodes small RNAs with antisense complementarity to lin-14. Cell 75: 843-54
Lee Y, Ahn C, Han J, Choi H, Kim J, et al. 2003. The nuclear RNase III Drosha initiates
microRNA processing. Nature 425: 415-9
Lee Y, Kim M, Han J, Yeom KH, Lee S, et al. 2004. MicroRNA genes are transcribed by
RNA polymerase II. EMBO J 23: 4051-60
Lewis BP, Burge CB, Bartel DP. 2005. Conserved seed pairing, often flanked by
adenosines, indicates that thousands of human genes are microRNA targets. Cell 120: 1520
Lewis BP, Shih IH, Jones-Rhoades MW, Bartel DP, Burge CB. 2003. Prediction of
mammalian microRNA targets. Cell 115: 787-98
Lim LP, Glasner ME, Yekta S, Burge CB, Bartel DP. 2003. Vertebrate microRNA genes.
Science 299: 1540
70.
71.
72.
73.
74.
75.
76.
77.
78.
79.
Lim LP, Lau NC, Garrett-Engele P, Grimson A, Schelter JM, et al. 2005. Microarray
analysis shows that some microRNAs downregulate large numbers of target mRNAs.
Nature 433: 769-73
Lim LP, Lau NC, Weinstein EG, Abdelhakim A, Yekta S, et al. 2003. The microRNAs of
Caenorhabditis elegans. Genes Dev 17: 991-1008
Lingel A, Simon B, Izaurralde E, Sattler M. 2003. Structure and nucleic-acid binding of
the Drosophila Argonaute 2 PAZ domain. Nature 426: 465-9
Liu J, Carmell MA, Rivas FV, Marsden CG, Thomson JM, et al. 2004. Argonaute2 is the
catalytic engine of mammalian RNAi. Science 305: 1437-41
Liu Q, Rand TA, Kalidas S, Du F, Kim HE, et al. 2003. R2D2, a bridge between the
initiation and effector steps of the Drosophila RNAi pathway. Science 301: 1921-5
Llave C, Kasschau KD, Rector MA, Carrington JC. 2002. Endogenous and silencingassociated small RNAs in plants. Plant Cell 14: 1605-19
Llave C, Xie Z, Kasschau KD, Carrington JC. 2002. Cleavage of Scarecrow-like mRNA
targets directed by a class of Arabidopsis miRNA. Science 297: 2053-6
Lu C, Fedoroff N. 2000. A mutation in the Arabidopsis HYLI gene encoding a dsRNA
binding protein affects responses to abscisic acid, auxin, and cytokinin. Plant Cell 12:
2351-66
Lund E, Guttinger S, Calado A, Dahlberg JE, Kutay U. 2004. Nuclear export of
microRNA precursors. Science 303: 95-8
Lynn K, Fernandez A, Aida M, Sedbrook J, Tasaka M, et al. 1999. The
PINHEADIZWILLE gene acts pleiotropically in Arabidopsis development and has
overlapping functions with the ARGONAUTE1 gene. Development 126: 469-81
37
80.
81.
82.
83.
84.
85.
86.
87.
88.
89.
90.
91.
92.
93.
Mallory AC, Bartel DP, Bartel B. 2005. microRNA-Directed Regulation of Arabidopsis
AUXIN RESPONSE FACTOR17 Is Essential for Proper Development and Modulates
Expression of Early Auxin Response Genes. Plant Cell 17
Mallory AC, Dugas DV, Bartel DP, Bartel B. 2004. MicroRNA regulation of NACdomain targets is required for proper formation and separation of adjacent embryonic,
vegetative, and floral organs. Curr Biol 14: 1035-46
Mallory AC, Reinhart BJ, Bartel D, Vance VB, Bowman LH. 2002. A viral suppressor of
RNA silencing differentially regulates the accumulation of short interfering RNAs and
micro-RNAs in tobacco. Proc Natl Acad Sci U S A 99: 15228-33
Mallory AC, Reinhart BJ, Jones-Rhoades MW, Tang G, Zamore PD, et al. 2004.
MicroRNA control of PHABULOSA in leaf development: importance of pairing to the
microRNA 5' region. EMBO J 23: 3356-64
Mansfield JH, Harfe BD, Nissen R, Obenauer J, Srineel J, et al. 2004. MicroRNAresponsive 'sensor' transgenes uncover Hox-like and other developmentally regulated
patterns of vertebrate microRNA expression. Nat Genet 36: 1079-83
Matzke M, Aufsatz W, Kanno T, Daxinger L, Papp I, et al. 2004. Genetic analysis of
RNA-mediated transcriptional gene silencing. Biochim Biophys Acta 1677: 129-41
Mayer AM, Staples RC. 2002. Laccase: new functions for an old enzyme.
Phytochemistry 60: 551-65
McConnell JR, Barton MK. 1998. Leaf polarity and meristem formation in Arabidopsis.
Development 125: 2935-42
McConnell JR, Emery J, Eshed Y, Bao N, Bowman J, Barton MK. 2001. Role of
PHABULOSA and PHAVOLUTA in determining radial patterning in shoots. Nature 411:
709-13
Mette MF, van der Winden J, Matzke M, Matzke AJ. 2002. Short RNAs can identify new
candidate transposable element families in Arabidopsis. Plant Physiol 130: 6-9
Millar AA, Gubler F. 2005. The Arabidopsis GAMYB-Like Genes, MYB33 and MYB65,
Are MicroRNA-Regulated Genes That Redundantly Facilitate Anther Development.
Plant Cell 17: 705-21
Morel JB, Godon C, Mourrain P, Beclin C, Boutet S, et al. 2002. Fertile hypomorphic
ARGONAUTE (agol) mutants impaired in post-transcriptional gene silencing and virus
resistance. Plant Cell 14: 629-39
Moss EG, Lee RC, Ambros V. 1997. The cold shock domain protein LIN-28 controls
developmental timing in C. elegans and is regulated by the lin-4 RNA. Cell 88: 637-46
Motamedi MR, Verdel A, Colmenares SU, Gerber SA, Gygi SP, Moazed D. 2004. Two
RNAi complexes, RITS and RDRC, physically interact and localize to noncoding
centromeric RNAs. Cell 119: 789-802
94.
95.
96.
Mourelatos Z, Dostie J, Paushkin S, Sharma A, Charroux B, et al. 2002. miRNPs: a novel
class of ribonucleoproteins containing numerous microRNAs. Genes Dev 16: 720-8
Mourrain P, Beclin C, Elmayan T, Feuerbach F, Godon C, et al. 2000. Arabidopsis SGS2
and SGS3 genes are required for posttranscriptional gene silencing and natural virus
resistance. Cell 101: 533-42
Moussian B, Haecker A, Laux T. 2003. ZWILLE buffers meristem stability in
Arabidopsis thaliana. Dev Genes Evol 213: 534-40
38
97.
98.
99.
100.
101.
Moussian B, Schoof H, Haecker A, Jurgens G, Laux T. 1998. Role of the ZWILLE gene
in the regulation of central shoot meristem cell fate during Arabidopsis embryogenesis.
EMBO J 17: 1799-809
Noma K, Sugiyama T, Cam H, Verdel A, Zofall M, et al. 2004. RITS acts in cis to
promote RNA interference-mediated transcriptional and post-transcriptional silencing.
Nat Genet 36: 1174-80
Olsen PH, Ambros V. 1999. The lin-4 regulatory RNA controls developmental timing in
Caenorhabditis elegans by blocking LIN-14 protein synthesis after the initiation of
translation. Dev Biol 216: 671-80
Palatnik JF, Allen E, Wu X, Schommer C, Schwab R, et al. 2003. Control of leaf
morphogenesis by microRNAs. Nature 425: 257-63
Papp I, Mette MF, Aufsatz W, Daxinger L, Schauer SE, et al. 2003. Evidence for nuclear
processing of plant micro RNA and short interfering RNA precursors. Plant Physiol 132:
1382-90
102.
Parizotto EA, Dunoyer P, Rahm N, Himber C, Voinnet O. 2004. In vivo investigation of
the transcription, processing, endonucleolytic activity, and functional relevance of the
spatial distribution of a plant miRNA. Genes Dev 18: 2237-42
103. Park MY, Wu G, Gonzalez-Sulser A, Vaucheret H, Poethig RS. 2005. Nuclear processing
and export of microRNAs in Arabidopsis. Proc Natl Acad Sci U S A 102: 3691-6
:104. Park W, Li J, Song R, Messing J, Chen X. 2002. CARPEL FACTORY, a Dicer homolog,
and HEN1, a novel protein, act in microRNA metabolism in Arabidopsis thaliana. Curr
Biol 12: 1484-95
105.
106.
1.07.
108.
109.
110.
Pasquinelli AE, Reinhart BJ, Slack F, Martindale MQ, Kuroda MI, et al. 2000.
Conservation of the sequence and temporal expression of let-7 heterochronic regulatory
RNA. Nature 408: 86-9
Peragine A, Yoshikawa M, Wu G, Albrecht HL, Poethig RS. 2004. SGS3 and
SGS2/SDE1/RDR6 are required for juvenile development and the production of transacting siRNAs in Arabidopsis. Genes Dev 18: 2368-79
Pfeffer S, Sewer A, Lagos-Quintana M, Sheridan R, Sander C, et al. 2005. Identification
of microRNAs of the herpesvirus family. Nat Methods 2: 269-76
Pfeffer S, Zavolan M, Grasser FA, Chien M, Russo JJ, et al. 2004. Identification of virusencoded microRNAs. Science 304: 734-6
Poy MN, Eliasson L, Krutzfeldt J, Kuwajima S, Ma X, et al. 2004. A pancreatic isletspecific microRNA regulates insulin secretion. Nature 432: 226-30
Reinhart BJ, Bartel DP. 2002. Small RNAs correspond to centromere heterochromatic
repeats. Science 297: 1831
111.
Reinhart BJ, Slack FJ, Basson M, Pasquinelli AE, Bettinger JC, et al. 2000. The 21nucleotide let-7 RNA regulates developmental timing in Caenorhabditis elegans. Nature
403: 901-6
112.
113.
114.
Reinhart BJ, Weinstein EG, Rhoades MW, Bartel B, Bartel DP. 2002. MicroRNAs in
plants. Genes Dev 16: 1616-26
Rhoades MW, Reinhart BJ, Lim LP, Burge CB, Bartel B, Bartel DP. 2002. Prediction of
plant microRNA targets. Cell 110: 513-20
Schauer SE, Jacobsen SE, Meinke DW, Ray A. 2002. DICER-LIKE : blind men and
elephants in Arabidopsis development. Trends Plant Sci 7: 487-91
39
115.
116.
117.
118.
Schwab R, Palatnik JF, Riester M, Schommer C, Schmid M, Weigel D. 2005. Specific
Effects of MicroRNAs on the Plant Transcriptome. Dev Cell 8: 517-27
Schwarz DS, Hutvagner G, Du T, Xu Z, Aronin N, Zamore PD. 2003. Asymmetry in the
assembly of the RNAi enzyme complex. Cell 115: 199-208
Slack FJ, Basson M, Liu Z, Ambros V, Horvitz HR, Ruvkun G. 2000. The lin-41 RBCC
gene acts in the C. elegans heterochronic pathway between the let-7 regulatory RNA and
the LIN-29 transcription factor. Mol Cell 5: 659-69
Song JJ, Liu J, Tolia NH, Schneiderman J, Smith SK, et al. 2003. The crystal structure of
the Argonaute2 PAZ domain reveals an RNA binding motif in RNAi effector complexes.
Nat Struct Biol 10: 1026-32
119.
120.
121.
122.
123.
124.
125.
Song JJ, Smith SK, Hannon GJ, Joshua-Tor L. 2004. Crystal structure of Argonaute and
its implications for RISC slicer activity. Science 305: 1434-7
Souret FF, Kastenmayer JP, Green PJ. 2004. AtXRN4 degrades mRNA in Arabidopsis
and its substrates include selected miRNA targets. Mol Cell 15: 173-83
Stark A, Brennecke J, Russell RB, Cohen SM. 2003. Identification of Drosophila
MicroRNA targets. PLoS Biol 1: E60
Suh MR, Lee Y, Kim JY, Kim SK, Moon SH, et al. 2004. Human embryonic stem cells
express a unique set of microRNAs. Dev Biol 270: 488-98
Sunkar R, Girke T, Jain PK, Zhu JK. 2005. Cloning and Characterization of microRNAs
from Rice. Plant Cell 17: 666-999
Sunkar R, Zhu JK. 2004. Novel and stress-regulated microRNAs and other small RNAs
from Arabidopsis. Plant Cell 16: 2001-19
Tabara H, Yigit E, Siomi H, Mello CC. 2002. The dsRNA binding protein RDE-4
interacts with RDE-1, DCR-1, and a DExH-box helicase to direct RNAi in C. elegans.
Cell 109: 861-71
126.
127.
128.
129.
130.
131.
Tang G, Reinhart BJ, Bartel DP, Zamore PD. 2003. A biochemical framework for RNA
silencing in plants. Genes Dev 17: 49-63
Telfer A, Poethig RS. 1998. HASTY: a gene that regulates the timing of shoot maturation
in Arabidopsis thaliana. Development 125: 1889-98
Tomari Y, Zamore PD. 2005. MicroRNA biogenesis: drosha can't cut it without a partner.
Curr Biol 15: R61-4
Vaucheret H, Vazquez F, Crete P, Bartel DP. 2004. The action of ARGONAUTE in the
miRNA pathway and its regulation by the miRNA pathway are crucial for plant
development. Genes Dev 18: 1187-97
Vazquez F, Gasciolli V, Crete P, Vaucheret H. 2004. The nuclear dsRNA binding protein
HYL1 is required for microRNA accumulation and plant development, but not
posttranscriptional transgene silencing. Curr Biol 14: 346-51
Vazquez F, Vaucheret H, Rajagopalan R, Lepers C, Gasciolli V, et al. 2004. Endogenous
trans-acting siRNAs regulate the accumulation of Arabidopsis mRNAs. Mol Cell 16: 6979
132.
Verdel A, Jia S, Gerber S, Sugiyama T, Gygi S, et al. 2004. RNAi-mediated targeting of
133.
heterochromatin by the RITS complex. Science 303: 672-6
Volpe T, Schramke V, Hamilton GL, White SA, Teng G, et al. 2003. RNA interference is
required for normal centromere function in fission yeast. Chromosome Res 11: 137-46
40
134.
Volpe TA, Kidner C, Hall IM, Teng G, Grewal SI, Martienssen RA. 2002. Regulation of
heterochromatic silencing and histone H3 lysine-9 methylation by RNAi. Science 297:
135.
Wang XJ, Reyes JL, Chua NH, Gaasterland T. 2004. Prediction and identification of
Arabidopsis thaliana microRNAs and their mRNA targets. Genome Biol 5: R65
Wightman B, Ha I, Ruvkun G. 1993. Posttranscriptional regulation of the heterochronic
gene lin-14 by lin-4 mediates temporal pattern formation in C. elegans. Cell 75: 855-62
Xie X, Lu J, Kulbokas EJ, Golub TR, Mootha V, et al. 2005. Systematic discovery of
regulatory motifs in human promoters and 3' UTRs by comparison of several mammals.
1833-7
136.
137.
Nature 434: 338-45
138.
139.
Xie Z, Johansen LK, Gustafson AM, Kasschau KD, Lellis AD, et al. 2004. Genetic and
functional diversification of small RNA pathways in plants. PLoS Biol 2: E104
Xie Z, Kasschau KD, Carrington JC. 2003. Negative feedback regulation of Dicer-Likel
in Arabidopsis by microRNA-guided mRNA degradation. Curr Biol 13: 784-9
140.
Yan KS, Yan S, Farooq A, Han A, Zeng L, Zhou MM. 2003. Structure and conserved
141.
RNA binding of the PAZ domain. Nature 426: 468-74
Yekta S, Shih IH, Bartel DP. 2004. MicroRNA-directed cleavage of HOXB8 mRNA.
Science 304: 594-6
142.
Yi R, Qin Y, Macara IG, Cullen BR. 2003. Exportin-5 mediates the nuclear export of pre-
microRNAs and short hairpin RNAs. Genes Dev 17: 3011-6
143.
Yu B, Yang Z, Li J, Minakhina S, Yang M, et al. 2005. Methylation as a crucial step in
plant microRNA biogenesis. Science 307: 932-5
144. Zamore PD, Tuschl T, Sharp PA, Bartel DP. 2000. RNAi: double-stranded RNA directs
the ATP-dependent cleavage of mRNA at 21 to 23 nucleotide intervals. Cell 101: 25-33
145. Zhong R, Ye ZH. 2004. Amphivasal vascular bundle 1, a gain-of-function mutation of
the IFLI REV gene, is associated with alterations in the polarity of leaves, stems and
carpels. Plant Cell Physiol 45: 369-85
146. Zilberman D, Cao X, Jacobsen SE. 2003. ARGONAUTE4 control of locus-specific
siRNA accumulation and DNA and histone methylation. Science 299: 716-9
1.47. Zilberman D, Cao X, Johansen LK, Xie Z, Carrington JC, Jacobsen SE. 2004. Role of
Arabidopsis ARGONAUTE4 in RNA-directed DNA methylation triggered by inverted
repeats. Curr Biol 14: 1214-20
41
Prediction of Plant MicroRNA Targets
Matthew W. Rhoades l 2 , Brenda J. Reinhart, Lee P. Lim1 2,
Christopher B. Burge2 , Bonnie Bartel3 ' 4, and David P. Bartel'
4
'Whitehead Institute for Biomedical Research, 9 Cambridge Center, Cambridge, Massachusetts
02142
2Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts
02139
3 Department of
Biochemistry and Cell Biology, Rice University, 6100 Main St., Houston, Texas
77005
4 To whom
correspondence should be addressed.
dbartel@wi.mit.edu, 617-258-5287, fax 617-258-6768
bartel@rice.edu, 713-348-5602
42
Summary
We predict regulatory targets for 14 Arabidopsis microRNAs (miRNAs) by identifying
mRNAs with near complementarity. Complementary sites within predicted targets are
conserved in rice. Of the 49 predicted targets, 29 are members of transcription factor gene
families involved in developmental patterning or cell differentiation. The near-perfect
complementarity between plant miRNAs and their targets suggests that many plant
miRNAs act similarly to small interfering RNAs and direct mRNA cleavage. The targeting
of developmental transcription factors suggests that many plant miRNAs function during
cellular differentiation to clear maternal regulatory transcripts from daughter cell lineages.
Introduction
Nearly 200 genes for tiny, noncoding RNAs termed microRNAs (miRNAs) have been identified
in animals and plants (Lagos-Quintana et al., 2001; Lau et al., 2001; Lee and Ambros, 2001;
Lagos-Quintana et al., 2002; Llave et al., 2002; Mourelatos et al., 2002; Reinhart et al., 2002).
Two miRNAs, lin-4 and let-7 RNAs, have been studied in detail; both control developmental
timing in C. elegans through a mechanism that involves imperfect base pairing to the 3' UTRs of
target mRNAs (Lee et al., 1993; Wightman et al., 1993; Ha et al., 1996; Moss et al., 1997;
Reinhart et al., 2000; Slack et al., 2000). The remaining miRNAs have unknown functions.
Nonetheless, their sequences are typically conserved among different species, and many have
intriguing expression patterns in different tissues or stages of development, indicating that these
other miRNAs have important functions and might also modulate gene expression. This idea is
supported by the observation that Dicer and Argonaute proteins, which are known to be crucial
for normal plant and animal development, are needed for proper miRNA accumulation
(Robinson-Beers et al., 1992; Ray et al., 1996; Ray et al., 1996; Jacobsen et al., 1999; Grishok et
al., 2001; HutvAgneret al., 2001; Ketting et al., 2001; Knight and Bass, 2001; Reinhart et al.,
2002).
The major challenge in determining miRNA functions is to identify their regulatory
targets. By analogy to lin-4 and let-7 RNAs, it is reasonable to suppose that miRNAs generally
recognize their regulatory targets through base pairing. However, the small size of the mature
miRNAs (20-24 nt) and the imperfect nature of miRNA:mRNA base pairing have hampered the
general prediction of mRNA targets for animal miRNAs. Thus far, prediction of animal miRNA
43
targets has been achieved only after experimental evidence narrowed the number of candidate
mRNAs to a small set, either by placing the mRNAs within the same regulatory pathway as the
miRNA or by identifying regulatory elements within mRNA 3'-UTRs (Lee et al., 1993;
Wightman et al., 1993; Moss et al., 1997; Reinhart et al., 2000; Slack et al., 2000; Lai, 2002).
An indication that target prediction for certain plant miRNAs might be more straight-forward
came with the recent identification of miR171, a plant miRNA with perfect antisense
complementarity to the mRNAs of three SCARECROW-like transcription factors (Llave et al.,
2002; Reinhart et al., 2002).
Here we report that near complementarity to mRNAs, particularly transcription factor
mRNAs, is a general trend for plant miRNAs. We have been able to identify potential regulatory
targets for 14 of the 16 miRNAs studied by searching for mRNAs capable of base pairing with
three or fewer mismatches to one of the miRNAs. The fact that many of these potential targets
are members of gene families with roles in plant development supports the idea that the function
of miRNAs in mediating development is conserved across kingdoms. Particularly compelling
targets include the PHABULOSA and PHAVULOTA mRNAs, for which the identification of
miRNA complementary sites may explain the ectopic expression previously described for
mutations in these genes (McConnell et al., 2001). Similar analysis of animal miRNAs did not
predict animal regulatory targets, suggesting mechanistic differences between plant and animal
miRNA function.
Results and Discussion
Plant MicroRNAs Have Significant Complementarity to Messenger RNAs
To identify potential regulatory targets, we searched for Arabidopsis mRNAs that were
complementary, with four or fewer mismatches, to at least one of 16 recently identified
Arabidopsis miRNAs (Reinhart et al., 2002). Gaps were not allowed, and G:U and other noncanonical pairs were treated as mismatches. To evaluate the significance of these hits to
annotated mRNAs, parallel analyses were performed using cohorts of randomly permuted
sequences that had identical sizes and base compositions as the set of authentic miRNAs. There
were substantially more antisense hits to the authentic miRNAs than to the randomized
sequences (Figure 1). This difference was especially striking at higher stringency; when
summing the hits with two or fewer mismatches, the number of hits to the authentic miRNA set
44
outnumbered those to the randomized cohorts by a ratio of 30:0.2 (Figure 1). Considering the
low probability of so many antisense hits occurring by chance, we suggest that these
complementary sites reflect a functional relationship between the miRNAs and the identified
mRNAs-that these protein-coding genes are regulatory targets of the miRNAs to which they
can potentially base pair.
At lower stringencies, there were also significantly more hits with the authentic set of
miRNAs than with the randomized cohorts. Most of the 31 hits with three mismatches are viable
miRNA target candidates, although a few are likely to be mRNAs with fortuitous
complementarity, as judged by the observation that on average the randomized cohorts hit 4.2
mRNAs when three mismatches were permitted (Figure 1). Some hits with four mismatches
might also be genuine targets. However, they are not included in the present analysis because of
the greater likelihood that their complementarity is fortuitous or occurs because they are targets
of unidentified miRNAs related to our query set of 16 miRNAs.
Potential regulatory targets with three or fewer mismatches were found for 14 of the 16
miRNAs (Table 1). Targets for the other two miRNAs might be identified through slight
changes in the search algorithm. For example, miR163, one of the two miRNAs without
predicted targets in Table 1, has extensive complementarity to members of the AtPP-like gene
family (Atlg66690, Atlg66700, Atlg66720, At3g44860, At3g44870), which have unknown
functions (Cui et al., 1999). All 24 nucleotides of this miRNA paired to complementary sites
within these mRNAs when a single-nucleotide gap was permitted near the 3' terminus of the
miRNA. Nonetheless, when searching for miRNA targets, permitting gaps did not substantially
increase the number of targets predicted for the other miRNAs (data not shown). Perhaps a
bulge is accommodated near the miRNA terminus more readily for miR163 because this miRNA
is 24 nt in length, which is 3 nt longer than the other miRNAs queried.
In all cases where an miRNA was complementary to more than one mRNA, most of the
potential targets were members of the same gene family (Table 1). The fraction of the gene
family members with miRNA complementary sites varied considerably. Of the 16 Squamosa
Promoter Binding Protein (SBP)-like genes in Arabidopsis (Riechmann et al., 2000), 10 have
miR156 complementary sites. In contrast, the MYB and NAC families each have over 100
members in Arabidopsis (Riechmann et al., 2000), of which five in each case have sites
complementary to miR159 or miR164, respectively. As more miRNAs are identified it will be
45
interesting to learn whether remaining members of these gene families have complementary sites
to other miRNAs. In support of this possibility, unrelated miRNAs can be complementary to
different members of the same gene family, as illustrated by miR160 and miR167, which
apparently target different members of the Auxin Response Factor family (Ulmasov et al., 1999).
When considering the significance of multiple hits to the same gene family, it is
important to address the possibility that these hits are merely the consequence of
complementarity to a nucleotide sequence that encodes a critical protein motif. Indeed, for
:miR161,miR165, miR170, and miR171, the miRNA complementary sites were within the
context of a domain strongly conserved among family members, as shown for the miR165
complementary sites (Figure 2A). Therefore, we can not rule out the possibility that only a
subset of the hits for these miRNAs are authentic targets. This possibility is less likely in the
cases of miR156, miR157, miR159, miR160, miR164, and miR169. The complementary sites
for these miRNAs fell outside the conserved domains that define the families and instead fell
within sequence contexts that were only weakly conserved among the family members, as shown
for the miR156 sites within SBP-like mRNAs (Figure 2B). Indeed, there are examples where the
conservation of the miRNA complementary sites among family members must be independent of
conserved protein function. In the case of the MYB genes with miR159 complementary sites,
four genes translate the complementary site in the same reading frame, while the fifth gene
translates the site in a different reading frame. In four other cases (miR156/157 to Atlg53160,
miR156 to At2g33810, and miR169 to Atlg17590 and Atlg54160), the miRNA complementary
sites are not in the coding regions at all but rather in the 3'-UTRs, as illustrated for miR156 and
its complementary sites (Figure 2B).
MicroRNA Complementary Sites Are Conserved Among Flowering Plants
Many complementary sites observed in Arabidopsis are conserved in rice (Oryza sativa).
Analysis of rice homologs focused on the seven miRNAs perfectly conserved in Oryza (Reinhart
et al., 2002) for which complementary sites had been identified in Arabidopsis (Table 1). When
using a three-mismatch cutoff, six of the seven conserved miRNAs (miR156, miR160, miR164,
miR167, miR169, miR171) have at least one potential target gene in Oryza homologous to a
corresponding Arabidopsis target. In an analogous control study using 44 hits to the randomized
46
cohorts, there were no miRNA complementary sites in rice homologs of the Arabidopsis hits,
even when four mismatches were allowed.
The location of the miRNA complementary sites within the mRNAs was conserved
between Arabidopsis and rice. Importantly, when there were differences between Arabidopsis
and rice complementary sites within homologous genes, these differences were distributed
evenly across the three codon positions (Table 2). Homologous regions under selection only at
the protein level tend to exhibit a higher frequency of differences at codon position 3. Thus, the
even distribution of mismatches across the codon positions indicates selection occurring at the
nucleic acid level, in addition to any selection at the protein level, as would be expected if these
segments act in miRNA recognition.
Most Predicted MicroRNA Targets Are Members of Transcription Factor Families
Involved in Development
Perhaps the most intriguing evidence that these genes are regulatory targets of the miRNAs is the
identity of the genes themselves. MicroRNA complementary sites were found in 61 mRNAs,
which, due to overlap between similar miRNAs, represent 49 unique genes (Table 1). Of these
49 predicted targets, 29 are known or putative transcription factors (Table 1), even though
transcription factors are thought to represent only 6% of protein-coding genes in Arabidopsis
(Riechmann et al., 2000). Many of these genes specify shoot and floral meristem development
or, for those with unknown functions, are in a family that has members involved in meristem
development. For example, the predicted targets of miR164 include CUP-SHAPED
COTYLEDON2 (CUC2), which is required for shoot apical meristem formation (Aida et al.,
1.997),and miR165 predicted targets include PHABULOSA (PHB) and PHAVOLUTA (PHV),
which encode HD-Zip transcription factors that regulate axillary meristem initiation and leaf
development (McConnell et al., 2001). A miR159 predicted target, AtMYB33, can bind to the
promoter of the floral meristem identity gene LEAFY (Gocal et al., 2001). Homologs of the
SBPs, which are thought to regulate the Antirrhinum floral meristem identity gene SQUAMOSA
(Klein et al., 1996), may in turn be regulated by miR156 and miR157.
Genetic evidence supports the regulatory roles of miR165 complementary sites within
PHB and PHV (Figure 2A). Multiple gain-of-function alleles have been isolated for both genes,
and each of these mutations disrupts the miR165 complementary site, usually as a single-
47
nucleotide substitution (McConnell et al., 2001). In the mutant examined, phb mRNA
expression extends more broadly than in wild type (McConnell et al., 2001), suggesting that
complementarity to miR165 is required for confining PHB mRNA accumulation to the proper
cell types.
A connection between miRNAs and meristem development is consistent with the
phenotypes of the Arabidopsis carpelfactory (caj) mutant. Dicer and CAF are homologous
RNaseIII-domain proteins required for the accumulation of mature miRNAs in animals and
plants, respectively (Hutvdgner and Zamore, 2002; Reinhart et al., 2002). Mutant alleles of CAF,
which is also known as SHORT INTEGUMENT] (SIN1), delay the meristem switch from
vegetative to floral development and cause over-proliferation of the floral meristem (Ray et al.,
1996; Jacobsen et al., 1999). Other genes required for miRNA accumulation in animals are
homologs of the Arabidopsis gene ARGONAUTE (AGOI), which is required for axillary shoot
meristem formation and leaf development in Arabidopsis (Bohmert et al., 1998). While AGO1
has not yet been reported to influence miRNA accumulation in plants, it is a predicted target of
miR168 (Table 1), suggesting a negative-feedback mechanism for controlling expression of the
AGO1 gene.
Other predicted targets of miRNAs do not have direct roles in meristem identity but
rather could have roles in cell division or differentiation. For example, miR160 and miR167 are
predicted to target auxin response factors, DNA-binding proteins that are thought to control
transcription in response to the phytohormone auxin (Ulmasov et al., 1999). Transcriptional
regulation is important for many of the diverse developmental responses to auxin signals, which
include cell elongation, division, and differentiation in both roots and shoots (Rogg and Bartel,
2001; Liscum and Reed, 2002). The predicted targets of miR170 and miR171 are three
SCARECROW-like proteins, a family of transcription factors whose members have been
implicated in radial patterning in roots, signaling by the phytohormone gibberellin, and light
signaling (Di Laurenzio et al., 1996; Peng et al., 1997; Silverstone et al., 1998; Bolle et al., 2000;
Helariutta et al., 2000). Overall, the high percentage of predicted miRNA targets that act as
developmental regulators suggest that miRNAs are involved in a wide range of cell division and
cell fate decisions throughout the plant.
48
Mechanistic and Functional Models for Regulation by MicroRNAs in Plants
The success in identifying potential miRNA targets in Arabidopsis prompted us to examine
whether our simple computational approach could also identify miRNA targets in C. elegans and
D. melanogaster. In both organisms, the miRNAs had few mRNA hits with complementary
sites--essentially the same number of hits as seen for randomized cohorts (data not shown).
While the possibility that a few animal miRNAs will recognize their targets with near-perfect
complementarity cannot be excluded, the general phenomenon of near-perfect complementarity
appears to be specific to plants. Two other key differences emerge when comparing the
predicted target sites of plant miRNAs with those of the C. elegans lin-4 and let-7 miRNAs.
First, the plant complementary sites are primarily, though not exclusively, within the ORFs,
whereas the only proposed lin-4 and let-7 sites are within 3' UTRs (Lee et al., 1993; Wightman
et al., 1993; Moss et al., 1997; Reinhart et al., 2000; Slack et al., 2000). Second, multiple sites
'withinthe same target mRNA are not detected in plants, whereas there are typically multiple lin4 and let-7 sites within each mRNA target (Lee et al., 1993; Wightman et al., 1993; Ha et al.,
1996; Reinhart et al., 2000; Slack et al., 2000).
These differences observed between plant and animal miRNA target recognition have
intriguing mechanistic implications for plant miRNA function (Figure 3A). Namely, plant
miRNA target recognition appears to resemble that of small interfering RNAs (siRNAs) much
more than that of animal miRNAs. During RNA interference (RNAi), long double-stranded
RNA is processed by Dicer into -22-nt siRNAs, which serve as guide RNAs to target
homologous mRNA sequences for cleavage (Bernstein et al., 2001; HutvAgnerand Zamore,
2002). Importantly, targeting either the ORF or the UTRs is effective (McManus et al., 2002),
provided that the siRNA has near-perfect complementarity to the targeted mRNA (Elbashir et al.,
2001). Plants also have siRNAs. Indeed, these tiny RNAs were first observed in plants and are
associated with a process related to RNAi, known as posttranscriptional gene silencing (PTGS),
which leads to the destruction of mRNA from plant viruses and transgenes (Hamilton and
Baulcombe, 1999; Matzke et al., 2001). Plant miRNAs resemble animal miRNAs in their
biogenesis, in that they are derived from endogenous, evolutionarily conserved genes and are
processed from stem-loop precursors by a Dicer homolog, with accumulation of mature miRNA
from only one arm of the precursor stem-loop (Reinhart et al., 2002). However, plant miRNAs
resemble siRNAs in their target recognition, suggesting that they might also resemble siRNAs in
49
their mechanism of action (Figure 3A). We propose that many plant miRNAs hybridize to
mRNAs with near-perfect complementarity and target the mRNAs for cleavage. A function in
mediating RNA cleavage might allow the plant miRNAs to target any region of the mRNA,
whereas the animal miRNAs that mediate translational attenuation might be relegated to 3'UTRs in order to avoid the mRNA-clearing activity of ribosomes. The efficiency and finality of
mRNA cleavage might require only a single complementary site in each message, whereas the
regulatory mechanism of lin-4 and let-7 miRNAs, which leaves the mRNA intact, might
generally require multiple target sites.
In presenting this hypothesis, we leave open the possibility that some plant miRNAs
might not specify cleavage of their regulatory targets, and some might specify cleavage of some
targets but employ other mechanisms to regulate other targets. Targets with many mismatches,
analogous to the targets of lin-4 and let-7 miRNAs, would not have been detected in our analysis.
Furthermore, some mismatches for the predicted targets are near the center of the complementary
sites (Table 2, data not shown) and might be expected to abrogate siRNA-mediated mRNA
cleavage (Elbashir et al., 2001). However, it is difficult to know whether these mismatches are
incompatible with mRNA cleavage because the types and locations of mismatches permissive for
siRNA-mediated cleavage are still being determined in animals and have not yet been explored
in plants. In those cases where the miRNAs might not be mediating mRNA cleavage, they might
attenuate translation (Olsen and Ambros, 1999), act as guide RNAs for mRNA modifications
(Kiss, 2002), or target DNA for epigenetic modifications, such as methylation (Matzke et al.,
2001). Although DNA targeting cannot be excluded as an additional miRNA function for some
miRNAs, two observations argue strongly for a role in targeting mRNAs in addition to any
possible role in targeting DNA. First, plant miRNAs are complementary to the sense rather than
antisense strands of mRNAs (data not shown). Second, the complementary sites for miR165 and
miR166 span a splice junction within each of the HD-Zip mRNAs.
The observation that many plant miRNAs potentially target the mRNAs of transcription
factors involved in development suggests that some miRNAs might function to clear maternal
regulatory transcripts from certain daughter-cell lineages (Figure 3B). Through the action
miRNAs, these inherited mRNAs could be eliminated without relying on constitutively unstable
messages. Now that potential miRNA binding sites in some of these developmentally important
transcription factor mRNAs have been identified, it should be possible to test this speculative
50
model by disrupting the miRNA complementarity site in the mRNA without changing the
protein sequence of the transcription factor.
The miRNAs analyzed here are likely to be only a small fraction of the miRNAs in
Arabidopsis (Llave et al., 2002; Reinhart et al., 2002). Nonetheless, the discovery that so many
of these plant miRNAs appear to have readily identifiable regulatory targets will greatly facilitate
experimental investigation of the functions of these tiny noncoding RNAs and the many other
miRNAs remaining to be found in plants. With the ability to computationally identify candidate
targets, the presumed roles of miRNAs in development can be more readily explored, and roles
of miRNAs in other processes can be more readily uncovered.
Experimental Procedures
Identification of miRNA Complementary Sites in Annotated mRNAs
The set of annotated Arabidopsis mRNA sequences was extracted from the genomic GenBank
files, January 2002 release (Arabidopsis Genome Initiative, 2000). This set was searched for
complementary sites to any of 16 miRNAs (GenBank accession numbers AJ493620-AJ493656)
using Patscan (Dsouza et al., 1997). When the miRNA was cloned as either a 20- or 21-nt RNA,
the 21-nt RNA was used (Reinhart et al., 2002). Thus, the miR158 sequence was 20 nt, the
miR163 sequence was 24 nt, and the remaining 14 miRNA sequences were 21 nt. One mismatch
was added to all miR158 complementary sites to compensate for their smaller size and the
correspondingly greater chance of fortuitous complementarity. Complementary sites were also
found for 10 cohorts of 16 randomly permuted sequences that had identical sizes and base
compositions to the authentic miRNAs. One mismatch was added to the sites complementary to
the randomly permuted versions of miR158. Analogous searches for animal miRNA
complementary sites queried annotated D. melanogaster mRNAs (GenBank October 2000
release) and annotated C. elegans coding regions (GenBank April 1999 release).
Identification of Homologous miRNA Complementary Sites in Oryza mRNAs
For each Arabidopsis target mRNA, the mRNAs of up to 10 homologous Oryza proteins were
predicted from the unannotated Oryza contigs (Yu et al., 2002) by GenomeScan, a program
which identifies genes within genomic sequence using homology to input protein sequences
combined with an ab initio gene-finding algorithm (Yeh et al., 2001). Complementary sites in
51
this dataset were identified by PatScan searches, and homology to the Arabidopsis targets was
confirmed by alignment of the inferred protein sequences (ClustalX). One additional target
homolog (TC79868) was found by searching the TIGR Rice Gene Index (9.0).
Acknowledgments
We thank Earl Weinstein and Ru-Fang Yeh for computer scripts and helpful discussions. This
work was supported by grants from David H. Koch Cancer Research Fund (DBP), the Alexander
and Margaret Stewart Trust (DBP), the Robert A. Welch Foundation (BB), and the NIH (CBB).
52
Table 1. Potential Regulatory Targets of Arabidopsis miRNAs
_.
.
MicroRNA
miR156
Target protein family
SQUAMOSA-PROMOTER
BINDING PROTEIN (SBP)
like proteins
Target gene names (number of mismatches)
At3g57920 (1), At2g42200/SPL9 (1), At5g50570 (1), At5g50670 (1),
At1g53160/SPL4 (2), At2g33810/SPL3 (2), At1g27370/SPL10 (2),
At5g43270/SPL2 (2), Atlg69170/SPL6 (2), Atlg27360/SPL11 (2)
miR157
SQUAMOSA-PROMOTER
BINDING PROTEIN (SBP)
like proteins
Atlg27370/SPL10 (1), At3g57920 (1), At2g42200/SPL9 (1), At5g43270/SPL2 (1),
At1 g27360/SPL11 (1), At1g69170/SPL6 (2), At5g50570 (2), At5g50670 (2),
At1g53160/SPL4 (3)
Putative RNA helicase
At5g08620 (3)
Unknown proteins
At3g47170 (3), Atl g22000 (3)
miR158
Unknown protein
Atlg64100 (3)
miR159
MYB proteins
At2g32460/AtMYB101 (2), At3g60460 (3), At2g26950/AtMYB104 (3),
At5g06100/AtMYB33 (3), At3g11440/AtMYB65 (3)
Unknown protein
Atlg29010 (3)
miR160
Auxin Response Factors
At1g77850/ARF17 (1), At2g28350/ARF10 (2), At4g30080 (3)
miR161
PPR repeat proteins
At1 g63150 (3), At1g63400 (3), Atlg06580 (3), At1g64580 (3), At5g16640 (3),
Atlg62670 (3), Atlg62720 (3), At5g41170 (3), Atlg63080 (3)
miR164
NAC domain proteins
At5g61430 (2), At5g07680 (2), Atlg56010/NAC1 (2), At3g15170 (3),
At5g53950/CUC2 (3)
miR165
HD-Zip transcription factors
At5g60690/REV (3), At3g34710/PHB (3), At4g32880/ATHB-8 (3), At1g30490/PHV (3)
miR166
HD-Zip transcription factor
At1 g52150/ATHB-15 (3)
miR167
Auxin Response Factor
At5g37020/ARF8 (3)
miR168
ARGONAUTE
Atlg48410/AGO (3)
miR169
CCAAT Binding Factor
(CBF)-HAP2-like proteins
At1g17590 (3), Atlg54160 (3)
miR170
GRAS domain proteins
(SCARECROW-like)
At2g45160 (2), At3g60630 (2), At4g00150/SCL6 (2)
miR171
GRAS domain proteins
At2g45160 (0), At3g60630 (0), At4g00150/SCL6 (0)
(SCARECROW-like)
For each gene, the number of mismatches between the miRNA and the mRNA is indicated in parentheses. The sequences of three
pairs of miRNAs (miR156/miR157, miR165/miR166, and miR170/miR171) are closely related and therefore sometimes complementary
to the same sites within the target mRNAs. Sites complementary to miR158 had an additional mismatch added to compensate for the
fact that miR158 is at least 1 nt shorter than the other miRNAs.
53
Table2. MicroRNAComplementary
Sitesin PotentialmRNA
TargetsConservedBetweenArabidopsisand Oryza
Target
RNAsequenceof
Peptide
gene
complementarysite
sequence
uU GCUCAc ucU cUu CUG UCA
miR156
At5g50570(1) UGU GCU CuC UCU CUU CUG UCA
At5g50670(1) UGU GCU CuC UCU CUU CUG UCA
At3g57920
(1) UGU GCU CuC UCU CUU CUG UCA
At2g42200
(1) UGU GCU CuC UCU CUU CUG UCA
Atl
g27370(2) aGU GCU CuC UCU CUU CUG UCA
At1g27360(2) cGU GCU CuC UCU CUU CUG UCA
At5g43270
(2) gGU GCU CuC UCU CUU CUG UCA
At1g69170(2) cGU GCU CuC UCU CUU CUG UCA
At2g33810(2) UuU GCU uAC UCU CUU CUG UCA
At1g53160(2) UcU GCU CuC UCU CUU CUG UCA
Os20095(1) UGU GCU CuC UCU CUU CUG UCA
Os06618(1) UGU GCU CuC UCU CUU CUG UCA
Os02878(1) UGU GCU CuC UCU CUU CUG UCA
Os 25470(2) gGU GCU CuC UCU CUU CUG UCA
CALSLLS
CALSLLS
CALSLLS
CALSLLS
SALSLLS
RALSLLS
GALSLLS
RALSLLS
3' UTR
3' UTR
CALSLLS
CALSLLS
CALSLLS
GALSLLS
miR160
U GGC AUA CAG GGAGCC AGG CA
Atlg77850
(1) U GGCAUg CAG GGAGCCAGG CA AGMQGARQ
At2g28350(2) a GGa AUA CAG GGA GCC AGG CA
AGIQGARQ
At4g30080
(3) g GGu uUA CAG GGAGCC AGG CA
OsTC73519
(1)a GGC AUA CAG GGA GCC AGG CA
OsTC70631 (1)a GGC AUA CAG GGA GCC AGG CA
Os17478(1) a GGC AUA CAG GGA GCC AGG CA
Os02679(1) a GGC AUA CAG GGA GCCAGG CA
VGLQGARH
AGIQGARH
AGIQGARH
AGIQGARH
AGIQGARH
miR164
UG CAC GUG CCC UGC UUC UCC A
Atlg56010(2) aG CAC GUaCCC UGC UUC UCC A EHVPCFSN
At5g07680
(2) Uu uAC GUG CCC UGC UUCUCC A VYVPCFSN
At5g61430(2)
At3gl 5170(3)
At5g53950(3)
Os 00116(2)
Uc uAC
aG CAC
aG CAC
cG CAC
GUG CCC UGC UUC UCC A
GUG uCC UGu UUC UCC A
GUG uCC UGu UUC UCC A
GUG aCC UGC UUC UCC A
VYVPCFSN
EHVSCFSN
EHVSCFST
AHVTCFSN
miR167
U AGA UCA UGC UGG CAG CUU CA
(3) U AGA UCA gGC UGG CAG CUl gu LRSGWQLV
At5g37020
OsTC79868 (3)U AGA UCA gGC UGG CAG CUU gu DRSGWQLV
UCG GCA AGU
miR169
At1g17590(3) aaG GgA AGU
At1g54160
(3) aCG GgAAGU
Os 04048(3) UaG GCA AcU
Os09843(3) UaG GCA AuU
CAU CCU UGG CUG
CAU CCU UGG CUG
CAU CCU UGG CUa
CAU uCU UGG CUG
CAU CCU UGG CUu
3' UTR
3' UTR
3' UTR
3' UTR
G AUA UUG GCG CGG CUC AAU CA
miR171
At2g45160
(0) G AUA UUG GCG CGG CUC AAU CA GILARLNH
At3g60630(0) G AUA UUG GCG CGG CUC AAU CA GILARLNH
At4g00150(0) G AUA UUG GCG CGG CUC AAU CA GILARLNQ
OsTC76755
(0)G AUA UUG GCG CGG CUC AAU CA EILARLNQ
OsTC81772
(0) G AUA UUG GCG CGG CUC AAU CA EILARLNH
Os00711(0) G AUA UUG GCG CGG CUC AAU CA EILARLNQ
Os 12185(0) G AUA UUG GCG CGG CUC AAU CA EILARLNQ
OsTC75254(1)
G AUA UUG GCGCGG CUC AAU uA EILARLNY
For eachgene,the nucleotidesequenceof the miRNA
complementary
site is brokenintocodonscorresponding
to the
readingframe of the mRNA.The reversecomplementis shownfor
eachmiRNA,andfor each complementary
site,mismatchesare
shownin lowercase. Thepeptidesequenceof the miRNA
complementary
site is shown. Oryzagenesare labeledeitherby
their tentativeconsensus(TC)numbersfromthe TIGRrice gene
index(version9.0)or by the genomiccontigof the mRNApredicted
by GenomeScan.
54
Figure Legends
Figure 1. Antisense Hits Between Arabidopsis miRNAs and Annotated mRNAs
Annotated Arabidopsis mRNAs were searched for sites complementary to 16 Arabidopsis
miRNAs with 0-4 mismatches (solid bars). Identical searches with cohorts of 16 randomized
RNAs were also performed (open bars, mean values from 10 cohorts; error bars, one standard
deviation). Note that two hits by similar miRNAs to the same complementary site within an
mRNA were counted as separate hits (Table 1).
Figure 2. Sequence Context of miRNA Complementary Sites
(A) The four miR165 complementary sites. These complementary sites lie within the START
domain present in a subfamily of HD-Zip transcription factors. The altered protein sequences of
the reported phv and phb gain-of-function alleles are indicted (McConnell et al., 2001). Each of
these lesions also disrupts the miR165 complementary site. Amino acids conserved in a majority
of the proteins are highlighted.
(B) The miR156 complementary sites. All ten predicted targets contain the Squamosa Promoter
Binding (SBP) box, but the complementary sites are downstream of this conserved domain,
within a poorly conserved protein-coding context or the 3'-UTR. Amino acids conserved in a
majority of the proteins are highlighted.
Figure 3. Models for the Biogenesis, Action, and Roles of miRNAs in Plants
(A) Although plant miRNAs are apparently generated through the classical miRNA pathway
(Reinhart et al., 2002), we propose that many act as classical siRNAs, pairing with near-perfect
complementarity to their mRNA targets to specify mRNA cleavage.
(B) Plant miRNAs might target transcription factor mRNAs for cleavage following cell
divisions that require rapid implementation of new transcription factor programs. Following cell
division, the daughter cells inherit transcription factor mRNAs from the precursor cell. At the
onset of differentiation, one daughter might express not only new transcription factor mRNAs
(green) but also miRNAs (red) complementary to mRNAs of key maternal transcription factors
(blue). The miRNAs might direct the cleavage of the inherited transcription factor mRNA,
preventing the inappropriate expression of the transcription factor protein, thus enabling the
rapid differentiation of the daughter cell.
55
Rhoades et. al., Figure 1
100
80
Cn 60
4-.
zcr
E
I
40
17
20 I'
3
0
Ai --
l
O0
1
2
3
4
0
Stringency of pairing (# of mismatches)
Rhoades et. al., Figure 2
A
rn-rn
165 compleme
,/1miR
REV
ATHB-8
PHV
PHB
i 1
si
i
e
11 aa insertion in phb
gain-of-function allele
tN
-j
LG to D mutation in phv
and phb gain-of-function
alleles
B
m m
-
X
miR156 complementary site
AtSg50570
At5g50670
AtIg27370
At1g27360
SRTASLC
ISRTASLC
PDKGVGEC
HGEDVGEY
At5g43270 FSKEKVTI
At3g57920
At2g42200
At1g69170
SSFTTCP
IPEIMDTK
ITEVSSIW
LQPPLSLSQEA
LQPPLSLSQEA
HTPVAEPPPIF
HVQPFSLLCSY
DQPRRFTLDHH
LQTPTNTWRPS
NNNNNNNNNNN
FPNTTFSITQP
At2g33810
miR156 complementary site in 3'-UTR
Atlg53160
miR156 complementary site in 3 -UTR
Rhoades et. al., Figure 3
MicroRNA Pathway
PTGS/RNAi Pathway
miRNA precursor
Long double-stranded RNA
Many plant
miRNA
Cap
.A
Attenuated translation
Cap
I
'-
-
AAA A
Near-perfect complementarity in
coding region or UTR
Short segments of complementarity in 3 -UTR
CapnAAA.
siRNAs
miRNAs
AAA. A
Precursor cell expressing
transcription factor mRNAs
Cap •_*MSN
Cleaved mRNA
AAA AA
I!
Daughter cells with distinct
transcription factor profiles
I',' S
References
Aida, M., Ishida, T., Fukaki, H., Fujisawa, H., and Tasaka, M. (1997). Genes involved in organ
separation in Arabidopsis: an analysis of the cup-shaped cotyledon mutant. Plant Cell 9, 841857.
The Arabidopsis Genome Initiative. (2000). Analysis of the genome sequence of the flowering
plant Arabidopsis thaliana. Nature 408, 796-815.
Bernstein, E., Denli, A. M., and Hannon, G. J. (2001). The rest is silence. RNA 7, 1509-1521.
Bohmert, K., Camus, I., Bellini, C., Bouchez, D., Caboche, M., and Benning, C. (1998). AGO1
defines a novel locus of Arabidopsis controlling leaf development. EMBO J. 17, 170-180.
Bolle, C., Koncz, C., and Chua, N. H. (2000). PAT1, a new member of the GRAS family, is
involved in phytochrome A signal transduction. Genes Dev. 14, 1269-78.
Cui, Y., Brugiere, N., Jackman, L., Bi, Y.-M., and Rothstein, S. (1999). Structural and
transcriptional comparative analysis of the S locus regions in two self-incompatible Brassica
napus lines. Plant Cell 11, 2217-2231.
Di Laurenzio, L., Wysocka-Diller, J., Malamy, J. E., Pysh, L., Helariutta, Y., Freshour, G., Hahn,
M. G., Feldmann, K. A., and Benfey, P. N. (1996). The SCARECROW gene regulates an
asymmetric cell division that is essential for generating the radial organization of the Arabidopsis
root. Cell 86, 423-33.
Dsouza, M., Larsen, N., and Overbeek, R. (1997). Searching for patterns in genomic data. Trends
in Genetics 13, 497-498.
59
Elbashir, S., Martinez, J., Patkaniowska, A., Lendeckel, W., and Tuschl, T. (2001). Functional
anatomy of siRNAs for mediating efficient RNAi in Drosophila melanogaster embryo lysate.
EMBO J. 20, 6877-6888.
Gocal, G. F., Sheldon, C. C., Gubler, F., Moritz, T., Bagnall, D. J., MacMillan, C. P., Li, S. F.,
Parish, R. W., Dennis, E. S., Weigel, D., and King, R. W. (2001). GAMYB-like genes,
flowering, and gibberellin signalling in Arabidopsis. Plant Physiol. 127, 1682-1693.
Grishok, A., Pasquinelli, A. E., Conte, D., Li, N., Parrish, S., Ha, I., Baillie, D. L., Fire, A.,
Ruvkun, G., and Mello, C. C. (2001). Genes and mechanisms related to RNA interference
regulate expression of the small temporal RNAs that control C. elegans developmental timing.
Cell 106, 23-34.
Ha, I., Wightman, B., and Ruvkun, G. (1996). A bulged lin-4/lin-14 RNA duplex is sufficient for
Caenorhabditis elegans lin-14 temporal gradient formation. Genes Dev. 10, 3041-3050.
Hamilton, A. J., and Baulcombe, D. C. (1999). A novel species of small antisense RNA in
posttranscriptional gene silencing. Science 286, 950-952.
Helariutta, Y., Fukaki, H., Wysocka-Diller, J., Nakajima, K., Jung, J., Sena, G., Hauser, M. T.,
and Benfey, P. N. (2000). The SHORT-ROOT gene controls radial patterning of the Arabidopsis
root through radial signaling. Cell 101, 555-67.
Hutvdgner, G., McLachlan, J., Pasquinelli, A. E., Balint, E., Tuschl, T., and Zamore, P. D.
(2001). A cellular function for the RNA-interference enzyme Dicer in the maturation of the let-7
small temporal RNA. Science 293, 834-838.
Hutvdgner, G., and Zamore, P. D. (2002). RNAi: nature abhors a double-strand. Curr. Opin.
Genet. Dev. 12, 225-232.
60
Jacobsen, S. E., Running, M. P., and Meyerowitz, E. M. (1999). Disruption of an RNA
helicase/RNAseIII gene in Arabidopsis causes unregulated cell division in floral meristems.
Development 126, 5231-5243.
Ketting, R. F., Fischer, S. E. J., Bernstein, E., Sijen, T., Hannon, G. J., and Plasterk, R. H. A.
(2001). Dicer functions in RNA interference and in synthesis of small RNA involved in
developmental timing in C. elegans. Genes Dev. 15, 2654-2659.
Kiss, T. (2002). Small nucleolar RNAs: an abundant group of noncoding RNAs with diverse
cellular functions. Cell 109, 145-148.
Klein, J., Saedler, H., and Huijser, P. (1996). A new family of DNA binding proteins includes
putative transcriptional regulators of the Antirrhinum majus floral meristem identity gene
SQUAMOSA. Mol. Gen. Genet. 250, 7-16.
Knight, S., and Bass, B. (2001). A Role for the RNase III Enzyme DCR-1 in RNA Interference
and Germ Line Development in Caenorhabditis elegans. Science 293, 2269-2271.
Lagos-Quintana, M., Rauhut, R., Lendeckel, W., and Tuschl, T. (2001). Identification of novel
genes coding for small expressed RNAs. Science 294, 853-858.
Lagos-Quintana, M., Rauhut, R., Yalcin, A., Meyer, J., Lendeckel, W., and Tuschl, T. (2002).
Identification of tissue-specific microRNAs from mouse. Curr. Biol. 12, 735-739.
Lai, E. C. (2002). MicroRNAs are complementary to 3'UTR motifs that mediate negative posttranscriptional regulation. Nat. Gen. 30, 363-364.
Lau, N. C., Lim, L. P., Weinstein, E. G., and Bartel, D. P. (2001). An abundant class of tiny
RNAs with probable regulatory roles in Caenorhabditis elegans. Science 294, 858-862.
61
Lee, R. C., and Ambros, V. (2001). An extensive class of small RNAs in Caenorhabditis
elegans. Science 294, 862-864.
Lee, R. C., Feinbaum, R. L., and Ambros, V. (1993). The C. elegans heterochronic gene lin-4
encodes small RNAs with antisense complementarity to lin-14. Cell 75, 843-854.
Liscum, E., and Reed, J. W. (2002). Genetics of Aux/IAA and ARF action in plant growth and
development. Plant Mol. Biol. 49, 387-400.
Llave, C., Kasschau, K., Rector, M., and Carrington, J. (2002). Endogenous and silencingassociated small RNAs in plants. Plant Cell 14, 1-15.
Matzke, M. A., Matzke, A. J., Pruss, G. J., and Vance, V. B. (2001). RNA-based silencing
strategies in plants. Curr. Opin. Genet. Dev. 11, 221-227.
McConnell, J. R., Emery, J., Eshed, Y., Bao, N., Bowman, J., and Barton, M. K. (2001). Role of
PHABULOSA and PHAVOLUTA in determining radial patterning in shoots. Nature 411, 709713.
McManus, M. T., Petersen, C. P., Haines, B. B., Chen, J., and Sharp, P. A. (2002). Gene
silencing using micro-RNA designed hairpins. RNA 8, 842-850.
Moss, E., Lee, R., and Ambros, V. (1997). The cold shock domain protein LIN-28 controls
developmental timing in C. elegans and is regulated by the lin-4 RNA. Cell 88, 637-646.
Mourelatos, Z., Dostie, J., Paushkin, S., Sharma, A., Charroux, B., Abel, L., Rappsilber, J.,
Mann, M., and Dreyfuss, G. (2002). miRNPs: a novel class of ribonucleoproteins containing
numerous microRNAs. Genes Dev. 16, 720-728.
62
Olsen, P. H., and Ambros, V. (1999). The lin-4 regulatory RNA controls developmental timing
in Caenorhabditis elegans by blocking LIN- 14 protein synthesis after the initiation of
translation. Dev. Biol. 216, 671-680.
Peng, J., Carol, P., Richards, D. E., King, K. E., Cowling, R. J., Murphy, G. P., and Harberd, N.
P. (1997). The Arabidopsis GAI gene defines a signaling pathway that negatively regulates
gibberellin responses. Genes Dev. 11, 3194-205.
Ray, A., Lang, J. D., Golden, T., and Ray, S. (1996). SHORT INTEGUMENT (SIN1), a gene
required for ovule development in Arabidopsis, also controls flowering time. Development 122,
2631-2638.
Ray, S., Golden, T., and Ray, A. (1996). Maternal effects of the short integument mutation on
embryo development. Dev. Biol. 180, 365-369.
Reinhart, B., Weinstein, E., Rhoades, M., Bartel, B., and Bartel, D. (2002). MicroRNAs in
plants. Genes Dev. 16, 1616-1626.
Reinhart, B. J., Slack, F. J., Basson, M., Bettinger, J. C., Pasquinelli, A. E., Rougvie, A. E.,
Horvitz, H. R., and Ruvkun, G. (2000). The 21 nucleotide let-7 RNA regulates developmental
timing in Caenorhabditis elegans. Nature 403, 901-906.
Riechmann, J. L., Heard, J., Martin, G., Reuber, L., Jiang, C.-Z., Keddie, J., Adam, L., Pineda,
O., Ratcliffe, O. J., Samaha, R. R., Creelman, R., Pilgrim, M., Broun, P., Zhang, J. Z.,
Ghandehari, D., Sherman, B. K., and Yu, G.-L. (2000). Arabidopsis transcription factors:
genome-wide comparative analysis among eukaryotes. Science 290, 2105-2110.
Robinson-Beers, K., Pruitt, R. E., and Gasser, C. S. (1992). Ovule development in wild-type
Arabidopsis and two female-sterile mutants. Plant Cell 4, 1237-1249.
63
Rogg, L. E., and Bartel, B. (2001). Auxin signaling: derepression through regulated proteolysis.
Dev. Cell 1, 595-604.
Silverstone, A. L., Ciampaglio, C. N., and Sun, T. (1998). The Arabidopsis RGA gene encodes a
transcriptional regulator repressing the gibberellin signal transduction pathway. Plant Cell 10,
155-69.
Slack, F. J., Basson, M., Liu, Z., Ambros, V., Horvitz, H. R., and Ruvkun, G. (2000). The lin-41
RBCC gene acts in the C. elegans heterochronic pathway between the let-7 regulatory RNA and
the LIN-29 transcription factor. Mol. Cell 5, 659-669.
Ulmasov, T., Hagen, G., and Guilfoyle, T. (1999). Dimerization and DNA binding of auxin
response factors. Plant J. 19, 309-319.
Wightman, B., Ha, I., and Ruvkun, G. (1993). Posttranscriptional regulation of the heterochronic
gene lin-14 by lin-4 mediates temporal pattern formation in C. elegans. Cell 75, 855-862.
Yeh, R., Lim, L., and Burge, C. (2001). Computational inference of homologous gene structures
in the human genome. Genome Res. 11, 803-16.
Yu, J., Hu, S., Wang, J., Wong, G. K., Li, S., Liu, B., Deng, Y., Dai, L., Zhou, Y., Zhang, X., et
al. (2002). A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science 296, 79-92.
64
Computational identification of plant microRNAs and their targets,
including a stress-induced miRNA
Matthew W. Jones-Rhoades 1 and David P. Bartel l 2
1Whitehead
Institute for Biomedical Research and Department of Biology, Massachusetts
Institute of Technology, 9 Cambridge Center, Cambridge, Massachusetts 02142
2Correspondence: dbartel@wi.mit.edu
Running title: Plant microRNAs
Key words: computational gene prediction, microRNA regulatory targets, noncoding RNAs
65
Summary
MicroRNAs (miRNAs) are -21-nucleotide RNAs, some of which have been shown to play
important gene-regulatory roles during plant development. We developed comparative
genomic approaches to systematically identify both miRNAs and their targets that are
conserved in Arabidopsis thaliana and rice (Oryza sativa). Twenty-three miRNA
candidates, representing seven newly identified gene families, were experimentally
validated in Arabidopsis, bringing the total number of reported miRNA genes to 92,
representing 22 families. Nineteen newly identified target candidates were confirmed by
detecting mRNA fragments diagnostic of miRNA-directed cleavage in plants. Overall,
plant miRNAs have a strong propensity to target genes controlling development,
particularly those of transcription factors and F-box proteins. However, plant miRNAs
have conserved regulatory functions extending beyond development, in that they also
target superoxide dismutases, laccases, and ATP sulfurylases. The expression of miR395,
the sulfurylase-targeting miRNA, increases upon sulfate starvation, showing that miRNAs
can be induced by environmental stress.
Introduction
MicroRNAs are endogenous 20- to 24-nucleotide RNAs, some of which are known to play
important post-transcriptional regulatory roles in plants and animals (Bartel and Bartel, 2003;
Lai, 2003; Bartel, 2004). MicroRNAs are initially transcribed as much longer RNAs that contain
imperfect hairpins, from which the mature miRNA is excised by Dicer-like enzymes (Grishok et
al., 2001; Hutvagner et al., 2001; Ketting et al., 2001; Lee et al., 2002; Park et al., 2002; Reinhart
et al., 2002; Lee et al., 2003). The mature miRNA derives from the double-stranded portion of
the hairpin and is initially excised as a duplex comprising two -22-nt RNAs, one of which is the
mature miRNA while the other, known as the miRNA*, comes from the opposite arm of the
hairpin (Lau et al., 2001; Reinhart et al., 2002; Khvorova et al., 2003; Lim et al., 2003b; Schwarz
et al., 2003). The miRNA of this miRNA:miRNA* duplex is preferentially loaded into the
RNA-induced silencing complex (RISC(Hammond et al., 2000)), where it functions as a guide
RNA to direct the posttranscriptional repression of mRNA targets, while the miRNA* is
degraded (Hutvagner and Zamore, 2002; Mourelatos et al., 2002; Khvorova et al., 2003; Schwarz
et al., 2003).
66
The primary method of identifying miRNA genes has been to isolate, reverse transcribe, clone,
and sequence small cellular RNAs (Lagos-Quintana et al., 2001; Lau et al., 2001; Lee and
Ambros, 2001; Llave et al., 2002a; Park et al., 2002; Reinhart et al., 2002). However, molecular
cloning is biased towards finding miRNAs that are relatively abundant. In animals, miRNA gene
discovery by molecular cloning has been supplemented by systematic computational approaches
that identify evolutionarily conserved miRNA genes by searching for patterns of sequence and
secondary structure conservation that are characteristic of metazoan miRNA hairpin precursors
(Ambros et al., 2003; Grad et al., 2003; Lai et al., 2003; Lim et al., 2003a; Lim et al., 2003b).
The most sensitive of these methods indicate that miRNAs constitute nearly 1% of all predicted
genes in nematodes, flies, and mammals (Lai et al., 2003; Lim et al., 2003a; Lim et al., 2003b).
Methods developed in one animal lineage work well when extended to another animal lineage
(Lim et al., 2003a), but cannot be directly applied to plants because the hairpins of plant
miRNAs are more heterogeneous than those of animal miRNAs (Reinhart et al., 2002).
Because the miRNAs recognize their regulatory targets through base pairing, computational
methods have been invaluable for identifying these targets. The extensive complementarity
between plant miRNAs and mRNAs makes systematic target identification easier in plants than
in animals (Rhoades et al., 2002). A search for targets of 13 Arabidopsis miRNA families
predicted 49 unique targets, with a signal-to-noise ratio exceeding 10:1, simply by looking for
Arabidopsis messages with three or fewer mismatches (Rhoades et al., 2002). Evolutionary
conservation of the miRNA:mRNA pairing in rice (Rhoades et al., 2002), together with
experimental evidence showing that these miRNAs direct cleavage of their predicted mRNA
targets (Llave et al., 2002b; Kasschau et al., 2003; Palatnik et al., 2003; Tang et al., 2003;
Mallory et al., 2004; Vazquez et al., 2004) supports the validity of these predictions. Because
metazoan miRNAs only rarely recognize their targets with such extensive complementarity
(Yekta et al., 2004), more sophisticated methods that search for short segments of conserved
complementarity to the miRNAs are required to identify metazoan miRNA targets (Enright et al.,
2003; Lewis et al., 2003; Stark et al., 2003).
The previously identified plant miRNAs have a remarkable propensity to target genes involved
in development, particularly those of transcription factors (Rhoades et al., 2002). In all cases
where disruption of plant miRNA regulation has been reported, striking developmental
abnormalities are observed. Dominant gain-of-function mutations in HD-ZIP transcription factor
67
genes PHABULOSA, PHAVULOTA, and REVOLUTA that destabilize pairing to miR165/miR166
cause loss of adaxial/abaxial polarity in developing leaves (McConnell et al., 2001; Rhoades et
al., 2002; Emery et al., 2003; Kidner and Martienssen, 2003). In maize, similar mutations in the
HD-ZIP gene ROLLED LEAF1 also cause adaxilization of the abaxial surface of leaves,
indicating that the miR165/miR166 family has a conserved role in determining leaf polarity
despite the morphological differences between Arabidopsis and maize leaves (Juarez et al.,
2004). Transgenic plants with silent mutations in the miR-JAW complementary sites of TCP
transcription factors arrest as seedlings with fused cotyledons and lack shoot apical meristems,
while those with mutations in the miR159 complementary site of MYB33 have upwardly curled
leaves(Palatnik et al., 2003). Plants deficient in miR172-mediated regulation of APETALA2
have altered patterns of floral organ development (Chen, 2004). Plants deficient in miR164mediated regulation of CUP-SHAPED COTYLEDON1 have altered patterns of embryonic,
vegetative, and floral development (Mallory et al., 2004). Finally, silent mutations in the
miR168 complementary site of ARGONAUTE] lead to misregulation of miRNA targets and
numerous developmental defects (Vaucheret et al., 2004).
To gain a more complete understanding of plant miRNAs and their regulatory targets, we
devised a computational procedure to identify conserved miRNA genes that were missed in
previous cloning efforts, and we refined our computational method for identifying mRNA targets
to increase its sensitivity. Using criteria that retain all 11 of the previously identified miRNA
gene families conserved between Arabidopsis thaliana and Oryza sativa, we found 13 additional
families of candidates. Molecular evidence showed that at least seven of these newly identified
families of candidate miRNAs are authentic, and that at least six out of the seven mediate the
cleavage of their predicted mRNA targets. These seven newly identified families are represented
by 23 loci. When these are added to those identified by cloning, we count 92 miRNA loci in the
Arabidopsis genome. Our updated analysis of the plant miRNA targets indicates a continued
very strong overall bias toward transcription factors and genes involved in development. Some
targets of the newly identified miRNAs, such as F-box proteins and GRL transcription factors,
represent genes with demonstrated or probable roles in controlling developmental processes.
Nonetheless, other newly identified miRNA targets, such as ATP sulfurylases, laccases, and
superoxide dismutases, show that the range of functionalities regulated by miRNAs is broader
than previously known. Furthermore, the expression of miR395, which targets genes involved in
68
sulfate assimilation, is responsive to the sulfate concentration of the growth media,
demonstrating that miRNA expression can be modulated by levels of external metabolites.
Results
Identification of 20mers in conserved miRNA-like hairpins
Our computational approach to identify plant miRNAs was based upon six characteristics that
describe previously known plant miRNAs. 1) The base pairing of the mature miRNA to its
miRNA* within the hairpin precursors is relatively consistent. In contrast, both the size of the
foldback and the extent of base pairing outside of the immediate vicinity of the miRNA are
highly variable among the hairpins of plant miRNAs, even among those of miRNAs from the
same gene family. 2) The majority of known Arabidopsis miRNAs have identifiable homologs
in the Oryza sativa genome, in which the predicted mature Oryza miRNAs have 0-2 base
substitutions relative to their Arabidopsis homologs. 3) The secondary structures of known
miRNA hairpins are robustly predicted by RNAfold if given a sequence sufficiently long to
contain both the miRNA and the miRNA*. 4) The sequences of the Arabidopsis and Oryza
hairpins are generally more conserved in the miRNA and miRNA* than in the segment joining
the miRNA and miRNA*. 5) All matches to known miRNAs in the Arabidopsis genome, with
the exception of those antisense to coding regions, have potential miRNA-like hairpins and are
thus annotated as miRNA genes. 6) Most known Arabidopsis miRNAs are highly
complementary to target mRNAs, and this complementarity is conserved to Oryza.
As the first step to identifying miRNAs in the genomes of Arabidopsis thaliana and Oryza
sativa, we considered only those genomic portions contained in imperfect inverted repeats as
defined by EINVERTED (Figure la, step 1). Within these 133,864 Arabidopsis and 410,167
Oryza inverted repeats were 73 of 86 reference set loci corresponding to the 24 previously
reported miRNAs (refsetl, Table Si1). Secondary structures for the inverted repeats were
predicted with RNAfold, and all 20mers within the inverted repeats were checked against
MIRcheck, an algorithm written to identify 20mers with the potential to encode miRNAs (Figure
la, step 2). MIRcheck takes as input a) the sequence of a putative miRNA hairpin, b) a
secondary structure of the putative hairpin, and c) a 20mer sequence within the hairpin to be
considered as a potential miRNA. MIRcheck takes into account the total number of unpaired
nucleotides (no more than 4 in the putative miRNA), the number of bulged or asymmetrically
69
unpaired nucleotides (no more than 1 in the putative miRNA), the number of consecutive
unpaired nucleotides (no more than 2 in the putative miRNA) and the length of the hairpin (at
least 60 nucleotides inclusive of the putative miRNA and miRNA*). In contrast to the
algorithms designed to identify metazoan miRNAs, MIRcheck has no requirements pertaining to
the pattern or extent of base pairing in other parts of the predicted secondary structure. Even
though these parameters were chosen to be relatively stringent, only 7 of the 73 remaining
Arabidopsis and Oryza refsetl loci were lost at this step.
After removal of 20mers that overlap with repetitive elements, or which have highly biased
sequence compositions, 389,648 Arabidopsis 20mers (AtSetl) and 1,721,759 Oryza 20mers
(OsSetl) had at least 1 locus that passed MIRcheck. We used Patscan to identify 20mers in
AtSetl that matched at least one 20mer in OsSetl with 0-2 base substitutions, considering only
20mers on the same arm of their putative hairpins (Figure la, step 3). 3,851 Arabidopsis 20mers
had at least 1 Oryza match (AtSet2), and 5,438 Oryza 20mers were matched at least once
(OsSet2).
For the previously known plant miRNAs, RNAfold predicts a secondary structure in
which the miRNA is paired to the miRNA*, provided that the flanking sequence is sufficiently
long to contain the miRNA*. The presence of additional flanking sequence does not interfere
with the prediction of a miRNA-like secondary structure. This robustly predicted folding is
observed for all of the loci of each cloned miRNA, even though they have widely divergent
flanking sequences. While recognizing that the predicted folds are unlikely to be correct in all
their details, it is reasonable to propose that the overall robustness of the predicted folding might
reflect an evolutionary optimization for defined folding in the plant. To eliminate candidates that
do not fold as robustly as the previously known miRNAs, we required AtSet2 and OsSet2 20mers
to pass MIRcheck a second time after being computationally folded in the context of sequences
flanking the hairpin. Patscan was used to find all matches of AtSet2 and OsSet2 to their
respective genomes, RNAfold was used to predict the secondary structure of each match in the
context of a 500 nt genomic sequence centered on the 20mer, and each match was evaluated by
MIRcheck (Figure la, step 4). 2,588 Arabidopsis 20mers (AtSet3) and 3,083 Oryza 20mers
(OsSet3) had at least one locus that passed MIRcheck. Because EINVERTED misses some
hairpins and because this second MIRcheck evaluation used more relaxed cutoffs (up to 6
70
unpaired nt each in the putative miRNA and miRNA*), this step also recovered paralogs that
were missed in steps 1 or 2.
'The genomic matches to known Arabidopsis miRNAs are all either in hairpins or antisense to
coding regions. To ensure that computationally identified miRNAs met this criterion,
Arabidopsis 20mers were removed from the analysis if less than 50% of intergenic matches
passed MIRcheck, or if more than 50% of genomic matches overlapped with repetitive sequence
elements (Figure la, step 5), resulting in 2,506 20mers (AtSet4). Because gene annotation in
Oryza is poor, we could not reliably define matches as genic or intergenic. The 2,780 Oryza
20mers that had at least 1 locus pass MIRcheck and had no more than 50% of genomic matches
in repetitive sequence elements were included in OsSet4.
The next step in our analysis was to identify pairs of Arabidopsis and Oryza hairpins that
have miRNA-like patterns of sequence conservation (Figure la, step 6). MicroRNA precursors
are generally most conserved in the miRNA:mRNA* portion of the hairpin, a characteristic that
has been used to help identify insect miRNA genes (Lai et al., 2003). In our procedure, we
retained homologous pairs for which both the miRNA and miRNA* 20mers were more
conserved than any 20mer from the loop regions. Doing pairwise comparisons of the hairpins of
AtSet4 against those of OsSet4 resulted in 1,145 20mers (AtSet5) with at least 1 acceptable
Oryza homolog.
AtSet5 was mapped to the Arabidopsis genome, and overlapping 20mers were joined together to
form 379 sequences with miRNA encoding potential. A single miRNA gene could be
represented by up to four of these potential miRNA sequences, representing the miRNA, the
miRNA*, the antisense miRNA, and the antisense miRNA*. After accounting for multiple
potential miRNAs mapping to a single locus, the 379 potential miRNAs represented 228
potential miRNA loci. These 228 loci were grouped into 118 families of potential miRNA loci
based on sequence similarity as determermined by blastn. Many of these newly identified
miRNA candidates had patterns of secondary structure conservation resembling those of
previously known plant miRNAs (Figure lb,c). For many of the miRNA loci corresponding to
previously reported miRNAs, the computationally identified sequences extended 1-9 nt on either
side of the cloned miRNAs, although in a few cases the actual miRNA overlapped with but
extended beyond the predicted sequence.
71
A refined procedure for predicting miRNA targets
We previously identified mRNAs containing ungapped, antisense matches to miRNAs with 0-3
mispairs (counting G:U pairs as mispairs) as probable miRNA targets (Rhoades et al., 2002).
Although the majority of validated plant miRNA targets are captured by this cutoff, there are
several authentic targets which are missed. For example, miR162 has a bulged nucleotide as it
basepairs to the mRNA of DCL1, and miR-JAW has 4-5 mispairs to the mRNAs of several TCP
transcription factors (Palatnik et al., 2003; Xie et al., 2003). In order to more thoroughly assess
the mRNA targeting potential of both known and predicted miRNAs, we developed a more
sensitive computational approach to identify target candidates. It allows for gaps and more
mismatches in the mRNA:miRNA duplex but requires that the miRNA complementarity be
conserved between homologous Arabidopsis and Oryza mRNAs. Each miRNA complementary
site was scored, with perfect matches given a score of 0, and points were added for each G:U
wobble (0.5 points), each non-G:U mismatch (1 point) and each bulged nucleotide in the miRNA
or target strand (2 points). To allow the same cutoffs to be applied more evenly to miRNAs of
different lengths and to avoid penalizing mismatches at the ends of longer miRNAs, those
miRNAs that were longer than 20 nt were broken into overlapping 20mers, with the
mRNA:miRNA pair receiving the score of the most favorable 20mer.
This scoring was tested using a set of 10 unrelated miRNAs that are highly conserved (0-1
substitutions) between Arabidopsis and Oryza (refset2, Table S1). As a control, we generated 5
cohorts of permuted miRNAs, in which each permuted miRNA has the same dinucleotide
composition as the corresponding miRNA in refset2. For all 20mers from the sets of real and
permuted miRNAs we searched for complementary sites in Arabidopsis and Oryza mRNAs.
Compared to their shuffled cohorts, the real miRNAs had many more complementary
Arabidopsis mRNAs with scores < 2 (Figure 2a), which was in agreement with our previous
results (Rhoades et al., 2002). Filtering the miRNA-complementary mRNAs to include only
those conserved to Oryza showed that nearly all the complementary sites to authentic miRNAs
with scores of < 2 are conserved (Figure 2b). For the permuted miRNAs, requiring conservation
reduced to nearly zero the number of complementary sites with scores of 2-3.5, whereas for the
authentic miRNAs a small but significant number of sites scoring in this range were conserved
(Figure 2b). Thus, adding a requirement for conservation raised the threshold at which spurious
matches were found, thereby enabling confident prediction of targets that were less extensively
72
paired to the miRNAs - in some cases forming Watson-Crick pairs to only 15 of 20 miRNA
nucleotides.
Each of the conserved miRNAs had at least one predicted target with score < 3.0,
suggesting that the possession of predicted targets could be a criterion for screening the newly
identified miRNA candidates. For each 20mer in AtSet5 and OsSetS, miRNA complementary
sites were found and scored (Figure la, step 7). As would be expected even for permuted
sequences, nearly all of the AtSet4 20mers (1,124 out of 1,145) had a complementary score of
3.0 to at least 1 Arabidopsis mRNA. Of these, 278 20mers (AtSet6) had at least one homologous
Oryza 20mer with complementarity to a homologous Oryza mRNA. AtSet6 represented 24
families of potential miRNAs, which account for 100 potential miRNA loci. Eleven of these
families, represented by 60 loci (including 41 refsetl loci), corresponded to all previously known
miRNA families with identifiable Oryza homologs, suggesting that our method also identified
most of the previously unknown families that have extensive conserved complementarity in
Oryza.
Newly identified miRNAs are expressed
Our computational screen identified 13 previously unreported families of conserved miRNA
candidates with conserved complementarity to mRNAs. To determine which of these putative
miRNAs are expressed, we used a PCR based assay (Lim et al., 2003a; Lim et al., 2003b) to
search for the predicted miRNAs in a library of small cDNAs (Reinhart et al., 2002). In addition
to verifying the expression of the miRNAs, this assay maps the 5' ends of the miRNAs (Table 2).
Each PCR reaction used one common primer corresponding to the adaptor oligo attached to the
5' end of all members of the library and one primer specific the 3' portion of the predicted
miRNA. For seven miRNA families, PCR reactions resulted in products in which the specific
primer was extended by at least 3 nucleotides that matched the predicted miRNA sequence. In
sum, the seven newly identified miRNA families comprised 23 genomic loci in Arabidopsis
(Table 2). All clones for families 393, 396, 397, and 398 had the same 5' end, while for families
394, 395, and 399 miRNAs were detected with differing 5' ends that could result from
inconsistent processing of precursors transcripts from a single locus, or from differential
processing of precursors from different loci. Several of these miRNA families include loci that
73
would encode distinct but highly similar miRNAs (Table 2). Because the PCR primers
overlapped with the residues that differ, it is not possible to know which variants were detected.
;Sixfamilies of putative miRNAs passed all computational checks but were not validated by the
IPCRassay. Five of these families had a single locus in Arabidopsis, whereas the sixth had 14
Arabidopsis loci and 52 Oryza loci and likely represented a repetitive element not identified by
RepeatMasker. Although the possibility that some of these non-validated predicted candidates
are authentic cannot be ruled out, we consider it unlikely that they represent miRNA sequences.
The expression of newly identified miRNAs was also tested by Northern blot analysis.
Hybridization probes were designed for representative members of the 7 miRNA families
detected by the PCR assay. Probes complementary to miR393, miR394, miR396a, miR398b
detected 20-21 nt RNAs in samples from wild-type, soil-grown Columbia plants (Figure 3a),
whereas probes complementary to miR395a, miR397b, and miR399b did not detect expressed
small RNAs in these samples. These miRNAs that are difficult to detect on a Northern blot are
likely to be expressed only at low levels or only in a subset of tissues or growth conditions.
Because miR395 is complementary to mRNAs of ATP sulfurylase (APS) proteins (Figure 5),
amdbecause the expression levels of numerous sulfate metabolizing genes are responsive to
sulfate levels (Takahashi et al., 1997; Lappartient et al., 1999; Maruyama-Nakashita et al., 2003),
we hypothesized that the expression of miR395 might be dependent on cellular sulfate levels. To
test this, we probed RNA samples from plants grown in modified MS media containing various
amount of sulfate. As seen for plants grown in soil, miR395 was not detected in the samples
from plants grown in 2 mM SO42-. However, miR395 was readily detected in the samples grown
in very low sulfate (Figure 3b, 0.2 or 0.02 mM SO4 2 ). Induction of miR395 by low external
sulfate concentrations is somewhat reminiscent of the starvation-associated miR-234 increase
that has been observed in nematodes (Lim et al., 2003b), although the miR395 induction (greater
than 100 fold) is much more striking than that of miR-234 (twofold). We examined whether
APS1 expression changed in the conditions that induced miR395, and found that its expression
decreased when miR395 increased, as would be expected if APS1 was a cleavage target of
miR395 (Figure 3c).
74
Experimental verification of miRNA targets
MicroRNAs, like small interfering RNAs (siRNAs (Elbashir et al., 2001)), can direct the
cleavage of their mRNA targets when these messages have extensive complementarity to the
miRNAs (Hutvagner and Zamore, 2002; Llave et al., 2002b; Tang et al., 2003; Yekta et al.,
2004). This miRNA-directed cleavage can be detected by using a modified form of 5'-RACE
(rapid amplification of cDNA ends) because the 3' product of the cleavage has two diagnostic
properties: 1) a 5' terminal phosphate, making it a suitable substrate for ligation to an RNA
adaptor using T4 RNA ligase, and 2) a 5' terminus that maps precisely to the nucleotide that
pairs with the tenth nucleotide of the miRNA (Llave et al., 2002b; Kasschau et al., 2003). To
examine whether any of the newly identified miRNAs can direct cleavage of their predicted
targets in vivo, we isolated RNA from vegetative and floral tissues and performed the 5'-RACE
procedure using primers specific to the predicted targets. For 19 predicted targets the 5'-RACE
PCR yielded a distinct band of the predicted size on an agarose gel, which was isolated, cloned
and sequenced. In all 19 cases the most common 5' end of the mRNA fragment mapped to the
nucleotide that pairs to the tenth nucleotide of one of the miRNAs validated by PCR (Figure 4),
indicating cleavage at sites precisely analogous to those seen for other miRNA targets (Llave et
al., 2002b; Aukerman and Sakai, 2003; Kasschau et al., 2003; Palatnik et al., 2003; Xie et al.,
2003; Vazquez et al., 2004)(Mallory et al., 2004), as well as for RNAs complementary to
siRNAs and metazoan miRNAs (Elbashir et al., 2001; Hutvagner and Zamore, 2002; Yekta et
al., 2004). These observations also corroborate the 5' ends of the miRNAs as mapped by PCR
('Table 2).
Identification of miRNA paralogs
Our computational approach found 81 miRNA loci from 18 miRNA families (Table 1, Table
S2). We searched for additional members of these families by searching the Arabidopsis
genome for near matches (0-3) to the miRNAs of these 81 loci (Figure la, step 9). After manual
inspection for potential hairpin-like secondary structures, this identified six additional loci in
miRNA families that are conserved to Oryza. Together with the five loci in miRNA families
without apparent Oryza homologs, this brings to 92 the total number of Arabidopsis loci that
meet the criteria for designation as miRNA genes (Ambros et al., 2003) (Table S2). As is
generally the case with computational gene prediction, some of these might be pseudogenes.
75
Our de-novo miRNA-finding algorithm found 88% of these, and 93% of those with Oryza
homologs. These Arabidopsis genes correspond to 122 Oryza miRNA genes, of which 111
(91%) were found de-novo by our algorithm (Figure la, step 9; Table S3).
As has been previously observed for numerous animal miRNAs (Lagos-Quintana et al., 2001;
Lau et al., 2001;), we find that some plant miRNA genes are clustered in the genome, most
strikingly the genes of the 395 family. In Arabidopsis, miRNAs of the 395 family are located in
two clusters, each containing three hairpins within 4 kb (Figure Sla). In each cluster, two
MIR395 hairpins are on one strand while the third is on the opposite. Thus each cluster could
not be expressed as a single primary transcript, but could be expressed as two transcripts sharing
common regulatory elements. The Oryza MIR395 hairpins are also clustered, but with a
different arrangement than in Arabidopsis. The two largest Oryza MIR395 clusters contain seven
and six hairpins, respectively, within 1 kb, with all hairpins encoded on the same strand of DNA
(Figure Slb). These clusters are likely expressed as transcripts containing multiple miRNAs, an
idea supported by Oryza EST CA764701, which contains four miR395 hairpins.
Prediction of conserved miRNA targets
Having refined our computational method to more sensitively predict plant miRNA targets, we
applied it to the prediction of conserved mRNA targets of all known Arabidopsis and Oryza
miRNAs (Figure 5). Control experiments with refset2 and 5 sets of permuted miRNAs
suggested that a score cutoff of < 3.5 was appropriate to identify conserved miRNA targets with
high sensitivity and selectivity. However, when searching for targets of the entire set of
miNRAs, this cutoff identified a number of mRNAs for which miRNA mediated cleavage
products could not be found by 5'-RACE. Thus, a cutoff of < 3 was chosen to minimize the
number of non-authentic targets. All previously validated targets miRNA targets are identified at
this level of sensitivity, although several newly validated targets have scores of 3.5 in one or both
species and are not retained using this cutoff. Thus, there is still a threshold at which it is
difficult to distinguish authentic targets from potentially spurious complementarity without
experimental verification. Nonetheless, a score of <3.0 in our refined method identifies targets
with very high confidence (Figure 2).
Plant miRNAs are deeply conserved
76
MicroRNAs conserved between the dicot Arabidopsis thaliana and the monocot Oryza sativa are
likely to be found in most flowering plants. Homologs of miR-JAW and miR-JAW
complemenary sites have been found in ESTs from numerous angiosperms (Palatnik et al.,
2003). To look for evidence of other miRNAs in additional plant species, we searched for ESTs
representing potential homologs of Arabidopsis and Oryza miRNAs, defined here as having
;19/20nt matches and a predicted foldback that passes MIRcheck. This search identified 187
putative miRNA homologs in the ESTs (Table S4). A large majority of these appear to be
authentic, in that the 10 miRNAs in refset2 each had on average 9.7 EST matches that passed
MIRcheck, whereas the set of 50 permuted miRNAs averaged only 0.04 matches that passed
MIRcheck. For all 18 miRNA families that are conserved between Arabidopsis and Oryza,
potential miRNA precursors were found in at least one additional angiosperm species (Table S4).
For miRNAs that are not conserved between Arabidopsis and Oryza, no homologous miRNAs in
additional species were identified, suggesting that the lack of conservation in Oryza is a
consequence of recent emergence rather than loss in the Oryza lineage. We also searched for
matches to experimentally confirmed miRNA complementary sites in ESTs encoding proteins
homologous to Arabidopsis targets (blastx score >10-6). For all miRNA families with validated
miRNA targets, conserved miRNA complementary sites (19/20 nt matches) were found in at
least one additional angiosperm (Table S5). On average, the miRNA complementary sites from
17 unrelated Arabidopsis miRNA targets were each conserved in 191 homologous ESTs,
representing 14 species. This is far more than would be expected by chance; when repeating the
analysis using 170 sites chosen at random from the same Arabidopsis mRNAs, the average
number of ESTs and species were 2.6 and 0.5, respectively.
MicroRNAs of the 166 family, as well as their binding sites in mRNAs of HD-ZIP proteins,
predate the emergence of seed plants (Floyd and Bowman, 2004). We found nine miRNA
families (156, 160, 166, 167, 393, 395, 396, 397 and 398) that had complementary sites
conserved in gymnosperms, while a miR171 complementary site was conserved in a SCL mRNA
from a fern (Ceratopteris richarii). In addition, a potential miRNA hairpin of the 159/JAW
family was present in an EST from moss (Physomitrella patens). These data suggest that
multiple miRNAs have deep origins in plant phylogeny.
Discussion
77
The scope of miRNAs conserved between dicots and monocots
A combination of computational prediction and experimental verification identified seven
families of sequences that had not previously been identified as miRNAs. A set of 2088 small
RNAs from Arabidopsis was recently reported (Xie et al., 2004)
(http://gac.bcc.orst.edu/smallRNA/). Sequences corresponding to miR397a, miR398b and
miR399b were contained in this dataset, each having been cloned a single time, although none
were annotated as miRNAs. The cloning of miR397 and miR399, which were not detected by
Northern blot, corroborates their expression as determined by PCR.
Families 393, 394, 395 and 396 are absent from the reported sets of cloned, sequenced small
RNAs. These are each detectable by Northern analysis, and as with families 397, 398 and 399
were detected by PCR in our library of small cDNAs used for cloning. Therefore they would
have been found eventually by sequencing enough small cDNAs. However, given that other
miRNAs have been cloned hundreds of times (Xie et al., 2004), it seems that all seven newly
identified miRNA families are relatively rare in the tissues and growth conditions from which
small RNAs have been cloned. They may represent miRNAs that are needed at low levels, or
whose expression is limited to rare cell types or particular growth conditions. The expression of
miR395 is greatly increased by sulfate starvation; other miRNAs with seemingly low expression
may also be inducible by metabolite levels or environmental stimuli. It is the identification of
these difficult to clone but potentially important miRNAs that makes computational prediction a
useful complement to cloning of small RNAs.
The sensitivity of our computational approach, which found all 11 conserved miRNA families
previously identified through cloning, suggests that most plant miRNAs with properties similar
to previously cloned miRNAs have been identified. MicroRNA genes not found by our analysis
are likely to fall into several categories. One set will be those without apparent conservation to
Oryza. This describes four families of currently known Arabidopsis miRNAs (158, 161, 163,
and 173). It is difficult to estimate how many additional non-conserved miRNA families exist in
either species, but the observation that most of the cloned plant miRNAs have readily identified
Oryza homologs indicates either that there are no more than a handful of non-conserved miRNAs
remaining to be identified or that non-conserved miRNAs are disproportionally poorly expressed
in plants.
78
Another set of false negatives will be miRNA families that are conserved between
Arabidopsis and Oryza but were missed by our analysis. Most steps in our analysis have the
potential to lose authentic miRNA genes. The parameters and cutoffs we used were chosen to be
slightly more relaxed than what was needed to retain most loci corresponding to the 11
previously-known miRNAs families with Oryza homologs in refsetl. They found at least one
member of each family and 92% (59/64) of all loci in these families. A similar percentage of
loci, 96%, were correctly identified for newly discovered miRNA families (22/23), suggesting
that our parameters are not over-fitted. Relaxing the parameters of MIRcheck (Figure 1, steps 2
& 4) to allow up to two asymmetric bulges, shorter hairpins (as short as 54 nt), and an additional
mismatch did not identify any additional verifiable miRNAs (data not shown). Nonetheless, the
low number of previously identified Arabidopsis miRNA gene families (15) precluded splitting
the miRNAs into a training set and test set, as was done in our metazoan analysis to evaluate the
degree of overtraining and enable firm estimates of the number of genes remaining to be
identified (Lim et al., 2003b). MicroRNA families with few members would be more prone to
being missed. For example, MIR393 and MIR394 each have only one identified locus in Oryza;
either would have been missed if their Oryza locus had been among the fraction of authentic
miRNA loci not identified as an inverted repeat or that did not pass MIRcheck, whereas miRNAs
that were members of larger gene families that have multiple Oryza homologs were identified
even though some Oryza homologs were missed. The observation that some miRNA primary
transcripts are spliced (Aukerman and Sakai, 2003) raises the possibility that some miRNA
transcripts might have an intron within the hairpin precursor, which could prevent their
identification in our analysis of genomic DNA. Furthermore, any unknown miRNA family that
systematically had a pattern of base pairing that failed MIRcheck would also have been lost, but
there is no reason to suspect that this was a widespread problem.
More significant uncertainty in plant miRNA gene number arises from the 94 families of
candidate miRNAs that had conserved miRNA-like hairpins but lacked extensive and conserved
complementarity to mRNAs. Some of these candidates may be authentic miRNAs with different
modes of target recognition. For example, any plant miRNA that recognizes all its target
mRNAs in a manner similar to that of most animal miRNAs, that is, by recognizing its targets
predominantly through "seed matches" (Lewis et al., 2003), would have been missed. Therefore,
further analysis will be required before a meaningful upper bound on the number of plant
79
miRNA genes can be estimated. The 92 loci tabulated to date, when considered together with
the assumption that a few others might remain undetected because they are refractory to both
cloning and computation, places a lower bound on the number of Arabidopsis miRNA genes at
-100, or -0.4% of the predicted Arabidopsis genes-a percentage somewhat lower than that of
animals. The plant miRNAs are generally in larger, more highly related families, further
reducing the relative complexity of known miRNA sequences when compared to those of
animals. Of course, when considering the vast number of distinct -22 nt RNAs that have been
cloned from plants, which might be endogenous siRNAs but are not miRNAs, the diversity of
small RNA silencing in plants could exceed that in animals.
The targets of newly identified miRNAs
The detection of the RNA fragments diagnostic of miRNA-directed cleavage confirms in planta
these 19 newly identified miRNA-target interactions. However, these 5'-RACE results do not
rule out the possibility that the predominant mode of silencing is translational inhibition. 5'RACE experiments demonstrate that miR172 directs the cleavage of some APETALA2 mRNA
molecules, even though the predominate mode of repression appears to be translational inhibition
(Aukerman and Sakai, 2003; Kasschau et al., 2003; Chen, 2004). Nonetheless, for all the other
plant miRNA targets examined, inhibition of the miRNA pathway leads to increased
accumulation of target mRNA (Kasschau et al., 2003; Vaucheret et al., 2004; Vazquez et al.,
2004), suggesting that mRNA cleavage typically plays a significant regulatory role, although in
these cases augmentation by translational repression cannot be ruled out. The same is likely to
be true for our newly identified targets.
Some of the newly identified targets resemble those of previous predictions with regard to their
proven or inferred roles in regulating developmental processes (Figure 5). miR396 targets seven
Growth Regulating Factor genes, which are putative transcription factors that regulate cell
expansion in leaf and cotyledon (Kim et al., 2003). miR393 and miR394 both target the
messages of F-box proteins, which in turn target specific proteins for proteolysis by making them
substrates for ubiquitination by SCF E3 ubiquitin ligases (Vierstra, 2003). At2g27340, targeted
by miR394, is in the same subfamily of F-box genes as UNUSUAL FLORAL ORGANS (UFO)
(Gagne et al., 2002), which is involved in floral initiation and development (Wilkinson and
Haughn, 1995; Samach et al., 1999). miR393 targets four closely related F-box genes, including
80
TRANSPORT INHIBITOR RESPONSE1 (TIR1), which targets AUX/IAA proteins for proteolysis
in an auxin-dependent manner and is necessary for auxin-induced growth processes (Ruegger et
al., 1998; Gray et al., 2001). These five F-box genes constitute a newly identified biochemical
class of miRNA targets.
The identification of TIR1 as a miRNA target implies that miRNAs regulate auxinresponsiveness at multiple points. Other auxin related miRNA targets include Auxin Response
Factors (miR160 and miR167) (Rhoades et al., 2002; Kasschau et al., 2003), which are thought
to regulate transcription in response to auxin(Ulmasov et al., 1999), and NAC1 (miR164)
(Rhoades et al., 2002; Mallory et al., 2004), which promotes auxin-induced lateral root growth
downstream of TIR1 (Xie et al., 2000). Finally, in addition to targeting F-box genes, miR393
also targets At3g23690, a basic helix-loop-helix transcription factor with homology to GBOF-1
from tulip, which Genbank annotates as auxin-inducible.
Other newly identified miRNA targets have less obvious connections to the control of
developmental patterning (Figure 5). miR397 targets putative laccases, members of a family of
enzymes with numerous described roles in fungal biology but without well defined roles in plant
biology (Mayer and Staples, 2002). miR399 targets two copper superoxide dismutases, CSD1
and CSD2, enzymes which protect the cell against radicals and whose expression patterns
respond to oxidative stress (Kliebenstein et al., 1998).
The most definitive example of a plant miRNA operating outside the gene regulatory circuitry
controlling development is miR395. miR395 targets the ATP sulfurylases, APS1, APS3 and
APS4, enzymes that catalyze the first step of inorganic sulfate assimilation (Leustek, 2002). The
observations that the expression of miR395 depends on sulfate concentration and that APS1
expression declines with increasing miR395 corroborate the idea that this miRNA regulates
sulfate metabolism (Figure 3).
Our systematic analysis, which probably has identified most plant miRNAs with
conserved and extensive complementarity to plant messages, including those that are expressed
at very low levels during lab growth conditions, allows us to revisit the question of what this
class of tiny regulatory RNAs is generally doing in plants. As before (Rhoades et al., 2002), we
find an overwhelming propensity for targeting messages of known or suspected plant
transcription factors (63 of 83, or 76% of genes in Figure 5) and similar propensity for targeting
messages of genes with known or suspected roles in plant development (70 of 83, or 84% of
81
genes in Figure 5). A propensity to target developmental regulators differs from what has been
seen in mammals (Lewis et al., 2003). Nonetheless, the conserved targets of plant miRNAs
extend beyond the regulatory circuitry of development. The discovery that miRNAs regulate
genes such as ATP sulfurylases, laccases, and superoxide dismutases shows that miRNAs also
have an ancient role in regulating other aspects of plant biology.
Experimental procedures
Details of the computational miRNA prediction method and sequences of primers used are
available online at http://www.molecule.org/.
PCR validation of miRNAs
We used a PCR based assay to detect expression and map the 5' ends of predicted miRNAs (Lim
et al., 2003b). miRNAs were PCR amplified out of a library of small cDNAs from leaf, flower,
and seedling flanked by 5' and 3' adaptor oligos (Reinhart et al., 2002). Each PCR reaction used
one common primer corresponding the 5' adaptor oligo and one specific primer antisense to the
3' portion of the predicted miRNA.
RNA purification and Northern hybridization
RNA was isolated as previously described (Vance, 1991). For developmental Northerns, 30 ,/g
per lane of total RNA from soil grown Colombia plants was separated by 15% polyacrylamide
electrophoresis and blotted to a nylon membrane.
For plants grown on media, Columbia plants were grown in long-day conditions on modified
MS/agarose media, containing 0.8% Agarose-LE (USBiochem), in which the S0 42 -containing
salts of minimal MS media were replaced with their chloride counterparts and the media
supplemented with 20/zM to 2 mM 2(NH4)SO4. RNA was harvested from 2-week old plants.
For miRNA Northerns, 40 jig per lane was used in Northern blots as above. For miR393,
miR394, miR396a and miR398b, end-labeled antisense DNA probes were used. For miR395a,
miR397b, and miR399b, higher specific activity Starfire (Integrated DNA technologies) probes
were used. MicroRNA Northerns were hybridized and washed as previously described (Lau et
al., 2001). For mRNA Northerns, 10 g per lane were separated by agarose electrophoresis and
blotted as described (Mallory et al., 2001). Probes to exon 1 of APS1 were made using the
Megaprime DNA labeling system (Amersham).
82
5'-RACE analysis
5'-RACE was performed on poly(A)-selected RNA from Columbia inflorescences and rosette
leaves using the GeneRacer Kit (Invitrogen) as described (Kasschau et al., 2003), except that
nested PCR was done for each gene, with each round of PCR using one gene-specific primer and
the GeneRacer 5' Nested Primer. For each gene we designed gene-specific primers that were
180-450 bp away from the predicted miRNA binding site. PCR reactions were separated by
agarose gel electrophoresis, and distinct bands of the appropriate size for miRNA-mediated
cleavage were purified (excised gel slices corresponded to a size range of - 100 basepairs),
cloned, and sequenced.
Acknowledgements
We thank M. Axtell for the 5'-RACE library, R. Rajagopalan for the library of 18-28 nt cDNAs
and Allison Mallory and other Bartel lab members for helpful discussions. This work was
supported by grants from the NIH.
83
Table 1. Sensitivity of computational
identification of plant miRNA loci
Family
At loci
Os loci
Newly identified families
393
2/2
1/1
394
2/2
1/1
395
6/6
16/19
396
2/2
3/3
397
1/2
1/2
398
3/3
2/2
399
6/6
10/11
Previously identified conserved families
12/12
12/12
156a
159/JAW a,b,c
3/6
7/8
160 a
3/3
6/6
162 ad
164 a
166 a
167 a,b,d
2/2
3/3
8/9
3/4
2/2
5/5
10/12
168 a
2/2
1/2
169 a
14/14
15/17
7/7
3/3
171 ad
4/4
172 b
5/5
9/9
Previously identified non-conserved families
158a
0/2
0
161 a
0/1
0
163 a
0/1
0
173 b
0/1
0
All newly identified and previously known
miRNA families are tallied. The number of
loci found by de novo computational
prediction (Figure la, through step 8) is
shown (numerator) as fraction of total found
by searching for near paralogs to miRNAs
with verified expression (denominator).
Additional details regarding the miRNA loci
are reported in Tables S2 and S3
(Arabidopsis and Oryza loci, respectively).
Citations for previously identified families:
aReinhart et al. (2002). bPark et al. (2002).
CMette et al. (2002).dLlave et al. (2002b).
84
Table 2. Newly
miRNA
family
393
(PCR,N,R)
394
(PCR,N,R)
395
(PCR,N,R)
identified miRNA gene families in Arabidopsis
miRNA
Chr.
Arm
miRNA sequence
gene
MIR393a 2
5'
UCCAAAGGGAUCGCAUUGAUC
MIR393b 3
5'
""
MIR394a 1
5'
uUCUUUGGCAUUCUGUCCACC
MIR394b 1
5'
".
"
MIR395a 1
3'
cUGAAGUGUUUGGGGGAACUC
MIR395b 1
3'
"
MIR395c
1
3'
MIR395d
1
3'
cUGAAGUGUUUGGGGGGACUC
MIR395e
1
3'
"
MIR395f
1
3'
396
MIR396a 2
5'
UUCCACAGCUUUCUUGAACUG
(PCR,N,R)
MIR396b 5
5'
UUCCACAGCUUUCUUGAACUU
397
MIR397a 4
5'
UCAUUGAGUGCAGCGUUGAUG
(PCR,R)
MIR397b 4
5'
UCAUUGAGUGCAUCGUUGAUG
398
MIR398a 2
3'
UGUGUUCUCAGGUCACCCCUU
(PCR,N,R)
MIR398b 5
3'
UGUGUUCUCAGGUCACCCCUG
MIR398c 5
3'
"
.
399
MIR399a 1
3'
UGCCAAAGGAGAUUUGCCCUG
(PCR)
MIR399b 1
3'
ccUGCCAAAGGAGAGUUGCCCUG
MIR399c
5
3'
MIR399d
2
3'
UGCCAAAGGAGAUUUGCCCCG
MIR399e
2
3'
UGCCAAAGGAGAUUUGCCUCG
MIR399f
2
3'
UGCCAAAGGAGAUUUGCCCGG
Newly identified miRNA families are listed with summary of experimental
validation (PCR, PCR validation of miRNA; N, Northern blot of miRNA; R, 5'RACE of target mRNA). The chromosome of each locus is indicated (Chr.), as is
the arm of the predicted stem-loop that contains the miRNA (arm). 5' ends of
miRNAs were determined from PCR of small cDNAs, and lengths of miRNAs were
inferred from mobility on Northern blots. For miRNAs not detected on Northem
blots (families 397 and 399), lengths of 21 nt were assumed. For miRNA families
for which multiple 5' ends were detected by PCR, nucleotides present in some but
not all clones are listed in lower case.
85
Figure legends
Figure 1. Prediction of conserved plant miRNAs.
(A) Outline of the computational approach used to identify conserved plant miRNAs.
See text for description. In steps 1-8, the sensitivity is reported (blue) as the fraction of miRNA
loci retained with perfect matches to previously identified miRNAs (refsetl). In step 9, this
fraction extends to imperfect matches to previously identified miRNAs. In the later steps, the
total numbers of predicted miRNA loci are also reported (red).
(B,C) Predicted hairpin secondary structures of two newly identified miRNA families, 393 (B)
and 394 (C) that target mRNAs of F-box proteins. Nucleotides in red comprise the sequence of
the most common mature miRNA as deduced from PCR validation and Northern hybridization.
Nucleotides in blue indicate additional portions of the hairpins predicted to have miRNAencoding potential after identification of conserved 20mers in miRNA-like hairpins (Figure la,
step 6), but before identification of conserved complementarity to mRNAs or experimental
evaluation. For all three MIR393 loci, sequences antisense to the validated miRNA were also
identified as potentially miRNA-encoding, but the miRNA* segments were not.
Figure 2. The utility of incorporating evolutionary conservation when predicting plant miRNA
targets.
(A) Arabidopsis mRNAs with sites complementary to a set of 10 diverse miRNAs conserved
between Arabidopsis and Oryza (refset2) were found and scored such that lower scores indicate
fewer mismatches (see text for details). The number of mRNAs with each of the indicated
scores is graphed (solid bars). Complementary sites were found and scored in the same manner
for 5 cohorts of permuted miRNAs with the same dinucleotide composition as the authentic
miRNAs (open bars, average number of complementary mRNAs per cohort; error bars, 2
standard deviations).
(B) mRNAs complementary to 10 miRNAs were found as in (A), with the additional
requirement that at least one homologous Oryza mRNA be complementary to the same miRNA
(solid bars). Each conserved miRNA complementary site is counted as having the either the
Arabidopsis or Oryza score, whichever is higher (i.e. less complementary). Messenger RNAs
with conserved complementarity to cohorts of dinucleotide shuffled miRNAs were found in the
86
same manner (open bars, average number of complementary mRNAs; error bars, 2 standard
deviations).
Figure 3. Expression of newly identified miRNAs.
(A) Total RNA (30 g) from seedlings (S), rosette leaves (L), flowers (F), and roots (R) were
analyzed on a Northern blot, successively using radio-labeled DNA probes complementary to
newly identified miRNAs. The lengths of 5'-phosphorulated radio-labeled RNA size markers
(M) are indicated. As a loading control, he blot was probed for the U6 snRNA.
(B) miR395 is induced with low sulfate. Total RNA (40 ttg) from 2-week-old Columbia plants
grown on modified MS media containing the indicated concentrations of S04-2 were analyzed by
Northern blot, probing for the indicated miRNAs as in (A).
(C) APSi mRNA decreases in low sulfate. Total RNA (10 /tg) from 2-week-old plants grown on
modified MS media containing the indicated concentrations of S04-2were analyzed by Northern
hybridization using randomly primed body-labeled DNA probes corresponding to exon 1 of the
APS1 mRNA. Normalized ratios of APS1 mRNA to U6 splicosomal RNA are indicated.
Figure 4. Experimental verification of predicted miRNA targets.
Each top strand (black) depicts a miRNA complementary site, and each bottom strand depicts the
miRNA (red). Watson-Crick pairing (vertical dashes) and G:U wobble pairing (circles) are
indicated. Arrows indicate the 5' termini of mRNA fragments isolated from plants, as identified
by cloned 5'-RACE products, with the frequency of clones shown. Only cloned sequences that
matched the correct gene and had 5' ends within a 100 nt window centered on the miRNA
complmentary site are counted. The miRNA sequence shown corresponds to the most common
miRNA suggested by miRNA PCR validation (Table 2). For miR394, the 5' end of a less
common variant (1 out of 4 PCR clones) is indicated in lower case and corresponds to the most
commonly cloned cleavage product.
Figure 5. Conserved predicted miRNA targets.
All predicted miRNA targets with scores of 3.0 or less in both Arabidopsis and Oryza are listed.
The score of the best scoring 20mer from any member of the miRNA family to each gene is
given in parentheses. Predicted targets with scores greater than 3.0 in either Arabidopsis or
87
Oryza but have been validated by 5'-RACE are also listed and marked with an astrisks. Genes in
red were validated as miRNA targets by 5'-RACE experiments in this work. Genes in blue are
validated as miRNA targets by previous work. Additional information on these genes can be
found at www.arabidopsis.org.
a Vazquez
et al. (2004). b Kasschau et al. (2003). cPalatnik et al.
(2003). dXie et al. (2003). e A. Mallory et al. (2004)f Tang et al. (2003). g Emery et. al. (2003).
Vaucheret et al. (2004)
Llave et al. (2002a). i Aukerman and Sakai (2003). k Chen (2003).
88
h
Jones-Rhoades and Bartel Figure 1
BAA
GC-A
C-G
U-A
CAA-U- uIUA~
c-G
C
CCA
U U
U-G
C-G
G-C
C-G
U G
C U
G-C
C-G",9 nt
-GU 18 nt."U-A ,14
loop
G-C loop
loop U-A-o
26 nt-G U
opC-GC-G
C
G-C
I
loop
U-A
A-U
loAC-G
A-U
U-G
C -CU
U-A
U-G
U C
A-U
A-U
A-U
A-U
C-G
A-U
U
U-A
UU
U
U C
CC
C-G
C-G
C-G
C-G
C-G
C-G
U-A
U-A
U-A
A-U
A-U
A-U
G-C
UG-C
-C
UU-A
U-A
A-U
A-U
A-U
C-G
C-G
C-G
G-C
G-C
G-C
C U
C-G
C-G
U-A
U-A
U-A
A-U
A-U
A-U
G-C
G-C
G-C
G-U
G-C
G'U
G-C
G'U
G-C
A-U
A-U
A-U
A-U
A-U
A-U
A-U
A-U
A-U
C-G
C-G
C-G
c-G
C-G
C-G
U-A
U-A
U-A
A-U
A-U
A G
G'U
C-G
G-U
G-C
G-C
A A
G-C
AU
A A
A-U
A-U
A-U
G-C
CAC
A
U
A
A
A
A
A A
U-A
A-U
A-U
3. Identify Arabidol
mers with potential
homologs
6. Identify miRNAof conservation bel
Arabidopsisand Oi
C-G
U-A
UU-A
UU-A
UU-G
U-A
U-A
U-A
U-A
U-A
G-C
U-A
-C
G-CA
AG-UC
5' 3'
MIR393a
Chr 2
Arabidopsis
C
G A
U-A
G-C
UG
G-U
UA8-U
U
A-U
5' 3'
5' 3'
MIR393b
MIR393
Chr 3
Contig 4493
Arabidopsis
Oryza
U-A
UU
G
U-A
U-G
A-U
U-G
A-U
G-U
U-A
G-U
U-A
A-U
C-G
G-C
U.G
GA
A
A
A
C
A-U
G-C
A-U
U
UA-U
C-G
C
C
C
C
UAA-uGU
A-U
AGA
ucU-A-23
UCkGU nlt
loopn AUGA
-AA
CUC-G
C-AA
Ci-i
C-,
U-A
G-U
UG
U-A
C-G
C-G
U-A
C-G
C-G
A-U
C-G
C-G
U.G
G-C
U-A
C U
U-A
U C
A-U
C-G
G-C
G-C
U-A
U-A
U-A
C-G
C-G
U-A
C-G
C-G
A-U
C-G
C-G
U G
G U
U-A
C U
U-A
U C
A-U
C-G
G-C
G-C
U
C-G
C-G
U-A
C-G
C-G
A-U
C-G
C-G
U-G
G-C
U-A
C U
U-A
U C
A-U
C-G
G-C
G-C
U-A
C-G
U-A
A G
G A
A-U
U-AU
C-G
U-A
AC-G
U-A
C-G
U-A
U'G
G-C
A-U
5' 3'
MIR394a
Chr 1
Arabidopsis
U U
5' 3'
MIR394b
Chr 1
Arabidopsis
G-C
A-U
C-G
A-U
U U
G-C
A-U
C-G
A-U
G G
A-U
G-TP
A-U
C-G
5' 3'
MIR394
Contig 15318
Oryza
Jones-Rhoades and Bartel Figure 2
A
80
B
HAl
_-
70
60
znC 50
N1
L
E
c 40
E
a)
E 30
8
0
1 20
I
NC,4
N1i]
V
.
IN
10
0
0
(A
m
C',
I __
0
0.5
|
|
' E
_
B
_
1
_
1.5 2
I
I
I
2.5
3
0
0.5
1
1.5 2
2.5
Score (0 = 20 contiguous complementary nucleotides)
3
3.5
4
Jones-Rhoades and Bartel Figure 3
SO4 2 - (mM)
B
S
L
F
R
M
..
, -24
miR393
-21
~irQ
miR394
0.02 0.2
2.0
-24
miR395
-21
-21
-18
-24
miR156
miR396
-21
miR398
U6
t
a
AL
-24
miR159
s"l
SO 42-(mM)
C
0.02 0.2
APS1
2.0
*0
U6
APS1/U6 0.33 0.63 1.0
Jones-Rhoades and Bartel figure 4
TIR1 ..
1700
ACAAAGCUGGAGAUGUCUU
3' CUAIA= A5AA&
miR393a
1874
j
At1g12820 ... GGUAGGUACGAAA
3'
miR393a
At3926810 ...
At4g03190
,6
7
UGUCGUCUUG ...
CCU
CUAGACGCUA
5'
T
AGCAAGUAUGAAAAA
1989
I
1577
UGUUUCAUG.
4
... GCCAAGCUAGA
UGUCAUCUUG...
3' CUA A&&jA6GYAAAC
U 5'
miR393a
396
At3923690 ... CUACCUUUG
81
J
GA
...
UGGCAAUG
5'
3' CUAGUUACGCUAGGGAAACCU
miR393a
1345
At1g27340 ... CUGUUGUGGAA
miR394a
3'
2/!f/IO
UI
,J
,,CAUAUGGUG...
CCUCCACUGUUUACGGUUu 5'
1/10
.•/10
1/111•1911/10 3/1
355
APS4 ... GAGACAGUCA
U AAA
A
miR395a
UUUAACCGU...
GUGAAGUC 5'
GG
3 ' CUCAA•
758
GRL1 ... GAGGCCGCCAUCAUAG
If
C
g
CCAAAAU.
.
5gA
U
5'
3' GUCAAGUUCUUUCG-ACACCUU
miR396a
853
GRL2 ... GAGCCGUCCU
AUCCAAUCU
...
5'
3' GUCAAGUUCUUUCG-ACACCUU
miR396a
732
..
GRL3 . . GUGGCCGCACCGUUCAAGAAAGCAUGUGGAAACUCCAACC
miR396a
3'
GUCAAGUUCUUUCG-ACACCUU 5'
4194f
CCC
A
GRL7 ... GAGGUCGUCCU
miR398a
3 ' 656111
GUAGUUG
CACIAGUUACU
5 '
12
miR396a
3' GUCAAGUCUUUCG-ACACCUU
5'
GRL8 ...
8275i
AGAGCCG1~.jiU
A
3' GUUC
CACUG
miR396a
CU
aIUCUUG
UU 5'
CUUCU
U
GRL9 ... CUAAUCGUAAAC•UAýI
CUUUC-CACtUU
3' GUCAAAUU
miR396a
AU..
...
5'
671
At2g29130 ... UACUACGAUUAy1E
CGAACUCUUC...
r4A1
MUUIAI A
3' GUA MUG.d
miR397a
737
5
If
At2g38080 ... UGCUACGACUAGUCACGUGACUAUGAGAACUCUUU...
3' GUAUUGCGACUU ALUUACU 5'
1112
miR397a
656
AM2g60020 ... uucucAGcuAAUAAuC
miR397a
I I9
3'
AAUACGAGCUCUUU
1
0ol
cGUGuUC
82
CSD1 ... AuucuuuccArg
3'
miR398a
UUC
AAAGGCCAAGU...
5'
TrA&6AU6IU
10/I /12
CSD2 ... AGuGccGuCAU
AUAAAUGCCAAU...
=
3'1UUCCCCACU
miR398a
5'
-G
1
U 5'
85
At3g15640 ... CUAAUCCU
miR398a
3' UC
r
UC
IAU
,AA
CAAAAC...
UA-G
5'
Jones-Rhoades and Bartel Figure 5
-
miRNA family
393
394
395
396
397
398
399
156
159/JAW
Target protein class
F-box proteins
bHLH transcription factor
F-box protein
ATP sulfurylases
Growth Regulating Factor
(GRL)transcription factors
Rhodenase-like protein
Kinesin-likeprotein B
Laccases
Beta-6 tubulin
Copper superoxide dismutases
CytochromeC oxidase
subunit V
Phosphate transporter
Squamosa-promoterBinding
Protein (SBP)-liketranscription
factors
MYB transcription factors
TCP transcription factors
160
162
164
166
167
168
169
171
172
Auxin Response Factors (ARF
transcription factors)
DICER-LIKE 1
NAC domain transcription
factors
HD-Ziptranscription factors
Auxin ResponseFactors (ARF
transcription factors)
ARGONAUTE
CCAAT Binding Factor (CBF)
HAP2-like transcription factors
SCARECROW-like
transcription factors
APETALA2-liketrancription
factors
Target genes
At1g12820(1), At3g26810(1), At3g629801TIR1(1.5), At4g03190(2.5)
At3g23690(2)*
At1g27340(1)
At3g228901APS1(1.5), At4g146801APS3(1 .5), At5g43780/APS4(0.5)
At2g22840/GRL1(3), At2g364001GRL3(3), At2g45480/GRL9(3),
At3g529101GRL4(3), At4g241501GRL8(3), At4g377401GRL2(3),
At5g536601GRL7(3)
At2g40760(2.5)
At4g271801ATK2(3)
At2g29130(0.5), At2g38080(1), At5g60020(1 )
At5g12250(3)
At lgO88301CSD1(3),At2g28190/CSD2(3.5)*
At3g15640(3)*
At3g54700(2)
Atlg273601SPL11(1),Atlg27370/SPL10(1)a,Atlg53160/SPL4(2),
Atlg691701SPL4(1), At2g338101SPL3(1.5), At2g422001SPL9(1),
At3g152701SPL5(3), At3g57920(1), At5g43270/SPL2(1)b,
At5g50570(1), At5g50670(1)
At2g26950/MYB104(1.5), At2g26960/MYB81(2.5),
C
At2g324601MYB101(1.5), At3g114401MYB65(1.5)a , At3g60460(1.5),
At4g269301MYB97(2.5), At5gO6100/MYB33(1 .5)C,
At5g550201MYB120(2)
Atlg302101TCP24(2.5)c, AtIg53230/TCP3(3),
At2g310701TCP10(2.5)c, At3gI5030/TCP4(2.5)c,
At4g183901TCP2(2.5)c
Atlg778501ARF17(0.5) b, At2g28350/ARFIO(1 )b
At4g300801ARF16(1
.5)
AtIg010401DCL1(2)d
Atlg56010NAC1(1 )e, At3g151701CUC(1 )b'e,At5907680(1.5)e,
At5g39610(2),At5g539501CUC2()'b , At5g61430(1.5)e
At1g304901PHV(1. 5 )f, Atlg52150/ A THB-15(1.5), At2g34710PHB(1 .5 )f,
At4g328801A THB-8(1.5), At5g60690/REV(1 .5)9
AtIg30330/ARF6(2), At5g370201ARF8(2)a
Atlg484101AGO(2.5)ah
Atlg17590(1.5), Atlg54160(2),At1g72830(1.5),At3g05690(1.5),
At3g20910(2), At5g06510(1.5), At5g12840(1.5)
At2g45160(0), At3g60630(0)a", At4g001501SCL6(O)
At2g28550/TOE1(1 5 )bJ,At2g39250(1), At4g36920/AP2(0.5)b' k,
At5601201TOE2(O.5)bJ, At5g671801TOE3(1. 5)b
-
Ambros, V., Lee, R. C., Lavanway, A., Williams, P. T., and Jewell, D. (2003). MicroRNAs and
other tiny endogenous RNAs in C. elegans. Curr Biol 13, 807-818.
Aukerman, M. J., and Sakai, H. (2003). Regulation of flowering time and floral organ identity by
a MicroRNA and its APETALA2-like target genes. Plant Cell 15, 2730-2741.
Bartel, B., and Bartel, D. P. (2003). MicroRNAs: at the root of plant development? Plant Physiol
132, 709-717.
Bartel, D. P. (2004). MicroRNAs: genomics, biogenesis, mechanism, and function. Cell 116,
281-297.
Chen, X. (2004). A MicroRNA as a Translational Repressor of APETALA2 in Arabidopsis
Flower Development. Science 303, 2022-2025.
Elbashir, S. M., Lendeckel, W., and Tuschl, T. (2001). RNA interference is mediated by 21- and
22-nucleotide RNAs. Genes Dev 15, 188-200.
Emery, J. F., Floyd, S. K., Alvarez, J., Eshed, Y., Hawker, N. P., Izhaki, A., Baum, S. F., and
Bowman, J. L. (2003). Radial patterning of Arabidopsis shoots by class III HD-ZIP and KANADI
genes. Curr Biol 13, 1768-1774.
Enright, A. J., John, B., Gaul, U., Tuschl, T., Sander, C., and Marks, D. S. (2003). MicroRNA
targets in Drosophila. Genome Biol 5, R1.
Floyd, S. K., and Bowman, J. L. (2004). Gene regulation: ancient microRNA target sequences in
plants. Nature 428, 485-486.
Gagne, J. M., Downes, B. P., Shiu, S. H., Durski, A. M., and Vierstra, R. D. (2002). The F-box
subunit of the SCF E3 complex is encoded by a diverse superfamily of genes in Arabidopsis.
Proc Natl Acad Sci U S A 99, 11519-11524.
94
Grad, Y., Aach, J., Hayes, G. D., Reinhart, B. J., Church, G. M., Ruvkun, G., and Kim, J. (2003).
Computational and experimental identification of C. elegans microRNAs. Mol Cell 11, 12531263.
Gray, W. M., Kepinski, S., Rouse, D., Leyser, O., and Estelle, M. (2001). Auxin regulates
SCF(TIR1)-dependent degradation of AUX/IAA proteins. Nature 414, 271-276.
Grishok, A., Pasquinelli, A. E., Conte, D., Li, N., Parrish, S., Ha, I., Baillie, D. L., Fire, A.,
Ruvkun, G., and Mello, C. C. (2001). Genes and mechanisms related to RNA interference
regulate expression of the small temporal RNAs that control C. elegans developmental timing.
Cell 106, 23-34.
Hammond, S. M., Bernstein, E., Beach, D., and Hannon, G. J. (2000). An RNA-directed
nuclease mediates post-transcriptional gene silencing in Drosophila cells. Nature 404, 293-296.
Hutvagner, G., McLachlan, J., Pasquinelli, A. E., Balint, E., Tuschl, T., and Zamore, P. D.
(2001). A cellular function for the RNA-interference enzyme Dicer in the maturation of the let-7
small temporal RNA. Science 293, 834-838.
Hutvagner, G., and Zamore, P. D. (2002). A microRNA in a multiple-turnover RNAi enzyme
complex. Science 297, 2056-2060.
Juarez, M. T., Kui, J. S., Thomas, J., Heller, B. A., and Timmermans, M. C. (2004). microRNA-
mediated repression of rolled leafl specifies maize leaf polarity. Nature 428, 84-88.
Kasschau, K. D., Xie, Z., Allen, E., Llave, C., Chapman, E. J., Krizan, K. A., and Carrington, J.
C. (2003). P1/HC-Pro, a viral suppressor of RNA silencing, interferes with Arabidopsis
development and miRNA unction. Dev Cell 4, 205-217.
95
Ketting, R. F., Fischer, S. E., Bernstein, E., Sijen, T., Hannon, G. J., and Plasterk, R. H. (2001).
Dicer functions in RNA interference and in synthesis of small RNA involved in developmental
timing in C. elegans. Genes Dev 15, 2654-2659.
Khvorova, A., Reynolds, A., and Jayasena, S. D. (2003). Functional siRNAs and miRNAs
exhibit strand bias. Cell 115, 209-216.
Kidner, C. A., and Martienssen, R. A. (2003). Macro effects of microRNAs in plants. Trends
Genet 19, 13-16.
Kim, J. H., Choi, D., and Kende, H. (2003). The AtGRF family of putative transcription factors
is involved in leaf and cotyledon growth in Arabidopsis. Plant J 36, 94-104.
Kliebenstein, D. J., Monde, R. A., and Last, R. L. (1998). Superoxide dismutase in Arabidopsis:
an eclectic enzyme family with disparate regulation and protein localization. Plant Physiol 118,
637-650.
Lagos-Quintana, M., Rauhut, R., Lendeckel, W., and Tuschl, T. (2001). Identification of novel
genes coding for small expressed RNAs. Science 294, 853-858.
Lai, E. C. (2003). microRNAs: runts of the genome assert themselves. Curr Biol 13, R925-936.
Lai, E. C., Tomancak, P., Williams, R. W., and Rubin, G. M. (2003). Computational
identification of Drosophila microRNA genes. Genome Biol 4, R42.
Lappartient, A. G., Vidmar, J. J., Leustek, T., Glass, A. D., and Touraine, B. (1999). Inter-organ
signaling in plants: regulation of ATP sulfurylase and sulfate transporter genes expression in
roots mediated by phloem-translocated compound. Plant J 18, 89-95.
Lau, N. C., Lim, L. P., Weinstein, E. G., and Bartel, D. P. (2001). An abundant class of tiny
RNAs with probable regulatory roles in Caenorhabditis elegans. Science 294, 858-862.
96
Lee, R. C., and Ambros, V. (2001). An extensive class of small RNAs in Caenorhabditis
elegans. Science 294, 862-864.
Lee, Y., Ahn, C., Han, J., Choi, H., Kim, J., Yim, J., Lee, J., Provost, P., Radmark, O., Kim, S.,
and Kim, V. N. (2003). The nuclear RNase III Drosha initiates microRNA processing. Nature
425, 415-419.
Lee, Y., Jeon, K., Lee, J. T., Kim, S., and Kim, V. N. (2002). MicroRNA maturation: stepwise
processing and subcellular localization. Embo J 21, 4663-4670.
Leustek, T. (2002). Sulfate Metabolism. The Arabidopsis Book, 1-16.
Lewis, B. P., Shih, I. H., Jones-Rhoades, M. W., Bartel, D. P., and Burge, C. B. (2003).
Prediction of mammalian microRNA targets. Cell 115, 787-798.
Lim, L. P., Glasner, M. E., Yekta, S., Burge, C. B., and Bartel, D. P. (2003a). Vertebrate
microRNA genes. Science 299, 1540.
Lim, L. P., Lau, N. C., Weinstein, E. G., Abdelhakim, A., Yekta, S., Rhoades, M. W., Burge, C.
B., and Bartel, D. P. (2003b). The microRNAs of Caenorhabditis elegans. Genes Dev 17, 9911008.
Llave, C., Kasschau, K. D., Rector, M. A., and Carrington, J. C. (2002a). Endogenous and
silencing-associated small RNAs in plants. Plant Cell 14, 1605-1619.
Llave, C., Xie, Z., Kasschau, K. D., and Carrington, J. C. (2002b). Cleavage of Scarecrow-like
mRNA targets directed by a class of Arabidopsis miRNA. Science 297, 2053-2056.
Mallory, A. C., Dugas, D. V., Bartel, D. P., and Bartel, B. (2004). MicroRNA regulation of
NAC-domain targets is required for proper formation and separation of adjacent embyonic,
vegatative, and floral organs. Curr Biol In press.
97
Mallory, A. C., Ely, L., Smith, T. H., Marathe, R., Anandalakshmi, R., Fagard, M., Vaucheret,
H., Pruss, G., Bowman, L., and Vance, V. B. (2001). HC-Pro suppression of transgene silencing
eliminates the small RNAs but not transgene methylation or the mobile signal. Plant Cell 13,
571-583.
Maruyama-Nakashita, A., Inoue, E., Watanabe-Takahashi, A., Yamaya, T., and Takahashi, H.
(2003). Transcriptome profiling of sulfur-responsive genes in Arabidopsis reveals global effects
of sulfur nutrition on multiple metabolic pathways. Plant Physiol 132, 597-605.
Mayer, A. M., and Staples, R. C. (2002). Laccase: new functions for an old enzyme.
Phytochemistry 60, 551-565.
McConnell, J. R., Emery, J., Eshed, Y., Bao, N., Bowman, J., and Barton, M. K. (2001). Role of
PHABULOSA and PHAVOLUTA in determining radial patterning in shoots. Nature 411, 709713.
Mette, M. F., van der Winden, J., Matzke, M., and Matzke, A. J. (2002). Short RNAs can
identify new candidate transposable element families in Arabidopsis. Plant Physiol 130, 6-9.
Mourelatos, Z., Dostie, J., Paushkin, S., Sharma, A., Charroux, B., Abel, L., Rappsilber, J.,
Mann, M., and Dreyfuss, G. (2002). miRNPs: a novel class of ribonucleoproteins containing
numerous microRNAs. Genes Dev 16, 720-728.
Palatnik, J. F., Allen, E., Wu, X., Schommer, C., Schwab, R., Carrington, J. C., and Weigel, D.
(2003). Control of leaf morphogenesis by microRNAs. Nature 425, 257-263.
Park, W., Li, J., Song, R., Messing, J., and Chen, X. (2002). CARPEL FACTORY, a Dicer
homolog, and HEN1, a novel protein, act in microRNA metabolism in Arabidopsis thaliana.
Curr Biol 12, 1484-1495.
98
Reinhart, B. J., Weinstein, E. G., Rhoades, M. W., Bartel, B., and Bartel, D. P. (2002).
MicroRNAs in plants. Genes Dev 16, 1616-1626.
Rhoades, M. W., Reinhart, B. J., Lim, L. P., Burge, C. B., Bartel, B., and Bartel, D. P. (2002).
Prediction of plant microRNA targets. Cell 110, 513-520.
Ruegger, M., Dewey, E., Gray, W. M., Hobbie, L., Turner, J., and Estelle, M. (1998). The TIR1
protein of Arabidopsis functions in auxin response and is related to human SKP2 and yeast
grrlp. Genes Dev 12, 198-207.
Samach, A., Klenz, J. E., Kohalmi, S. E., Risseeuw, E., Haughn, G. W., and Crosby, W. L.
(1999). The UNUSUAL FLORAL ORGANS gene of Arabidopsis thaliana is an F-box protein
required for normal patterning and growth in the floral meristem. Plant J 20, 433-445.
Schwarz, D. S., Hutvagner, G., Du, T., Xu, Z., Aronin, N., and Zamore, P. D. (2003).
Asymmetry in the assembly of the RNAi enzyme complex. Cell 115, 199-208.
Stark, A., Brennecke, J., Russell, R. B., and Cohen, S. M. (2003). Identification of Drosophila
MicroRNA Targets. PLoS Biol 1, E60.
Takahashi, H., Yamazaki, M., Sasakura, N., Watanabe, A., Leustek, T., Engler, J. A., Engler, G.,
Van Montagu, M., and Saito, K. (1997). Regulation of sulfur assimilation in higher plants: a
sulfate transporter induced in sulfate-starved roots plays a central role in Arabidopsis thaliana.
Proc Natl Acad Sci U S A 94, 11102-11107.
Tang, G., Reinhart, B. J., Bartel, D. P., and Zamore, P. D. (2003). A biochemical framework for
RNA silencing in plants. Genes Dev 17, 49-63.
Ulmasov, T., Hagen, G., and Guilfoyle, T. J. (1999). Dimerization and DNA binding of auxin
response factors. Plant J 19, 309-319.
99
Vance, V. B. (1991). Replication of potato virus X RNA is altered in coinfections with potato
virus Y. Virology 182, 486-494.
Vaucheret, H., Vazquez, F., Crete, P., and Bartel, D. P. (2004). The action of ARGONAUTE1 in
the miRNA pathway and its regulation by the miRNA pathway are crucial for plant
development. Genes Dev In Press.
Vazquez, F., Gasciolli, V., Crete, P., and Vaucheret, H. (2004). The nuclear dsRNA binding
protein HYL1 is required for microRNA accumulation and plant development, but not
posttranscriptional transgene silencing. Curr Biol 14, 346-351.
Vierstra, R. D. (2003). The ubiquitin/26S proteasome pathway, the complex last chapter in the
life of many plant proteins. Trends Plant Sci 8, 135-142.
Wilkinson, M. D., and Haughn, G. W. (1995). UNUSUAL FLORAL ORGANS Controls Meristem
Identity and Organ Primordia Fate in Arabidopsis. Plant Cell 7, 1485-1499.
Xie, Q., Frugis, G., Colgan, D., and Chua, N. H. (2000). Arabidopsis NACI transduces auxin
signal downstream of TIR1 to promote lateral root development. Genes Dev 14, 3024-3036.
Xie, Z., Johansen, L. K., Gustafson, A. M., Kasschau, K. D., Lellis, A. D., Zilberman, D.,
Jacobsen, S. E., and Carrington, J. C. (2004). Genetic and Functional Diversification of Small
RNA Pathways in Plants. PLoS Biol 2, E104.
Xie, Z., Kasschau, K. D., and Carrington, J. C. (2003). Negative feedback regulation of DicerLikel in Arabidopsis by microRNA-guided mRNA degradation. Curr Biol 13, 784-789.
Yekta, S., Shih, I. H., and Bartel, D. P. (2004). MicroRNA-directed cleavage of HOXB8 mRNA.
Science In press.
100
MicroRNA-mediated regulation of an F-box gene is required for
embryonic, floral, and vegetative development
Matthew W. Jones-Rhoades1 and David P. Bartell 2
'Whitehead Institute for Biomedical Research and Department of Biology, Massachusetts
Institute of Technology, 9 Cambridge Center, Cambridge, Massachusetts 02142
2 Correspondence:
dbartel@wi.mit.edu
101
Abstract
MicroRNAs are endogenous -21 nt RNAs that function as post-transcriptional regulators
in both plants and animals. miR394, and conserved miR394-complementary sites in F-box
mRNAs, were previously identified in a bioinformatic screen for unknown miRNAs. Here
we show that miR394-mediated regulation of F-box gene Atlg27340 is required at multiple
stages of Arabidopsis development. Transgenic plants expressing a miR394-resistant
version of Atlg27340 display a range of developmental abnormalities, including radialized
and fused cotyledons, absent shoot apical meristems, curled and radialized leaves, and
abortive flowers. The severity of these abnormalities correlates with the overaccumulation
of Atlg27340
mRNA, suggesting that an SCFAtlg27340complex ubiquitinates an activator of
class III HD-ZIP function.
Introduction
MicroRNAs (miRNAs) are endogenous - 21 nucleotide non-coding RNAs that regulate
gene expression in both plants and animals (reviewed in (Bartel, 2004)). Initially expressed as
single-stranded stem-loop precursor RNAs, miRNAs require the RNase III enzyme DICERLIKE1 (DCL1), as well as HEN1, HYL1, HST, and AGO1, for proper processing and
accumulation (Park et al., 2002; Reinhart et al., 2002; Boutet et al., 2003; Han et al., 2004;
Vaucheret et al., 2004; Vazquez et al., 2004; Park et al., 2005). Many miRNAs isolated from
Arabidopsis are conserved to Oryza (rice) and other plant species (Reinhart et al., 2002; Floyd
and Bowman, 2004; Jones-Rhoades and Bartel, 2004; Sunkar and Zhu, 2004; Axtell and Bartel,
2005), suggesting that miRNAs have evolutionarily conserved roles in land plants. Regulatory
targets have been confidently predicted for most Arabidopsis miRNAs based on the high degree
of complementarity between the miRNAs and their target mRNAs (Rhoades et al., 2002; JonesRhoades and Bartel, 2004). Like the miRNAs themselves, many of these miRNA target sites are
broadly conserved in plant species (Rhoades et al., 2002; Jones-Rhoades and Bartel, 2004).
Perhaps because of the extensive complementarity of plant miRNA-target duplexes, most
Arabidopsis miRNAs guide the cleavage of target mRNAs (Llave et al., 2002; Kasschau et al.,
2003; Tang et al., 2003).
Several lines of evidence indicate that plant miRNAs play key roles in a broad range of
developmental processes. Plants with dcll, henl, agol, hst, or hyll mutations have severe and
pleotropic developmental abnormalities (Bohmert et al., 1998; Telfer and Poethig, 1998; Lu and
102
Fedoroff, 2000; Chen et al., 2002; Morel et al., 2002; Schauer et al., 2002) which correlate with
the impairment of miRNA activity (Park et al., 2002; Reinhart et al., 2002; Boutet et al., 2003;
Han et al., 2004; Vaucheret et al., 2004; Vazquez et al., 2004; Park et al., 2005), as do plants
which express certain viral suppressors of RNA mediated silencing (Mallory et al., 2002;
Kasschau et al., 2003; Chapman et al., 2004; Chen et al., 2004; Dunoyer et al., 2004). The
majority of confirmed and predicted evolutionarily-conserved miRNA targets are mRNAs that
encode for transcription factors and other regulatory proteins, such as F-box proteins and
components of the miRNA pathway itself (Jones-Rhoades and Bartel, 2004). Plants with
impaired miRNA-mediated regulation of particular transcription factor mRNAs have been shown
to have various developmental phenotypes. For example, plants expressing miR166 resistant
versions of HD-ZIP genes PHABULOSA, PHAVOLUTA, and REVOLUTA have radialized leaves
or vasculature (Emery et al., 2003; Kidner and Martienssen, 2004; Mallory et al., 2004b; Zhong
and Ye, 2004), and miRNA-resistant copies of certain TCP or ARF transcription factors result in
seedlings that arrest or that have extra cotyledons, respectively(Palatnik et al., 2003; Mallory et
al., 2005).
A recent bioinformatic screen for conserved plant miRNAs and targets identified two
miRNA families that guide the cleavage of mRNAs that encode for F-box proteins (JonesRhoades and Bartel, 2004). F-box proteins are specificity determinants of SCF E3 ubiquitin
ligases, which facilitate the transfer of ubiquitin from E2 ubiquitin conjugating proteins to
specific target proteins, thereby marking them for degradation by the 26S proteasome (reviewed
in (Deshaies, 1999; Smalle and Vierstra, 2004)). In Saccharomyces cerevisiae, SCF complexes
are composed of four primary subunits: Cullinl, Rbxl, and Skpl are thought to comprise the
core ubiquitin ligase activity, and an F-box protein is thought to serve as a bridge between the
SCF complex and the target protein (Deshaies, 1999; Smalle and Vierstra, 2004). The - 60
amino acid N-terminal F-box domain interacts with the rest of the SCF complex, and the Cterminal portion, which is highly divergent between different F-box proteins, is thought to
interact with the target protein and thus confer specificity of ubiquitination (Zheng et al., 2002;
Willems et al., 2004).
The Arabidopsis genome contains nearly 700 F-box proteins (Gagne et al., 2002), several
of which have been shown to be important for diverse aspects of plant biology such as hormone
signaling, response to the environment, and developmental patterning. TRANSPORT
103
INHIBITOR RESPONSE1 (TIR1) targets AUX/IAA proteins for degradation in an auxindependent manner, and is needed for auxin-induced developmental processes (Ruegger et al.,
1998; Gray et al., 2001). The F-box proteins EBF1/EBF2, GID2 and COI1 mediate ethylene,
gibberellin, and jasmonate signaling, respectively (Xie et al., 1998; Guo and Ecker, 2003;
Potuschak et al., 2003; Sasaki et al., 2003; Gagne et al., 2004).
UNUSUAL FLORAL ORGANS
(UFO) is required for proper floral development (Wilkinson and Haughn, 1995; Samach et al.,
1999), and ORE1 regulates leaf senescence and axillary shoot growth(Woo et al., 2001;
Stirnberg et al., 2002). However, the majority of Arabidopsis F-box genes have no known
function. It is likely that many of these F-box proteins (or subclades of F-box proteins) each
target specific proteins for ubiquitination and proteolysis.
Here we show that miR394-mediated regulation of Atlg27340, an F-box gene related to
UFO, is required for proper development. Seedlings expressing 5mAtlg27340, a miR394resistant version of Atlg27340, frequently arrest without forming shoot apical meristems
(SAMs), often with fused and/or radialized cotyledons. 5mAtlg27340 expressing plants that do
form SAMs have pleotropic defects in vegetative and floral development, including downwardly
curled leaves and abortive flowers. These developmental abnormalities correlate with the
overaccumulation of Atlg27340 mRNA, suggesting that they are the result of overexpression of
an Atlg27340-directed SCF ubiquitin ligase.
Results
Atlg27340 defines a conserved class of miR394-regulated F-box genes with homology to UFO
In a phylogenetic tree of 694 F-box genes, Atlg27340 falls in a subclade of five genes
that contains UFO (Gagne et al., 2002). Although Atlg27340 is the second best blastp hit to
UFO in the Arabidopsis genome (E value 6.7-2 1), the two proteins have only -30% similarity at
the amino acid level. UFO is unlikely to be regulated by miR394; whereas miR394 can pair to
Atlg27340 with 19 out of 20 nucleotides, only 12 out of 20 miR394 nucleotides can pair to the
corresponding section of UFO (Figure la).
Although the similarity between Atlg27340 and UFO is limited, F-box genes in other
plant species are highly similar to Atlg27340 and contain conserved miR394 complementary
sites. Two Populus and one Oryza F-box proteins have Atlg27340 as their best Arabidopsis
blastp hits (Figure la). All three of these proteins have are at least 75% similar to Atlg27340 at
the amino acid level, including extensive identity in the C-terminal region that is likely to specify
104
substrate recognition, and miR394 can pair to the mRNA encoding each with 0-1 unpaired
nucleotides (Figure la). In addition to these Atlg27340-like genes in plants with sequenced
genomes, numerous plant species have ESTs which a) have Atlg27340 as their best Arabidopsis
blastx hit and b) can pair to miR394 with 0-1 mismatches (Figure 1). These miR394complementary, Atlg27340-like
ESTs are found in both monocots and dicots, as well as in
conifers (genus Picea). This conservation implies that the divergence of Atlg27340 from UFO
and the regulation of Atlg27340-like genes by miR394 predate the divergence of gymnosperms
and angiosperms.
miR394 regulation of Atlg27340 is required for normal development
In vivo miR394-directed cleavage of Atlg27340 can be detected by 5' RACE (JonesRhoades and Bartel, 2004). In order to investigate the biological significance miR394-mediated
regulation, we constructed a mutant version of Atlg27340 with reduced complementarity to
miR394. This 5mAtlg27340 construct encodes for the same amino acid sequence as Atlg27340,
but has five silent mutations within the miR394 complementary site, and contains 1.6 kb of
putative promoter sequence upstream of Atlg27340 (Figure lb). We transformed Arabidopsis
thaliana separately with both this 5mAtlg27340 construct and with an unmutated Atlg27340
control construct. Only 1 out of 91 control Atlg27340 primary transformants (-1%) had any
developmental abnormalities (small outgrowths from the midveins of a few cauline leaves on
one plant). In contrast, 65 of 105 5mAtlg27340 primary transformants (62%) displayed various
vegetative and floral phenotypes (Table 1). Most noticeably, 51 5mAtlg27340 transformants
(49%) had moderately to severely downwardly curled rosette leaves (Figure 2a,b). Fifty-one
5nmAtlg27340transformants (49%) also had cauline leaf abnormalities. Most commonly, cauline
leaves had a spiked outgrowth protruding from the abaxial midvein (Figure 2b,c2). In other
cases, the entire cauline leaf was replaced by a radialized, spiked structure (Figure 2c3-6). In
some cases, these radialized cauline leaves subtended approximately wild-type axillary
inflorescences (Figure 2c6), whereas in other cases radialized cauline leaves subtend axillary
inflorescences that themselves produce aberrant cauline leaves and flowers (Figure 2c3,4). The
number and severity of abnormal cauline leaves generally correlated with the extent of rosette
leaf curling.
5mAtlg27340 transformants also exhibited various floral abnormalities. Most of the
flowers that were produced had the expected numbers of organs and were fertile, although
105
flowers on 13 of the plants (12%) with stronger phenotypes were generally missing 1-4 petals
and had reduced fertility. Twenty-seven 5mAtlg27340 transformants (26%) sporadically
produced abortive flowers that consisted of only a filamentous structure in some cases (Figure 2d
inset, Figure 2e), whereas in other cases flowers consisted of two sepals without any other floral
organs (Figure 2e). The percentage of abortive flowers produced per plant varied from -1% to
-40%. In most cases, a single inflorescence would alternate between producing fertile and
abortive flowers in a seemingly stochastic pattern (Figure 2d,e). In extreme cases, inflorescences
of 5mAtlg27340 plants produced a proliferation of determinate filaments in place of floral buds
(Figure 2e). Shoots of 5mAtlg27340 expressing plants often had a seemingly stochastic
phyllotaxy of maturing siliques, with the locations of the missing siliques marked by abortive
filaments or empty flowers (Figure 2d).
Approximately 10% of 5mAtlg27340 T1 transformants failed to develop a SAM and
never formed any true leaves. Analysis of T2 seeds for several 5mAtlg27340 lines revealed that
seedling arrest occurred in 0-55% of T2 seedlings, with the percentage of arrested seedlings
correlating with the severity of the T1 phenotype, whereas the remainder of Basta-resistant
seedlings did form SAMs and recapitulated the vegetative and floral abnormalities observed in
their T1 parents (Table 2). The arrested seedlings displayed a range of different phenotypes
(Figure 3a,b). Some seedlings had only one cotyledon (Figure 3bl,4), whereas others had two
(Figure 3b2,3). In some cases the cotyledons were radialized (Figure 3bl,2), whereas in other
cases seedlings had cotyledons that approached wild-type size and shape, but did not form
functional SAMs (Figure 3b3,4). In some of these cases, one or two determinate, spike-like
structures eventually emerged from the region where the SAM should have been (data not
shown).
5mAtlg27340 plants overaccumulate Atlg27340 mRNA
Many plant miRNAs guide the cleavage to target mRNAs. Because of this, mRNAs
targeted by miRNAs generally overaccumulate in plants impaired in miRNA function, and the
expression of a miRNA-resistant version of a miRNA target can similarly result in
overaccumulation of the miRNA-resistant mRNA. We find that this is the case with
5mAtlg27340-expressing plants; normalized Atlg27340 mRNA levels are 1.7, 2.8, and 2.2 fold
higher in leaves, inflorescences, and seedlings, respectively, in T2 5mAtlg27340 plants
compared to control Atlg27340 plants (Figure 3b). Atlg27340 mRNA levels are highest in
106
.5mAtlg27340 T2 seedlings that lack SAMs; these arrested seedlings accumulate Atlg27340
transcripts at levels 1.8 fold higher than 5mAtlg27340 seedlings with functional SAMs and 3.9
-foldhigher than control Atlg27340 seedlings.
Discussion
We find that expression of a miR394-resistant version of Atlg27340 has broad ranging
effects on Arabidopsis development, whereas expression of an additional wild-type copy does
not. These results confirm the biological relevance of the interaction between miR394 and
Atlg27340, and represent the first insights into the roles of miRNA-mediated regulation of F-box
genes. Our finding that Atlg27340 mRNA levels are increased in plants expressing
5mAtlg27340 is consistent with the idea that miR394 exerts its influence over Atlg27340
primarily through guided RNA cleavage. Indeed, the extent of developmental abnormalities
correlates with the level of Atlg27340 mRNA in that Atlg27340 transcript levels are highest in
seedlings that fail to develop shoot apical meristems.
The Arabidopsis shoot apical meristem is a small group of pluripotent cells which gives
rise to all aerial tissues and organs (reviewed in (Baurle and Laux, 2003). The proper initiation
of and maintenance of SAM pluripotency requires a complex interplay of gene interactions, and
is critical to all stages of vegetative and floral development. SHOOTMERISTEMLESS (STM)
and WUSCHEL (WUS), which encode for homeodomain transcription factors, act in parallel to
initiate and maintain meristem identity (Endrizzi et al., 1996; Laux et al., 1996; Long et al.,
1996; Mayer et al., 1998). The embryonic expression of STM, and hence the embryonic
establishment of SAM identity, is dependent on the proper development of the cotyledons.
Embryos with double homozygous mutations in the NAC domain transcription factors CUPSHAPED COTYLEDONS1 (CUC1) and CUC2 have fused cotyledons and fail to initiate STM
expression during embryogenesis (Aida et al., 1997; Aida et al., 1999). Similarly, the correct
balance between the antagonistic activities of class III HD-ZIP and KANADI transcription
factors is essential for proper cotyledon development and SAM formation. Seedlings which are
either homozygous for loss-of-function mutations in three partially redundant HD-ZIP genes
(phblphv/rev), or which overexpress KANADI genes, have one or two radialized cotyledons and
fail to initiate SAMs (Eshed et al., 2001; Kerstetter et al., 2001; Emery et al., 2003).
Our results establish that both MIR394 and Atlg27340 are also important regulators of
meristem identity. MIR394 is expressed highly in inflorescences (Jones-Rhoades and Bartel,
107
2004), and the relative increase of Atlg27340 mRNA in 5mAtlg27340 plants was greatest in
inflorescences. This increase in Atlg27340 mRNA levels is likely associated with the
overexpression of Atlg27340 protein and/or the accumulation of Atlg27340 protein in cells in
which miR394 would block wild-type Atlg27340 expression. If Atlg27340 functions in SCF E3
ubiquitin ligases as do other F-box proteins, then the observed 5mAtlg27340 phenotypes are
likely to be the result of increased ubiquitination and protealysis of unknown factors targeted by
the putative SCFAtlg27340
ubiquitin ligase. Because many 5mAtlg27340 seedlings have abnormal
cotyledons, the target of SCFAtlg27340 is likely to be upstream of STM, which is dispensable for
cotyledon development (Endrizzi et al., 1996; Long et al., 1996). The 5mAtlg27340 seedlings
with one or two fused cotyledons are reminiscent of homozygous phblphvlrev triple mutants and
KANADI overexpressors (Eshed et al., 2001; Kerstetter et al., 2001; Emery et al., 2003),
suggesting that the targets of SCFAtlg27340may be activators of the HD-ZIP activity or repressors
of KANADI function. Because these genes are also important for the proper initiation and
patterning of lateral organs and meristems post-embryonically, the vegetative and phenotypes
observed in 5mAtlg27340 plants might also be related to a misregulation of HD-ZIP or
KANADI activities. Indeed, plants homozygous for loss-of-function alleles for multiple class III
HD-ZIP genes sporadically initiate abortive flowers (Prigge et al., 2005) in a manner reminiscent
of 5mAtlg27340 expressing plants.
Experimental Procedures
DNA constructs and transgenic plants
BAC clone F17L21 was digested with SpeI and NsiI to yield a 5.1 kb fragment
containing Atlg27340, as well 1.6 kb of upstream sequence and 0.9 kb of downstream sequence,
which was ligated into SpeI and PsI cut pBluescriptIISK+ (Stratagene). Site directed
mutagenesis was performed by PCR with PfuUltra polymerase and the primers
GCACCATATGTTCGGCATGCGATCAACTTCCTTCCACAACAGTGT and
ACACTGTTGTGGAAGGAAGTTGATCGCATGCCGAACATATGGTGC, followed by DpnI
digestion. Following mutagenesis, a 2.5 kb Hindm-BamHI fragment of the original Atlg27340
clone was replaced with the corresponding fragment containing the mutagenized miR394
complementary site, which was sequenced to ensure that no additional mutations had occurred
during PCR. Wild-type and mutant ATlg27340 5.1 kb Spel-HindIII fragments were subcloned
108
into the binary vector pGreenII0229, and electroporated into Agrobacterium tumefaciens strain
GV3101::pMP90. Arabidopsis thaliana (Columbia accession) was transformed by the floral dip
method (Clough and Bent, 1998), and the collected seeds were surface sterilized and plated on
Bouterage No.2 media (Duchefa Biochemie) containing 10 ug/ml Basta. Seedlings were grown
under long day conditions (20° C, 16 hr light, 8 hr dark) for about 10 days before transfer to soil
consisting of 50% promix (Premier Horticulture) and 50% redi-earth (Scotts).
RNA Isolation and Northern blot analysis
Total RNA was isolated as described (Mallory et al., 2001). For mRNA northerns, 12 ug
of total RNA was size fractionated on a 1% agarose/formaldehde gel and transferred to a
nitrocellulose membrane as described (Mallory et al., 2005). 1.4 kb of exon2 of Atlg27340 was
PCR amplified with primers AGTCTCTAGAATGGTGTTGCCCTGTATTGAGGA and
CAGTAAGCTTAAGAGGTTCCACACAACCCA, and directionally cloned into
pBluescriptIISK+ (Stratagene). Following XbaI digestion, this template was used to generate
Atlg27340 antisense RNA probe by T7 transcription in the presence of a-3 2P UTP. Blots were
hybridized in at 680 C in Ultrahyb buffer overnight, and washed successively with 2X SSC, 0.1%
SDS (two times) and 0.1X SSC, 0.1% SDS (two times). For miRNA northerns, 30 ug total RNA
was fractionated on a 15% polyacrylamide gel, transferred to a nitrocellulose membrane,
hybridized, and washed as described, using the 5' 32p labeled DNA oligo
AGGAGGTGGACAGAATGCCAA as a probe for miR394.
Scanning Electron Microscopy
Plant tissues were fixed, dehydrated, critical point dried, and coated with gold and
palladium as described (Mallory et al., 2004a). Samples were imaged on a Jeol 5600LV
scanning electron microscope.
109
Table 1. Observed phenotypes of T1 transformant plants
construct
Atlg27340
5mAtlg27340
1 (1%)
curled
rosette
leaves
0 (0%)
spikes on
cauline
leaves
1 (1 %)
radialized
cauline
leaves
1 (1 %)
missing
petals
0 (0%)
abortive
flowers
0 (0%)
no SAM
0 (0%)
65 (62%)
51 (49%)
51 (49%)
23 (22%)
13 (12%)
27 (26%)
11 (10%)
total
wild type
development
abnormal
development
91
90 (99%)
105
40 (38%)
The number and percentage of T1 plants with various developmental abnormalities are indicated. See text for details.
Table 2. Observed phenotypes of 5mAtlg27340 T2 transformant plants
basta
Line
total
sensitive
no SAM
like T1
5mAtlg27340-16
36
27%
56%
18%
5mAt lg27340-1
36
23%
33%
44%
5mAtlg27340-23
37
30%
32%
38%
T1
phenotpye
severe
strong
strong
5mAtlg27340-6
66
21%
30%
49%
severe
5mAtlg27340-30
45
27%
26%
47%
severe
5mAtlg27340-44
95
25%
25%
50%
severe
5mAtlg27340-33
32
25%
20%
55%
strong
5mAtlg27340-3
33
20%
9%
71%
mild
5mAtlg27340-18
43
36%
6%
58%
mild
5mAtlg27340-24
22
24%
3%
73%
slight
5mAtlg27340-27
35
23%
0%
61%
slight
The observed frequencies of T2 phenotypes for 5mAtlg27340 lines are indicated, as is
the severity of developmental defects observed in the T1 parent of each line.
110
Figure Legends
Figure 1. Atlg27340 is complementary to miR394.
(A) miR394 complementary sites in F-box genes from different plant genera are depicted.
Nucleotides which can form Watson-Crick base pairs with miR394 are in upper case and
highlighted, whereas nucleotides which are mismatched or can form G:U wobble pairs are in
lower case. For each F-box gene, the Atlg27340 blastp (for Arabidopsis, Oryza, and Populus
proteins) or blastx (for ESTs from other genera) E value and rank (out of all Arabidopsis
proteins) are indicated. (B) The Atlg27340 genomic clone used to transform Arabidopsis is
depicted. Intergenic regions are shown as solid lines, UTR sequence as shaded boxes, coding
sequence as open boxes, and intronic sequence as a dashed line. The restriction sites used to
isolate the genomic clone from BAC F17L21 are indicated. Within the Atlg27340 coding
region, the position of the F-box domain ("F") and miR394 complementary site ("*") are shown.
The amino acid sequence, nucleotide sequence, and miR394-complementarity of the wild-type
and mutated miR394 complementary sites are shown.
Figure 2. Vegetative and floral phenotypes of 5mAtlg27340 plants.
(A) Three week old wild-type plant with broad, flat rosette leaves and T1 SmAtlg27340 plant
with downwardly curled rosette leaves. (B) Close-up views of flat wild-type rosette and cauline
leaves and curled T1 5mAtlg27340 rosette and cauline leaves (right). 5mAtlg27340 has a spiked
outgrowth from the abaxial midvein (arrow). (C) Control T2 Atlg27340 (1) and 5mAtlg27340
cauline leaves and axillary shoots (2-6). (D) Shoots of T2 control Atlg27340 and 5mAtlg27340
plants. At right are close up views of inflorescences showing reduction in silique number in
5mAtlg27340 plants. The inset show the presence of filaments on 5mAtig27340 shoots where
phyllotaxy suggests siliques should be. (E) Inflorescences of control Atlg27340 containing
flowers in various developmental stages and inflorescences of 5mAtlg27340 plants containing
numerous abortive filaments, a few empty flowers (arrows), as well as some reproductively
functional flowers.
Figure 3. Seedling phenotypes of 5mAtlg27340 plants
(A) Six day old control T2 Atlg27340 seedlings have the first pair of true leaves emerging from
the shoot apical meristem. (B) Some T2 5mAtlg27340 seedlings display a variety of
111
developmental abnormalities, including having one radicalized cotyledon (1), two radicalized
cotyledons (2), one flat cotyledon (4), and two flat cotyledons but no apparent true leaves (4).
(C) Atig27340 mRNA overaccumulates in 5mAtlg27340 plants. 12 ug of total RNA from
control Atlg27340 and 5mAtlg27340 rosette leaves (L), inflorescences (Inf), and seedlings (Se,
SAM-), was analyzed by Northern blot using a body labeled RNA probe complementary to most
of exon 2. For 5mAtlg27340, RNA was isolated separately from seedlings with (Se) and without
(SAM-) evident shoot apical meristems. The levels of Atlg27340 mRNA were quantified
relative to the ethidium bromide staining of the 25S ribosomal RNA.
112
Figure
1
At 9g27340
A
miR394
C
Yyy
FY
Ff
y 5 blast E value
yAI Y
(rank)
Atlg27340 Arabicdopsis 5s .. .
UFO Arabicdopsis
fgenesh4_pm.C_LG_111000589 Populus
estExt Genewisel vl.CLG
17715 PopulUS
CB292711
CD476694
AW351311
BQ971555
BJ571294
BQ874161
5' ... c 1
~c cig EIu
la
a
~
5'...
~.
5... lu
Citrus
Eschs scholzia 5 I .'. .
Glycir le
5'...i
nthus
5'... _
CX543200 Poncirus
5' ...
PrunuS
5... u _
3.7-62(1)
.3
4.4-59(1)
9.1-91(1)
. ..3
33.36(1)
.3
7.5-70(1)
..3
1.9-74(1)
.3
7.1- 3 6 (1)
.. 3
5.9-96(1)
.. 3
.. 3
7.4-57(1 )
2.7-142(1)
.. 3
4.9-6(1 )
_
---- R'
--
ar-lyr
...
5'...
.. 3
5...
5'...
._
~~ ~~ ·r-JUQlUlli~
5...
_
.. 3
.. 3
.3
5' ...
3
5 ...
-
-------~--
HIT
MAMldYs
..3
.3
5'...
5'... [QLWlbla
"
5'...
a
AL-I
U. .3
I .3
IlRI
rYI
4.3-174(1)
2.7-'00°°(1
)
3. 0-69(1)
3
~
-
6.721(3)
2.0-153(1)
3
-
5
5
.3
_
Solan um
5 ...u
Zantedeschia
...
Avena
Gossypium
Hordeum
Pennisetum
Saccharum
Sorghum
Triticum
Zea
Linum
Picea
.. 3
_.
5'..._u
5,...
.. 3
u __-
5 '...
Os01g69940 Oryza
c .3
.3
Heliar
5 ...
CN820826
BQ407881
AW982846
BM084705
CA076958
BE366831
BE427348
A1438876
CA482544
CO0204356
.3
5'...
Ipomc oea
Lactui ca
CN580831 Malus
CV049137
BF153392
AJ700842
B
3'
8.6-63(1)
8.2-81(1)
1.0-43(1)
1.7-57(1)
2.7 52(1)
2.451(1)
1.8-44(1)
4.8-62(1)
9.2-43(1)
Atlg27340 mRNA
Spel
Nsil
11111
1.6 kb 5'
0.9 kb 3'
0~~~~
Atlg273405'..
miR394
3'
5m-Atlg27340 5, ..
E
.
G
.. a
V
?
D
TJ
AI
R M
A .
P
~?V
2:U 116
,u. .c cx Atr1r
g32lc I. . . 33'
Figure 2
D
At1g27340
E
Atlg27340
At 1g27340
5mAtlg27340
Col
5mAtlg27340
Col
5mAt1 g27340
5mAt1g27340
Atlg27340
5mAtlg27340
5mAtlg27340
Atlg27340
5mAtlg27340
Figure 3
A
D
I-'
D
Atl g27340
Atlg27340 5mAt1g27340
L Inf Se
L Inf Se SAM-
4-
D
0
0
0
CC
arC
a:
::
miR394
Atlg27340
4-
arcu
-J
E
¸
iiiii;
4i!i!!
75:
C U1)
U
O
Jiz
0U)
_
-:
-21
-18
I1
U6
25S
3.3 1.0 2.0 5.6 2.8 4.3 7.8
At1 g27340/25S
I'
Aida, M., Ishida, T., and Tasaka, M. (1999). Shoot apical meristem and cotyledon formation
during Arabidopsis embryogenesis: interaction among the CUP-SHAPED COTYLEDON
and SHOOT MERISTEMLESS genes. Development 126, 1563-1570.
Aida, M., Ishida, T., Fukaki, H., Fujisawa, H., and Tasaka, M. (1997). Genes involved in
organ separation in Arabidopsis: an analysis of the cup-shaped cotyledon mutant. Plant
Cell 9, 841-857.
Axtell, M.J., and Bartel, D.P. (2005). Antiquity of MicroRNAs and Their Targets in Land
Plants. Plant Cell 17, 666-99999.
Bartel, D.P. (2004). MicroRNAs: genomics, biogenesis, mechanism, and function. Cell 116,
281-297.
Baurle, I., and Laux, T. (2003). Apical meristems: the plant's fountain of youth. Bioessays 25,
961-970.
Bohmert, K., Camus, I., Bellini, C., Bouchez, D., Caboche, M., and Benning, C. (1998).
AGO1 defines a novel locus of Arabidopsis controlling leaf development. Embo J 17,
170-180.
Boutet, S., Vazquez, F., Liu, J., Beclin, C., Fagard, M., Gratias, A., Morel, J.B., Crete, P.,
Chen, X., and Vaucheret, H. (2003). Arabidopsis HEN1: a genetic link between
endogenous miRNA controlling development and siRNA controlling transgene silencing
and virus resistance. Curr Biol 13, 843-848.
Chapman, E.J., Prokhnevsky, A.I., Gopinath, K., Dolja, V.V., and Carrington, J.C. (2004).
Viral RNA silencing suppressors inhibit the microRNA pathway at an intermediate step.
Genes Dev 18, 1179-1186.
Chen, J., Li, W.X., Xie, D., Peng, J.R., and Ding, S.W. (2004). Viral virulence protein
suppresses RNA silencing-mediated defense but upregulates the role of microrna in host
gene expression. Plant Cell 16, 1302-1313.
Chen, X., Liu, J., Cheng, Y., and Jia, D. (2002). HEN1 functions pleiotropically in Arabidopsis
development and acts in C function in the flower. Development 129, 1085-1094.
Clough, S.J., and Bent, A.F. (1998). Floral dip: a simplified method for Agrobacteriummediated transformation of Arabidopsis thaliana. Plant J 16, 735-743.
Deshaies, R.J. (1999). SCF and Cullin/Ring H2-based ubiquitin ligases. Annu Rev Cell Dev
Biol 15, 435-467.
Dunoyer, P., Lecellier, C.H., Parizotto, E.A., Himber, C., and Voinnet, 0. (2004). Probing
the microRNA and small interfering RNA pathways with virus-encoded suppressors of
RNA silencing. Plant Cell 16, 1235-1250.
Emery, J.F., Floyd, S.K., Alvarez, J., Eshed, Y., Hawker, N.P., Izhaki, A., Baum, S.F., and
Bowman, J.L. (2003). Radial patterning of Arabidopsis shoots by class III HD-ZIP and
KANADI genes. Curr Biol 13, 1768-1774.
Endrizzi, K., Moussian, B., Haecker, A., Levin, J.Z., and Laux, T. (1996). The SHOOT
MERISTEMLESS gene is required for maintenance of undifferentiated cells in
Arabidopsis shoot and floral meristems and acts at a different regulatory level than the
meristem genes WUSCHEL and ZWILLE. Plant J 10, 967-979.
116
Eshed, Y., Baum, S.F., Perea, J.V., and Bowman, J.L. (2001). Establishment of polarity in
lateral organs of plants. Curr Biol 11, 1251-1260.
Floyd, S.K., and Bowman, J.L. (2004). Gene regulation: ancient microRNA target sequences in
plants. Nature 428, 485-486.
Gagne, J.M., Downes, B.P., Shiu, S.H., Durski, A.M., and Vierstra, R.D. (2002). The F-box
subunit of the SCF E3 complex is encoded by a diverse superfamily of genes in
Arabidopsis. Proc Natl Acad Sci U S A 99, 11519-11524.
Gagne, J.M., Smalle, J., Gingerich, D.J., Walker, J.M., Yoo, S.D., Yanagisawa, S., and
Vierstra, R.D. (2004). Arabidopsis EIN3-binding F-box 1 and 2 form ubiquitin-protein
ligases that repress ethylene action and promote growth by directing EIN3 degradation.
Proc Natl Acad Sci U S A 101, 6803-6808.
Gray, W.M., Kepinski, S., Rouse, D., Leyser, O., and Estelle, M. (2001). Auxin regulates
SCF(TIR1)-dependent degradation of AUX/IAA proteins. Nature 414, 271-276.
Guo, H., and Ecker, J.R. (2003). Plant responses to ethylene gas are mediated by
SCF(EBF1/EBF2)-dependent proteolysis of EIN3 transcription factor. Cell 115, 667-677.
Han, M.H., Goud, S., Song, L., and Fedoroff, N. (2004). The Arabidopsis double-stranded
RNA-binding protein HYL1 plays a role in microRNA-mediated gene regulation. Proc
Natl Acad Sci U S A 101, 1093-1098.
Jones-Rhoades, M.W., and Bartel, D.P. (2004). Computational identification of plant
microRNAs and their targets, including a stress-induced miRNA. Mol Cell 14, 787-799.
Kasschau, K.D., Xie, Z., Allen, E., Llave, C., Chapman, E.J., Krizan, K.A., and Carrington,
J.C. (2003). P1/HC-Pro, a viral suppressor of RNA silencing, interferes with Arabidopsis
development and miRNA unction. Dev Cell 4, 205-217.
Kerstetter, R.A., Bollman, K., Taylor, R.A., Bomblies, K., and Poethig, R.S. (2001).
KANADI regulates organ polarity in Arabidopsis. Nature 411, 706-709.
Kidner, C.A., and Martienssen, R.A. (2004). Spatially restricted microRNA directs leaf
polarity through ARGONAUTEI. Nature 428, 81-84.
Laux, T., Mayer, K.F., Berger, J., and Jurgens, G. (1996). The WUSCHEL gene is required
for shoot and floral meristem integrity in Arabidopsis. Development 122, 87-96.
Llave, C., Xie, Z., Kasschau, K.D., and Carrington, J.C. (2002). Cleavage of Scarecrow-like
mRNA targets directed by a class of Arabidopsis miRNA. Science 297, 2053-2056.
Long, J.A., Moan, E.I., Medford, J.I., and Barton, M.K. (1996). A member of the KN07TED
class of homeodomain proteins encoded by the STM gene of Arabidopsis. Nature 379,
66-69.
Lu, C., and Fedoroff, N. (2000). A mutation in the Arabidopsis HYL1 gene encoding a dsRNA
binding protein affects responses to abscisic acid, auxin, and cytokinin. Plant Cell 12,
2351-2366.
Mallory, A.C., Bartel, D.P., and Bartel, B. (2005). microRNA-Directed Regulation of
Arabidopsis AUXIN RESPONSE FACTOR1 7 Is Essential for Proper Development and
Modulates Expression of Early Auxin Response Genes. Plant Cell 17.
Mallory, A.C., Dugas, D.V., Bartel, D.P., and Bartel, B. (2004a). MicroRNA regulation of
NAC-domain targets is required for proper formation and separation of adjacent
embryonic, vegetative, and floral organs. Curr Biol 14, 1035-1046.
Mallory, A.C., Reinhart, B.J., Bartel, D., Vance, V.B., and Bowman, L.H. (2002). A viral
suppressor of RNA silencing differentially regulates the accumulation of short interfering
RNAs and micro-RNAs in tobacco. Proc Natl Acad Sci U S A 99, 15228-15233.
117
Mallory, A.C., Reinhart, B.J., Jones-Rhoades, M.W., Tang, G., Zamore, P.D., Barton,
M.K., and Bartel, D.P. (2004b). MicroRNA control of PHABULOSA in leaf
development: importance of pairing to the microRNA 5' region. Embo J 23, 3356-3364.
Mallory, A.C., Ely, L., Smith, T.H., Marathe, R., Anandalakshmi, R., Fagard, M.,
Vaucheret, H., Pruss, G., Bowman, L., and Vance, V.B. (2001). HC-Pro suppression
of transgene silencing eliminates the small RNAs but not transgene methylation or the
mobile signal. Plant Cell 13, 571-583.
Mayer, K.F., Schoof, H., Haecker, A., Lenhard, M., Jurgens, G., and Laux, T. (1998). Role
of WUSCHEL in regulating stem cell fate in the Arabidopsis shoot meristem. Cell 95,
805-815.
Morel, J.B., Godon, C., Mourrain, P., Beclin, C., Boutet, S., Feuerbach, F., Proux, F., and
Vaucheret, H. (2002). Fertile hypomorphic ARGONAUTE (agol) mutants impaired in
post-transcriptional gene silencing and virus resistance. Plant Cell 14, 629-639.
Palatnik, J.F., Allen, E., Wu, X., Schommer, C., Schwab, R., Carrington, J.C., and Weigel,
D. (2003). Control of leaf morphogenesis by microRNAs. Nature 425, 257-263.
Park, M.Y., Wu, G., Gonzalez-Sulser, A., Vaucheret, H., and Poethig, R.S. (2005). Nuclear
processing and export of microRNAs in Arabidopsis. Proc Natl Acad Sci U S A 102,
3691-3696.
Park, W., Li, J., Song, R., Messing, J., and Chen, X. (2002). CARPEL FACTORY, a Dicer
homolog, and HEN1, a novel protein, act in microRNA metabolism in Arabidopsis
thaliana. Curr Biol 12, 1484-1495.
Potuschak, T., Lechner, E., Parmentier, Y., Yanagisawa, S., Grava, S., Koncz, C., and
Genschik, P. (2003). EIN3-dependent regulation of plant ethylene hormone signaling by
two arabidopsis F box proteins: EBF1 and EBF2. Cell 115, 679-689.
Prigge, M.J., Otsuga, D., Alonso, J.M., Ecker, J.R., Drews, G.N., and Clark, S.E. (2005).
Class III Homeodomain-Leucine Zipper Gene Family Members Have Overlapping,
Antagonistic, and Distinct Roles in Arabidopsis Development. Plant Cell 17, 61-76.
Reinhart, B.J., Weinstein, E.G., Rhoades, M.W., Bartel, B., and Bartel, D.P. (2002).
MicroRNAs in plants. Genes Dev 16, 1616-1626.
Rhoades, M.W., Reinhart, B.J., Lim, L.P., Burge, C.B., Bartel, B., and Bartel, D.P. (2002).
Prediction of plant microRNA targets. Cell 110, 513-520.
Ruegger, M., Dewey, E., Gray, W.M., Hobbie, L., Turner, J., and Estelle, M. (1998). The
TIR1 protein of Arabidopsis functions in auxin response and is related to human SKP2
and yeast grrlp. Genes Dev 12, 198-207.
Samach, A., Klenz, J.E., Kohalmi, S.E., Risseeuw, E., Haughn, G.W., and Crosby, W.L.
(1999). The UNUSUAL FLORAL ORGANS gene of Arabidopsis thaliana is an F-box
protein required for normal patterning and growth in the floral meristem. Plant J 20, 433445.
Sasaki, A., Itoh, H., Gomi, K., Ueguchi-Tanaka,
M., Ishiyama, K., Kobayashi, M., Jeong,
D.H., An, G., Kitano, H., Ashikari, M., and Matsuoka, M. (2003). Accumulation of
phosphorylated repressor for gibberellin signaling in an F-box mutant. Science 299,
1896-1898.
Schauer, S.E., Jacobsen, S.E., Meinke, D.W., and Ray, A. (2002). DICER-LIKEI: blind men
and elephants in Arabidopsis development. Trends Plant Sci 7, 487-491.
Smalle, J., and Vierstra, R.D. (2004). The ubiquitin 26S proteasome proteolytic pathway. Annu
Rev Plant Biol 55, 555-590.
118
Stirnberg, P., van De Sande, K., and Leyser, H.M. (2002). MAX1 and MAX2 control shoot
lateral branching in Arabidopsis. Development 129, 1131-1141.
Sunkar, R., and Zhu, J.K. (2004). Novel and stress-regulated microRNAs and other small
RNAs from Arabidopsis. Plant Cell 16, 2001-2019.
Tang, G., Reinhart, B.J., Bartel, D.P., and Zamore, P.D. (2003). A biochemical framework
for RNA silencing in plants. Genes Dev 17, 49-63.
Telfer, A., and Poethig, R.S. (1998). HASTY: a gene that regulates the timing of shoot
maturation in Arabidopsis thaliana. Development 125, 1889-1898.
Vaucheret, H., Vazquez, F., Crete, P., and Bartel, D.P. (2004). The action of ARGONAUTEl
in the miRNA pathway and its regulation by the miRNA pathway are crucial for plant
development. Genes Dev 18, 1187-1197.
Vazquez, F., Gasciolli, V., Crete, P., and Vaucheret, H. (2004). The nuclear dsRNA binding
protein HYL1 is required for microRNA accumulation and plant development, but not
posttranscriptional transgene silencing. Curr Biol 14, 346-351.
Wilkinson, M.D., and Haughn, G.W. (1995). UNUSUAL FLORAL ORGANS Controls
Meristem Identity and Organ Primordia Fate in Arabidopsis. Plant Cell 7, 1485-1499.
Willems, A.R., Schwab, M., and Tyers, M. (2004). A hitchhiker's guide to the cullin ubiquitin
ligases: SCF and its kin. Biochim Biophys Acta 1695, 133-170.
Woo, H.R., Chung, K.M., Park, J.H., Oh, S.A., Ahn, T., Hong, S.H., Jang, S.K., and Nam,
H.G. (2001). ORE9, an F-box protein that regulates leaf senescence in Arabidopsis. Plant
Cell 13, 1779-1790.
Xie, D.X., Feys, B.F., James, S., Nieto-Rostro, M., and Turner, J.G. (1998). COII: an
Arabidopsis gene required for jasmonate-regulated defense and fertility. Science 280,
1091-1094.
Zheng, N., Schulman, B.A., Song, L., Miller, J.J., Jeffrey, P.D., Wang, P., Chu, C., Koepp,
D.M., Elledge, S.J., Pagano, M., Conaway, R.C., Conaway, J.W., Harper, J.W., and
Pavletich, N.P. (2002). Structure of the Cull-Rbxl-Skpl-F boxSkp2 SCF ubiquitin
ligase complex. Nature 416, 703-709.
Zhong, R., and Ye, Z.H. (2004). Amphivasal vascular bundle 1, a gain-of-function mutation of
the IFL1/REV gene, is associated with alterations in the polarity of leaves, stems and
carpels. Plant Cell Physiol 45, 369-385.
119
.Appendix 1. Conserved miRNA target sites in Arabidopsis, Oryza and Populus
The sequence and score (see Jones-Rhoades & Bartel, 2004 Molecular Cell 14(6):787-99)
is listed for miRNA complementary sites within predicted miRNA targets of three plant species.
Some complementary sites occur adjacent to annotated gene models, especially in Populus; this
iis indicated as "In annotation" (Y or N).
rniRNA
family
targetgene
TargetFamily
Score
miRNAcomplementary
sequence
species
In
annotation?
rniR156 At1g27360.1
SBP
1
GUGCUCUCUCUCUUCUGUCA
Arabidopsis
Y
rniR156 At1g27370.1
SBP
1
GUGCUCUCUCUCUUCUGUCA
Arabidopsis
Y
rniR156 Atlg53160.1
SBP
2
CUGCUCUCUCUCUUCUGUCA
Arabidopsis
Y
rniR156 Atlg69170.1
SBP
1
GUGCUCUCUCUCUUCUGUCA
Arabidopsis
Y
miR156
At2g33810.1
SBP
1.5
UUGCUUACUCUCUUCUGUCA
Arabidopsis
Y
miR156
At2g42200.1
SBP
1
GUGCUCUCUCUCUUCUGUCA
Arabidopsis
Y
nniR156 At3g15270.1
SBP
3
CCGCUCUCUCUCUUCUGUCA
Arabidopsis
Y
miR156
At3g57920.1
SBP
1
GUGCUCUCUCUCUUCUGUCA
Arabidopsis
Y
mniR156 At5g43270.1
SBP
1
GUGCUCUCUCUCUUCUGUCA
Arabidopsis
Y
miR156 At5g50570.1
SBP
1
GUGCUCUCUCUCUUCUGUCA
Arabidopsis
Y
miR156 At5g50670.1
SBP
1
GUGCUCUCUCUCUUCUGUCA
Arabidopsis
Y
miR156
Os01g69830
SBP
0
UGUGCUCUCUCUCUUCUGUCA Oryza
Y
miR156
Os02g04680
SBP
1
AUGCUCUCUCUCUUCUGUCA
Oryza
Y
miR156
Os02g07780
SBP
0
GUGCUCUCUCUCUUCUGUCA
Oryza
Y
miR156
Os04g46580
SBP
0
GUGCUCUCUCUCUUCUGUCA
Oryza
Y
miR156
Os06g45310
SBP
0
GUGCUCUCUCUCUUCUGUCA
Oryza
Y
miR156
Os06g49010
SBP
0
GUGCUCUCUCUCUUCUGUCA
Oryza
Y
miR156
Os07g32170
SBP
2
AUGCUCCCUCUCUUCUGUCA
Oryza
Y
miR156
Os08g39890
SBP
0
UGUGCUCUCUCUCUUCUGUCA Oryza
Y
miR1516 Os08g41940
SBP
0
UGUGCUCUCUCUCUUCUGUCA Oryza
Y
miR156
estExt_Genewisel_v1
.C_1240186
SBP
1
AUGCUCUCUCUCUUCUGUCA
Populus
Y
miR156
estExtGenewisel_vl.C_LGXV2187
SBP
1
GUGCUCUCUCUCUUCUGUCA
Populus
Y
miR156
eugene3.001
:20942
SBP
2
GCGCUCUCUCUCUUCUGUCA
Populus
Y
miR156
eugene3.001
60416
SBP
1
GUGCUCUCUCUCUUCUGUCA
Populus
Y
miR156
fgenesh4pg.C_LG_11001
303
SBP
1
GUGCUCUCUCUCUUCUGUCA
Populus
Y
mriRl56
fgenesh4_pg.C_LG_X001404
SBP
1
GUGCUCUCUCUCUCUGUCA
Populus
Y
miR156
grail3.001
0026801
SBP
1
GUGCUCUCUCUCUUCUGUCA
Populus
Y
mliR156 gw1.107.39.1
SBP
1.5
AUGCUCCCUCUCUUCUGUCA
Populus
N
miR156
gw1.129.152.1
SBP
0.5
GUGCUCGCUCUCUUCUGUCA
Populus
N
miR156
gw1.164.76.1
SBP
1
GUGCUCUCUCUCUUCUGUCA
Populus
N
miR156
gw1.40.76.1
SBP
1
GUGCUCUCUCUCUUCUGUCA
Populus
N
miR156
gw1.1.7783.1
SBP
1
GUGCUCUCUCUCUUCUGUCA
Populus
N
miR156
gw1.111.2396.1
SBP
1
GUGCUCUCUCUCUUCUGUCA
Populus
N
miR156
gw1.IV.3037.1
SBP
1.5
AUGCUCUCUCUCUWCUGUCA
Populus
N
miR156 gwl.VII.548.1
SBP
2
UUGCUCUCUCUCUUCUGUCA
Populus
N
miR156; gwl.XI.3794.1
SBP
1.5
AUGCUCCCUCUCUUCUGUCA
Populus
N
miR159 At2g26950.1
MYB
1.5
UGGAGCUCCCUUCAUUCCAAG Arabidopsis
Y
miR159 At2g26960.1
MYB
2.5
UCGAGUUCCCUUCAUUCCAAU Arabidopsis
Y
miR159 At2g32460.1
MYB
1.5
UAGAGCUUCCUUCAAACCAAA Arabidopsis
Y
mrniR159 At3g11440.1
MYB
1.5
UGGAGCUCCCUUCAUUCCAA
Arabidopsis
Y
miR159
At3g60460.1
MYB
1.5
UGGAGCUCCAUUCGAUCCAAA Arabidopsis
Y
miR159
At4g26930.1
MYB
2.5
AUGAGCUCUCUUCAAACCAAA Arabidopsis
Y
miR159
At5gO6100.1
MYB
1.5
UGGAGCUCCCUUCAUUCCAA
Arabidopsis
Y
MYB
2
AGCAGCUCCCUUCAAACCAAA Arabidopsis
Y
miR159 At5g55020.1
120
miR159
Os01g59660
MYB
1
UGGAGCUCCCUUCACUCCAAG
miR159
Os03g38210
MYB
2
CCGAGCUCCCUUCAAGCCAAU Oryza
miR159
Os04g46390
MYB
1.5
UGGAGCUCCAUUCGAUCCAAA Oryza
miR159
Os5g41 170
MYB
0.5
miR159
0s06g40330
MYB
1
miR159
Os06g46560
MYB
2.5
GCGAGCUCCCUUCGAACCAAU Oryza
miR159
fgenesh4_pm.
C_LG_I
11000641
MYB
2.5
UGGAGCUCUAUUCGGUCCAAA Populus
miR159 fgenesh4_pm.C_scaffold_40000020 MYB
1.5
Oryza
UGGAGCUCCCUUUAAUCCAAU Oryza
UAGAGCUCCCUUCACUCCAAU
Oryza
UGGAGCUCCAUUCGAUCCAAA Populus
miR159
gwl. 1.6885.1
MYB
1
UGGAGCUCCCUUCACUCCAAU
Populus
miR159
gwl .1.9701.1
MYB
1
UAGAGCUCCCUUCACUCCAAU
Populus
miR159
gwl.111.41.1
MYB
0
UUGAGCUCCCUUCACUCCAAU Populus
miR159
Atlg30210.1
TCP
2.5
AGGGGGACCCUUCAGUCCAA
Arabidopsis
miR159 At1g53230.1
TCP
3
AGGGGUCCCCUUCAGUCCAU
Arabidopsis
miR159
At2g31070.1
TCP
2.5
AGGGGUACCCUUCAGUCCAG
Arabidopsis
miR159 At3g15030.1
TCP
2.5
AGGGGUCCCCUUCAGUCCAG
Arabidopsis
miR159 At4g18390.1
TCP
2.5
AGGGGGACCCUUCAGUCCAA
Arabidopsis
miR159
TCP
3.5
AGGGGACCCCUUCAGUCCAGU
Oryza
miR159 0s03g57190
TCP
2.5
AGGGGGACCCUUCAGUCCAA
miR159 Os07g05720
TCP
2.5
AGGGGGACCCUUCAGUCCAA
Oryza
miR159
TCP
2.5
CGGGGCACACUUCAGUCCAA
Oryza
miR159 eugene3.0011
0429
TCP
2.5
AGGGGGACCCUUCAGUCCAA
Populus
miR159 eugene3.00110631
TCP
3
AGGGGAACCCUUCAGUCCAG
Populus
miR159
eugene3.00121020
TCP
2.5
AGGGGGACCCUUCAGUCCAA
Populus
miR159
eugene3.00190830
TCP
3
AGGGGGCCCCUUCAGUCCAG
miR159
eugene3.00410019
TCP
3
AGGGGAACCCUUCAGUCCAG
Populus
Populus
miR159
grail3.0032015302
TCP
3
AGGGGACCCCUUCAGUCCAG
miR159
gwl .IV.2486.1
TCP
3
miR160
At1g77850.1
ARF
0.5
miR160
At2g28350.1
ARF
1
miR160
At4g30080.1
ARF
1.5
miR160
0s02g41800
ARF
0
AGGCAUACAGGGAGCCAGGCA
Oryza
miR160
0s04g43910
ARF
0
AGGCAUACAGGGAGCCAGGCA
Oryza
miR160
Os04g59430
ARF
1
UGACAUUCAGGGAGCCAGGCA
Oryza
miR160
0s06g47150
ARF
0
AGGCAUACAGGGAGCCAGGCA Oryza
miR160
Os10g33940
ARF
0
AGGCAUACAGGGAGCCAGGCA Oryza
miR160
estExtfgenesh4pg.C_LG_V0901
ARF
0.5
miR160
0
miR160
estExt_fgenesh4_pm.C_LG_X0888 ARF
estExt_fgenesh4_pm.C_LG_XVI0323ARF
miR160
Os01g11550
Os12g42190
Oryza
Populus
AUGAGCUCCCUCCACUCAACPopulus
UGGCAUGCAGGGAGCCAGGCA
Arabidopsis
AGGAAUACAGGGAGCCAGGCA
Arabidopsis
GGGUUUACAGGGAGCCAGGCA Arabidopsis
UGGCAUGCAGGGAGCCAGGCA Populus
AGGCAUACAGGGAGCCAGGCA
Populus
0.5
UGGCAUGCAGGGAGCCAGGCA Populus
eugene3.00660262
ARF
0
AGGCAUACAGGGAGCCAGGCA Populus
miR160 fgenesh4_pg.C_LG_11000830
ARF
0.5
miR160 fgenesh4_pg.C_LG_X001411
ARF
0
miR160
fgenesh4_pg.C_LG_VI11000301
ARF
0
UGGCAUGCAGGGAGCCAGGCA Populus
AGGCAUACAGGGAGCCAGGCA Populus
AGGCAUACAGGGAGCCAGGCA Populus
miR160
gw1.28.631.1
ARF
0.5
UGGCAUGCAGGGAGCCAGGCA
Populus
miR160
gw1.28.632.1
ARF
0.5
UGGCAUGCAGGGAGCCAGGCA
Populus
miR161
At1g06580.1
PPR
2
CCCGGAUGUAAUCACUUUCAG Arabidopsis
miR161
CCCUGAUGUAUUCACUUUCAG Arabidopsis
Atl g62670.1
PPR
1.5
rniR161 At1g62720.1
PPR
2.5
CCCCGAUGUAGUGACUUAUAA Arabidopsis
rniR161 At1g63080.1
PPR
2
UCCAAAUGUAGUCACUUUCAA Arabidopsis
rniR161 At1g63150.1
PPR
2.5
rniR161 At1g63400.1
PPR
2
CCCCAAUGUUGUUACUUUCAA Arabidopsis
UCCAAAUGUAGUCACUUUCAA
Arabidopsis
121
At1g64580.1
PPR
miR161 At5g16640.1
miR161
1.5
CCCUGAUGUUGUCACUUUCAC Arabidopsis
Y
PPR
2
CCCUGAUGUAUUUACUUUCAA Arabidopsis
Y
miR161
At5g41170.1
PPR
1.5
ACCUGAUGUAAUCACUUUCAA Arabidopsis
Y
miR162
At gOl
01040.1
DCL
2
CUGGAUGCAGAGGUAUUAUCGAArabidopsis
Y
Os03g02970
DCL
2
CUGGAUGCAGAGGUUUUAUCG Oryza
Y
miR162
eugene3.00021687
DCL
2
CUGGAUGCAGAGGUCUUAUCG Populus
miR163
At1g66690.1
SAMT
0.5
At1966700.1
SAMT
0.5
miR163 Atlg66720.1
SAMT
1
miR162
miR163
Arabidopsis
AUCGAGUUCCAAGUCCUCUUCAA
Arabidopsis
AUCGAGUUCCAAGUCCUCUUCAA
AUCGAGUUCCAGGUCCUCUUCAA
Arabidopsis
Arabidopsis
AUCGAGUUCCAAGUUUUCUUCAA
y
y
y
y
y
yY
miR163
At3g44860.1
SAMT
1.5
miR163
At3g44870.1
SAMT
1.5
miR164
At1g56010.1
NAC
1
AGCACGUACCCUGCUUCUCCA
Arabidopsis
miR164
At3g15170.1
NAC
1
AGCACGUGUCCUGUUCUCCA
Arabidopsis
miR164
At5g07680.1
NAC
1.5
miR164
At5g39610.1
NAC
2
CUCACGUGACCUGCUUCUCCG Arabidopsis
miR164
At5g53950.1
NAC
1
AGCACGUGUCCUGUUUCUCCA
At5g61430.1
NAC
1.5
miR164
Os02g36880
NAC
1
CGCACGUGACCUGCUUCUCCA
Oryza
miR164
Os04g38720
NAC
1
CGCACGUGACCUGCUUCUCCA
Oryza
miR164
0s06g23650
NAC
1
AGCUCGUGCCCUGCUUCUCCA
Oryza
miR164
Os06g46270
NAC
1
AGCAAGUGCCCUGCUUCUCCA
Oryza
miR164
Os08g10080
NAC
1.5
miR164
Os12g41680
NAC
1
miR164
eugene3.00150202
NAC
1.5
CCUACGUGCCCUGCUUCUCCA Populus
1.5
CCUACGUGCCCUGCUUCUCCA Populus
y
y
N
Y
miR164
Arabidopsis
AUCGAGUUCCAAGUUUUCUUCAA
UUUACGUGCCCUGCUUCUCCA Arabidopsis
Arabidopsis
UCUACGUGCCCUGCUUCUCCA Arabidopsis
AGCAAGUGUCCUGCUUCUCCG Oryza
AGCAAGUGCCCUGCUUCUCCA
Oryza
C_LG_XI1000069
miR164 fgenesh4_pm.
NAC
miR164
gw1.107.10.1
NAC
1
AGCACGUGUCCUGUUUCUCCA
Populus
miR164
gwl.V.3536.1
NAC
1
AGCAAGUGCCCUGCUUCUCCA
Populus
miR164
gwl .VII1.2722.1
NAC
1
AGCAAGUGCCCUGCUUCUCCA
Populus
miR164
gwl .XI.3766.1
NAC
1
AGCACGUGUCCUGUUUCUCCA
Populus
miR166
At1g30490.1
HD-ZIP
1.5
UUGGGAUGAAGCCUGGUCCGG Arabidopsis
miR166
At1g52150.1
HD-ZIP
1.5
CUGGAAUGAAGCCUGGUCCGG Arabidopsis
miR166
At2g3471
0.1
HD-ZIP
1.5
UUGGGAUGAAGCCUGGUCCGG Arabidopsis
miR166
At4g32880.1
HD-ZIP
1.5
CUGGGAUGAAGCCUGGUCCGG
miR166
At5g60690.1
HD-ZIP
1.5
CUGGGAUGAAGCCUGGUCCGG
miR166
Os03g01890
HD-ZIP
2
CUGGGAUGAAGCCUGGUCCGG
miR166
Os03g43930
HD-ZIP
2
UUGGGAUGAAGCCUGGUCCGG Oryza
HD-ZIP
2
CUGGGAUGAAGCCUGGUCCGG Oryza
*miR166 Osl2g41860
HD-ZIP
2
UUGGGAUGAAGCCUGGUCCGG Oryza
miR166
estExt_fgenesh4_pg.C_2360002
HD-ZIP
3
UUGGGAUGAAGCCUGGUCCAG Populus
miR166
estExt_fgenesh4_pg.C_LG_12905 HD-ZIP
2.5
UUGGUAUGAAGCCUGGUCCGG Populus
miR166
estExtfgenesh4_pg.C_LG_1110436 HD-ZIP
1.5
CUGGAAUGAAUGAAGCCUGGUCCGG
Populus
miR166
HD-ZIP
3
estExt_fgenesh4_pm.C_LG_V1071
2
CUGGGAUGAAGCCUGGUCCGG Populus
miR166
estExt_Genewisel
_vl .C_660759
HD-ZIP
1.5
miR166 fgenesh4_pg.C_LG_XVI11000250
HD-ZIP
2
CUGGGAUGAAGCCUGGUCCGG Populus
miR166
C_LG_1000560
fgenesh4_pm.
HD-ZIP
1.5
CUGGAAUGAAGCCUGGUCCGG Populus
miR166
Populus
gw1.6326.1.1
HD-ZIP
2
CUGGGAUGAAGCCUGGUCCGG Populus
rniR166 gwl .IX.4748.1
HD-ZIP
2
CUGGGAUGAAGCCUGGUCCGG Populus
rniR167 Atlg30330.1
ARF
2
GAGAUCAGGCUGGCAGCUUGU Arabidopsis
rniR167 At5g37020.1
ARF
2
UAGAUCAGGCUGGCAGCUUGU Arabidopsis
rniR167 Os02g06910
ARF
3
GAGAUCAGGCUGGCAGCUUGU Oryza
122
y
yN
y
y
y
Arabidopsis y
Arabidopsis y
y
Oryza
Y
miR166 Os109g33960
CUGGAAUGAAGCCU
GGUCCGG
yY
Y
yY
yY
y
y
y
y
yY
Y
yY
yY
Y
y
Y
y
y
Y
miR167
Os04g57610
ARF
2
UAGAUCAGGCUGGCAGCUUGU Oryza
Y
miR167 Os06g46410
ARF
3
GAGAUCAGGCUGGCAGCUUGU Oryza
Y
miR167
Os12941950
ARF
3
AAGAUCAGGCUGGCAGCUUGU Oryza
Y
miR167
estExt_Genewisel
_vl .C_LG_110777 ARF
3
GAGAUCAGGCUGGCAGCUUGU Populus
Y
miR167
estExt_Genewiselvl .C_LG_XI2869
ARF
3
GAGAUCAGGCUGGCAGCUUGU Populus
Y
miR167
fgenesh4_pg.C_LG_1002802
ARF
3
GAGAUCAGGCUGGCAGCUUGU Populus
Y
miR167
fgenesh4_pg.C_scaffold_1
006000001
ARF
3
GAGAUCAGGCUGGCAGCUUGU Populus
Y
miR167
gw1.44.432.1
ARF
2
UAGAUCAGGCUGGCAGCUUGU Populus
Y
miR167
gw1.IV.3880.1
ARF
2
UAGAUAG
GGCUGGCAGCUUGU
Populus
Y
miR167
gwl .V.806.1
ARF
3
GAGAUCAGGCUGGCAGCUUGU Populus
Y
miR168
At1g48410.1
AGO
2.5
UUCCCGAGCUGCAUCAAGCUA Arabidopsis
Y
miR168
Os02g45070
AGO
0
UUCCCGAGCUGCACCAAGCCU Oryza
N
miR168
Os02g58490
AGO
2.5
CUCCCGAGCUGCGCCAAGCAA Oryza
Y
miR168
Os04g47870
AGO
0
UUCCCGAGCUGCACCAAGCCC
Oryza
Y
miR168
Os04g52540
AGO
3
UUCGCCCGCUGCACCAAGCCG Oryza
Y
miR168
Os04g52550
AGO
3
UUCGCCCGCUGCACCAAGCCG Oryza
Y
miR168
Os06g51310
AGO
3
CUCCCGAGCUGCUCCAAGCAA Oryza
Y
miR168
grail3.0031006602
AGO
3
CACCCGAGCUGCACCAAGCUA Populus
N
miR168
grail3.0122002801
AGO
3
CACCCGAGCUGCACCAAGCUA Populus
N
miR169
At1g917590.1
CCAAT
1.5
AAGGGAAGUCAUCCUUGGCUG Arabidopsis
Y
miR169
Atlg54160.1
CCAAT
2
ACGGGAAGUCAUCCUUGGCUA Arabidopsis
Y
miR169
Atlg72830.1
CCAAT
1.5
AGGGGAAGUCAUCCUUGGCUA Arabidopsis
Y
miR169
At3g05690.1
CCAAT
1.5
AGGCAAAUCAUCUUUGGCUCA Arabidopsis
Y
miR169
At3g14020.1
CCAAT
2.5
UAGCCAAGGAUGACuUCCCU
Arabidopsis
Y
miR169
At3g20910.1
CCAAT
2
CGGCAAUUCAUUCUUGGCUUU Arabidopsis
N
miR169
At5g0651
0.1
CCAAT
1.5
AGGCAAAUCAUCUUUGGCUCA Arabidopsis
Y
miR169
At5g12840.1
CCAAT
1.5
CCGGCAAAUCAUUCUUGGCUU Arabidopsis
Y
miR169
Os03g07880
CCAAT
2.5
AUGGCAAAUCAUCCUUGGCUU Oryza
Y
miR169
Os03g29760
CCAAT
1.5
GUGGCAAUUCAUCCUUGGCUU Oryza
Y
miR169
Os03g44540
CCAAT
1
Oryza
Y
miR169
Os03g48970
CCAAT
1.5
CAGGCAAUUCAUUCUUGGCUU Oryza
Y
miR169
Os07g06470
CCAAT
1
miR169
Os07g41720
CCAAT
1.5
miR169
1
UAGGCAACUCAUUCUUGGCUG
1
CAGGCAAUUCAUCCUUGGCUU Populus
Y
miR169
Os12942400
CCAAT
estExt_fgenesh4pg.C_LG_XVI110020
CCAAT
eugene3.00011755
CCAAT
1.5
CAGGCAAUUCAUUCUUGGCUU Populus
Y
miR169
eugene3.00060980
CCAAT
1
CAGGCAAUUCAUCCUUGGCUU
N
miR169
eugene3.00061121
CCAAT
3
AGGGCAAGUCGUUCUUGGCUC Populus
N
miR169
eugene3.00091116
CCAAT
2
GCGGCAAAUCAUUCUUGGCUU Populus
Y
miR169
eugene3.00160615
CCAAT
2.5
AGGGCAAGUCGUUCUUGGCUC Populus
N
miR169
fgenesh4_pg.C_LG_IX000987
CCAAT
1.5
CAGGCAAUUCAUUCUUGGCUU Populus
Y
miR169
grail3.0024038301
CCAAT
2.5
UUGGCAAAUCAUUCUUGGCUU Populus
N
miR169
gw1..1522.1
CCAAT
2.5
GCGGCAAAUCAUUCUUGGCUU Populus
N
miR171
At2g45160.1
SCL
0
GAUAUUGGCGCGGCUCAAUCA Arabidopsis
Y
miR171
At3g60630.1
SCL
0
GAUAUUGGCGCGGCUCAAUCA Arabidopsis
Y
miR171 At4g00150.1
miR171 Os02g44360
SCL
0
GAUAUUGGCGCGGCUCAAUCA Arabidopsis
Y
SCL
0
GAUAUUGGCGCGGCGCGGCUCAAUCA
Oryza
Y
miR171 Os02g44370
miR171 Os04g46860
SCL
0
GAUAUUGGCGCGGCUCAAUCA Oryza
Y
SCL
0
GAUAUUGGCGCGGCUCAAUCA Oryza
Y
imiR171 Os06g01620
SCL
0
GAUAUUGGCGCGGCUCAAUCA Oryza
Y
miR169
123
UAGGCAAAUCAUUCUUGGCUC
Otyza
Y
GUGGCAAUUCAUCCUUGGCUU Oryza
Y
Oryza
Y
GUGGCAAUUCAUCCUUGGCUG
Populus
miR171
Os10g40390
SCL
0.5
miR171
estExt_Genewisel
_vl .CLG_113184
SCL
0
GAUAUUGGCGCGGCUCAAUCA Populus
miR171
eugene3.44860001
SCL
0
GAUAUUGGCGCGGCUCAAUCA Populus
miR171
fgenesh4_pg.C_LG_11000787
SCL
1.5
miR171
gw1.127.243.1
SCL
0
miR171
gwl.40.23.1
SCL
0
GAUAUUGGCGCGGCUCAAUCA Populus
miR171
gw1.57.294.1
SCL
1
GAUAUUGGAACGGCUCAACGGC
UCA
miR171
gw1.11.1043.1
SCL
0
GAUAUUGGCGCGGCUCAAUCA Populus
miR171
gw1.111.2060.1
SCL
0
GAUAUUGGCGCGGCUCAAUCA Populus
miR171
gwl .VI1.3405.1
SCL
2.5
GAUACUGGAACGGCUCAAUCA Populus
miR172
At2g28550.1
AP2
1.5
miR172
At2g39250.1
AP2
1
UUGUAGCAUCAUCAGGAUUCC
Arabidopsis
miR172
At3g54990.1
AP2
1
UGCAGCAUCAUCAGGAUUCC
Arabidopsis
miR172
At4g36920.1
AP2
0.5
CUGCAGCAUCAUCAGGAUUCU Arabidopsis
miR172
At5g60120.1
AP2
0.5
AUGCAGCAUCAUCAGGAUUCU Arabidopsis
miR172
At5g67180.1
AP2
1.5
UGGCAGCAUCAUCAGGAUUCU Arabidopsis
miR172
Os03g60430
AP2
0.5
CUGCAGCAUCAUCAGGAUUCU Oryza
miR172
0s04g55560
AP2
1
miR172
Os05g03040
AP2
0.5
CUGCAGCAUCAUCAGGAUUCU Oryza
miR172
0s06g43220
AP2
0.5
CUGCAGCAUCAUCAGGAUUCC Oryza
miR172
Os07g913170
AP2
0.5
CUGCAGCAUCAUCAGGAUUCU Oryza
miR172
grail3.001
9003502
AP2
0.5
CUGCAGCAUCAUCAGGAUUCC Populus
AP2
0.5
CUGCAGCAUCAUCAGGAUUCG Populus
miR172 gw1.28.415.1
GAUAUUGGCGCGGCUCAAUUA Oryza
GGUGAUAUUGG
GGCGGCUCAA
Populus
GAUAUUGGCGCGGCUCAAUCA Populus
Populus
CAGCAGCAUCAUCAGGAUUCU Arabidopsis
CUGCAGCAUCAUCACGAUUCC
Oryza
miR172
gwl .V.4061.1
AP2
0.5
miR172
gw1.VII.1637.1
AP2
1
miR172
gwl .X.2501.1
AP2
0.5
UUGCAGCAUCAUCAGGAUUCU Populus
miR172
gwl .XVI.2655.1
AP2
0.5
CUGCAGCAUCAUCAGGAUUCG Populus
miR393
Os08g41320
bHLH
3.5
ACCAAAAGAAUCACAUCGCCC Oryza
miR393
At3g23690.1
bZIP
2
miR393
eugene3.00140963
bZIP
2.5
miR393
At1g912820.1
Fbox
1
AAACAAUGCGAUCCCUUUGGA
Arabidopsis
miR393
At3g2681
0.1
Fbox
1
AAACAAUGCGAUCCCUUUGGA
Arabidopsis
miR393
At3g62980.1
Fbox
1.5
AGACAAUGCGAUCCCUUUGGA Arabidopsis
miR393
At4g03190.1
Fbox
2.5
AGACCAUGCGAUCCCUUUGGA Arabidopsis
miR393
Os04g32460
Fbox
1.5
AGACAAUGCGAUCCCUUUGGA Oryza
miR393
Os05g05800
Fbox
1.5
AGACAAUGCGAUCCCUUUGGA Oryza
miR393
estExt_Genewisel_vl.C_880149
F-box
1
AAACAAUGCGAUCCCUUUGGA
Populus
miR393
eugene3.00012208
F-box
1
AAACAAUGCGAUCCCUUUGGA
Populus
miR393 eugene3.00110318
F-box
3
AGUCAAUGAGGUCACUUUGGA Populus
miR393
eugene3.00140791
F-box
1.5
AGACAAUGCGAUCCCUUUGGA Populus
miR393
eugene3.00141554
F-box
1.5
miR394
Atl 9g27340.1
Fbox
1
GGAGGUUGACAGAAUGCCAA
Arabidopsis
miR394
Os01g69940
Fbox
0
GGAGGUGGACAGAAUGCCAA
Oryza
miR394
estExtGenewisel_vl .C_LG_17715
F-box
1
GGAGGUUGACAGAAUGCCAA
Populus
miR394
fgenesh4_pm.
C_LG_111000589
F-box
1
GGAGGUUGACAGAAUGCCAA
Populus
miR395
At3g22890.1
APS
1.5
GAGUUCCUCCAAACUCUUCAU Arabidopsis
miR395
At4g14680.1
APS
1.5
GAGUUCCUCCAAACUCUUCAU Arabidopsis
miR395
At5g43780.1
APS
0.5
APS
0.5
GAGUUCCUCCAAACACUUCAU
Arabidopsis
GAGUUCCUCCAAGCACUUCAU
Oryza
miR395 estExtGenewisel_vl .C_LG_VI112439APS
1.5
GAGUUCCUCCAAACUCUUCAU Populus
miR395 Os03g53230
124
CUGCAGCAUCAUCAGGAUUCU Populus
UUGCAGCAUCAUCAGGAUUCU
Populus
GGUCAGAGCGAUCCCUUUGGC Arabidopsis
GAUCAGAGCGAUCCCUUUGAG Populus
AGACAAUGCGAUCCCUUUGGA Populus
miR395
grail3.0175000802
APS
0.5
GAGUUCCUCCAAACACUUCAU Populus
miR395
At5gl 0180.1
S transporter
1.5
AAGUUCUCCCAAACACUUCAA Arabidopsis
miR395 Os03g09930
S transporter
1
GAGUUCACCCAAACACUUCAG
miR395 Os03g09940
S transporter
0
GAGUUCCCCCAAACACUUCAG Oryza
Oryza
GAGUUCCCUCAAGCACUUCAA Populus
miR395 estExt_fgenesh4_pm.C_LG_110422 S Transporter
2.5
eugene3.00070572
S Transporter
1
GAGUUUUCCCAAACACUUCAA
miR395 fgenesh4_pm.C_LG_V000080
S Transporter
3
UAUUUCCCCUGAACACUUCAA Populus
miR396 At2g22840.1
GRF
3
UCGUUCAAGAAAGCCUGUGGAAArabidopsis
miR396
At2g36400.1
GRF
3
CCGUUCAAGAAAGAAAGCCUGUGGAA
Arabidopsis
miR396
At2g45480.1
GRF
3
ACGUUCAAGAAAGCUUGUGGAAArabidopsis
miR396
At3g52910.1
GRF
3
CCGUUCAAGAAAGCCUGUGGAAArabidopsis
miR396
At4g24150.1
GRF
3
UCGUUCAAGAAAGCAUGUGGAAArabidopsis
miR396
At4g37740.1
GRF
3
UCGUUCAAGAAAGCCUGUGGAAArabidopsis
miR396
At5g53660.1
GRF
3
UCGUUCAAGAAAGCAUGUGGAAArabidopsis
miR396
Os02g45570
GRF
3
CCGUUCAAGAAAGAAAGCCUGUGGA
Oryza
miR396
Os02g47280
GRF
3
CCGUUCAAGAAAGCCUGUGGA Oryza
miR396
Os02g53690
GRF
3
CCGUUCAAGAAAGAAAGCCUGUGGA
Oryza
miR396
Os03g47140
GRF
3
CCGUUCAAGAAAGCCUGUGGA Oryza
miR396
Os03g51970
GRF
3
CCGUUCAAGAAAGCAUGUGGA Oryza
miR396
Os04g51190
GRF
3
CCGUUCAAGAAAGCCUGUGGA Oryza
miR396
Os06g02560
GRF
3
CCGUUCAAGAAAGCCUGUGGA Oryza
miR396
Os 1g35030
GRF
3
UCGUUCAAGAAAGAAAGCAUGUGGA
Oryza
miR396
Os12g29980
GRF
3
CCGUUCAAGAAAGCAUGUGGA Oryza
miR396
estExt_Genewisel_v.C_290455
GRF
3
CCGUUCAAGAAAGCCUGUGGA Populus
miR396
eugene3.00010995
GRF
3
GCGUUCAAGAAAGCUUGUGGA Populus
miR396
eugene3.00011018
GRF
3
CCGUUCAAGAAAGAAAGCCUGUGGA
Populus
miR396
eugene3.00021070
GRF
3
UCGUUCAAGAAAGAAAGCCUGUGGA
Populus
miR396
fgenesh4_pg.
C_LG_1000725
GRF
3
CCGUUCAAGAAAGCCUGUGGA Populus
miR396
fgenesh4_pg.C_LG_XI1000270
GRF
3
CCGUUCAAGAAAGAAAGCAUGUGGA
Populus
miR396
fgenesh4_pg.
C_LG_XIV000034
GRF
3
UCGUUCAAGAAAGCCUGUGGA Populus
miR396
fgenesh4_pm.C_scaffold_28000142 GRF
3
CCGUUCAAGAAAGAAAGCCUGUGGA
Populus
miR396
gwl .XIV.854.1
GRF
3
ACGUUCAGAAAGAAAGCUGUGGA
Populus
miR396
At2g40760.1
Rhodenase
2.5
miR396
Os05g25780
Rhodenase
3
AAAUUUAAGAGAGCUGUUGAU Oryza
miR396
gwl.XIX.1660.1
Rhodenase
3
AAGUUCAAAGGAGCUGUUGAU Populus
miR397
At2g29130.1
Laccase
0.5
miR397
At2g38080.1
Laccase
1
AGUCAACGCUGCACUUAAUGA
Arabidopsis
miR397
At5g60020.1
Laccase
1
AAUCAAUGCUGCACUUAAUGA
Arabidopsis
miR397
Os1g44330
Laccase
2
CAUCAACGCUGCAGUCAACGA Oryza
miR397
Os01g61160
Laccase
3
CAUCAACGCGGCACUCAACCA Oryza
miR397
OsO1
g62480
Laccase
2.5
CAUCAACGCCGCGCUCAACGA
Oryza
miR397
Os01g62490
Laccase
0.5
CAUCAACGCUGCGCUCAAUGA Oryza
miR397
Os01g63180
Laccase
1.5
CAUCAACGCUGCGCUCAACAC Oryza
miR397
Os2g51440
Laccase
3
CAUCAACGCUGGACUCACCAA Oryza
miR395
Populus
AAGUUUAAAGGAGCUGUGGAU Arabidopsis
AAUCAAUGCUGCACUCAAUGA Arabidopsis
miR397 OsO3g16610
Laccase
1.5
GAUCAACGCUGCGCUCAACGA Oryza
miR397 Os05g38390
imiR397 Os05g38410
Laccase
2.5
GAUCAACGCGGCGCUCAACGA Oryza
Laccase
1
CAUCAACGCUGCACUCAACGA
Oryza
miR397
Os05g38420
Laccase
1
CAUCAACGCUGCACUCAACGA
Oryza
miR397
OslgO01730
Laccase
2.5
miR397
Osl 1g48060
Laccase
1
125
CAUCAACGCCGCGCUCAACAC Oryza
CAUCAACGCUGCACUGAAUGA
Oryza
miR397
Os12g01730
Laccase
2.5
CAUCAACGCCGCGCUCAACAC Oryza
miR397
Os12915530
Laccase
2.5
CAUCAACGCCGCGCUCAACAC Oryza
miR397
Os12g915680
Laccase
1.5
CAUCAACGCUGCGCUCAACAC Oryza
miR397
estExtfgenesh4_pg.C_LG_X1
635
Laccase
1.5
miR397
estExtfgenesh4_pm.C_LG_V10293 Laccase
1
miR397
estExt_fgenesh4pm.C_LG_VII
10291
Laccase
1.5
CAUCAAUGCUGCACUCAAUCA Populus
miR397
estExtGenewisel_v .C_LG_XV13501 Laccase
0.5
GAUCAAUGCUGCACUCAAUGA Populus
miR397
eugene3.0001
0449
Laccase
0.5
miR397
eugene3.00060812
Laccase
1
AAUCAACGCUGCACUCAAUAA
Populus
miR397
eugene3.00091222
Laccase
1
CAUCAACGCUGCACUAAAUGA
Populus
miR397
eugene3.00161066
Laccase
1.5
miR397
eugene3.01070064
Laccase
1
GAUCAACGCCGCACUCAAUGA
Populus
miR397
eugene3.04340001
Laccase
1
AAUCAACGCUGCACUCAAUAA
Populus
miR397
fgenesh4_pg.C_LG_IV001314
Laccase
1.5
CAUCAAUGCUGCACUCAACGA Populus
miR397
fgenesh4_pg.C_LG_IX000614
Laccase
1.5
CAUCAAUGCUGCACUCAACGA Populus
miR397
fgenesh4_pg.C_LG_IX001
228
Laccase
1.5
GAUCAAUGCUGCACUCAACGA Populus
miR397
fgenesh4_pg.C_LG_VI000783
Laccase
1.5
GAUCAAUGCAGCACUCAAUGA Populus
miR397
fgenesh4_pg.C_LG_XVI000990
Laccase
2.5
AAUCAACGCUGCUCUCGAUAA Populus
miR397
fgenesh4_pg.C_scaffold_107000055
Laccase
1
GAUCAACGCCGCACUCAAUGA
Populus
miR397
fgenesh4_pm.C_LG_1000649
Laccase
1
UAUCAACGCUGCACUAAAUGA
Populus
miR397
fgenesh4_pm.C_LG_1000891
Laccase
2
AAUCAACGCAGCACUAAAUGA Populus
miR397 grail3.0023027201
Laccase
1.5
GAUCAAUGCAGCACUCAAUGA Populus
miR397 gw1.4300.5.1
Laccase
1.5
GAUCAAUGCUGCACUCAACGA Populus
miR397 gwl..1 184.1
Laccase
1.5
GAUCAAUGCUGCACUCAACGA Populus
miR397
gw1.1.247.1
Laccase
0.5
GAUCAAUGCUGCACUCAAUGA Populus
miR397
gwl.VII.3595.1
Laccase
2.5
CAUCAAUGCUGCCCUCAACGA Populus
miR397
gwl .V11.21
00.1
Laccase
3
GGUCAAUUCUGCACUCAAUCA Populus
miR397
gwl.XI.3910.1
Laccase
1.5
GAUCAAUGCCGCACUCAAUGA Populus
miR397
gwl.XI.3915.1
Laccase
1.5
GAUCAAUGCUGCCCUCAAUGA Populus
miR398
Atlg08830.1
CSD
3
AAGGGGUUUCCUGAGAUCACA Arabidopsis
miR398
At2g28190.1
CSD
4
UGCGGGUGACCUGGGAAACA
Arabidopsis
miR398
Os03g11960
CSD
4
UGUGGGCGACCUGGGAAACA
Oryza
miR398
Os08g44770
CSD
4
UGCGGGUGACCUGGGAAACA
Oryza
miR398
fgenesh4_pm.C_scaffold_1
63000009
CSD
4
UGCGGGUGACCUGGGAAACAU Populus
miR398
gwl .IX.5030.1
CSD
4
UGCGGGUGACCUGGGAAACAU Populus
miR398
Atlg15640.1
CytC oxidase
3
AAGGUGUGACCUGAGAAUCACAArabidopsis
miR398
Os01g42650
CytC oxidase
4
GCGCCGCGACCUGAGAGCACA Oryza
miR399
At3g54700.1
P transporter
2
CAGGCCAGCUCUUCUUUGGCU Arabidopsis
miR399
Os03g04360
P transporter
3
CGGGGCAGCUCUUCUUCGGGU Oryza
miR399
Os08g45000
P transporter
0.5
CAGGGCAACUCUUCUUUGGCU Oryza
miR399
Os109g30770
P transporter
3
CGGGGCAGCUCUUCUUCGGGU Oryza
miR399 Os109g30790
P transporter
3
CGGGGCAGCUCUUCUUCGGGU Oryza
miR399
estExt_fgenesh4_pm.C_LG_V0552 P transporter
2.5
CGGGCCAGCUCUUCUUUGGCU Populus
miR399
eugene3.00051302
P transporter
2.5
CGGGCCAGCUCUUCUUUGGCU Populus
miR399 eugene3.186960001
P transporter
2.5
CGGGCCAGCUCUUCUUUGGCU Populus
miR399
fgenesh4_pg.C_scaffold_125000020
P transporter
1.5
CAGGGCAACUCUUCUUUGGGU Populus
miR399
At2g33770.1
Ub
0.5
UAGAGCAAAUCUCCUUUGGCA Arabidopsis
miR399
At2g33770.1
Ub
0.5
UAGGGCAAAUCUUCUUUGGCA Arabidopsis
rniR399 At2g33770.1
Ub
0.5
UAGGGCAUAUCUCCUUUGGCA Arabidopsis
miR399
Ub
0.5
UCGAGCAAAUCUCCUUUGGCA Arabidopsis
At2g33770.1
126
CAUCAAUGCUGCACUCAAUCA Populus
AAUCAACGCUGCACUCAACGA
Populus
GAUCAAUGCUGCACUCAAUGA Populus
GAUCAAUGCUGCACUCAACGA Populus
UUGGGCAAAUCUCCUUUGGCA Arabidopsis
miR399
At2g33770.1
Ub
0.5
miR399
0s05g48390
Ub
1
CCGGGCAAAUCUCCUUUGGCA
Oryza
miR399
Os05g48390
Ub
1
CGUGGUAAUUCUCCUUUGGCA
Oryza
miR399 0s05g48390
Ub
0
CUGGGCAAAUCUCCUUUGGCA Oryza
miR399
Os05g48390
Ub
0
UAGGGCAAAUCUCCUUUGGCA Oryza
miR399
Os05g48390
Ub
2
UCGGGCAAAUCUCCUUUGGCA Oryza
miR399
Os05g48390
Ub
2
UUGGGCAAAUCUCCUUUGGCA Oryza
miR399
eugene3.00040513
Ub
0.5
CAGGGCAAAUCUUCUUUGGCA Populus
miR399
eugene3.00040513
Ub
1.5
UAGGGCAAAUCUCUUUUGGCU Populus
miR399
eugene3.00040513
Ub
3
AAGGAAAGAUCUUCUUUGGCA Populus
miR399
eugene3.00040513
Ub
0.5
UUGGGCAAAUCUCCUUUGGCA Populus
miR399
eugene3.00040513
Ub
1
miR399
240047
eugene3.01
Ub
0.5
miR399
eugene3.01240047
Ub
1
miR399
eugene3.01240047
Ub
2.5
miR399
240047
eugene3.01
Ub
1
miR399
eugene3.01240047
Ub
1.5
miR403
Atl 9g31280.1
AGO2
0
GGAGUUUGUGCGUGAAUCUAA Arabidopsis
miR403
gw1.200.30.1
AGO2
0
GGAGUUUGUGCGUGAAUCUAA Populus
miR408
At2g30210.1
Laccase
3
ACCAGUGAAGAGGCUGUGCAG Arabidopsis
miR408
At5g05390.1
Laccase
2.5
GCCGGUGAAGAGGCUGUGCAA Arabidopsis
miR408
At5g07130.1
Laccase
2.5
GCCGGUGAAGAGGCUGUGCAG Arabidopsis
miR408
Os01g61160
Laccase
2.5
GCCGGUGAAGAGGCUGUGCAA Oryza
miR408
Os03g18640
Laccase
2.5
GCUAGUGAAGAGGCUGUGCAA Oryza
miR408
eugene3.00131222
Laccase
3
ACCAGUGAAGAGGCUGUGCAG Populus
miR408
eugene3.00191007
Laccase
2.5
GCCAGUGAGGAGGCUGUGCAG Populus
miR408
gwl .VIII.2100.1
Laccase
3
UCCAGUGAAGAGGCUGUGCAA Populus
miR408
At2g02850.1
Plantacyanin
1
CCAAGGGAAGAGGCAGUGCAU
Arabidopsis
miR408
Os02g49850
Plantacyanin
1
CUCGGGGAAGAGGCAGUGCAU
Oryza
miR408
Os03g15340
Plantacyanin
1
CCCAGGGAAGAGGCAGUGCAG
Oryza
miR408
Os6g15600
Plantacyanin
0.5
GCCGGGGAAGAGGCAGUGCAA Oryza
miR408
estExt_fgenesh4_pm.C_LG_11I1
18
Plantacyanin
1.5
GCCAGGGAAGAUGCAGUGCGA Populus
127
UAGGGAAAAUCUCCUUUGGCA
Populus
UAGGGCAAAUCUCCUUUGGCA Populus
UUGGGCAAAUCUCCUUUGGCA
Populus
AAGGGCAGAUCUUCUUUGGCA Populus
UUGGGCAAAUCUCCUUUGGCA
Populus
CAGGGCAAAUCUUCUUUGGCG Populus
MicroRNAs in plants
Brenda J. Reinhart,' Earl G. Weinstein, 1 Matthew W. Rhoades,' Bonnie Bartel,2 '3
and David P. BartelL'3
'Whitehead Institute for Biomedical Research, and Department of Biology, Massachusetts Institute of Technology,
Cambridge, Massachusetts 02142, USA; 2 Department of Biochemistry and Cell Biology, Rice University,
Houston, Texas 77005, USA
MicroRNAs (miRNAs) are an extensive class of -22-nucleotide noncoding RNAs thought to regulate gene
expression in metazoans. We find that miRNAs are also present in plants, indicating that this class of
noncoding RNA arose early in eukaryotic evolution. In this paper 16 Arabidopsis miRNAs are described,
many of which have differential expression patterns in development. Eight are absolutely conserved in the rice
genome. The plant miRNA loci potentially encode stem-loop precursors similar to those processed by Dicer
(a ribonuclease III) in animals. Mutation of an Arabidopsis Dicer homolog, CARPEL FACTORY, prevents the
accumulation of miRNAs, showing that similar mechanisms direct miRNA processing in plants and animals.
The previously described roles of CARPEL FACTORY in the development of Arabidopsis embryos, leaves,
and floral meristems suggest that the miRNAs could play regulatory roles in the development of plants as
well as animals.
[Key Words: miRNA; siRNA; ncRNA; Dicer; CARPEL FACTORY]
Received May 6, 2002; revised version accepted May 22, 2002.
A growing body of evidence suggests that -22-nucleotide
of which are conserved from worms to humans (Pas-
(nt) noncoding RNA molecules play crucial roles as regulators of gene expression in eukaryotes. The first endogenous -22-nt RNAs to be identified were lin-4 RNA and
let-7 RNA, both of which are key regulatory molecules
in the pathway controlling the timing of larval development in the nematode Caenorhabditis elegans (Leeet al.
quinelli et al. 2000; Lagos-Quintana et al. 2001; Lau et al.
2001; Lee and Ambros 2001). RNAs are classified as
1993; Reinhart et al. 2000). When these RNAs are ex-
pressed, they pair to sites within the 3' untranslated region (UTR) of target mRNAs, triggering the translational
repression of the mRNA targets (Lee et al. 1993; Wightman et al. 1993; Reinhart et al. 2000; Slack et al. 2000).
The mature lin-4 and let- 7 RNAs are processed from the
double-stranded region of RNA precursor transcripts by
Dicer, a molecule with an N-terminal helicase and tandem C-terminal ribonuclease III domains (Bernstein et
al. 2001; Grishok et al. 2001; Hutvagner et al. 2001; Ket-
ting et al. 2001). Argonaute homologs also influence the
accumulation of the lin-4 and let-7 RNAs, but their biochemical roles are unclear (Grishok et al. 2001). Argonaute family members have a PAZ domain, which may
allow protein-protein interaction with Dicer, as well as
a Piwi domain, whose function is unknown (Cerutti et
al. 2000).
The lin-4 and let-7 regulatory RNAs are now recognized as the founding members of a large class of -22-nt
noncoding RNAs termed microRNAs (miRNAs),several
3
Corresponding
authors.
E-MAILbartelflrice.edu; FAX (713)348-5154.
E-MAILdbartelwi.mit.edu; FAX(617)258-6768.
Article and publication are at http://www.genesdev.org/cgi/doi/10.1101/
gad.1004402.
1616
miRNAs if they share the following features with lin-4
and let-7 RNAs: (1) The mature form of the RNA is a
20-nt to 24-nt species that is usually detectable on
Northern blots. (2) The RNA has the potential to pair to
flanking genomic sequences, placing the mature miRNA
within an imperfect RNA duplex thought to be needed
for its processing from a longer precursor transcript. In
addition, miRNAs are typically derived from a segment
of the genome that is distinct from predicted proteincoding regions. Thus far, >150 tiny RNAs that satisfy
these criteria have been identified in animals (LagosQuintana et al. 2001, 2002; Lau et al. 2001; Lee and Am-
bros 2001; Mourelatos et al. 2002).The abundance of the
miRNA genes, their intriguing expression patterns in
different tissues or in different stages of development,
and their evolutionary conservation imply that, as a
class, miRNAs have broad regulatory functions in addition to the known roles of lin-4 and let-7 RNAs in the
temporal control of developmental events. In support of
this idea, six of the recently identified Drosophila miRNAs
are complementary to 3'-UTR elements known to confer
posttranscriptional
regulation in this species (Lai 2002).
MicroRNAs are not the only small RNAs processed by
Dicer. Dicer was originally identified as a nuclease involved in the RNA interference (RNAi) pathway of animals (Bernstein et al. 2001).This method of RNA silencing is triggered by long double-stranded RNA (dsRNA),
typically introduced by injection or expression from
a transgene
(Fire et al. 1998). The dsRNA trigger is
GENES& DEVELOPMENT
16:1616-1626O 2002 by Cold SpringHarborLaboratoryPressISSN 0890-9369/02$5.00; www.genesdev.org
microRNAsin plants
cleaved by Dicer into -22-nt RNAs (Bernstein et al.
2001). These -22-nt RNAs, known as small interfering
-100 were cloned from flowers. Of these, 18 sequences
RNAs (siRNAs), act as guide RNAs to target homologous
subject of further analysis. Of these 18 RNAs, 16 had
mRNA sequences for destruction (Hammond et al. 2000;
Zamore et al. 2000; Elbashir et al. 2001).RNAs -25 nt in
length are also associated with posttranscriptional gene
silencing (PTGS)in plants, and it has been suggested that
a Dicer-like activity also produces these small RNAs
(Hamilton and Baulcombe 1999; Matzke et al. 2001;
striking similarities to the miRNAs of animals and have
therefore been named miR156 through miR171, with
were represented by more than one clone and were the
genes designated MIR 156 through MIR 171 (Table 1). Six
of the miRNAs represent three pairs of closely related
RNA sequences differing only by one or two nucleotides.
Vance and Vaucheret 2001). RNAi, PTGS, and quelling
Interestingly, most of the plant miRNAs begin with a U,
a trend previously observed in animal miRNAs (Lagos-
of Neurospora are related pathways that require a conserved set of proteins (Hutvigner and Zamore 2002).For
Quintana et al. 2001; Lau et al. 2001).
Five of the plant miRNA sequences have a single copy
example, PTGS requires ARGONAUTE
in the Arabidopsis genome, whereas each of the other 11
(Fagard et al.
2000), the RNA-directed RNA polymerase SDE1/SGS2,
which may amplify dsRNA used as a trigger for silencing
(Dalmay et al. 2000; Mourrain et al. 2000),and the RNA
sequences correspond to multiple (2-7) loci (Table 1),
helicase SDE3 (Dalmay et al. 2001). Some aspects of
expected for miRNA loci, nearly all (37 of 40) of the
RNA silencing may be species-specific, such as the
RNA-directed DNA methylation required to maintain
transgene silencing in plants (Morel et al. 2000; Bender
2001). Although RNA silencing has been proposed to
genomic loci lie outside of annotated segments of the
genome, and thus do not correspond to previously iden-
have evolved as a viral defense mechanism (Vance and
Vaucheret 2001), it can clearly be used by organisms for
the regulation of endogenous genes. The Drosophila Argonaute family member aubergine is involved in the endogenous RNAi-like silencing of Stellate by dsRNA pro-
duced from both DNA strands of the Suppressor of Stellate locus (Aravin et al. 2001). It is possible that other
animals or plants also generate endogenous siRNAs for
gene regulation in development.
To further examine the roles of small RNAs in the
regulation of plant gene expression, we cloned endogenous RNAs from Arabidopsis. Here we describe 16
plant RNAs that have the defining features of miRNAs.
The presence of miRNAs in plants greatly expands the
known phylogenetic distribution of this class of tiny
noncoding RNAs and indicates that miRNAs arose early
in eukaryotic evolution, before the last common ancestor of plants and animals. The presence of miRNAs in
plants also suggests that the developmental defects of
carpel factory (caf)l,a mutation in a Dicer homolog (Jacobsen et al. 1999), and mutations in ARGONAUTE
family proteins (Bohmert et al. 1998; Moussian et al.
1998) could result from miRNA processing defects. In
fact, we find that the accumulation of plant miRNAs is
substantially reduced in the caf mutant. The ancient origin of miRNAs, together with the potential link between
miRNAs and development, implies that miRNAs might
most likely because of duplications in the Arabidopsis
genome (The Arabidopsis Genome Initiative 2000). As
tified genes. The three
exceptions
are for a single
miRNA, miR171. Furthermore, each of these 37 loci
place the cloned RNA sequence in a context where it can
pair with a nearby genomic segment to form a dsRNA
hairpin structure resembling those thought to be required for Dicer processing of miRNAs (Fig. 1; Supple-
mental data available online at http://www.genesdev.
orgl. As with metazoans, the mature miRNA can be processed from either the 5' or the 3' arm of the fold-back
precursor. Nevertheless, each miRNA with multiple
matches to the genome is always present on the same
arm of its potential precursors, suggesting that these loci
share a common ancestry (see Supplemental data available online at http://www.genesdev.org). We do not
know whether all of these loci are transcriptionally active or whether some might be pseudogenes.
The sizes of the predicted Arabidopsis hairpins are
more variable than those of animals. For example, Caenorhabditis elegans miRNAs tend to be cleaved from
precursors -70 nt in length, with the mature miRNA
located only -2-10 bp from the terminal loop of the
stem-loop (Lau et al. 2001). Although some of the Arabidopsis precursor predictions resemble those of C. elegans (Fig. 1), others are larger, as seen for the -190-nt
predicted precursor of miR169 (Fig. 1).
In other systems, only one of the RNA strands accu-
mulates following Dicer processing of miRNAs from the
double-stranded region of the precursor, while the remainder of the precursor quickly degrades (HutvAgneret
have played roles during the origins and evolution of
al. 2001). As a result, RNA from only one side of the
both plant and animal multicellular life.
miRNA precursor is typically cloned or detected on
Northern blots, although on rare occasions RNA from
the other side of the precursor is identified (Lau et al.
Results
Identification of Arabidopsis miRNAs
2001; Mourelatos et al. 2002), particularly if many clones
are sequenced (E.G. Weinstein and D.P. Bartel, unpubl.).
Using methods designed to clone Dicer cleavage products, which are 20-nt to 24-nt RNAs with 5'-phosphate
and 3'-hydroxyl groups (Bernstein et al. 2001; Elbashir et
In contrast, Dicer processing of perfectly complementary
dsRNA molecules in the RNAi pathway is thought to
produce two stable overlapping -21-nt RNA molecules
that pair to each other with -2-nt 3' overhangs (Elbashir
al. 2001; Hutvigner
et al. 2001; Nykiken et al. 2001). As expected, for most
et al. 2001; Lau et al. 2001), -200
tiny RNAs were cloned from Arabidopsis seedlings and
(14/16)of the plant miRNAs, we cloned sequences from
GENES& DEVELOPMENT
1617
Table 1. MicroRNAs cloned from Arabidopsis
miRNA
gene
No. of
clones
MIR156a
16
miRNA sequence
miRNA
length
(nt)
Oryza
matches
20-2 I
10
Fold- Foldb;ack back
arm length
Chr.
Distance to nearest gene
5'
82
2
3.2 kb downstream of At2g25100(s)
MIR156b
5'
80
4
0.36 kb upstream of At4g30970(a)
MIR156c
5'
83
4
3.2 kb downstream of At4g31875(s)
MIR156d
5'
86
5
2.6 kb upstream of At5g10940(s)
MIR 156e
5'
96
5
1.6 kb downstream of At5gl1980 (s)
MIR156f
5'
90
5
1.3 kb downstream of At5g26150(a)
5'
91
1
1.8 kb downstream of Atlg66780 (a)
MIR157a
9
UGACAGAAGAGAGUGAGCAC
UUGACAGAAGAUAGAGAGCAC
20-2
MIR157b
5'
91
1
2.7 kb downstream of Atlg66790 (a)
MIR 157c
5'
165
3
2.3 kb downstream of At3g18215(a)
MIR157d
5'
173
1
1.0 kb upstream of Atlg48470 (s)
0.6kb upstream of At3g10750(s)
MIR 158
8
UCCCAAAUGUAGACAAAGCA
20
3'
64
3
MIR 159
8
UUUGGAUUGAAGGGAGCUCUA
21
3'
182
1
1.9 kb upstream of Atlg73690 (s)
MIR 160a
4
UGCCUGGCUCCCUGUAUGCCA
21
5'
78
2
4.0 kb downstream of At2g39180(a)
MIR .160b
5'
80
4
2.4 kb upstream of At4g17790(a)
MIR.160c
5'
81
5
1.5 kb upstream of At5g46850(a)
5'
90
1
2.6 kb downstream of Atlg48270 (a)
3'
85
5
1.2 kb upstream of At5g08190(s)
3'
88
5
1.4 kb upstream of At5g23070(s)
3'
303
1
0.6 kb upstream of Atlg66730 (s)
5'
78
2
1.lkb upstream of At2g47590(s)
5'
149
5
2.4 kb upstream of At5g01750(s)
3'
101
1
1.5 kb downstream of AtlgOI 180 (a)
3'
136
4
2.8 kb upstream of At4gO0880(s)
MIR .161
MIR.162a
16
UUGAAAGUGACUACAUCGGGG
20-2 1
3
UCGAUAAACCUCUGCAUCCAG
21
4
-
1
MIR .162b
MIR.163
24
UUGAAGAGGACUUGGAACUUCGAU
24
MIR.164a
21
UGGAGAAGCAGGGCACGUGCA
21
2
UCGGACCAGGCUUCAUCCCCC
2
MIR164b
MIR165a
20-2
MIR165b
3'
136
2
4.7 kb upstream of At2g46690(a)
MIR166b
3'
112
3
3.5 kb upstream of At3g61900(a)
MIR166c
3'
108
5
10 kb downstream of At5g08690(s)
MIR166d
3'
101
5
22 kb downstream of At5g08740 (a)
MIR166e
3'
135
5
2.6 kb downstream of At5g41910(a)
MIR166f
3'
91
5
1.1 kb downstream of At5g43600(s)
MIR166g
3'
90
5
1.5 kb upstream of At5g63720(s)
5'
101
3
4.7 kb upstream of At3g22890(a)
5'
90
3
0.19 kb downstream of At3g63370(s)
5'
104
4
2.3 kb upstream of At4g19390(a)
5'
89
5
0.5 kb downstream of At5g45310(s)
5'
190
3
1.9 kb downstream of At3g13400(a)
3'
64
5
0.5 kb downstream of At5g66040(s)
3'
-
92
-
3
2
3
4
0.5 kb downstream of At3g51380(a)
in At2g45160SCARECROW-like(a)
in At3g60630SCARECROW-like(a)
in At4g00150SCARECROW-like6 (a)
MIR166a
MIR167a
5
19
UCGGACCAGGCUUCAUUCCCC
UGAAGCUGCCAGCAUGAUCUA
21
21
6
3
MIR167b
3
UCGCUUGGUGCAGGUCGGGGA
21
MIR169
3
CAGCCAAGGAUGACUUGCCGA
21
MIRI 70
3
UGAUUGAGCCGUGUCAAUAUC
21
MIR171
10
UGAUUGAGCCGCGCCAAUAUC
21
MIR168a
MIR168b
2a
5
b
Some miRNAs are represented by clones of different lengths due to heterogeneity of the RNAends. The sequence of the most abundant clone is shown.
Both miR156 and miR161 clones were found with 5' or3' heterogeneity. MIR160band MIR161 each had one clone of the same size but in a registershifted
5' of the sequence shown by 2 and 8 nucleotides, respectively. The number of perfect matches to the availablerice genomic sequence (Oryza matches)
are indicated, as is the arm of the predicted stem-loop precursor that contains the miRNA (Fold-backarm) and the minimum number of nt that would
be required to from a fold-back structure bounded by the miRNA and the segment of the predictedprecursor that pairs to the miRNA (Fold-backlength).
Oryza fold-backshave the miRNA in the same arm as their Arabidopsis homologs (Supplemental data available online at http://www.genesdev.org).
Chromosomal (Chr)positions, distance to the nearest annotated gene, and the position of the miRNA, sense (s)and antisense (a),relative to the nearest
gene are noted for all matches in the Arabidopsis genome.
aOne of the miR169 Oryza matches is at the end of a contig, precluding prediction of a fold-back precursor structure.
bAs with Arabidopsis, only one of the miR171 Orzya matches has a predicted fold-back characteristic of miRNAs.
1618
GENES& DEVELOPMENT
microRNAs in plants
UC
U
U
A-U
C-G
G-C
U-A
U
C
A-U
A-U
C-G
G-C
GAU
oCB
A-U
G U
U
C
G-U
U-A
A-U
U-A
U-A
G-C
C U U
A-U
C-G
G-C
A-U
C-G
UUc
UUC
C G
U-A
A-U
C-G
G-C
U-A
U
U
A-U
A-U
C-G
C
A
C-G
A-U
C-G
G-C
A-U
G-C
U-A
G-C
A-UG
G-C
A-U
G-C
AU
A-U
A-U
G-C
A-U
C-G
A-U
G-C
U-A
C-G
GUG
U
A-U C
C
A
C-G
A-U
C-G
G-C
A-U
G-C
U-A
G-CC
A-U
G-UC
A-U
G-C U
U
A-U
A-U
G-C
A-U
C-G
A-U
G-C
U-A
C-G
A-U
A-U
A-U
C-G
zU
C
5'
3'
5'
3'
A-U
C-U
A
C-G
A-U
C-G
G-C
A-U
G-C
U-A
G-C
A-U
C
A G U
A
U
A-U
A-U
U'G
A-U
U-A
G'U
U-A
U
U
G
G
A
G
A-U
G-U
G
G
G-C
G'U
A-U
A-U
A-U
C-G
A
C
C-G
A
C_
G C
AU
G C
UA
G C
A-U
G C
AU
CG
C
AU
A-U
G-U
A-U
C G
A U
UA
U
A-U
G-C
A-U
GCC
A-UA
G-C
A-U
C-G
5'
U-A
C-G
3'
U-A
G-C
A-U
A-U
G-C
5'
c
3'
GGA
AU
A-U
G-C
U-A
UG
U'G
U-A
U-A
UG
U-A
U-A
U-A 1
Ir44itod
U
U-A
U-A
U-A
U-A
UCG
U-A
U-A
U-A
U-A
U-A
C-G
UGA
U
U
U
U
U-A
U G
U'G
AA BQ
G - CU U
U-A
A-U
u
u
U
U
U
U
C-G
U-A
UU-A
GGu-G
U-A
A-8
C_
G-C
U
U
5'
d
3'
e
U
U
U-A
G-C
G-U
U-A
A-U
C-0
A-U
C-G
A-U
C-G
G-C
A-U
G-U
U-A
G-C
A-U
G-C
A-U
G-C
A-U
C
A-U
G-C
A-U
C-G
A-U
G-C
U-A
G-C
AG-C
G-C
o-C
A-U
U
CU
CC
U
UUA
op
A
G-
C-0
C
G-C
U
U
C
A-U
A
3
G C
G C
G-C
A
C
C-G
A-U
G-C
U-A
G-C
G-C
3'
f
A
u.-GU
U-A
A-U
G-C
G'U
U-A
A-U
C-G
A-U
C-G
A-U
C-G
G-C
A-U
G-C
U-A
G-C
A-U
G-C
A-U
G-C
A-U
5'
A-U
4ntF
C
C
G C
A
C
A
a C
U
GU-G
G-U
A-U
5'
3'
MIR169
U UUACUU
A
A
C
UU-AC
A-U
G-C
A-U
C-G
U-A
C
U
A-U
C-G
U-A
U-G
G-C
G-C
U-G
C
U
C-G
G-U
G-C
U-A
U-A
A-U
U-A
A-U
G-C
U
U
C
C
5'
3'
MIR170
MIR156
Figure 1. Fold-back secondary structures of Arabidopsis miRNA predicted precursors as determined by the RNAfold program. The
miRNA sequences are in red. For miR156 and miR169, RNAs from the other side of the fold-back (boxed in blue) were each cloned
once. The duplexes that could form between these RNAs and the miRNA from the other strand have -2-nt 3' overhangs characteristic
of Dicer cleavage (Elbashir et al. 2001).
only one arm of the fold-back precursor. For two loci, we
also cloned a single 21-nt sequence from the other arm of
the fold-back (Fig. 1). The disparity in cloning frequency
between the two sides, 16:1 in the case of MIR156, was
similar to that seen for metazoan miRNAs (E.G. Weinstein and D.P. Bartel, unpubl.). The isolation of these
two sequences generated from the opposite arm of the
predicted fold-back supports the existence of these stemGENES & DEVELOPMENT
1619
Reinhart et al.
loops as miRNA precursors. Furthermore, the duplexes
that could be formed between the sequences isolated
from both sides of the stems have 2-nt 3' overhangs (Fig.
1), suggesting that they are products of a Dicer-like activity similar to that which processes the metazoan
miRNAs (E.G. Weinstein and D.P. Bartel, unpubl.).
have differently processed precursors or tissue-specific
differences in the Arabidopsis miRNA processing machinery. We have not been able to reliably detect expression of RNAs in the size range of 60-200 nt that might
correspond to the stem-loop precursors cleaved by Dicer.
Arabidopsis miRNAs are produced
by CARPEL FACTORY
The Arabidopsis miRNAs display developmental
expression differences
Although the presence of precursors in Arabidopsis was
not detected on Northern blots, the potential for their
production prompted us to investigate whether the -21nt miRNAs might be processed from a longer dsRNA by
proteins homologous to those that generate metazoan
miRNAs. Dicer is thought to cleave the double-stranded
region of the miRNA precursors in Drosophila, C. elegans, and humans (Grishok et al. 2001; Hutvigner et al.
2001; Ketting et al. 2001; Lee and Ambros 2001). Mutations have been isolated in only one of the four Dicer
homologs in Arabidopsis, CARPEL FACTORY (CAF;
also named SHORT INTEGUMENT [SIN1]; GenBank
accession no. AAG38019). The pleiotropic phenotypes
associated with loss of CAF/SIN1 function, such as floral
meristem proliferation defects, floral organ morphogenesis defects, and altered ovule development, emphasize
the critical developmental role of RNAs processed by
CAF (Robinson-Beers et al. 1992; Ray et al. 1996a,b; Jacobsen et al. 1999). Northern analysis showed that the
expression level of the three miRNAs tested is signifi-
Northern analysis confirmed that the 16 miRNAs were
stably expressed as -21-nt RNAs (Fig. 2). All are expressed at some level in seedlings, leaves, stems, flowers,
and siliques (seed pods). Whereas miR163 accumulates
in all tissues, with only slightly lower levels in seedlings
and siliques, other miRNAs have quite variable levels
among the tissues tested. For example, miR157 is most
highly expressed in seedlings, and miR171 is most highly
expressed in flowers, suggesting that they might play
roles in the development of these stages/organs. The size
of the RNAs detected approximately matches those that
were cloned. In some cases, RNAs of two sizes can be
detected, reflecting the heterogeneity of the cloned sequences (Table 1). For example, a probe to miR156 detects both 20-nt and 21-nt RNAs, and the miR156 clones
were of both sizes. In another case, miR167, a 21-nt RNA
accumulates in all tissues except stem, where a 22-nt
RNA accumulates instead. This might reflect either differential transcription of the two MIR167 genes that
M Se
L St
F
M Se
Si
L
St
F
Si
24miR158
Figure 2.
Developmental expression of Ara-
bidopsis miRNAs. Total RNA from Columbia
seedlings (Se), leaves (L), stems (St), flowers (F),
and siliques (Si) was analyzed on Northern
blots by hybridization to end-labeled DNA oligonucleotide probes complementary to the
miRNA. The lengths of end-labeled RNA oligonucleotides run as a size marker (M) are
noted to the left of each panel. Although
miR165 and miR166 sequences and miR170
and miR171 sequences are too closely related
to be reliably distinguished by hybridization
probes, miR156 and miR157 should be specifically recognized (Lau et al. 2001), as reflected
in their different levels of expression in seedlings and siliques. miR159 and miR164 show a
similar expression profile to miR165, whereas
miR160, miR162, and miR168 have similar
profiles to miR158 (data not shown). The low
expression level of most miRNAs in leaves
and siliques might reflect a difference in the
efficiency of small RNA recovery with the
RNA isolation method used for these two tissues (see Materials and Methods). Blots were
stripped and reprobed with an oligonucleotide
probe complementary to U6 as a loading control.
1620
24-
(1)
2118-
miRll
46,'
miR163
I-,
miR157
21-0
18-
ft
242118miR156
oi
2
miR169
U6
7824miR165
18-
Sr
miR171
i
miR167
w
2421-
mIR170
18-
*aSu
78
GENES & DEVELOPMENT
)3K
U6
microRNAs in plants
cantly reduced in carpel factory homozygotes (Fig. 3).
Although the level of miRNA precursors is increased
when Dicer function is reduced in metazoans (Grishok
et al. 2001; Hutvigner et al. 2001; Ketting et al. 2001; Lee
and Ambros 2001), we have not detected precursor accumulation in caf mutants (Fig. 3; data not shown).
Evolutionary conservation of Arabidopsis miRNAs
in Oryza
The evolutionary conservation of miRNA sequences in
different species indicates that they have important biological functions (Pasquinelli et al. 2000; Lagos-Quintana et al. 2001; Lau et al. 2001; Lee and Ambros 2001).
Eight Arabidopsis miRNAs have sets of identical
matches in the genome of the rice Oryza sativa L. ssp.
indica (Table 1), which was estimated to have 92% functional coverage at the time of our analysis (Yu et al.
2002). With rare exceptions (noted in Table 1), these sets
of Oryza homologs have adjacent sequences that could
form stem-loop precursors analogous to those of Arabidopsis, with the miRNA sequence invariably on the
M
(CAF/CAF)
(CAF/caO
L St
L St
F
F
(caf/ca)
L
St F
miR169
4W 40 #0
ow NoS~
miR156
$** WO ft
r!
e
A*.W
miR158
7Jlil
Ueo
Figure 3. Expression of miR169 is dependent on CARPEL
FACTORY. Total RNA from wild-type Landsberg erecta (CAF/
CAF), heterozygous (CAF/caf), and homozygous (caf/caf) carpel
factory leaves (L), stems (St), and flowers (F)was analyzed on a
Northern blot. RNA size markers (M) are noted to the left. The
blot probed for miR158 was stripped and reprobed with a U6
end-labeled DNA probe as a loading control.
same arm of the precursor in both species (see Supplemental data available online at http://www.genesdev.org). The Arabidopsis and Oryza sequences have
drifted considerably in regions outside the miRNA sequence, but selective pressure can be seen in the segments predicted to base-pair with the miRNAs, resulting
in only a few base changes in these segments and a conserved overall propensity for dsRNA formation (Fig. 4).
For each set of related loci, the precursor duplexes extend
beyond the length of the miRNA, but the sequence of the
flanking duplex RNA is variable (see Supplemental data
available online at http://www.genesdev.org). This conservation in secondary structure accompanied by variability in sequence provides added evidence that the secondary structural context of these RNAs is important,
presumably for their processing from stem-loop precursors.
An miRNA complementary to three related mRNAs
In nematodes, lin-4 and let-7 RNA recognize their target
mRNAs through limited base-pairing to complementary
sites within the 3' UTR of their targets. The largest regions of uninterrupted complementarity are only -8 nt
(Lee et al. 1993; Wightman et al. 1993; Reinhart et al.
2000; Slack et al. 2000). Consistent with this precedent,
the plant miRNA sequences do not perfectly match coding regions, with the exception of miRl71, which has
four matches to the genome. One locus is 0.5 kb from
the nearest predicted coding region and adjacent to genomic sequence that can form a classical miRNA precursor, consistent with the idea that it is a true miRNA.
Further supporting this idea is the observation that a
closely related sequence, miR170, was also cloned multiple times and has all the characteristics of the other
plant miRNAs. However, the other three MIR171 loci
differ from those of the other miRNAs (Table 1). They
are anti-sense to the coding region of three SCARECROW-like genes of the GRAS family of putative transcription factors (DiLaurenzio et al. 1996; Pysh et al.
1999). This is the first example of a convincing miRNA
candidate that is also the perfect anti-sense match to a
coding region. Although this miR171 sequence identity
might be a coincidence, the targets of this 21-nt RNA
could include these three SCARECROW-like genes.
miR171 (and perhaps the related miRNA, miR170)
might act like a translational regulator similar to the
lin-4 and let-7 RNAs, or it might pair with these three
genes for a very different type of regulatory interaction.
miR171 could direct cleavage of the messages as if it
were an siRNA of the RNAi pathway, or it could direct
a nucleic acid modification such as the methylation of
genomic DNA seen in PTGS and transcriptional gene
silencing of plants. Interestingly, the five perfect
matches to miR171 in Oryza also include one miRNA
homolog and four anti-sense matches to SCARECROW
family members. This observation raises the possibility
that these SCARECROW segments might be conserved
based on their function as miRNA targets in addition to
their function in coding proteins.
GENES & DEVELOPMENT
21·1
1621
Reinhart et al.
AU
A
U
G-U
G-U
C-0
G-U
AU G
U-A
A-U
U-A
A-U
U-A
O 0
U-A
G-U
C-G
G-U
U-6
A-U
G-U
G-U
C-0
G
G
A-U
G-C
G-C
A-U
C
U
C-0
U
U
U
A
G
a-C
U-A
C-a
G-C
C
C
A
A
G-C
A
A
A A
U
A
U-A
A-U
C-B
A-U
AA
AC
AA
AA
G UG
G-C
U-A
C
U
C-G
U-A
U-A
C
U
U-A
C-0
U-A
A-U
G-C
C-G
U-A
A-U
C
A
U-A
U-A
0-C
a-C
C
U
A-U
-C0
G-C
O A
A-U
G-C
a-C
U-A
C-G
G-C
C-9
U
U
5'
3'
MIR162a
AU
U
U
U
U
A-UU
U'G
A-U
A-U
G-U
U-A
G-C
U-AAA
C
C - GAA
a
a
G -C
C-0
C-G
C-0
U-A
U-A
A-U
A
C
C-G
U-A
A-U
G-C
C-0
U-A
A-U
C
A
U-A
U-A
G-C
G-C
C
U
G-C
A-U
0-G
0-C
G
A
A-U
G-C
G-c
U-A
C-G
a-C
C-0
U
C
5'
3'
MIR162b
Arabdopsis Arabidopsis
U-A
U-A
U
C
C C
U-A
A-U
a-C
C-0
U-A
A-U
U-A
U-A
U-A
G-C
G-C
U
U
a-C
A-U
CG0
0-C
A
C
0
U
a-C
a-C
U-A
C-0
CU
5'
3'
MIR162
Oryza
C-G A
A-UA
A-UC
U
C
AC
A-U
A-U
AC
A
U
AU
A-U
A-U
A-U
c-aG
G-c
U-A
G-C
C-0
A-U
A
C
G-C
G-U
G-U
A
A
UC A
A-U
A-U
8-C
A-U
G-C
G-C
U-A
U-A
G-C
U-A
5'
ccU
UC-0U
C
3'
A
A-U
C-6
U-A
C-0
AU
AU-
U-A
A-U
U.0
A 0
U-A
A-U
0-0
U-A
C-0
aGU
A-U
U-A
c-G
U
C-0
C-0
U
CU U
U-A
0-G
U-A
C-a
A-U
C-0
U-A
C-G
U
U
C-a
A-U
AAU
A
C
0-C
C-G
C
U
C
G
a
U-A
0-C
U-A
U-A
C 0C
C-
A
B
cc
u
A
C-0
G-C
Au
CU-a
U-A
U-A
U-C
UA-UC
G-C
U-A
U
0-c
A-U
C-G
U-A
A-U
C-G
A-U
C-G
G-C
G-C
AG'CA
0e-cu
U•.U
A-U
A-U
AG'CA
G-0
G-C
U-A
A-U
G-C
5'
3'
MIR164a
MIR164b
Arabidopsis Arabidopsis
GU-A
c
G-C
A-U
A-U
G-C
A-U
G-C
A-U
A-U
G-C
A-U
U-A
U
U
G-C
C
C
G-C
5'
3'
MIRI64a
Oryza
C
a c
C-a
C-G
G-C
G-C
C-G
C
C
G-C
U-AU
C
U
U
C-a
0-C
C
UC
A
C
A-U
C-G
0-G
U-A
A-U
C-G
C-0
uuA -U UC
A
U
U-A
G-U
C-6
A-U
C-G
G-C
G-C
A
0-C
C
U
G-C
A-U
A-U
G-C
A-U
G-C
U-A
a-c
G-C
5'
3'
MIR164b
Oryza
Figure 4. Conservation between the Arabidopsis and Oryza predicted stem-loop precursors. (A) miR162 homologs. (B) miR164
homologs. Sequence homology is seen within the miRNA (in red), its paired sequences, and a few base pairs adjacent to the miRNA.
The remainder of the sequence has drifted considerably, with the main constraint being the formation of a stem-loop structure.
Other endogenous small RNAs
The other two RNAs cloned multiple times, Seq C and
Seq F in Figure 5, are not likely to be miRNAs. Expression of Seq F but not Seq C can be detected on Northern
blots (data not shown). Nonetheless, neither appears to
have the potential to form extended pairing with the
adjoining sequence like that seen for the other 16 se1622
GENES & DEVELOPMENT
quences. Interestingly, both of these sequences match
single loci in the same 2.3-kb region of Chromosome 2
that is also the source of four other -22-nt RNAs that we
cloned once (Fig. 5). These RNAs are unlikely to be simply degradation products of mRNAs. Only two of these
six sequences correspond to the same DNA strand as the
two predicted protein-coding genes in this 2.3-kb region.
Moreover, one of the single-clone RNAs (Fig. 5, Seq B) is
microRNAs in plants
At2g39680
At2g39670
C*
E
I
C
5
11I
I
II
AB
1 kb
A UUCAAUAAAU AAUUGGUUCU A
(1)
B GAACUAGAAA AGACAUUGGA C
(1)
C UCCAAUGUCU UUUCUAGUUC GU (3)
D AGAGUAAGAU GGAUCUUGAU AA (1)
.
-
I
I
D F
E UAUAUCCCAU UUCUACCAUC UG (1)
F UCCAAGCGAA UGAUGAUACU U
(3)
Figure 5. A cluster of small RNAs derived from Chromosome 2. Arrows represent the two predicted genes in this region, and vertical
lines represent the genomic positions of the six cloned RNAs. Sequences of the RNAs are listed, with cloning frequencies in
parentheses.
a 2-nt-offset reverse-complement
of Seq C. A duplex
formed between them would have 1-nt and 2-nt 3' overhangs, reminiscent of Dicer cleavage products during
RNAi (Elbashir et al. 2001). The high density of 21-nt to
22-nt RNAs cloned from this region implicates either
endogenous RNAi or some other, unknown Dicer-mediated event.
Discussion
We have described 16 plant miRNAs that have the
characteristic features of metazoan miRNAs. Like the
miRNAs of animals, the plant miRNAs are 20-nt to 24nt endogenous RNAs detectable on Northern blots and
are derived from one arm of an apparent stem-loop precursor through the action of Dicer. As with most of the
metazoan miRNAs, most plant miRNAs begin with a U,
are transcribed from independent genes, and are evolutionarily conserved. The discovery that the phylogenetic
distribution of miRNAs extends to plants indicates that
miRNAs arose early in eukaryotic evolution and suggests that they have been shaping gene expression since
the emergence of multicellular life. Although the evolution of the RNAi and PTGS pathways and their related
proteins has been attributed to defense against viruses
and transposons (Ketting and Plasterk 2000; Vance and
Vaucheret 2001), the presence of miRNAs in plants suggests that Dicer and Argonaute proteins also have ancient roles in miRNA processing and function.
One difference between plant and animal miRNAs is
the dsRNA precursor from which the mature miRNAs
are cleaved. Based on the length of RNA that would be
necessary to allow the miRNA to be incorporated into an
RNA duplex suitable for Dicer cleavage, we predict that
plant miRNA precursors can be more than three times as
large as those of animals (Table 1).However, we have not
detected plant precursor molecules during our Northern
analysis of wild-type or caf RNA. Our method may not
be sufficiently sensitive to detect very low levels of
precursors. Perhaps precursor transcripts are more rapidly cleaved and turned over in Arabidopsis than in
metazoans, or plant precursors might be too large or diffuse in size for Northern analysis techniques maximized
for the resolution of the -21-nt mature RNAs. For instance, plant miRNAs might be processed cotranscriptionally, directly from transient primary transcripts.
This would be in contrast to metazoan miRNAs, which
often appear to be processed from metastable stem-loop
precursors that have been preprocessed from a primary
transcript (Lauet al. 2001).Although the common role of
Dicer homologs in the production of plant and animal
miRNAs highlights the similarities between their
mechanisms of production, there might be differences in
the structure and production of precursors, cellular compartmentalization, timing of precursor processing, or
types of cofactors involved in processing.
The increasing number of miRNAs being identified
raises the question of what their cellular functions are.
Although some might regulate translation via base-pairing to target gene 3' UTRs in a manner similar to regulation by lin-4 and let-7 RNAs, it is not clear whether all
will be found to perform similar biochemical functions.
One hint that miRNAs could perform other types of
RNA-mediated gene regulation is our finding that
miR171 could interact with the coding region of three
GRAS family transcription factors through perfect
complementarity rather than the limited base-pairing
seen between lin-4 and let-7 and the 3' UTRs of their
targets. If these genes are regulatory targets of miR171,
the miRNA could act like other -21-nt regulatory RNAs
and direct mRNA degradation or epigenetic modification
of the genomic sequence.
A role for the miRNAs in development of both plants
and animals is suggested by the phenotypes of Dicer and
Argonaute family mutants. In C. elegans, developmental
defects resulting from reduction of function of dcr-1
(Dicer) and alg-llalg-2 (Argonaute-like gene) have been
attributed to the improper processing of miRNA precursors and a reduction in mature miRNA expression
(Grishok et al. 2001). The mutant animals essentially
reiterate stem-cell-like divisions and delay the switch to
a later-stage developmental program. An intriguing parallel in Arabidopsis is that mutant alleles of caf/sinl
delay the meristem switch from vegetative to floral development (Ray et al. 1996a)and cause overproliferation
of the floral meristem (Jacobsen et al. 1999), which sug-
gests a distant link between the pathways affected by
Dicer mutants in plants and animals. Mutations in two
Arabidopsis Argonaute family genes also alter meristem
development. The argonaute mutants disrupt axillary
shoot meristem formation and leaf development (Bohmert et al. 1998), and ZWILLE/PINHEADis required for
shoot meristem maintenance and floral development
GENES& DEVELOPMENT
1623
Reinhartet al.
(Moussian et al. 1998; Lynn et al. 1999). The existence of
amide gel, electroblotted to a nylon membrane, and hybridized
miRNAs in plants suggests that aberrant processing of
miRNAs could be responsible for some if not all of the
developmental defects in caf mutants, and it is possible
that the same will be true for argonaute or zwille/pinhead mutants. However, ARGONAUTE is also required
to end-labeled anti-sense DNA probes (Lee et al. 1993).
for PTGS (Fagard et al. 2000; Morel et al. 2002), and a
genomes/A_thaliana/ (13-Aug-2001).Predicted secondary structures were generated using the Zucker folding algorithm and
manually inspected for fold-backs with the RNA sequence in
related protein is required for RNAi in animals (Tabara
et al. 1999; Hammond et al. 2001; Williams and Rubin
2002).In fact, the Drosophila Argonaute family member
aubergine, a gene required for oogenesis (Schupbach and
Wieschaus 1991), is involved in the endogenous RNAi-
Sequence analysis
Sequences of RNA clones were compared with the Arabidopsis
genome downloaded from ftp://ncbi.nlm.nih.gov/genbank/
the stem as is characteristic
of metazoan miRNAs (Lau et al.
2001). To identify Oryza sativa homologs, the miRNAs were
compared with the rice genome sequence downloaded from the
like silencing of Stellate by dsRNA produced from both
DNA strands of the Suppressor of Stellate locus (Aravin
et al. 2001), raising the possibility that the Arabidopsis
argonaute or caf phenotypes reflect the role of these proteins in the production of endogenous siRNAs that con-
Beijing Genomics Institute Web site at http://btn.genomics.
trol gene expression. Further investigation of the roles of
Acknowledgments
small RNAs such as those from the Chromosome 2 cluster (Fig. 5) will address this possibility.
We thank the Arabidopsis Biological Resource Center at Ohio
State University for seeds segregatingfor the caf mutation, Nel-
Finally, we suspect that other classes of Dicer- and
Argonaute-dependent small RNAs are present in Arabidopsis. Noncoding RNAs continue to be discovered in a
wide range of organisms, and the roles they play in the
cell are only beginning to be understood (Eddy 2001). In
many ways, the most interesting possibility is that no
one class of RNAs can be responsible for the phenotypes
of Dicer and Argonaute family mutations because organisms use such a rich variety of RNA-mediated gene regulation in their development.
org.cn/rice (first draft) using the BLAST algorithm, and the adjoining sequences were analyzed for fold-back secondary structures as described above.
son Lau for reagents and advice in the cloning of endogenous
Dicer products, Lee Lim for advice on bioinformatics, and Phil
Zamore and members of the Bartel laboratories for comments
on the manuscript. This research was supported in part by the
Robert A. Welch Foundation (C-1309).
The publication costs of this article were defrayed in part by
payment of page charges. This article must therefore be hereby
marked "advertisement" in accordance with 18 USC section
1734 solely to indicate this fact.
References
Materials and methods
The Arabidopsis Genome Initiative. 2000. Analysis of the genome sequence of the floweringplant Arabidopsis thaliana.
Plant growth and RNA isolation
Nature 408: 796-815.
Aravin, A.A., Naumova, N.M., Tulin, A.A., Rozovsky, Y.M.,
and Gvozdev, V.A. 2001. Double-stranded RNA-mediated si-
Total RNA from wild-type Arabidopsis thaliana (Columbia accession) was isolated from 6-day-old seedlings grown on agar-
based medium overlaid with filter paper and from flowers and
stems of 4-week-old plants grown in soil using Trizol (GIBCO
BRL. Total RNA was prepared from leaves and siliques using a
modification of the method described in Nagy et al. (1988),in
which the LiCl precipitation was replaced by ethanol precipitation. For isolation of RNA from carpel factory plants, progeny of
CAF/caf heterozygous plants (in the Landsberg erecta accession! were grown on medium supplemented with 12 g/mL
kanamycin for 8 d, after which kanamycin-resistant individuals
were transferred to soil and grown for an additional 24 d under
continuous illumination. Plants were then scored as having
(caf/cafl or lacking (CAF/cafl the carpel factory phenotype (Jacobsen et al. 1999), and RNA was prepared from leaves, stems,
and flowers using a modification of the Nagay et al. (1988)
method (see above). Wild-type plants (Landsberg erecta accession) were processed similarly, except that seeds were originally
sown on medium lacking kanamycin.
RNA analysis
Endogenous 18-nt to 26-nt RNAs from seedlings and flowers
were isolated from total RNA by 15% PAGE and cloned as
described (Lau et al. 2001). The laboratory protocol is available
at http://web.wi.mit.edu/bartel/pub/. For Northern analysis, 20
ug of total RNA per lane was separated on a 15% polyacryl-
1624
GENES& DEVELOPMENT
lencing of genomic tandem repeats and transposable elements in Drosophila melanogaster germline. Curr. Biol.
11: 1017-1027.
Bender, J. 2001. A vicious cycle: RNA silencing and DNA meth-
ylation in plants. Cell 106: 129-132.
Bernstein, E., Caudy, A.A., Hammond, S.M., and Hannon, G.J.
2001. Role for a bidentate ribonuclease in the initiation step
of RNA interference. Nature 409: 295-296.
Bohmert, K., Camus, I., Bellini, C., Bouchez, D., Caboche, M.,
and Benning, C. 1998. AGO1 defines a novel locus of Arabidopsis controlling leaf development. EMBO . 17: 170180.
Cerutti, L., Mian, N., and Bateman, A. 2000. Domains in gene
silencing and cell differentiation proteins: The novel PAZ
domain and redefinition of the Piwi domain. Trends Biochem. Sci. 25: 481-482.
Dalmay, T., Hamilton, A., Rudd, S., Angell, S., and Baulcombe,
D.C. 2000. An RNA-dependent RNA polymerase in Arabidopsis is required for posttranscriptional gene silencing mediated by a transgene but not by a virus. Cell 101: 543-553.
Dalmay, T., Horsefield, R., Braunstein, T.H., and Baulcombe,
D.C. 2001. SDE3 encodes an RNA helicase required for posttranscriptional gene silencing in Arabidopsis. EMBO J.
20: 2069-2078.
DiLaurenzio, L., Wysocka-Diller, J., Malamy, J.E., Pysh, L., Helariutta, Y., Freshour, G., Hahn, M.G., Feldmann, K.A., and
microRNAs in plants
Benfey, P.N. 1996. The SCARECROW gene regulates
an
Lee, R.C., Feinbaum, R.L., and Ambros, V. 1993. The C. elegans
asymmetric cell division that is essential for generating the
radial organization of the Arabidopsis root. Cell 86: 423-
heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14. Cell 75: 843-854.
433.
Eddy, S.R. 2001. Non-coding RNA genes and the modem RNA
world. Nat. Rev. Genet. 2: 919-929.
Elbashir, S.M., Leneckel, W., and Tuschl, T. 2001. RNA interference is mediated by 21- and 22-nucleotide RNAs. Genes
& Dev. 15: 188-200.
Fagard, M., Boutet, S., Morel, J.-B., Bellini, C., and Vaucheret, H.
2000. AGO1, QDE-2, and RDE-1 are related proteins re-
Lynn, K., Fernandez, A., Aida, M., Sedbrook, J., Tasaka, M.,
Masson, P., and Barton, M.K. 1999. The PINHEAD/ZWILLE
quired for post-transcriptional gene silencing in plants,
quelling in fungi, and RNA interference in animals. Proc.
gene acts pleiotropically in Arabidopsis development and
has overlapping functions with the ARGONAUTE1 gene.
Development 126: 469-481.
Matzke, M.A., Matzke, A.J., Pruss, G.J., and Vance, V.B. 2001.
RNA-basedsilencing strategies in plants. Curr. Opin. Genet.
Dev. 11: 221-227.
Morel, J., Mourrain, P., B&clin, C., and Vaucheret, H. 2000. DNA
Natl. Acad. Sci. 97: 11650-11654.
Fire, A., Xu, S., Montgomery, M.K., Kostas, S.A., Driver, S.E.,
methylation and chromatin structure affect transcriptional
and posttranscriptional transgene silencing in Arabidopsis.
Curr. Biol. 10: 1591-1594.
and Mello, C.C. 1998. Potent and specific genetic interference by double-stranded RNA in Caenorhabditis elegans.
Morel, J.B., Godon, C., Mourrain, P., Bclin, C., Boutet, S.,
Feuerbach, F., Proux, F., and Vaucheret, H. 2002. Fertile hy-
Nature 391: 806-811.
Grishok, A., Pasquinelli, A.E., Conte, D., Li, N., Parrish, S., Ha,
I., Baillie, D.L., Fire, A., Ruvkun, G., and Mello, C.C. 2001.
Genes and mechanisms related to RNA interference regulate
expression of the small temporal RNAs that control C. elegans developmental timing. Cell 106: 23-34.
Hamilton, A.J. and Baulcombe, D.C. 1999. A novel species of
small antisense RNA in posttranscriptional gene silencing.
Science 286: 950-952.
pomorphic ARGONAUTE (agol) mutants impaired in posttranscriptional gene silencing and virus resistance. Plant
Cell 14: 629-639.
Mourelatos, Z., Dostie, J., Paushkin, S., Sharma, A., Charroux,
B., Abel, L., Rappsilber, J., Mann, M., and Dreyfuss, G. 2002.
miRNPs: A novel class of ribonucleoproteins containing nu-
merous microRNAs. Genes & Dev. 16: 720-728.
E., Beach, D., and Hannon, G.J.
Mourrain, P., Beclin, C., Elmayan, T., Feuerbach, F., Godon, C.,
Morel, J.B., Jouette, D., Lacombe, A.M., Nikic, S., Picault,
N., et al. 2000. Arabidopsis SGS2 and SGS3 genes are re-
2000. An RNA-directed nuclease mediates posttranscriptional gene silencing in Drosophila cells. Nature 404: 293-
quired for posttranscriptional gene silencing and natural virus resistance. Cell 101: 533-542.
296.
Hammond, S.M., Boettcher, S., Caudy, A.A., Kobayashi, R., and
Hannon, G.J. 2001. Argonaute2, a link between genetic and
biochemical analyses of RNAi. Science 293: 1146-1150.
HutvAgner, G. and Zamore, P.D. 2002. RNAi: Nature abhors a
Moussian, B., Schoof, H., Haecker, A., Jurgens, G., and Laux, T.
1998. Role of the ZWILLE gene in the regulation of central
Hammond,
S.C., Bernstein,
double-strand. Curr. Opin. Genet. Dev. 12: 225-232.
HutvAgner, G., McLachlan, J., Pasquinelli, A.E., Balint, E.,
Tuschl, T., and Zamore, P.D. 2001. A cellular function for
the RNA-interference enzyme Dicer in the maturation of the
let-7 small temporal RNA. Science 293: 834-838.
Jacobsen, S.E., Running, M.P., and Meyerowitz, E.M. 1999. Dis-
ruption of an RNA helicase/RNAseIII gene in Arabidopsis
causes unregulated cell division in floral meristems. Development 126: 5231-5243.
Ketting, R.F. and Plasterk, R.H.A.2000. A genetic link between
co-suppression and RNA interference in C. elegans. Nature
404: 296-298.
Ketting, R.F., Fischer, S.E.J., Bernstein, E., Sijen, T., Hannon,
G.J.,and Plasterk, R.H.A.2001. Dicer functions in RNA interference and in synthesis of small RNA involved in developmental timing in C. elegans. Genes & Dev. 15: 26542659.
Lagos-Quintana, M., Rauhut, R., Lendeckel, W., and Tuschl, T.
2001. Identification of novel genes coding for small expressed RNAs. Science 294: 853-858.
Lagos-Quintana, M., Rauhut, R., Yalcin, A., Meyer, J., Lendeckel, W., and Tuschl, T. 2002. Identification of tissue-specific microRNAs from mouse. Curr. Biol. 12: 735-739.
Lai, E.C. 2002. MicroRNAs are complementary to 3' UTR mo-
tifs that mediate negative post-transcriptional regulation.
Nat. Genet. 30: 363-364.
Lau, N.C., Lim, L.P., Weinstein, E.G., and Bartel, D.P. 2001. An
abundant class of tiny RNAs with probable regulatory roles
in Caenorhabditis elegans. Science 294: 858-862.
Lee, R.C. and Ambros, V. 2001. An extensive class of small
RNAs in Caenorhabditis elegans. Science 294: 862-864.
shoot meristem cell fate during Arabidopsis embryogenesis.
EMBO . 17: 1799-1809.
Nagy, F., Kay, S.A., and Chua, N.-H. 1988. Analysis of gene
expression in transgenic plants. In Plant molecular biology
manual ed. S.B. Gelvin and R.A. Schilperoort), Part B4, pp.
1-29. Kluwer, Dordrect.
Nykaken, A., Haley, B., and Zamore, P.D. 2001. ATP require-
ments and small interfering RNA structure in the RNA interference pathway. Cell 107: 309-321.
Pasquinelli, A.E., Reinhart, B.J., Slack, F., Martindale, M.Q.,
Kuroda, M., Maller, B., Srinivasan, A., Fishman, M., Hayward, D., Ball, E., et al. 2000. Conservation across animal
phylogeny of the sequence and temporal regulation of the 21
nucleotide let-7 heterochronic regulatory RNA. Nature
408: 86-89.
Pysh, L.D., Wysocka-Diller,
J.W., Camilleri, C., Bouchez, D.,
and Benfey, P.N. 1999. The GRASgene family in Arabidopsis: Sequence characterization and basic expression analysis
of the SCARECROW-LIKE genes. Plant 1. 18:111-119.
Ray, A., Lang, J.D., Golden, T., and Ray, S. 1996a. SHORT INTEGUMENT (SIN1), a gene required for ovule development
in Arabidopsis, also controls flowering time. Development
122: 2631-2638.
Ray, S., Golden, T., and Ray, A. 1996b. Maternal effects of the
short integument mutation on embryo development. Dev.
Biol. 180:365-369.
Reinhart, B.J., Slack, F.J., Basson, M., Bettinger, J.C., Pasquinelli,
A.E., Rougvie, A.E., Horvitz, H.R., and Ruvkun, G. 2000.
The 21 nucleotide let-7 RNA regulates developmental timing in Caenorhabditis elegans. Nature 403: 901-906.
Robinson-Beers, K., Pruitt, R.E., and Gasser, C.S. 1992. Ovule
development in wild-type Arabidopsis and two female-sterile mutants. Plant Cell 4: 1237-1249.
Schupbach, T. and Wieschaus, E. 1991. Female sterile muta-
GENES& DEVELOPMENT
1625
Reinhart et al.
tions on the second chromosome of Drosophila melanogaster II. Mutations blocking oogenesis or altering eggmorphology. Genetics 129: 1119-1136.
Slack, F.J., Basson, M., Liu, Z., Ambros, V., Horvitz, H.R., and
Ruvkun, G. 2000. The lin-41 RBCC gene acts in the C. el-
egans heterochronic pathway between the let-7 regulatory
RNA and the LIN-29 transcription factor. Mol. Cell 5: 659669.
Tabara, H., Sarkissian, M., Kelly, W.G., Fleenor, J., Grishok, A.,
Timmons, L., Fire, A., and Mello, C.C. 1999. The rde-1 gene,
RNA interference, and transposon silencing in C. elegans.
Cell 99: 123-132.
Vance, V. and Vaucheret, H. 2001. RNA silencing in plantsDefense and counterdefense. Science 292: 2277-2280.
Wightman, B., Ha, I., and Ruvkun, G. 1993. Posttranscriptional
regulation of the heterochronic gene lin-14 by lin-4 mediates
temporal pattern formation in C. elegans. Cell 75: 855-862.
Williams, R.W. and Rubin, G.M. 2002. ARGONAUTE1 is required for efficient RNA interference in Drosophila embryos. Proc. Natl. Acad. Sci. 99: 6889-6894.
Yu, ., Hu, S., Wang, J., Wong, G.K., Li, S., Liu, B., Deng, Y., Dai,
L., Zhou, Y., Zhang, X., et al. 2002. A draft sequence of the
rice genome (Oryza sativa L. ssp. indica). Science 296: 7992.
Zamore, P.D., Tuschl, T., Sharp, P.A., and Bartel, D.P. 2000.
RNAi: Double-stranded RNA directs the ATP-dependent
cleavage of mRNA at 21 to 23 nucleotide intervals. Cell
101: 25-33.
1626
GENES& DEVELOPMENT
Cell, Vol. 115,787-798,December26, 2003,Copyright©2003 by CellPress
Prediction of Mammalian MicroRNA Targets
4
4
Benjamin P. Lewis,', I-hung Shih,2,
Matthew W. Jones-Rhoades, 1' 2 David P. Bartel,' 2.*
and Christopher B. Burge'l*
'Department of Biology
Massachusetts Institute of Technology
Cambridge, Massachusetts 02139
2
Whitehead Institute for Biomedical Research
9 Cambridge Center
Cambridge, Massachusetts 02142
that they could have many more regulatory functions
than those uncovered to date (Lagos-Quintana et al.,
2001; Lau et al., 2001; Lee and Ambros, 2001; Lai et al.,
2003; Um et al., 2003a, 2003b). The regulatory roles of
the vertebrate miRNAs in particular remain unknown.
The possibility that many mammalian miRNAs play im-
portant roles during development and other processes
is supported by their tissue-specific or developmental
stage-specific
expression pattems as well as their evo-
lutionary conservation,which is very strong within mamSummary
MicroRNAs (miRNAs) can play important gene regulatory roles in nematodes, insects, and plants by basepairing to mRNAs to specify posttranscriptional repression of these messages. However, the mRNAs
regulated by vertebrate miRNAs are all unknown. Here
we predict more than 400 regulatory target genes for
the conserved vertebrate miRNAs by identifying mRNAs
with conserved pairing to the 5' region of the miRNA
and evaluating the number and quality of these complementary sites. Rigoroustests using shuffled miRNA
controls supported a majority of these predictions,
with the fraction of false positives estimated at 31%
for targets identified
in human, mouse, and rat and
22% for targets identified in pufferfish as well as mammals. Eleven predicted targets (out of 15 tested) were
supported experimentally using a HeLa cell reporter
system. The predicted regulatory targets of mammalian miRNAs were enriched for genes involved in transcriptional regulation but also encompassed an unexpectedly broad range of other functions.
Introduction
MicroRNAs are endogenous "22 nt RNAs that can play
important gene regulatory roles by pairing to the mes-
sages of protein-coding genes to specify mRNA cleavage or repression of productive translation (Lai, 2003;
Bartel, 2004). The first to be discovered were the lin-4
and let-7 miRNAs, which are components of the gene
regulatorynetwork that controls the timing of C. elegans
larval development (Lee et al., 1993; Wightman et al.,
1993; Moss et al., 1997; Reinhart et al., 2000; Abrahante
et al., 2003; Lin et al., 2003). More recently discovered
miRNA functions include the control of cell proliferation,
cell death, and fat metabolism in flies (Brennecke et al.,
2003; Xu et al., 2003) and the control of leaf and flower
development in plants (Aukerman and Sakai, 2003;
Chen, 2003; Emery et al., 2003; Palatnik et al., 2003).
MicroRNA genes are one of the more abundant
classes of regulatory genes in animals, estimated to
comprise between 0.5 and 1 percent of the predicted
genes in worms, flies, and humans, raising the prospect
*Correspondence:dbarteltwi.mit.edu (D.P.B.), cburgemit.edu
(C.B.B.)
4
Theseauthorscontributedequallyto this work.
mals and often extends to invertebrate homologs (Pasquinelli et al., 2000; Aravin et al., 2001; Lagos-Quintana
et al., 2001, 2002, 2003; Lau et al., 2001; Lee and Ambros,
2001; Ambros et al., 2003b; Dostie et al., 2003; Houbaviy
et al., 2003; Krichevsky et al., 2003; Lai et al., 2003; Lim
et al., 2003a, 2003b; Moss and Tang, 2003). Indeed,
miR-181, one of the many miRNAs conserved among
vertebrates, is preferentially expressed in the B lymphocytes of mouse bone marrow, and the ectopic expres-
sion of this miRNA in hematopoietic stem/progenitor
cells modulates blood cell development such that the
proportion of B lymphocytes
increases (Chen et al.,
2003). However, regulatory targets have not been estab-
lished or even confidently predicted for any of the vertebrate miRNAs, which has slowed progress toward un-
derstanding the functions of these tiny noncoding RNAs
in humans and other vertebrates.
Finding regulatory targets is much easier for the plant
miRNAs. In a systematic search for the targets of 13
Arabidopsis miRNA families, 49 unique targets were
found with a signal-to-noise ratio exceeding10:1,simply
by looking for Arabidopsis messageswith near-perfect
complementarity to the miRNAs (Rhoades et al., 2002).
Confidence in many of these predictions was bolstered
by the observation that the complementarity is conserved among rice orthologs of the miRNAs and messages (Rhoades et al., 2002), and many of the 49 have
since been confirmed experimentally (Uave et al., 2002;
Emery et al., 2003; Kasschau et al., 2003; Tang et al.,
2003). These predicted targets were greatly enriched in
transcription factors involved in developmental patteming or stem cell maintenance and identity, sug-
gesting that many plant miRNAsfunction during cellular
differentiation to clear regulatory gene transcripts from
daughter cell lineages, perhaps enabling more rapid dif-
ferentiation without having to depend on regulatory
genes having constitutively unstable messages (Rhoades
et al., 2002). An analogous search for near-perfect pairing between the miRNAs and messages of C. elegans
and Drosophila genes did not uncover more hits than
would be expected by chance (Rhoadeset al., 2002).
More sophisticated methods for predicting targets of
insect miRNAs have recently been published (Stark et
al., 2003) or submitted (Enright et al. http://genomebiology.
com/2003/4/11/P8). The method of Stark et al. (2003)
provides lists of candidate target genes that when used
in combination with additional biological criteria, including functional relationshipsshared among predicted targets of individual miRNAs,led to validation of six targets
for two Drosophila miRNAs (Stark et al., 2003). The cur-
rent Drosophila analyses do not include estimates of
false positive rates, leaving open the question of the
accuracy of these methods in cases where predicted
targets of a miRNA do not have clear functional relatedness.
In the present study, we describe an approach that
predicts hundreds of mammalian miRNA targets and
provide computational and experimental evidence that
most are authentic, allowing us to begin to explore fundamental questions about miRNA:target relationships
inanimals. Pairing to the 5' portion of the miRNA, particularly nucleotides 2-8, appears to be most important
for target recognition by vertebrate miRNAs. As seen
previously for plant miRNAs, the predicted regulatory
targets of mammalian miRNAs are enriched for genes
involved in transcriptional regulation. In addition, the
predicted mammalian regulatory targets encompass an
unexpectedly broad range of other functions. Indeed,
several lines of evidence imply that the targets identified
in this initial analysis are only a fraction of the total,
supporting the possibility that miRNAs regulate the expression of a large portion of the mammalian transcriptome.
A
SMAD-1
5' UGCCU---CUGGAAAACUAUUGAGCCUUGCAUGUACUUGAAG
1111
miR-26a
SMAD-1
i iiil
5' GAGCCUU
----- GAUAAUACUUGAC
11111
III iiliii
UCGGAUAGGACCUA--AUGAACUU 5'
-17.0 kcalmnol
miR-26a
Z=e
IIII1
UCGGAUAGGACCUA
------------------- GAA(IU
-21 8 kcallmol
17.W020
21. 620
-dGye-dG2jT
+e
= e
+ e
= 5.3
B
Results and Discussion
An Algorithm for Predicting Vertebrate
MicroRNA Targets
To identify the targets of vertebrate miRNAs, we developed an algorithm called TargetScan (the TargetScan
software is available for download at http://genes.mit.
edu/targetscan), which combines thermodynamics-based
modeling of RNA:RNA duplex interactions with comparative sequence analysis to predict miRNA targets conserved across multiple genomes (Figure 1). Given an
miRNA that is conserved inmultiple organisms and a set
of orthologous 3' UTR sequences from these organisms,
TargetScan (1)searches the UTRs inthe first organism
for segments of perfect Watson-Crick complementarity
to bases 2-8 of the miRNA (numbered from the 5'
end)-we refer to this 7 nt segment of the miRNA as the
"miRNA seed" and UTR heptamers with perfect WatsonCrick complementarity to the seed as "seed matches";
(2)extends each seed match with additional base pairs
to the miRNA as far as possible in each direction,
allowing G:U pairs, but stopping at mismatches; (3)optimizes basepairing of the remaining 3' portion of the
miRNA to the 35 bases of the UTR immediately 5' of
each seed match using the RNAfold program (Hofacker
et al., 1994), thus extending each seed match to a longer
"target site"; (4)assigns a folding free energy G to each
such miRNA:target site interaction (ignoring initiation
free energy) using RNAeval (Hofacker et al., 1994); (5)
assigns a Z score to each UTR, defined as: Z =
I, e -
G
i,
where n is the number of seed matches inthe
0
100
200
300
400
500
Z
7
600
.
mn
.
--
-
.,
5.3
4.8
4.9
5.2
Rank
45
72
76
16
Figure 1. Prediction of miRNA Targets
(A)Structures, energies, and scoring for predicted RNA duplexes
involving human miR-26a and two target sites in the 3' UTR of the
human SMAD-1 gene, with seeds and seed matches in red and
seed extension in blue.
(B) Schematic for identification of targets conserved across mammals (upper) and targets conserved in mammals and fish (lower).
The number of genes from each organism with identified orthologs
in every other organism is indicated.
(C)Positions of two target sites for miR-26a (blue) in orthologous
SMAD-1 3' UTR sequences from human (Hs), mouse (Mm), rat (Rn),
and Fugu (Fr), with the Z score and rank of each miRNA:UTR pair,
with T = 20.
k=l
UTR, Gk is the free energy of the miRNA:target site interaction (kcal/mol) for the k0 target site evaluated inthe
previous step, and T is a parameter described below
(UTRs that have no seed match are assigned a Z score
of 1.0); (6) sorts the UTRs in this organism by Z score
and assigns a rank Ri to each; (7)repeats this process
for the set of UTRs from each organism; and (8)predicts
as targets those genes for which both Zi - Zc and Ri Rc for an orthologous UTR sequence ineach organism,
where Zc and Rc are pre-chosen Z score and rank
cutoffs.
The only free parameters in this protocol are Rc and
Zc, and the T parameter inthe formula relating predicted
MammalianmicroRNATargets
789
free energy to Z score. The value of the T parameter
influences the relative weighting of UTRs with fewer
high-affinity target sites to those with larger numbers of
low-affinity target sites, and in this sense is analogous
to temperature. However, there is no thermodynamic
meaning to the T parameter or the Z scores used in this
analysis; they merely provide a convenient means of
17166, decreased to 14539 ortholog sets in humanmouse-rat and 10276 ortholog sets in human-mouserat-Fugu. In addition, some miRNA:target interactions
weighting and summing predicted folding free energies.
example, although most known invertebrate miRNAtarget sites have 7 nt Watson-Crick seed matches (or
longer matches), some do not, such as lin-41, a target
Suitable values for Rc, Zc, and T were assigned by optimization over a range of reasonablevalues using separate training and test sets of miRNAs.
TargetScan was initially applied using two sets of
miRNAs: a nonredundant pan-mammalian set of 79
miRNAs that have homologs in human, mouse, and pufferfish and identical sequence in human and mouse,
but not necessarilypufferfish, and a nonredundantpanvertebrate set of 55 miRNAs that have identical se-
might not be conserved between mammals and fish.
Another likely factor is that some features used by TargetScan to achieve an acceptable signal:noise ratio
might not be strictly required for miRNA regulation. For
of the C. elegans let-7 miRNA (Lee et al., 1993; Wightman
et al., 1993; Moss et al., 1997; Reinhart et al., 2000;
Abrahante et al., 2003; Brennecke et al., 2003; Lin et al.,
2003). Thus, increasing the number of species increases
the probability that the orthologous UTRof one or more
species harbors functional sites that fail to satisfy the
criteria requiredfor TargetScandetection. Nonetheless,
quence in human, mouse, and pufferfish (LagosQuintana et al., 2001, 2002, 2003; Mourelatos et al., 2002;
Dostie et al., 2003; Lim et al., 2003a). These sets, referred
to as nrMamm and nrVert, respectively (Supplemental
Table S1 at http://www.cell.com/cgi/content/fulV1 15/7/
787/DC1), are nonredundant in that when multiple miRNAs
in 115 cases involving the UTRs of 107 genes, the pre-
had identical seed heptamers, a single representative
was chosen. The initial use of miRNAsthat were both
nonredundant and perfectly conserved among the queried species simplified the analysis of signal to noise.
It is of utmost importance in this type of bioinformatic
analysis to ensure that the shuffled control sequences
preserve all relevant compositional features of the au-
Prediction of 400 Targets of Mammalian
MicroRNAs at a Signal:Noise Ratio of 3.2:1
To predict mammalian miRNA targets, the nrMamm set
of miRNAs was searched against orthologous human,
mouse,and rat 3' UTRsderived from the Ensemblclassification of orthologous genes. Using Rc = 200, Zc = 4.5,
and T = 20, TargetScan identified 451 putative miRNA:
target interactions (representing 400 distinct genes), an
average of 5.7 targets per miRNA (Figure 2A). This num-
ber of predicted targets (the "signal") was compared to
the number of targets predicted for cohorts of shuffled
(i.e., randomly permuted) miRNAs (the "noise'). As de-
scribed below, these shuffled sequenceswere carefully
screened to ensure that our estimates of noise were
as accurate as possible and not artifactually
low. An
average of only 1.8 targets were identified per shuffled
miRNA sequence, for a signal:noise ratio of 3.2:1. This
ratio was higher than the roughly 2:1 ratio observed for
targets of the nrMamm miRNA set predicted using only
the human and mouse UTRs (Figure 2A), underscoring
the importance of evolutionaryconservationacross multiple genomes in our approach. The signal:noise ratio
improved to 4.6:1 when conservation was required addi-
dicted target sites were sufficiently conserved to be
detected by TargetScan in orthologous
UTRs from all
four vertebrates (details of these predictions are given
in Supplemental Table S5 and Figure S1A on the Cell
website).
thentic miRNAs. For example, when compared to the
seeds of shuffled cohorts that had not been screened
to control for the expected number of target sites and
the expected strength of miRNA:targetsite interactions,
the seeds of vertebrate miRNAs have approximately 1.4
times as many seed matches in vertebrate UTRs. Specifically, the seeds of vertebrate miRNAs each had an average of about 2100 perfect-complement
matches in
masked vertebrate UTR regions whereas random heptamers with the same base composition averaged only
about 1500 matches. The high number of additional
matches seen for the miRNA seed (and also for the
antisense of the seed) argues strongly against the biological significance of most of these matches. Instead,
these excess matches appear to be the consequence
of dinucleotide composition biases shared between vertebrate miRNAs and UTRs, which must be controlled for
in order to avoid artificially highestimates of TargetScan
signal:noiseratios (particularly in an algorithmthat looks
for multiple matches). Therefore, it was important to
ensure that the shuffled miRNA controls matched the
corresponding miRNAs closely in all sequence proper-
ties that impact the expected number and quality of
TargetScan target sites. The properties we considered
were (1)the expected frequency of seed matches in the
UTR dataset; (2) the expected frequency of matching to
tionally in the fourth and most divergent species, Fugu
rubripes, using the nrVert set of miRNAs (Figure 2A).
Although the signal:noise ratio improved as more genomes were included, the number of predicted targets
the 3' end of the miRNA;(3)the observed count of seed
per miRNA decreased-even
though Rc and Zc were
relaxed to 350 and 4.5, respectively, and the value T =
10 was used for the four-species analysis (Figure 2A).
domized control sequences that possess all of these
Severalfactors might contribute to this effect, including
the increased chance that an orthologous gene will be
missing from the annotations of one genome as the
number of organisms is increased. For example, the
number of ortholog pairs available in human-mouse,
matches in the UTR dataset; and (4) the predicted free
energy of a seed:seed match duplex. A miRNA shuffling
protocol, MiRshuffle, was developed to generate ranproperties. For a given miRNA sequence, MiRshuffle
generates a series of random permutations with the
same length and base composition as the miRNA,until
a shuffled sequence is found that matches the parent
miRNA closely in each of the four criteria listed above.
The MiRshuffle procedure calculated expected frequencies using a first-order Markov model of 3' UTR
Cell
790
7.0
6.0
A
5.0'
12.0
4.0'
i 30
k 2.0
10.0
E 8.0
'
IL1.0
0.0
C
6.0
X 4.0
2.0
w
0.0
human
mouse
human
mouse
rat
human
mouse
rat
Fugu
1..7 2.8
5' end
3..9 4.10 5..11 6..12 713
-13-7-12-6-11-5-10..-4 -9-3 -8..-2 -7.-1
Positionof miRNAseed
3' end
10 I
5
1.7
2..8 39
5' end
,,
4.10 5..11 ..12 713
i
-13..-7-12.-6 -11.- 5 -10..-4-9-3 -8.-2 -7..-1
Poi
3 end
Positic
onof haptamer
Figure2. PredictedmiRNATargetsConservedin MultipleGenomes
(A) Meannumberof predictedtargets per miRNAfor authenticmiRNAs(filled bars)and meanand standarderror of numberof predicted
targets per shuffledsequencefor four cohortsof randomizedmiRNAs(open bars).Genomesusedfor identificationof targetsare listed below
correspondingbars.The nrMammset of 79 miRNAswas usedfor human/mouseand human/mouse/rat;the nrVertset of 55 miRNAswas
usedfor human/mouse/rat/Fugu.
(B) Meannumberof targets per miRNAusingthe human/mouse/ratUTRset and alternativemiRNAseed positionsfor the nrVertmiRNAs
(filled bars)and for cohortsof shuffledcontrols(openbars).Positionsof seed heptamerare indicatedunderbars;positivenumbersindicate
positionrelativeto 5' end of miRNA,negativenumbersindicatepositionsrelativeto 3' endof miRNA.Notethat the signal:noisefor the seed
at 2..8differs slightlyfrom that of the human/mouse/ratanalysisin (A) becausea differentset of miRNAswas used.
(C)ConservedheptamersamongparalogoushumanmiRNAs.Foreachposition,the numberof differentheptamersthat are perfectlyconserved
across multiplemiRNAsin rMammis shown.
composition that accounts for the long-recognized impact of dinucleotide frequency biases on the counts of
5.7 - 1.8 = 3.9 true targets conserved across mammals
tional control, anothershuffling protocol was developed,
per miRNA (Figure 2A). A number of factors limit the
sensitivity of our method, including (1) the incompleteness of orthologous gene annotations; (2) the pos-
longer oligonucleotides
(Nussinov, 1981). As an addi-
DiMiRshuffie, which preserves the precise dinucleotide
sibility that some targets do not meet our stringent seed
composition of both the seed and the 3' end of the
matching, Z score, or rank criteria; (3)the possibility that
miRNA, as well as the seed match count and seed:seed
match folding free energy. This protocol is less general
some mammaliantarget sites lie outside the 3' UTR,as
often observed for plant miRNAs(Rhoadeset al., 2002);
than MiRshuffle in that not every oligonucleotide can be
randomizedwhile preservingexact dinucleotide composition-e.g., the only heptamer with the same dinucleotide composition as the miR-100 seed, ACCCGUA, is
ACCCGUAitself. Nevertheless,it was possibleto generate DiMiRShuffled controls for 47 of the 79 nrMamm
(4) the requirement that targets be conserved in the
complete set of organisms; and (5) the limitation that
our method does not model the simultaneousinteraction
of multiple miRNA species with the same UTR. Thus,
the actual number of target genes regulated by each
miRNA is likely to be substantially higher.
miRNAs, and a signal:noise ratio of 3.5 was observed
using this control in the three-mammal analysis (data
not shown), comparable to the value obtained for MiRshuffled controls. Because of its wider applicability,
MiRshuffle was used in all reported experiments.
In summary, even when the shuffled control
se-
quences were carefully selected to closely match the
corresponding miRNAs in all sequence properties expected to influence the number and quality of target
sites, these shuffled controls yielded far fewer targets
than did the authentic miRNA sequences. This difference results from an increased propensity of vertebrate
UTRsto contain multiple conserved regions of complementarity to authentic miRNAs. We conclude that this
propensity reflects a functional relationshipbetweenthe
miRNAs and the identified UTRs-that is, to the extent
that the signal exceeds the noise,these identified UTRs
are the regulatory targets of the miRNAs.
Correcting for the estimated rate of false positives,
TargetScan appears to have identified an average of
The Conserved 5' Region of Mammalian
MicroRNAs Is Most Important
for Target Identification
TargetScantreats the 5' and 3' ends of miRNAsdifferently, with perfect basepairing required for the seed at
the 5' end, but no such requirement at the 3' end. The
importance of complementarityto the 5' portion of invertebrate miRNAshas been suspected since the observation that complementary sites within the lin-14 mRNA
have "core elements" of complementarityto the 5' segment of the lin-4 miRNA (Wightman et al., 1993) and
has been corroborated with the observation that the 5'
segments of numerous invertebrate miRNAs are perfectly complementary to 3' UTR elements that mediate
posttranscriptional regulation or are known miRNA targets (Lai, 2002; Stark et al., 2003).Moreover, the 5' ends
of related miRNAs tend to be better conserved than
the 3' ends (Lim et al., 2003b), further supporting the
MammalianmicroRNATargets
791
hypothesis that these segments are most critical for
the rMamm set was restricted to those miRNAs with
mRNA recognition.
recognized Fugu homologs. The higher signal seen for
To explore this hypothesis, TargetScan was applied
to predict targets of the nrVert miRNA set conserved
the more broadly conserved miRNAscan be explained
by the idea that miRNAswith larger numbers of targets
would be under greater selective constraint, and therefore less likely to change during the course of evolution.
between human, mouse, and rat using versions of the
algorithm differing in the miRNA heptamer defined as
the seed in step 1 (Figure 2B). Consistent with residues
Thus, more broadly conserved miRNAs would be likely
at the 5' end of miRNAsbeing most important for target
recognition, the highest signal:noise ratio was observed
to have more targets and consequently a higher Tar-
when the seed was positioned at or near the extreme
5' end of the miRNA, with signal:noise values of 2.7, 3.4,
and 1.6 observed for seeds at segments 1..7, 2..8, and
3..9, respectively, and signal:noise ratios of 1.3 or less
conclusion that TargetScan is detecting authentic targets because otherwise it would be difficult to explain
the observed difference in signal:noisefor broadly con-
at other seed positions. We suggest that the critical
importance of pairing to segment 2..8for target identification in silico reflects its importance for target recognition in vivo and speculate that this segment nucleates
pairing between miRNAs and mRNAs.
getScan signal. This observation again supports the
served miRNAs relative to that of less broadly conserved miRNAs.
The 854 miRNA:UTR pairs represented UTRs of just
442 distinct genes because many genes were hit by
multiple miRNAs. In these cases, the miRNAs were usually, but not always, from the same paralogous miRNA
Thoseseed positions that had the highestsignal:noise
family, often with the same seed heptamer. In those
ratios in the sliding seed analysis (Figure 2B) also had the
highest degree of heptamer conservation in paralogous
human miRNAs (Figure 2C). This observation strengthens the assertion that the signal seen above noise in
cases where the same UTR was hit by multiple miRNAs
from different families (54genes),the target sites generally did not overlap, consistent with simultaneous binding and regulation of some target genes by combina-
our analysis reflects a functional relationship between
tions of miRNAs.A complete list of the 442target genes
the miRNAs and the identified UTRs because otherwise
and the corresponding miRNAs is provided (Supplemen-
it would be difficult to explain why the most conserved
portions of the miRNA and not other miRNA segments
have the greatest propensity to match multiple con-
tal Figure SIB and Table S2 on the Cell website). An
served segments in UTRs.
abbreviated list appears as Table 1, where genes were
chosen on the basis of high biological interest. Genes
involved in transcription, signal transduction, and cell-
cell signaling dominate this list, including a number of
The Number of Predicted Targets Is Greatest
for the Most Highly Conserved MicroRNAs
The set of target genes predicted using conservationof
miRNAcomplementarityacross the three mammalswas
most suitable in size and quality for systematic analysis
of gene function. To obtain as large a set of targets as
possible, we searched our set of orthologous mammalian 3' UTRs using an expanded set of 121 conserved
mammalian miRNAs (rMamm, Supplemental Table Si
on Cell website) that includes miRNAs that were excluded from the nrMamm set because they had redun-
dant seeds,yielding a total of 854 predicted miRNA:UTR
pairs conserved across human, mouse, and rat (Supple-
mental Figure SIB). This number of predicted targets
(854) represents an 89% increase over the 451 targets
predicted for the nrMamm miRNAs, even though the
human disease genes such as the tumor suppressor
gene PTEN and the protooncogenes E2F-1, N-MYC,
C-KIT, FLI-1, and LIF.
Experimental Support for 11 Predicted
Regulatory Targets
Reporter assays were used to test 15 predicted targets
of mammalian miRNAs in HeLa cells. The 15 targets
selected for these experiments all had known biological
functions but resembledthe complete set of predictions
in other respects, e.g., there was no significant difference in the average Z score, rank, or number of target
sites per mRNA between the tested targets and the
complete set of predicted targets. In only one case did
the tested targets of a miRNA have obvious functional
number of miRNAs used increased by only 53% from
relatedness (NOTCH1, a receptor for DELTA1, both predicted targets of miR-34). Three of the 15 genes,
79 to 121. This discrepancy prompted us to ask whether
SMAD-1, BRN-3b, and Notchl, were also in the set of
membership in a multi-miRNA gene family influenced
predicted targets conserved to Fugu. Eight genes were
the abundance of targets. Indeed, we found that the 27
predicted targets of miRNAsthat had been cloned from
miRNAs in nrMamm that were members of paralogous
HeLa cells (Lagos-Quintana et al., 2001; Mourelatos et
al., 2002), and three genes were predicted targets of
miR-34, which is also expressed in HeLa cells, based
on Northern analysis (data not shown). For these 11
miRNA families, i.e., families with variant miRNAs that
have the same seed, had an average of 8.7 predicted
targets per miRNA, more than twice the average of 4.2
seen for the remaining 52 nrMamm miRNAs, although
the difference in signal:noise between these two sets
was not as pronounced.
When initially expanding our list of mammalian
miRNAs, we found that the set of 19 mammalian miRNAs
that were conserved between human and rodents but
for which a Fugu homolog was not found gave an unacceptably low signal:noise ratio of 1.2:1,even though the
analysis did not extend to the Fugu UTRs.Accordingly,
genes, a 100 to 1200 nt 3' UTR segment that included
miRNA target sites was inserted downstream of a firefly
luciferase ORF, and luciferase activity was compared
to that of an analogous reporter with point substitutions
disrupting the target sites (as illustrated for SMAD-1,
Figure 3A). Of these 11 UTRs, mutations in eight (SMAD-1,
SDF-1, BRN-3b, ENX-1, N-MYC, PTEN, Deltal, and
Notch1, but not HOX-A5,MECP-2, or VAMP-2)significantly enhanced expression (p < 0.001),as expected if
Cell
792
Table1. HighlyCited PredictedTargetsof MammalianmiRNAs
Category
Seed
miRNAs
EnsemblID
GeneName
Regulation of
transcription/
DNAbinding
AGUGCAA
GUGCAAA
AAAGUGC
GAGGUAG
GAAAUGU
ACAGUAC
GAGGUAU
AAUCUCA
UAAGGCA
GCUGGUG
AAAGUGC
UCCAGUU
GCAGCAU
GGAAGAC
UAAGGCA
UGGUCCC
UCACAUU
GCUACAU
GGAAUGU
UAAGGCA
GGCAGUG
CCCUGAG
AGUGCAA
UCACAGU
AAUACUG
GAAAUGU
AUUGCAC
GCUGGUG
GUAAACA
AUUGCAC
GAGAACU
GGCUCAG
GAGAUGA
AGCUGCC
GCAGCAU
GUGCAAA
AGUGCAA
GGAAUGU
UUGGCAC
AGCACCA
AGCACCA
AUUGCAC
AAGUGCU
AAAGUGC
CCCUGAG
miR-130,-130b
miR-19a
miR-20,-106
let-7(a-g,i),miR-98
miR-203
miR-101
miR-202
miR-216
miR-124a
miR-138
miR-20,-106
miR-145
miR-103,-107
miR-7
miR-124a
miR-133,-133b
miR-23a,-23b
miR-221,-222
miR-1,-206
miR-124a
miR-34
miR-125a,-125b
miR-130,-130b
miR-27a
miR-200b
miR-203
miR-25,-92
miR-138
miR-30(a-e)
miR-25,-92
miR-146
miR-24
miR-143
miR-22
miR-103,-107
miR-19a,-19b
miR-130,-130b
miR-1,-206
miR-96
miR-29b,-29c
miR-29b,-29c
miR-25,-92
miR-93
miR-20,-106
miR-125a,-125b
169057
169057
101412
100823
125347
134323
134323
065978
163403
054598
103479
151702
137309
136826
168610
010610
107562
157404
176697
154188
148400
128342
184371
184371
008710
122641
065559
070886
156052
156052
175104
166484
166484
166484
141433
171862
130164
160211
101986
168542
114270
168090
168090
168090
160613
Methyl-CPG-binding
protein2 (MECP2)
Signal
transduction/
cell-cell
signaling
Other
TranscriptionfactorE2F1
DNA-(apurinic
or apyrimidinicsite)lyase (APEN)
Interferonregulatoryfactor 1 RF-1).
N-MYCprotooncogeneprotein
...
Nucleasesensitiveelementbindingprotein1 (YB-1)
Microphtalmia-associated
transcriptionfactor
Forkheadbox protein C1 (FKHL7)
Retinoblastoma-like
protein2 (RBR-2)
Friendleukemiaintegration1 transcriptionfactor (FLI-1)
High mobilitygroup protein HMG-I/HMG-Y
(HMG-I(Y))
Kruppel-likefactor 4 (EZF)
Signaltransducerand act. of transcription3 (STAT3)
T cell surfaceglycoproteinCD4precursor
Stromalcell-derivedfactor 1 precursor(SDF-1)
Mast/stemcell growthfactor receptorprecursor(C-KIT)
Brain-derivedneurotrophicfactor precursor(BDNF)
Angiopoietin-1precursor(ANG-1)
Notchhomologprotein 1 precursor(HN1)
Leukemiainhibitoryfactor precursor(LIF)
Macrophagecolony stimulatingfactor-1 precursor(MCSF)
Polycystin1 precursor
InhibinbetaA chainprecursor(EDF)
Dual spec.mitogen-activatedproteinkinasekinase4
Ephrintype-areceptor8 precursor(HEK3)
Guaninenucleotide-bindingprotein G(l),alpha-2
subunit
.. ..
TNF receptor-associated
factor 6 (TRAF6)
Mitogen-activatedprotein kinase7 (ERK4)
Pituitaryadenylatecyclase act. polypeptideprecursor
Phosphatidylinositol-3,4,5-trisphos.
3-phosphatase(PTEN)
Low-densitylipoproteinreceptorprecursor(LDLR)
Glucose-6phosphate1-dehydrogenase
(G6PD)
Adrenoleukodystrophy
protein (ALDP)
Collagenalpha 1(111)
chain precursor
Collagenalpha 1(VII)chainprecursor
COP9subunit6
Proproteinconvertasesubtilisin/kexintype 7 precursor
The 442 predictedtargets conservedbetweenhuman,mouseand rat were ranked basedon the numberof referenceslisted in the RefSeq
GenBankflatfiles (11/10/03download).Thetop 37 most referencedpredictedtargets are shown,groupedon the basis of GeneOntology
annotations.The last six digits of the EnsemblID are shown (ENSG00000#).
MicroRNAswith differentseedsthat target the same UTRare
listed on separatelines.
the endogenous miRNAs in the HeLa cells were specifying the repressionof reporter geneexpression by pairing to the predicted target sites (Figure3B). Significantly
enhanced expression was also observedwhen the analogous experiment was performed using either the fulllength C. elegans lin-41 3' UTR or a 124 nt segment of
the UTR containing the two previously proposed let-7
miRNA target sites (Reinhart et al., 2000), indicating that
at least some of the repression of lin-41 observed in C.
elegans can be recapitulated by HeLa let-7 miRNA in
this heterologousreporter assay (Figure3B). For all eight
predicted human targets of endogenous HeLa miRNAs
that responded to mutations, the increasein expression
seen when disrupting the pairing to the miRNA seed
was at least as high as that seen for mutations in the
let-7 target sites of lin-41 (Figure 3B).
Four tested genes (G6PD, BDNF, MCSF, and LDLR)
were predicted targets of miR-1 and miR-130, two miRNAs
that had not been cloned from HeLa cells and were
not detected by Northem analysis. Initially, reporters
containing UTR segments from these four genes were
examined for response to transfected miRNAs (Doench
et al., 2003) (data not shown). Of the four, G6PD, BDNF,
and MCSF responded to the transfected miRNAs. To
further validate these targets, we used a second assay
resembling the one described for targets of miRNAs
expressed in HeLa cells, except that it took advantage
of HeLa cell lines ectopically expressing either human
miR-1 or human miR-1 30. Mutations in the miRNA target
sites of all three of the genesthat had respondedto transfected miRNAs led to significantly increased reporter output in the lines expressing the cognate miRNAs, but not
Mammalian microRNA Targets
793
A
3'UTR
ORF
llý l
SMAD-1
I
Firefly Luc+ gone
WT
5'-f-Gr
GCCU--- CUGGAA-l1lt-GUACUUGAA,, l*nt
11111
fl l
11111111
UCGGAUAGGACCUA--
Mutant
5'-hnt
AUGAACUU
ifnt-GAGCCUU---
11111
11111
1111
H ill1
GAUAAUACUUGA(
II
1111111
UCGGAUAGGACCUA--AUGAACUU
UGCCU --- CUGGAA- 18ft-GUUCCUUAA
UCGGAUAGGACCUA -----
(;AGCCUU- ---
H ill
GAUAAUUCGUUAC5nt-3*
I IIII
I
I
UCGGAUAGGACCUA--AUGAACUU
AUGAACUU
S't-3'
mn-2"
19
rmwma
21
let-7 miRNA
100-
10.
7.9
S1.0
16
.
0.1
09
I
0.1 -
+ miR-1
-
mtR-1
+miR-130 -mlR-130
Figure 3. Experimental Support for Predicted Targets
(A) Schematic of a reporter construct used to evaluate the role of complementarity between miR-26a and the SMAD-1 3' UTR. The wild-type
(WT) construct had a 106 nt fragment of the SMAD-1 UTR (green) containing two miR-26a target sites (blue) inserted within the firefly luciferase
3' UTR. The mutant construct was identical to the WT construct except that it had three point substitutions (red) disrupting pairing to each
miR-26a seed.
(B)Box plots showing the luciferase activity after reporter plasmids were transfected into HeLa cells. Reporters analogous to those depicted
for SMAD-1 were constructed for the indicated target genes (Supplemental Figure S2 on Cell website). The UTR fragments often had two
target sites to the indicated miRNA, and both were disrupted in the mutant reporters (exceptions were SDF-1, BRN-3b, G6PD, Deltal, Notchl,
and BDNF, which each had three target sites, two of which were disrupted, and N-MYC, which had one of its two miR-101 sites disrupted).
Firefly luciferase activity was normalized to Renilla luciferase activity of the transfection control plasmid and then normalized to the median
activity of the corresponding WT reporter. Each box represents the distribution of activity measured for each WT (blue) and mutant (red)
reporter (n = 12-15; ends of the boxes define the 25* and 75" percentiles, a line indicates the median, bars define the 106 and 90* percentiles,
and the number indicates the median activity of the mutant reporter). Asterisks (*)denote instances in which differences between the WT and
mutant were statistically significant (p < 0.001; Mann-Whitney test). Two pairs of constructs for C. elegans lin-41, a previously known target
of let-7, were tested, one with a full-length and the other with a 124 nt UTR segment (f and s, respectively). Except for miR-1 and miR-130,
the miRNAs were all endogenously expressed in the HeLa cells. Reporters corresponding to predicted targets of miR-1 and miR-130 (G6PD,
BDNF, and MCSF) were each examined in a HeLa cell line stably expressing the relevant miRNA (+ miR-1 or + miR-130) and the parental
cell line (- miR-1 or - miR-130).
in the parental lines lacking the miRNAs (Figure 3B), as
expected if these genes were authentic targets of the
respective miRNAs. The levels of ectopically expressed
miR-1 and miR-130 were comparable to those of endogenous miRNAs, as judged by Northern blot analysis (Lim
et al., 2003b). For miR-1, Northern analysis with a synthetic miR-1 standard allowed accurate quantitation,
revealing an average expression of 500 miR-1 molecules
per cell.
In sum, for 11 of the 15 cases tested, the sites identi-
Cell
794
fled by TargetScan influenced expression
of an up-
stream ORF when expressed in the same cells as the
corresponding
miRNAs. Additional experiments in ani-
mals will be neededto address the particular biological
consequencesof these regulatory interactions, but the
evolutionary conservation of the pairings suggests that
they are important. All four of the remaininggenes might
not be true targets; our statistical analysis using shuffled
controls indicatedthat about 30% of predicted mammalian targets are likely to be false positives (Figure 2).
Altematively, some might still be authentic targets
whose regulation was not detected in our assays. Regulation would be missed in cases for which cell typespecific factors were required that were not expressed
in HeLa cells, or in cases for which additional mRNA
elements were required but were not included in the
UTR segments used in our reporters.
One limitation of the existing sequence databases
that complicates the systematic identification of miRNA
targets is that UTR annotations are often absent or incomplete. In order to compensatefor this limitation, we
had extended each annotated 3' UTR with 2 kb of 3'
flanking sequence. Using extended UTRs substantially
increased the number of predicted targets, with signal-tonoise ratios at least as high as they were for unextended
UTRs,suggesting that extension of the annotated UTRs
allows detection of many additional authentic target
genes. One consequence of using this UTR-extension
protocol is that for some genes,all predicted target sites
will fall outside of annotated UTRs. Manual inspection
of the 15 UTR regions tested in our reporter assays
revealed that in all but one of these cases the tested
target sites were contained within regions whose status
as UTRs was supported by known ESTsand predicted
polyadenylation sites, even though some of these regions are not yet annotated as human UTRs. For the
single exception, the Notchl gene, the tested target
sites were all located downstream of the annotated 3'
UTR of the human gene, and the end of the annotated
Notch1 3' UTR was supported by a predicted polyadenylation site and alignment of multiple ESTs. However,
Notchl might have additional 3' UTR isoforms; many
human genes-perhaps
as many as 50% or more of the
genes in the genome-have alternative polyadenylation
sites (Iseliet al., 2002).In order to investigatethe potential expression of the tested Notch1 target sites, which
gave a positive result in our assay for miRNA regulation
(Figure 3), an RT-PCR assay was used with polyAselected RNAfrom a pool of human tissues. Consistent
with the possibility that these sites lie within an altemative UTR isoform of Notchl, an RT-dependent product
of the correct size and sequence was observed (data
not shown).The TargetScanset of predicted mammalian
target genes (SupplementalTable S1B on the Cell website) undoubtedlycontains other examples for which the
target sites all lie outside of the UTR regions supported
by available data; some of these will be false positives,
but others might point to the miRNAregulation of alternative mRNA isoforms.
Human miRNAs Predominantly Are Negative
Regulators of Gene Expression
The finding that a sizable fraction of the tested UTR
segments were sensitive to mutations disrupting their
target sites supports the assertion that most of the predicted targets are authentic. For many, the pairing outside the seed was less extensive than that previously
proposed for miRNAtargets (SupplementalFiguresSI A
and Si B). Perhaps TargetScan is identifying mRNA ele-
ments that are necessary but not sufficient for miRNA
regulation. Alternatively, these elements might be sufficient, in which case their low information content raises
the possibility that miRNAsmodulate the utilization of
a substantial fraction of the mammalian mRNAs.
In none of the 15 cases tested was there evidence
of miRNA-mediated activation of reporter expression;
changes either were not statistically significant or were
in the direction of miRNA-directed repression. This re-
sult suggests that mammalian miRNAs are generally
negative regulators of gene expression, as has been
observed for the known examples in invertebratesand
plants (Lai, 2003; Bartel, 2004).
Predicted Mammalian MicroRNA Targets
Have Diverse Functions
To assess target gene functions, we evaluated the frequency of specific gene ontology (GO)molecular function classifications (Gene Ontology Consortium, 2001)
among the predicted targets of the nrMamm miRNAs
and their shuffled control sequences(Table2). Predicted
miRNAtargets populated many major GOfunctional categories, and for each of these categories, the number of
targets for the real miRNAs greatly exceeded the average
for the shuffled cohorts. Therefore, despite the presence
of false positives among our predictions, the data in
Table 2 strongly indicate that mammalian miRNAs are
involved in regulation of target genes with a wide spec-
trum of molecular functions.
We also compared the proportion of genes that fell
in each of the GO molecular function and GO biological
process categories for the predicted targets of miRNAs,
for targets of shuffled control sequences, and for the
initial set of orthologous genes (Table 2 and Supplemen-
tal Table S4 on Cell website).The targets of the shuffled
cohorts were enriched relativeto the initial set of orthologous genes in certain GO biological process categories
suchas development(14%versus 8%) andtranscription
(13% versus 9%) (Table S4) and in molecular function
categories such as nucleic acid binding (21% versus
14%), DNA binding (15% versus 10%), and transcriptional regulator activity (10% versus 6%) (Table 2). The
biases seen for the shuffled cohorts are likely to result
primarily from the TargetScan requirement
for con-
served segments in the 3' UTRs of predicted targets
and may reflect differences in the occurrence of 3' UTR
regulatory elements in different classes of genes.
In the GO biological process classifications, the predicted regulatory targets of authentic miRNA genes
were enriched in the developmentcategory but no more
than the targets of shuffled controls and were substantially more enriched for genes involved in transcription
(21% of miRNA targets versus 13% of shuffled targets
versus 9% of the initial dataset) and regulation of transcription (21% versus 12% versus 8%) (Supplemental
Table S4). In terms of the GO molecularfunction classifications, targets of authentic miRNAs were enriched in
the categories DNA binding (20% versus 15% versus
MammalianmicroRNATargets
795
Table2. MolecularFunctionClassificationof PredictedmiRNATargets
GO ID
GO:0005215
GO:0005515
GO:0016787
GO:0016740
GO:0016301
GO:0046872
GO:0003676
GO:0003677
GO:0030528
GO:0000166
GO:0004871
GO:0004872
MolecularFunction
miRNAs
None/unknown
Knownfunction
115
285
Transporteractivity
Proteinbinding
Hydrolaseactivity
Transferaseactivity
Kinaseactivity
Metal ion binding
Nucleicacid binding
DNAbinding
Transcriptionreg.act.
Nucleotidebinding
Signaltransduceract.
Receptoractivity
36
37
36
39
29
27
101
80
56
52
55
29
Meanof
ShuffledCohorts
All Orthologous
Genes
(29%)
(71%)
45
77
(37%)
(63%)
5131
9408
(35%)
(65%)
(9%)
(9%)
(9%)
(10%)
(7%)
(7%)
(25%)
(20%)
(14%)
(13%)
(14%)
(7%)
14
11
12
10
6
5
26
18
12
10
12
5
(12%)
(9%)
(9%)
(8%)
(5%)
(4%)
(21%)
(15%)
(10%)
(8%)
(10%)
(4%)
1441
1005
1502
1104
624
952
2072
1431
879
1172
1959
1351
(10%)
(7%)
(10%)
(8%)
(4%)
(7%)
(14%)
(10%)
(6%)
(8%)
(13%)
(9%)
Thenumberand percentageof genesannotatedwith variousGeneOntologymolecularfunctioncategoriesare shownfor targetsof nrMamm
miRNAs,targets of shuffledcontrol miRNAs(mean of four cohorts),and for the initial set of orthologoushuman-mouse-rat
genes.If GO
categorieshavea parent-childrelationship,the child is indented.Becauseone genecan belongto multipleGO categories,the sum of the
percentagesin eachcolumnis not interpretable.
10%), transcription regulatory activity (14% versus 10%
versus 6%), and nucleotide binding (13% versus 8%
versus 8%) (Table 2). The differing numbers of predicted
targets in the similar-sounding categories "regulation of
transcription" (GObiological process classification)and
"transcription regulatory activity" (GO molecular function classification) prompted us to investigate the gene
content of these two categories. Inspection of the lists
of genes showed that all but two of the predicted target
genes in the "transcription regulatoryactivity" category
the periphery of the regulatory networks, where they
regulate genes with a variety of molecular functions.
The predicted mammaliantargets also differ from the
plant targets with respect to biological function. Nearly
all of the transcription factors (TFs)predicted to be plant
miRNA targets have known or implied roles in plant devel-
opment, as do several of the other predicted plant targets (Rhoades et al., 2002). By comparison, only -13%
of predicted mammalian miRNA targets were involved
in development according to the GO biological process
were also included in the larger "regulation of transcrip-
categories (Supplemental Table S4). An important ca-
tion category," but that the latter category also contained more than two dozen additional target genes,the
annotation of which generally supporteda role in control
of transcription. The GO process category "regulation
veat to this analysis is that gene annotation and GO
categories are still evolving. Nonetheless,our data suggest that mammalian miRNAs are not exclusively, or
of transcription" (Supplemental Table S4) therefore appears to provide a more complete listing of known and
putative transcription factors.
The proportion of the predicted mammalian miRNA
target genes involved in the GO process categories
"transcription" and "regulation of transcription" was significantly higherthan that seen for either shuffled targets
or for the initial gene set (p < 0.001). Nonetheless, this
bias was much lower in magnitude than that seen in
plants: of the 49 targets predicted in a systematic search
for complementarity
to plant miRNAs, 69% were mem-
bers of transcription factor gene families (Rhoadeset al.,
2002).Examplesof other types of predicted mammalian
targets include translational regulators (e.g., COP9 subunit 6, ERF1), regulators of mRNA stability (e.g., HUAntigen D), structural proteins (e.g., collagen), and enzymes (e.g., G6PD). The set of predicted miRNA targets
conserved across all four vertebrates (SupplementalTable S5 online) was also somewhat biased toward genes
involved in transcription, but had annotated functions
consistent with the broad array of biological activities
seen for the larger mammalian target set. We conclude
that although mammalian miRNAs are sometimes at the
center of gene regulatory networks, where they regulate
genes,such as transcription factors, that regulate other
genes, they are more likely than plant miRNAs to be at
even primarily, involved in the traditional miRNA role of
developmental control. Instead, we find evidence for
miRNA regulation of a very broad diversity of biological processes.
ExperimentalProcedures
MicroRNADatasets
HumanandmousemiRNAsequencesthat satisfyestablishedcriteria (Ambroset al., 2003a)weredownloadedfrom the Rfamwebsite
(http://www.sanger.ac.uk/Software/Rfam).
Human miRNAs that
lackedannotatedmouseorthologsand mousemiRNAsthat lacked
annotatedhumanorthologswere searchedagainstthe mouseand
humangenomes,respectively,with BLASTN(Altschulet al., 1997)
and MiRscan(Limet al., 2003a,2003b).To identifyFuguhomologs,
the humanmiRNAswere searchedagainsttheFugu genomeusing
BLASTNand MiRscan,and the 121 humanmiRNAswith perfectly
homologousmiRNAsin mouseand clear homologousmiRNAsin
Fugu were assignedto rMamm. For sets of human miRNAsin
rMammwith identicalseed heptamers,a singlerepresentativewas
chosen, yielding 79 human miRNAs(nrMamm).The choice was
basedon conservationto FuguandC.elegansmiRNAswhenpossible (i.e., the sequencemost broadlyconservedwas chosen),but
was otherwiseessentiallyarbitrary(themiRNAwith the lowestmir-#
was generallychosen).Thesubsetof 55miRNAsfromnrMammthat
had perfectconservationto Fugu wereassignedto nrVert.
3' UTRDatasets
3' UTR sequencesfor all humangenes,and all mouse,rat, and
Fugugenesassociatedwitha humanortholog,wereretrievedusing
Cell
796
Annotated
EnsMartversion15.1(http://www.ensembl.org/EnsMart).
3' UTRsequenceswereavailablefor only 45%of rat genesin this
set and for noneof the Fugugenes.Moreover,14%of annotated
rat 3' UTRsequenceswerelessthan 50nucleotidesin length.Therefore, we extendedeach annotated3' UTRwith 2 kb of 3' flanking
sequence.Repetitiveelementswere maskedin these sequences
(Smit,A.FA. andGreen,P.,http/repeatmasker.
usingRepeatMasker
with repeat libraries
genome.washington.edu/cgi-bin/RM2_req.pl)
for primates,rodents,or vertebrates,as appropriate.
Identificationof miRNATarget Sites
The3' UTRsequencesweresearchedfor antisensematchesto the
designatedseed region of each miRNA(e.g., bases 2..8 starting
from the 5' end).Our choice of a 7 nt seed was motivatedby the
observationthat shorterseedsgavesubstantiallylowersignal:noise
ratios,while longerseedsreducedthe numberof predictedtargets
at comparablesignal:noiseratios.Becausechangingthe sizeof the
seed has a largeeffect on the noise as well as the signal,these
observationsaremuchmoredifficultto interpretintermsof potential
mechanisticimplicationsthan the "sliding seed"data of Figure2B.
For seeds locatedon the 5' portion of the miRNA,35 nt flanking
the seedmatchon the 5' end and 5 nt flankingthe seed matchon
the 3' end were retrieved(a "mirror" versionof this algorithmwas
used for 3' seedsin the experimentdescribedin Figure2B).Target
sites in whichthe 35nt flankingregioncontainedmaskedbasesor
the seed matchoccurredlessthan 20 nt downstreamof a previous
seed matchwere discarded.Basepairingbetweenthe miRNAseed
and UTRwas extendedwith additionalflankingbasepairsas far as
possiblein both directions,allowingG:Upairs butdisallowinggaps.
The basepairingpatternof the remaining3' end (or in the case of
a 3' seed, the remaining5' end)was predictedby runningRNAfold
on a foldback sequenceconsisting of an artificial stemloop (5'where"L" is an anonyGGGCCCGGGULLLLLLACCCGGGCCC-3',
mous unpairedloop character,and all other basesare pairedto a
complementarybaseon the oppositeside of the stem)attachedto
the extendedseed match. RNAfoldoptimizationwas constrained
so that allbasepairsfoundin previousstepswerefixed,thestructure
of the artificialstem was fixed, and basesin the miRNAand UTR
wereallowedto pair onlywith basesin theUTRandmiRNA,respectively. Thestemloopwas removed,and RNAevalwas usedto estimatetheenergyof the miRNAUTRduplexformedby the basepairs
determinedin the previoussteps.
ParameterOptimization
Trainingsets were constructedwith 40 randomlychosenmiRNAs
from nrMammand 27 randomlychosenmiRNAsfrom nrVert.The
remainingmicroRNAswere assignedto the nrMammand nrVert
referencesets. TargetScanwas tested on the training sets with
variousparametervalues:Twas variedfrom 5 to 25 in increments
of 5, Zc was variedbetween1 and 10 in incrementsof 0.5, andRc
wasvariedbetween50and 1000in incrementsof 50.Theparameters
T = 20,Zc = 4.5,Rc= 200werefoundto give anoptimalsignal:noise
of 3.4:1for the nrMammtrainingset.WhenRcwas raisedto 300or
Zc was loweredto 4, the signal:noisedecreasedonly moderatelyto
-3:1. The parametersT = 10, Zc = 4.5, Rc = 350 were found to
give anoptimalsignal:noiseof 4.6:1for the nrVerttrainingset used
with UTRsets from all four genomes.For both the nrMammand
nrVertsets,the signal:noiseratiosobtainedusing the trainingsets
did notdiffersignificantlyfromthe correspondingsignal:noiseratios
obtained using the referencesets, and thus results from the two
sets were merged.
Generationof RandomlyPermutedSequences
Foreach miRNAin nrMamm,randomlypermutedsequenceswith
the same startingbase, length,and base compositionas the real
miRNAwere generateduntil four sequenceswere foundthat deviated from the originalmiRNAby less than 15% in the following
properties:(1)E(SM),the 1" order Markov probabilityof the seed
match,(2)E(rM), the 1" order Markovprobabilityof the antisense
of the 3' end of the miRNA(or the 5' end in the caseof a 3' miRNA
seed),(3)O(SM),the observedcount of seed matchesin the UTR
dataset,and (4)the predictedfolding free energyof a seed:seed
matchduplex.Fora miRNA(or shuffledmiRNA)with the initial se-
quence SI,S
2 ,SS,S4,SS,S 7,Ss, and the seed designatedas bases
s,s,.Pss,)
.Ps where
.Psiss.PsS,
P
2..8,E(SM)was equalto (PsPe,
PSiSA,was the conditionalfrequencyof the nucleotideSk+ givenSk
at the previousposition in the set of inversecomplementsof the
UTRsin the UTRdatabase.E(TM)wastheanalogousquantitycalculatedfor the remainderof the sequence(i.e.,for bases9,10,11, ... to
the end of the miRNAor shuffledmiRNA).O(SM)was determined
directly from heptamercounts in the UTRdataset.The predicted
folding free energyof a seed:seedmatchduplex was determined
usingRNAeval.TheDiMirShuffleprogramgeneratedshuffledcontrols for a givenmiRNAsequenceby shufflingthe dinucleotidesof
the specifiedmiRNAseed (e.g.,bases2..8of the miRNA).
DNAConstructs
Thefireflyluciferasevectorwasmodifiedfrom pGL3ControlVector
(Promega),suchthat a shortsequencecontainingmultiplecloning
was
sites (5'-AGCTCTATACGCGTCTCAAGCTTACTGCTAGCGT-3')
insertedinto the Xbal site immediatelydownstreamfromthe stop
codon.3'UTRsegmentsof the targetgeneswereamplifiedby PCR
from human genomicDNA and inserted into the modified pGL3
vectorbetweenSacl andNhelsites. PCRwith the appropriateprimers also generatedinserts with point substitutionsin the miRNA
complementarysites.Wild-typeandmutantinsertswereconfirmed
FigureS2 online).
by sequencingand are listed (Supplemental
Transfectionsand Assays
AdherentHeLaS3 cellsweregrownin 10%FBSin DMEM,supplementedwith glutaminein the presenceof antibiotics,to 80%-90%
confluencyin 24-wellplates.Cellsweretransfectedwith 0.4 Itg of
of thecontrolvector
thefireflyluciferasereportervectorand0.08 ALg
containingRenilla luciferase,pRL-TK(Promega),in a final volume
of 0.5ml using Lipofectamine2000(Invitrogen).Fireflyand Renilla
luciferaseactivities were measuredconsecutivelyusingthe Dualluciferaseassays(Promega)30 hr after transfection.Eachfirefly
plasmidwas testedin 12-15transfections(fouror five independent
experiments,each withthree culturereplicates)involvingtwo independentplasmid preparations(six to ninetransfectionseach).A
HeLacell line that constitutivelyexpressedmiR-1froma pol-ll promoterwas created usinga derivativeof the retroviralvectorpRevTRE(Clontech)containinga 500bpfragmentof humanmir-ld gene.
A HeLaS3cell line that constitutivelyexpressedmiR-130fromthe
H1 pol-Ill promoterwas constructedusinga retroviralvectorcontaining a 330 nt fragmentof the humanmir-130geneand a GFP
kinasepromoter,which
geneunderthe murine3-phosphoglycerate
servedas an infectionmarker(Chen,et al., 2003).Cellsexpressing
GFP followinginfectionwere enrichedto 95% purity by FACS.
Analysisof Gene Ontologies
Geneontologieswere assignedto humangenesfrom the Ensembl
Ensemblidentifierswith GO identifidatabaseby crossreferencing
ers usingEnsMartversion15.1 (http://www.ensembl.org/EnsMart).
TheGeneOntologyConsortiumdatabasewasretrievedfrom http://
and function and processontologieswere
www.geneontology.org
compiledfor all predictedtarget genes.In additionto the assigned
categories,each genewas consideredas havingall more general
categorieswithinthe "MolecularFunction"and "BiologiC'("parent")
cal Process"ontologies.In Tables2 and S4,sets of GOcategories
wereselectedthat were both broadenoughto containa significant
fractionof the predictedtargetsandspecificenoughto bemeaningful. Becausethe GO descriptionsare not mutuallyexclusive,the
sum of the percentagesin these tables is not interpretable.GO
categorieswere also usedto producethecategoriesin Table1. To
be includedin a category,a genehad to be annotatedwith at least
one out of a set of GOcategories.Thesets of GOcategoriesused
GO:
were: regulationof transcription/DNAbinding (GO:0003700,
GO:0016563,or GO:0045449)and signal
0003713,GO:0003714,
transduction/cell-cell signaling (GO:0004871, GO:0004872,
or GO:0008083).
GO:0007267
GO:0007154,
GO:0007165,
Acknowledgments
We thank W.K.Johnston for technicalassistance,C-Z. Chenand
L.P. Lim for helpful discussions,H.F. Lodish for use of facilities
MammalianmicroRNATargets
797
and equipment,N.C.Lau for the miR-1-expressing
cell line, and G.
Ruvkunfor plasmidsused to construct the lin-41 reporters.Supportedby grantsfromtheN.I.H(D.P.B.and C.B.B.),theSearleScholars Program(C.B.B.),and theAlexanderand MargaretStewartTrust
(D.P.B.),andfellowshipsfromthe DOE(B.P.L.)and the CancerResearchInstitute(I.S.).
Received:November18, 2003
Revised:December3, 2003
Accepted:December4, 2003
Published:December24, 2003
References
Abrahante,J.E.,Daul,A.L.,Li, M.,Volk,M.L.,Tennessen,J.M.,Miller,
EA, and Rougvie,A.E.(2003).TheCaenorhabditiseleganshunchback-likegenelin-57/hbl-1controlsdevelopmental
timeand isregulated by microRNAs.Dev.Cell4, 625-637.
Altschul, S.F.,Madden,T.L., Schaffer,A.A., Zhang,J., Zhang,Z.,
Miller,W.,and Lipman,D.J.(1997).GappedBLASTand PSI-BLAST:
a new generationof protein databasesearchprograms.Nucleic
Acids Res.25, 3389-3402.
Ambros,V., Bartel, B., Bartel, D.P.,Burge,C.B.,Carrington,J.C.,
Chen,X., Dreyfuss,G., Eddy,S.,Griffiths-Jones,S., Matzke,M., et
al. (2003a).A uniform system for microRNAannotation.RNA9,
277-279.
Ambros,V.,Lee, R.C.,Lavanway,A., Williams,P.T.,and Jewell,D.
(2003b).MicroRNAsand othertiny endogenousRNAsin C.elegans.
Curr.Biol. 13,807-818.
Aravin,A.A.,Naumova,N.M.,Tulin,A.A.,Rozovsky,Y.M.,andGvozdev,VA. (2001).Double-stranded
RNA-mediated
silencingof genomic tandemrepeatsandtransposableelementsin Drosophilamelanogastergermline.Curr.Biol. 11, 1017-1027.
Aukerman,M.J., and Sakai,H. (2003).Regulationof floweringtime
andfloralorganidentityby a MicroRNAand itsAPETALA2-like
target
genes.Plant Cell 15,2730-2741.
Bartel,D.P.(2004).MicroRNAs:genomics,biogenesis,mechanism,
and function.Cell,in press.
Brennecke,J., Hipfner,D.R.,Stark,A., Russell,R.B.,and Cohen,
S.M. (2003).bantam encodes a developmentallyregulatedmicroRNAthat controls cell proliferationand regulatesthe proapoptotic genehid in Drosophila.Cell 113,25-36.
Chen,C.-Z.,Li, L., Lodish,H.F.,and Bartel,D.P.(2003).MicroRNAs
modulatehematopoieticlineagedifferentiation.Science,in press.
091903.
Publishedonline December4, 2003.10.1126/science.1
Chen,X. (2003).A MicroRNAas a translationalrepressorof APETALA2in arabidopsisflowerdevelopment.Science.Publishedonline
September11, 2003.10.1126/science.1
088060.
Consortium,TheGeneOntology.(2001).Creatingthe geneontology
resource:designand implementation.
GenomeRes.11,1425-1433.
Doench,J.G., Peterson,C.P.,and Sharp, P.A.(2003).siRNAscan
function as miRNAs.GenesDev.17, 438-442.
Dostie,J., Mourelatos,Z., Yang,M., Sharma,A., and Dreyfuss,G.
(2003).NumerousmicroRNPsin neuronalcells containingnovelmicroRNAs.RNA9, 631-632.
Emery,J.F.,Floyd,S.K.,Alvarez,J., Eshed,Y., Hawker,N.P.,Izhaki,
A.,Baum,S.F.,and Bowman,J.L.(2003).Radialpattemingof Arabidopsis shoots by class III HD-ZIPand KANADIgenes.Curr. Biol.
13,1768-1774.
Hofacker,I.L., Fontana,W., Stadler, P.F.,Bonhoeffer,S., Tacker,
M., and Schuster,P. (1994).Fast folding and comparisonof RNA
secondarystructures.Monatsheftefur Chemie125,167-188.
Houbaviy,H.B.,Murray, M.F.,and Sharp, PA. (2003).Embryonic
stem cell-specificMicroRNAs.Dev.Cell5, 351-358.
Iseli,C., Stevenson,B.J.,de Souza,S.J.,Samaia,H.B.,Camargo,
A.A.,Buetow,K.H.,Strausberg,R.L.,Simpson,A.J.,Bucher,P.,and
Jongeneel,C.V.(2002).Long-rangeheterogeneityat the 3' ends of
humanmRNAs.GenomeRes. 12,1068-1074.
Kasschau,K.D.,Xie,Z., Allen,E.,Uave, C., Chapman,E.J.,Krizan,
K.A.,and Carrington,J.C. (2003).P1/HC-Pro,a viral suppressorof
RNAsilencing,interfereswithArabidopsisdevelopmentandmiRNA
function.Dev. Cell4, 205-217.
Krichevsky,A.M., King,K.S.,Donahue,C.P.,Khrapko,K.,and Kosik,
K.S. (2003).A microRNAarray revealsextensiveregulationof microRNAsduringbraindevelopment.RNA9, 1274-1281.
Lagos-Quintana,M., Rauhut, R., Lendeckel,W., and Tuschl, T.
(2001).Identificationof novel genescoding for small expressed
RNAs.Science294, 853-858.
Lagos-Quintana,M., Rauhut,R., Yalcin,A., Meyer,J., Lendeckel,
W.,andTuschl,T. (2002).Identificationof tissue-specificmicroRNAs
from mouse.Curr.Biol. 12,735-739.
Lagos-Quintana,M., Rauhut, R., Meyer, J., Borkhardt,A., and
Tuschl,T. (2003).New microRNAsfrom mouse and human. RNA
9, 175-179.
Lai, E.C. (2002).MicroRNAsare complementaryto 3'UTR motifs
that mediate negativepost-transcriptionalregulation.Nat. Genet.
30, 363-364.
Lai,E.C.(2003).MicroRNAs:runtsof thegenomeassertthemselves
Curr.Biol. 13,R925-R936.
Lai, E.C.,Tomancak,P., Williams,R.W.,and Rubin,G.M. (2003).
Computationalidentificationof DrosophilamicroRNAgenes.GenomeBiol. 4:R42,1-20.
Lau, N.C., Lim, L.P.,Weinstein,E.G.,and Bartel,D.P. (2001).An
abundant class of tiny RNAswith probable regulatoryroles in
Caenorhabditiselegans.Science294, 858-862.
Lee, R.C.,and Ambros,V. (2001).Anextensiveclassof smallRNAs
in Caenorhabditiselegans.Science294, 862-864.
Lee, R.C.,Feinbaum,R.L.,and Ambros,V. (1993).The C. elegans
heterochronicgenelin-4 encodessmall RNAswith antisensecomplementarityto lin-14.Cell75, 843-854.
Lim, L.P., Glasner,M.E., Yekta, S., Burge,C.B., and Bartel, D.P.
(2003a).VertebratemicroRNAgenes.Science299, 1540.
Lim, L.P., Lau, N.C.,Weinstein,E.G.,Abdelhakim,A., Yekta, S.,
Rhoades,M.W.,Burge,C.B.,and Bartel,D.P. (2003b).The microRNAsof Caenorhabditiselegans.GenesDev. 17,991-1008.
Lin, S.Y.,Johnson,S.M.,Abraham,M., Vella,M.C.,Pasquinelli,A.,
Gamberi,C., Gottlieb, E.,and Slack,F.J. (2003).The C. elegans
hunchbackhomolog,hbl-1, controlstemporalpattemingand is a
probablemicroRNAtarget. Dev.Cell4, p639-p650.
Llave,C.,Xie,Z., Kasschau,K.D.,and Carrington,J.C.(2002).Cleavageof scarecrow-likemRNAtargetsdirectedby a classof Arabidopsis miRNA.Science297, 2053-2056.
Moss,E.G.,Lee,R.C.,andAmbros,V. (1997).Thecold shockdomain
protein LIN-28controlsdevelopmentaltiming in C. elegansand is
regulatedby the lin-4 RNA.Cell88, 637-646.
Moss,E.G.,and Tang,L. (2003).Conservationof the heterochronic
regulatorLin-28,its developmental
expressionand microRNAcomplementarysites. Dev.Biol. 258, 432-442.
Mourelatos,Z., Dostie,J., Paushkin,S.,Sharma,A., Charroux,B.,
Abel,L.,Rappsilber,J., Mann,M.,andDreyfuss,G.(2002).miRNPs:a
novelclass of ribonucleoproteins
containingnumerousmicroRNAs.
GenesDev. 16,720-728.
Nussinov,R. (1981).Nearestneighbornucleotidepatterns.Structural and biologicalimplications.J. Biol. Chem.256,8458-8462.
Palatnik,J.F.,Allen,E.,Wu,X.,Schommer,C.,Schwab,R.,Carrington, J.C., and Weigel,D. (2003).Controlof leaf morphogenesisby
microRNAs.Nature20, 257-263.PublishedonlineAugust20,2003.
10.1038/nature01
958.
Pasquinelli,A.E.,Reinhart,B.J.,Slack,F.,Martindale,M.Q.,Kuroda,
M., Mailer,B.,Srinivasan,A., Fishman,M., Hayward,D.,Ball, E.,et
al. (2000).Conservationacross animalphylogenyof the sequence
and temporal regulationof the 21 nucleotidelet-7 heterochronic
regulatoryRNA.Nature408,86-89.
Reinhart,B.J.,Slack,F.J., Basson,M., Bettinger,J.C., Pasquinelli,
A.E., Rougvie,A.E., Horvitz,H.R.,and Ruvkun,G. (2000).The 21
nucleotidelet-7 RNAregulatesdevelopmental
timingin Caenorhabditis elegans.Nature403, 901-906.
Cell
798
Rhoades,M.W.,Reinhart,B.J., Lim, L.P., Burge, C.B., Bartel, B.,
and Bartel,D.P. (2002).Predictionof plant microRNAtargets. Cell
110,513-520.
Stark,A.,Brennecke,J., Russell,R.B.,andCohen,S.M.(2003).Identificationof DrosophilamicroRNAtargets.PLOSBiol.,in press.PublishedonlineOctober13, 2003.10.1371/joumal.pbio.0000060.
Tang, G., Reinhart,B.J., Bartel,D.P.,and Zamore,P.D. (2003).A
biochemicalframeworkfor RNAsilencing in plants. Genes Dev.
17,49-63.
Wightman,B., Ha, I., and Ruvkun,G. (1993).Posttranscriptional
regulationof the heterochronicgenelin-14bylin-4 mediatestemporal patternformationin C. elegans.Cell75, 855-862.
Xu,P.,Vemooy,S.Y.,Guo,M.,and Hay,B.A.(2003).TheDrosophila
MicroRNAMir-14 suppressescell deathand is requiredfor normal
fat metabolism.Curr.Biol. 13,790-795.
Download