Non-coding RNA What is noncoding RNA? Non-coding RNA (ncRNA) is a RNA molecule that functions without being translated into a protein How many RNAs in cells ? Protein mRNA ? ? rRNA gRNA Ribozyme tRNA Antisense RNA SRP-RNA Telomerase RNA snRNA snoRNA pRNA microRNA Functional diversity of ncRNAs The function of diffenent small non-coding RNAs by targeting mRNAs or pre-mRNAs Scheme for the function of different sncRNAs by targeting bacterial or eukaryal mRNAs or pre-mRNAs leading to regulation of gene expression non-coding RNA (20-20000nt) small non-coding RNA (sncRNA) (20-500nt) long non-coding RNA (lncRNA) (>500-20000nt) RNA mRNA Non-coding RNA: Versatility in form and function Noncoding RNA genes are surprisingly numerous. Noncoding RNA have a very different functions. Time for RNomics Cell, 89: 669–672, May 30, 1997 PNAS, 97(26): 14035-14037, Dec 19, 2000 Understanding RNomics from an expending snoRNA world Couzin J. Breakthrough of the year. Small RNAs make big splash. Science. 2002 Dec 20;298(5602):2296-7. DNA RNA Protein Study non-coding RNAs on the genomic scale Study the identification, expression, biogenesis, structure, regulation of expression, targets, and biological functions of noncoding RNAs on the genomic scale. How to identify the ncRNA genes in genomic studies? sncRNAs are very small sncRNAs contain no specific features at their 5’ and 3’ ends methods for finding novel non-coding RNA genes Computational RNomics Searching conserved intronic sequences by comparative analysis of introns Searching conserved intergenic sequences Searching well-defined sequence elements or characteristics (boxC/D, functional regions, complementary and other conserved sequence etc.) Novel algorithm taking the folding parameters in RNA molecule into account All predictions of novel ncRNA genes need to be confirmed by direct detection of these transcripts !!! A example of computational approach for screening box C/D snoRNAs A Computational Screen for Methylation Guide snoRNAs SCIENCE, 283: 1168-1171, FEBRUARY 19, 1999 Similarity Searching Proteins BLAST, Sequence Alignment Genes that code for proteins are conserved across genomes (e.g. low rate of mutation) ncRNA Secondary structure usually conserved Alignment scoring based on structure is imperative Orthologous and paralogous orthologous(a1 in species I, a1 in species II) paralogous(a1 and a2 in species I) Repeat Sequence repeat sequence Inverted repeat, palindrome sequence mirror repeat (Inverted repeat) G AAT T C C T TAA G Triple helix ncRNA: Sequence vs Structure The specificity of RNA search ncRNA is defined by primary and secondary structure RNA structure Base-pairing defines a secondary structure Tertiary stuctures are much less well understood RNA is extremely difficult to crystallize: RNA is enzymatically unstable molecule (RNAses are everywhere!) RNA is conformationally flexible molecule. Thus Bioinformatic approach – RNA structure prediction is very important ! L-shaped tRNA molecule methods for finding novel non-coding RNA genes Experimental RNomics Traditional methods by PAGE separation of non-coding RNAs and sequencing by immunoprecipitation of specific RNPs by non-coding RNA enriched cDNA libraries and sequencing by microarray analysis new method by non-coding RNA libraries and deep sequencing New Deep sequencing Functional analysis Combination of bioinformatical methods and experimental methods in ncRNA functional analysis structure and functional analysis Computational Analysis Structural Prediction Functional Prediction Functional Analysis by Experimental Method Nomenclature of non-coding RNA Bacterial RNAs --- Small RNA(sRNA) Eukaryotic RNA --- Non-coding RNA (ncRNA), functional RNA (fRNA), small nonmessenger RNAs (snmRNA) Based on subcellular localization --Small nucleolar RNAs (snoRNA) Based on size --- micro RNA (miRNA),small interfering RNAs (siRNA), long non-coding RNA(lnRNA) snoRNA Box C/D and box H/ACA guide snoRNAs and the core associated proteins RNA processing and modification methylation and pseudouridylation guided by snoRNAs methyl groups or pseudouridine groups Box C/D snoRNA (a) (b) Box C/D snoRNAs direct rRNA methylation Box H/ACA snoRNA (a) (b) Box H/ACA snoRNAs direct rRNA pseudouridylation Box C/D-H/ACA snoRNA (scaRNA) snoRNA target snoRNA --------------------------------------rRNA, U6 scaRNA---------------------------------------snRNA imprinted snoRNA------ -------------------mRNA Homologs of snoRNAs in Archaea-----rRNA and tRNA Orphan guide snoRNAs-------------------No target Diversity of genomic organization of ncRNAs snoRNA gnene organization (1) (2) (3) (4) Trends Plant Science, 8(1): 42-49, 2003 Diversity of genomic organization of ncRNAs microRNA gnene organization Diversity of genomic organization of ncRNAs snoRNA and microRNA gene cluster Procession of polycistronic and intronic pre-snoRNA transcripts Polycistronic and intronic pre-snoRNA transcripts are processed by either a splicing or a non-splicing pathway Non-coding RNA host gene Protein Coding Gene------Most intronic snoRNA genes of vertebrates and yeast are nested in genes encoding proteins involved in ribosome biogenesis. Non-coding RNA gene------These ‘‘host’’ genes harbour snoRNAs in multiple introns but their exon does not code for proteins SPAC1B3.05 snR80 Exon 2 Intron snR90 SPAC1B3.05 Exon 1 Transcription snR80 snR90 Polycistronic precursor Exon 2 Intron Exon 1 Intron lariat snR80 Splicing Nucleases snR90 precursor Exonuclease trimming snR90 microRNA The discovery of miRNAs Victor Ambros Gary Ruvkun • miRNA was first discovered in 1993 by Victor Ambros at Harvard (lin-4) • The second miRNA Let-7 was discovered in 2000 by Frank Slack as a postdoc at Harvard (Ruvkun lab) The first discovered miRNA lin-4 in 1993 Ruvkun G, Wightman B, Ha I. The 20 years it took to recognize the importance of tiny RNAs. Cell. 2004 Jan 23;116 (2 Suppl):S93-6. Lee R, Feinbaum R, Ambros V. A short history of a short RNA. Cell. 2004 Jan 23;116 (2 Suppl):S89-92 Thought to be an oddity not a general phenomenon Breakthrough with BlastN of the second miRNA (stRNA) let-7 Pasquinelli AE, Reinhart BJ, Slack F, Martindale MQ, Kuroda MI, Maller B, Hayward DC, Ball EE, Degnan B, Muller P, Spring J, Srinivasan A, Fishman M, Finnerty J, Corbo J, Levine M, Leahy P,Davidson E, Ruvkun G. Conservation of the sequence and temporal expression of let-7 heterochronic regulatory RNA. Nature. 2000 Nov 2;408(6808):86-9. MicroRNAs: 22-25 nt Noncoding RNAs The founding members Animals Plants Bartel, Cell 116: 281-297, 2004 microRNAs had been neglected for so many years because of their small size. The underlying reason is: people never dream that small RNAs will have important biological roles. miRNA biogenesis Pri-miRNA (原初miRNA) Drosha (1) pre-miRNA (前体miRNA) Dicer (2) 成熟miRNA Exportin 5 (Exp5) transports premiRNA to the cytoplasm Cell 125, 887–901, 2006 Another View Microprocessor Complex Differences in miRNA Mode of Action microRNA nomenclature Experimentally confirmed microRNAs are given a number that is attached to the prefix mir followed by a dash eg mir-123. miRNAs with similar structures bar at 1 or 2 nucleotides are annotated to show their similar structure with added lower case letter eg miR-1a and miR-1b. miRNAs at different loci to produce the same miRNA and these are show with additional number eg miR-1-1 and miR-1-2 microRNA nomenclature should also be preceded by the annotation for the species they are observed in eg homo sapiens = hsa-miR-xxx. Discovery of siRNA In 1998, the American scientists Andrew Fire and Craig Mello published their discovery: RNA interference The Nobel Prize in Physiology or Medicine 2006 Andrew Z. Fire Craig C. Mello siRNA-Mediated Gene Silencing What is the Difference between miRNA and siRNA? siRNA originates with dsRNA; miRNA originates with ssRNA that forms a hairpin secondary structure. siRNA is often 100% complementary to the target; miRNA is often not 100% complementary to the target. A comparison between miRNA and siRNA RNAi by siRNAs ~22nt siRNAs Developmental regulation by MicroRNA processing processing ~22nt lin-4 target recognition mRNA lin-14 mRNA lin-41 mRNA ~22nt let-7 target recognition 3’UTR 3’UTR degradation Translational repression Base Pairing Differences between miRNAs and siRNAs Transcriptional Gene Silencing by Directing Chromatin Modification RNA silencing in different organisms RNA-Mediated Gene Silencing Post-transcriptional Gene Silencing (PTGS) or RNA Interference (RNAi) Transcriptional Gene Silencing (TGS) (RNA-dependent DNA Methylation) Gene Silencing By MicroRNAs Expression of hairpin RNA (shRNA) using a Pol III promoter Transcription from RNAP III promoters of U6 and H1 are well characterized. RNAP III transcription uses a well-defined termination signal (TTTTT) and the products have no extra sequence. Transcription from these promoters is very efficient in various tissues. Vector-based SiRNA plasmid and viral vectors establishing long-term RNAi: let the cell make the siRNA for you! Example of Expression Vector lentiviral construct for siRNAs siRNA Delivery & Processing 21世纪初RNA研究正在兴起 2000年世界十大科技突破的第二条 2001年世界十大科技突破的第二条 2002年世界科技十大突破的第一条 2004年世界科技十大突破均来自RNA snRNA (small nuclear RNA) 是细胞内稳定表达的一类RNA,转录后需与多种蛋白子结 合形成snRNP(small nuclear ribonucleoprotein particles) –种类:主要有5种 U1、U2、U4、U5、U6;其它如: U11、U12等 –功能: • 识别剪接点并与之结合 • 形成剪接体的三维结构,助于反应进行 • 可能有催化转酯反应的作用 The Spliceosome Assembly Pathway U1 Exon 1 Exon 2 (Commitment Complex) ATP U1 E U2 A A (Pre-spliceosome) U1 U6 B U5 U4 (spliceosome) U2 U4 C U6 (Activated Spliceosome) U5 U2 U6 U5 U2 Exon 1 Exon 2 mRNA gRNA RNA editing in RNA editing, the coding sequence of an mRNA molecule is altered after transcription, and so the protein has an amino acid sequence that differs from that encoded by the gene. observed in mRNAs, tRNAs, and rRNAs from a wide range of organisms; include the insertion and the deletion of nucleotides and the conversion of one base into another T. brucei (布氏锥虫)gCYB gRNA 68nt 导致RNA编辑中U的加入与去除 480 490 500 510 mRNA顺序 UUA GGU AUA AAA GUA GAU UGU AUA CCU GGU AGG UGU AAU 蛋白质顺序 DNA正链 L G I K V D T TA GGT ATA AAA GTA GA 480 490 C I P G R C N G A A CCT GGT AGG TGT AAT 500 锥虫COII基因片段及其表达产物的序列比较 核酸序列的数字是以起始密码子AUG(ATG)的A开始编码. 510 Xist RNA The Xist RNA is a large non-coding RNA which has been shown to necessary for developmentally regulated chromosomal silencing in females. Human XistRNA 16,500nt X 有丝分裂中失活X染色体(蓝色)上的Xist RNA(红色) Cell, 93, 309-312, (1998) pRNA 在双链DNA病毒增殖和成熟的过程中, 需要将 相当长的子代DNA装入一个空间极为有限的新 生病毒衣壳中。早在1987年, Guo P X等在对 噬菌体ф29 DNA的转运进行研究时发现了一 种具有转运功能的RNA分子, 该RNA分子在噬 菌体ф29的DNA包装中有着重要的作用, 这种 RNA分子被称为pRNA(packaging RNA)。 pRNA 人 端粒RNA( 451nt) 端粒(telomere)是真核细胞染色体的生理性末端 ,由高含G的DNA序列和相应的蛋白组成。 端粒的维持需端粒酶(telomerase)的激活。端粒酶 是一种核糖-核蛋白复合体,其中RNA和蛋白质 是端粒DNA合成所必须的。它不同于经典的 DNA聚合酶,而是专一的逆转录酶,能以自身的 RNA为模板,逆转录合成端粒DNA,以补偿细 胞分裂时染色体末端缩短. Telomerase RNA • Component of telomerase • Provides template for telomere synthesis • Role in Cancer and Aging Telomerase a reverse transcriptase to elongate telomeric DNA (TTAGGG)n A G G G T T 5’ (AATCCC)n 3’ Protein C A A U C C C A AUC RNA ’3 5’ Telomerase a reverse transcriptase to elongate telomeric DNA (TTAGGG)n A G G G T T A G G G T T 5’ (AATCCC)n 3’ C A A U C C C A AUC RNA ’3 5’ Telomerase a reverse transcriptase to elongate telomeric DNA (TTAGGG)n A G G G T T A G G G T T 3’ 5’ (AATCCC)n C A A U C C C A A U C RNA ’3 5’ Telomerase a reverse transcriptase to elongate telomeric DNA (TTAGGG)n A G G G T T A G G G T T 5’ 3’ C A A U C C C A AUC (AATCCC)n RNA ’3 Telomerase a reverse transcriptase to elongate telomeric DNA (TTAGGG)n A G G G T T A G G G T T A G G G T T 5’ 3’ C A A U C C C A AUC (AATCCC)n RNA ’3 Telomerase a reverse transcriptase to elongate telomeric DNA (TTAGGG)n A G G G T T A G G G T T A G G G T T 3’ 5’ (AATCCC)n RNA C A A U C C C A AUC ’3 5’ Telomerase a reverse transcriptase to elongate telomeric DNA (TTAGGG)n A G G G T T A G G G T T A G G G T T 5’ primer (AATCCC)n DNA polymerase 3’ A myriad of RNAs and functional diversity mRNA, tRNA, rRNA: protein biosynthesis gRNA: mRNA editing snRNA: mRNA processing (splicing and maturation) snoRNA: rRNA processing( cleavage and modification) RNA P: tRNA processing Telomerase RNA: DNA replication and life SRP-RNA: transport miRNA: regulation of gene expression in transcription and post-transcription levels siRNA: gene silence Xist and Tsix: X chromosome inactivation …… a hidden “RNA world” within modern DNA world