ncRNAs What Genomes are Telling Us ncrna.ppt ncRNA genes are difficult to discover! small no ORFs and no polyadenylation must be identified by paralogy/orthology novel members not readily discovered from sequence Draft or complete; no structural information; underrepresentation draft sequence biased against tandem ncRNA gene units an annotational and statistical concern “low complexity” BACs not sequenced for draft many pseudogenes and retropseudogenes difficult to distinguish gene from pseudogene except tRNA genes The ncRNAs We Know and Love •snRNAs •snoRNAs •rRNAs •Telomerase RNA •tRNAs •Xist RNA •Snurp RNAs •Vault RNA •7SL RNA •miRNAs Transfer RNAs (tRNAs) About 500 genes, about 300 pseudogenes Included tRNASeCys-UGA Fly < Humans < Worm Related to developmental and tissue-specific needs, not organismal complexity tRNAaa gene number roughly correlates with aa frequency and codon bias Genomic distribution of tRNA genes in human Nonrandom dispersal (clustering) 25% (140) located in 4 Mb of HSA6 0.1% of genome has near complete set of anticodons 18 of 30 tRNAcys genes located in 0.5MB of HSA7 Many tRNAarg and tRNAglu are loosely clustered on HSA1 Over 50% (280) genes located on HSA1&6 HSA3,4,8,9,10,12,18,20,21 and X have <10 tRNA genes each HSA22 and Y have one pseudogene each and no genes Ribosomal RNAs (rRNAs) Four RNA molecules for each of two ribosome subunits LSU and SSU rRNA genes occur as 44kb tandem repeat unit LSU rRNA = 28S, 5.8S SSU rRNA = 18S 5S rRNA (also part of LSU but from separate gene) 150-200 copies on the short arms of acrocentric chromosomes 13, 14, 15, 21, 22 5S rRNA gene occurs in several 200-300-unit tandem arrays Largest at 1q41.11-1q42.13 2000 copies predicted; 520 pseudogenes likely LSU-SSU rRNA gene repeat: 150-200 copies on HSA 13p, 14p, 15p, 21p, 22p Small nucleolar RNAs (snoRNAs) direct postranscriptional modification and processing of rRNAs in nucleolus Two families of snoRNA genes 97 snoRNA genes C/D-Box snoRNAs direct 2’-O-ribose methylations (105-107 instances) H/ACA-Box snoRNAs direct pseudouridylation (95 instances) Distributed across chromosomes as nearly all single copies 5-10 copies of CD-Box snoRNA gene inverted repeats at 17q21 More predicted Sequences diverse; cannot depend on paralogy to predict Spliceosomal snRNAs (snurps) Ten known RNAs (U1-U12) responsible for hnRNA splicing Snurp RNAs either clustered or dispersed: 44 dispersed genes for U6 RNA 16 dispersed genes for U1 RNA 10-20 tandem copies U2 RNA genes (6.1 kb units) at 17q21 30 copy loose cluster of U1 RNA genes at 1p36 More predicted Tandem-arrayed clusters underrepresented in draft ncRNA pseuodgenes 100’s-1000’s of pseudogene copies of ncRNA genes More copies from ncRNAs transcribed by RNA polymerase III Most presumed to have arisen by reverse transcription and retroposition Including snurp U6, 7 SL RNA, and hY RNA Like Alu and tRNA-family repeats Analytical comparison with Alus may help explain requirements for SINE proliferation in genomes Dude! Small interfering RNAs (siRNAs) siRNA-containing transcripts exhibit extensive folding dsRNA folds recognized by DICER enzyme siRNA molecules excised from folds in nucleus or cytoplasm snRNA perform multiple functions Directed degredation of specific mRNAs Maintenance of heterochromatin Analogous to RNAi experiments, but siRNA is endogenous RISC RNA induced silencing complex microRNAs (miRNAs) ~21 nts long miRNAs derive from 60-80 nt dsRNA hairpins Hairpins excised from primaries in nucleus Some transcripts have exon-intron structure miRNAs can derive from instron or exon sequence Some transcripts contain clusters of miRNAs Conserved sequence miRNAs excised from hairpins in cytoplasm Excised by DICER Hairpins derive from long primary transcripts “mature” miRNA 50% have Fugu and Danio homologues 25% have C. elegans homologues Primary function believed to be translational suppression Binds to target mRNAs at 3’ ends Suppresses, slows, or eliminates specific protein synthesis Some act as siRNAs A Few Parting Words on cRNAs and their Genes