Gene Molecular definition: Entire nucleic acid sequence necessary for the synthesis of a functional polypeptide (protein chain) or functional RNA Genes in the genome: • Protein-coding genes (mRNA): around 20500 (as of 10/2012) • Non-coding RNAs Ribosomal RNA (rRNA) Transfer RNA (tRNA) Small nuclear RNA (SnRNA) Small nucleolar RNA (SnoRNA) microRNA (miRNA) Other non-coding RNAs (Xist, 7SK, etc.) • Pseudogenes Ratio of non-coding to protein coding DNA Raises as a function of developmental complexity The genetic basis of human complexity and variation • ~ 98% of the transcriptional output in humans is noncoding RNA - 95-97% of the primary transcript of protein-coding genes is intronic - there are enormous numbers of noncoding RNA genes in the mammalian genome, which are only now beginning to be recognized, and which appear to account for between 1/2 and 3/4 of all transcripts • The majority of the human genome is transcribed Known or predicted transcribed Noncoding percent of transcription 60% 59% 70% 71% 56% - 1.9% exonic (1.2% protein-coding) x 20 indicates that 30-40% of the 58% 98% Human genome is transcribed, just to account for protein-coding genes 55% 98% Mouse Fruitfly - if equal number of noncoding RNA transcripts, then >60% is transcribed Worm - a direct summation of ’known genes', mRNAs,Yeast and spliced ESTs 0.6% 7Mb 130Mb 2.6Gb from the UCSC100Mb database shows that2.9Gb (a minimum of) 58% of the human genome is transcribed, 24% from both strands (total 2.3 Gb). • Either the genome is replete with useless transcription or these nonprotein-coding RNAs are fulfilling some unexpected function Non Coding RNAs: ‘RiboRegulators’ (~97% of RNAs Present in Human Cells are Non-Coding) rRNA tRNA Vault Y RNAs 7SK, 7SL Xist, H19 snRNAs snoRNAs Guide RNA Introns 5’ UTR 3’ UTR Antisense RNAs Catalytic: Ribozymes Telomerase MicroRNAs Viral RNAs Retrotransposons Many pseudogenes Processes affected by ncRNAs. Process Example Function Transcription 184-nt E. coli 6S Modulates promoter use 331-nt human 7SK Inhibits transcription factor P-TEFb Steroid receptor 875-nt human SRA Gene silencing 16,500-nt human Xist Required for Xchromosome Inactivation 100,000-nt human Air Required for autosomal gene imprinting Replication 451-nt human telomerase RNA Core of telomerase and telomere template RNA processing 377-nt E. coli RNase P 186-nt human U2 snRNA Translation… 28S+18S RNAs Catalytic core of RNase Core of spliceosome Ribosomes Non Coding RNAs: ‘RiboRegulators’ (~97% of RNAs Present in Human Cells are Non-Coding) rRNA tRNA Vault Y RNAs 7SK, 7SL Xist, H19 snRNAs snoRNAs Guide RNA Introns 5’ UTR 3’ UTR Antisense RNAs Catalytic: Ribozymes Telomerase MicroRNAs Viral RNAs Retrotransposons Many pseudogenes RNA-Mediated Gene Silencing Post-transcriptional Gene Silencing (PTGS) or RNA Interference (RNAi) Gene Silencing By MicroRNAs OTHER GENE SILENCING MECHANISMS MEDIATED BY NON-CODING RNAs RNA Silencing: The Genome’s Immune System Ronald H. A. Plasterk Science vol. 296 2002 Genomes are databases sensitive to invasion by viruses (foreign nucleic acids). In recent years, a defense mechanism has been discovered, which turns out to be conserved among eukaryotes. The system can be compared to the immune system in several ways: It has specificity against foreign elements and the ability to amplify and raise a massive response against an invading nucleic acid. The latter property is beginning to be understood at the molecular level. RNA-Mediated Gene Silencing Science 2002 296:1263-1265 Remarkable Properties of RNAi • dsRNA (not ssRNA) is interfering agent • Sequence-specific loss of mRNA and protein • Effective against exons not introns • Potent (few dsRNA molecules/cell effective) • Persistent (affects next generation) • Effects can cross cell barriers (feed, soak) dsRNA EXÓGENO O ENDÓGENO MicroRNAs: Expanding Family of ‘RiboRegulators’ • • • • lin-4 and let-7 RNAs (from worm) were first examples Also known as stRNAs (small temporal RNAs) Regulate expression of proteins and developmental timing Tip of the iceberg………..MicroRNAs are everywhere! Science Vol. 297 Sept. 13, 2002 RNAi and Heterochromatin – a Hushed-Up Affair R. Allshire 297:1818-1819 Regulation of Heterochromatic Silencing and Histone H3 Lysine-9 Methylation by RNAi Volpe et al 297:1833-1837 Small RNAs Correspond to Centromere Heterochromatic Repeats Reinhart & Bartel 297:1831 siRNA and Silent Chromatin - Model RNA homologous to centromeric repeats are processed – siRNAs siRNAs may recruit Clr4 histone H3 methylase result in meth. of H3 Lys9 Swi6 binds chromatin Gene silencing OTHER SILENCING SMALL NON-CODING RNAs… (piRNAs, rasiRNAs…) Piwi RNAs Long non-coding RNAs Paradigms for how long ncRNAs function. Wilusz J E et al. Genes Dev. 2009;23:1494-1504 ©2009 by Cold Spring Harbor Laboratory Press Other cell processes… Genomic organization of the transcription of short and large ncRNAs. A. Small non-coding RNAs (sRNAs) are transcribed from 5′nucleosome depleted region (5′-NDRs) i.e. PASR (Promoter-Associated Small RNAs, brownish-red arrows), tiRNA (transcription initiationassociated RNAs, orange arrows), TSSa-RNAs (Transcription Start Site-associated RNAs, red arrows), unstable PROMPTs (PROMoter upstream Transcripts; black, dotted arrows) and from 3′-NDR i.e. TASR (Terminator-Associated Short RNAs, in blue). These ncRNAs are transcribed in both senses. During gene looping RNAPII can possibly swap genomic regions (depicted with doted double-sense arrow) and in consequence transcribe different coding or non-coding regions. B. Large non-coding RNAs (lncRNAs) depicted here are: lincRNAs (long intervening ncRNAs, green arrows), PALRs (Promoter-Associated Long ncRNAs, magenta), lancRNAs (long antisense non-coding RNAs/NAT, blue) and eRNAs (enhancer-associated ncRNAs, yellow). lncRNAs are transcribed in both senses from promoters, enhancers or inter-genic regions. Mammalian X inactivation In female somatic cells, one X chromosome becomes inactive and is cytologically detected as a Barr body. The inactive X chromosome in female cells is more heavily methylated and later replicating than the active X chromosome Consequence: one allele is expressed in some areas of the body and the other allele is expressed in other areas of the body (cells are hemizygotic and all females are mosaics) There are pseudoautosomal regions of the X chromosome that are transcriptionally active on both active and inactive X chromosomes. XIST: the first discovered long ncRNA (X-Inactive Specific Transcript) -- Identified in 1991(Willard, Brockdorff) -- 17 kb spliced, noncoding RNA -- Stable expression only from inactive X -- “Paints” inactive X chromosome -- Required to initiate silencing The Xist RNA has an important role in X chromosome inactivation XIST: the first discovered long ncRNA Xist – X inactive-specific transcript Avner and Heard, Nat. Rev. Genetics 2001 2(1):59-67 Random parental X inactivation in somatic cells Mammalian X inactivation In female somatic cells, one X chromosome becomes inactive and is cytologically detected as a Barr body. The inactive X chromosome in female cells is more heavily methylated and later replicating than the active X chromosome Consequence: one allele is expressed in some areas of the body and the other allele is expressed in other areas of the body (cells are hemizygotic and all females are mosaics) There are pseudoautosomal regions of the X chromosome that are transcriptionally active on both active and inactive X chromosomes. X inactivation maternal chromosome paternal chromosome embryo dermal displasia Duchenne Muscular Dystrophy (DMD) • X-L recessive -progressive weakness and loss of muscle -symptoms by 5 yrs, in wheelchair by 11 yrs, death in or early 20s • 5-10% of carrier females have muscle weakness, a few severe disease • very large gene (>2.5 Mb) -protein is called dystrophin -most mutations are deletions -Becker Muscular Dystrophy is allelic late teens have X-linked diseases: female intermediate phenotypes (f.e. color blindness) XIST: the first discovered long ncRNA Mammalian X inactivation In female somatic cells, one X chromosome becomes inactive and is cytologically detected as a Barr body. The inactive X chromosome in female cells is more heavily methylated and later replicating than the active X chromosome Consequence: one allele is expressed in some areas of the body and the other allele is expressed in other areas of the body (cells are hemizygotic and all females are mosaics) There are pseudoautosomal regions of the X chromosome that are transcriptionally active on both active and inactive X chromosomes. Pseudogenes • Two types: processed and non-processed • 70% processed vs 30% non-processed • ~ 20,000 Torrents et al. Genome Res. 2003 13: 2559-67. Human pseudogenes Non-processed pseudogenes Contain introns; Arise by duplications; Frequency of transfer depend on chromosomal context (pericentromeral fragment are transferred more often) Processed pseudogenes Do not contain introns; Arise by retrotransposition; Frequency of transfer depends on initial level of gene expression (Highly expressed genes are transferred more often) Complete Partial Both types of pseudogenes are raw material for evolution NF1 gene and its pseudogenes on different chromosomes All NF1 pseudogenes are partial; 11 of them are found in the genome Mechanism of processed pseudogene transfer into new location Could be very prolific: there are 95 functional ribosomal genes and 2090 pseudogenes R e tro p se u d o g e n e s • R e ve r se -tr a n sc r ib e d g e n e s a r e (1) w id e ly e x p r e sse d , (2 ) h ig h ly c o n se r ve d , (3 ) sh o r t, a n d (4 ) G C -p o o r . • T h e h u m a n g e n o m e is e stim a te d to c o n ta in 2 3 ,0 0 0 t o 3 3 ,0 0 0 r e tr o p se u d o g e n e s. PROMOTOR ?? ncRNAs and transcription… Other cell processes… Triple helix formation