Aliens? Oddities? Or misunderstood? Transposons and miRNAs Genome sizes (haploid) Wheat 16 GB (? ploid) ~7 chromosomes Human 3.3 GB 23 chromosomes Mouse 2.5 GB 19 chromosomes Dog 2.4 GB 38 chromosomes Chicken 1.2 GB 38 chromosomes plus microchr. Drosophila 1.2 GB 4-5 chromosomes C. elegans 100 MB 5 chromosomes E. coli 5.2 MB 1 chromosome Carsonella ruddii 160 KB 1 chromosome 182 ORFs Number of genes in different organisms 25000 20000 15000 10000 5000 0 Human Rice Mouse Arabidopsis Chicken C. elegans Dog Drosophila E. coli What is a transposon? • Contiguous piece of DNA of varying length (300 bp to 6.5kb or so) • Repeated with minor variations throughout the host genome • Can replicate itself by cut and paste or copy and paste mechanisms (can move around!) • No known function — most synthetic genome projects aim to remove them • Structural and functional analogies to viruses – Much of the terminology reflects this Barbara McClintock, 1940s Discovered transposons and characterized their effects on their hosts She was ostracized for her ideas but won the Nobel Prize in 1983. Types of transposons • Cut and paste – DNA transposons • Copy and paste – Autonomous retrotransposons • ERVs *possibly active in human genome • L1 & relatives *active in human genome – Nonautonomous retrotransposons • SINEs (Alu) *active in human genome • SVA *active in human genome – Composite element (SINE, VNTR, Alu) • Processed pseudogenes Transposons comprise ~45% of the human genome • DNA transposons 3% • Autonomous retrotransposons – – – – ERVs L1 18% (500,000 copies) L2 3% L3 & relatives 1% ] LTR retrotransposons • Nonautonomous retrotransposons – SINEs (Alu) 15% (1 million+ copies) – SVA (3000 copies) – Processed pseudogenes (>8000) (Simple repeats occupy almost another 10%) “Junk DNA”? • What do transposons do? – – – – Make more of themselves Move genes around Serve as reservoirs of new sequence Cause genetic instability (repeats stimulate translocation; L1 causes chromosome breakage) • Can contribute to genes and gene expression – 5% of alternatively spliced internal human exons come from Alus – 80% of genes have some L1 sequence in noncoding portion – 1-4% of coding sequence is L1-derived – Act as methylation centers Importance in genomics • Transposons are a source of human variability – Roughly 5% of people have a transposon not found in either parent (not due to nonpaternity!) – Overall polymorphism variable but remarkable (40-50% of youngest elements are polymorphic) • Transposons can be useful in medicine – Occasionally cause disease (de novo insertion in factor VIII clotting gene led to L1 discovery in 1980s) – May often be linked to disease loci Importance in genomics • Transposons in introns may disrupt gene expression – Mechanism depends on whether they are on the sense or antisense strand – (+) strand orientation — transcription stalling – (-) strand orientation — premature polyadenylation, gene splitting Importance in genomics • Can have huge effects, through chromosomal translocation, inversion, breakage Transposon domestication • Overly active transposons will kill a cell (and then the organism) • Transposons have tempered – active almost exclusively in germ line – also in cancer cells and neuronal cell precursors Transposon domestication • Host cells use many mechanisms to control transposons – – – – Methylation (original role?) miRNA defense Sequestered in stress granules Nucleic acid editing • APOBEC family of proteins edits cytosines to uracils • ADARs edit dsRNA adenosine to inosine What to do with transposons? • Study them • Work around them (be aware) – RepeatMasker (Smit & Jurka) – Problem: each element is at least in part unique, and RepeatMasker will mask that too Another old element, new to science: microRNAs RNA world hypothesis: First “organism” was a strand of RNA that could somehow replicate itself. Eventually RNA used DNA as a more stable storage for genetic material. 1982: Tom Cech reported self-splicing RNAs microRNA • 21-25 nucleotide small RNAs • Discovered in a C. elegans screen • Alter gene expression at the posttranscriptional level (precise mechanism unknown) • Tend to be high-level regulators (>100 targets each) • Percentage of human genes under miRNA control is unknown but possibly 20-30% • Often are developmental or cell state miRNA Two mechanisms: Perfect match to target leads to mRNA cleavage or Imperfect match leads to translational repression Neither is wellunderstood, but likely involve the dsRNA recognition system Another role? • Under conditions of cell stress, a miRNA may be activating instead, as responding regulatory proteins interpret the signal differently Seems odd . . . • Why would a cell use this sort of mechanism? It’s making an mRNA and then degrading it. Should be easier to just not make it . . . • But what if the cell is not in control of that RNA, for example if it’s coming from an invasive nucleic acid species under its own promoter? – Transposon control!!! – piRNA (piwi RNA) are a whole class of small RNAs that control transposons – Invasive RNA was a big problem in the RNA world! Occam’s razor All other things being equal, the simplest solution is the best My alternative: If a biological principle is simple, it’s probably wrong. Evolution tends to higher complexity, as old mechanisms are reused and there’s little incentive to clean up. Looking for new miRNAs • Often found within stem-loop precursor structures (hairpins) • Associated (in the cell) with polysomes and other structures • Bioinformatics: unexpected sequence conservation in noncoding region, or homology to miRNA in a closely related species (works less often than you would think) • Identify candidate miRNA targets (TargetScan, by Chris Burge’s group) – A target protein usually has multiple target sites Problems with miRNAs • Small! Unstable, hard to get large quantities • Binding is degenerate, noncontiguous, and includes not only mismatches but bulges • Actual sequence recognition only 15 or so nucleotides (noncontiguous), varies by target • Essential “seed” element not well characterized • Sequences not well conserved across species • miRNA microarrays: statistics problematic