Research progress of piRNA evolution Xuedong Pan 2012-3-28 What is piRNA • Small RNAs: piRNA, siRNA, miRNA. • piRNA: derive from repetitive genomic element, interact with PIWI AGO family protein Guide RNA RISC(RNA induced silencing complex) Germline-specific AGO PIWI family( PIWI, AUB, AGO3) piRNA piRISC: recognize and silence complementary RNA Biogenesis of small RNAs Unkown; important Dicer independent Proved; 5’U and 10 A piRNA biogenesis Long precursor can generate a lot of piRNAs, thus is piRNA cluster Not only piwi. Piwi interact with Tudor-domain proteins directly. Tudor-domain proteins methylate arginine in piwi, which differs piwi with other AGOs. Piwi also needs Armi. Putative roles for piRNA during silencing of protein-coding genes In drosophila testis, if Su on ChrY do not work, Ste on ChrX is deprepressed, which leads to infertility. Piwi expression driven by TJ. piR from 3’UTR of tj together with piwi suppress Fas3. In drosophila testis, AT-chrX code piR, piR interact with AUB, and little AGO3, to suppress vasa Failure of nos mRNA deadenylation and translational repression by piR piRNA at different developmental stages in mouse Conclusion: different developmental stage, different protein and piR piR enriched in TE. Binds MILI and MIWI2 Pachytene: cross over happens piR not enriched in TE. Binds MIWI and MILI Piwi family in mouse: MIWI2, MIWI, MILI TDR: tudor domain containing proteins. Bind piwi directly. Current Evolutionary studies demonstrate • piRNA strongly repress retrotransposons, coevole? • piRNA evolves very fast among species • piRNA loci locates in low recombination region • piRNA show a signature of selective constraint in African populations Molecular evolution: piRNA evolves super fast and do not lost • Rapid repetitive element-mediated expansion of piRNA clusters in mammalian evolution (09 PNAS, Assis et. al, U Michigan) 1) 43% of all rodent piR clusters arose after rodent-primate divergence. While the highest known expansion rate for olfactory receptors is 33%. 2) Olfactory receptors and miRNAs are lost at the same rate with which they are acquired. However, not a single cluster loss was observed for piRNA. Positive selection most likely • RE(repetitive element) usually increases deletions more than insertions. • 60 cluster acquisitions without a single loss • Long insertions are unlikely to be neutual • piRNAs are involved in transposon silencing. • So, an arms race between expanding families of mammalian TE and piRNA cluster? Proteincoding gene piRNA evolve by ectopic recombination ProteinRepetitive element piRNA cluster coding gene (A) Architecture of a typical cluster-harboring genomic region. Intergenic region between two protein-coding genes piRNA cluster. The inserted segment is depicted in brackets. (B) The inserted segment is scanned against the genome to locate the source paralog. (C) Similarity between cluster- and source paralog-harboring regions includes preceding REs (D) Double-stranded break, leads to extension and reannealing of the broken strand Population dynamics of piRNA and TE(transposable element) in Drosophila • Jian Lu and Andrew G. Clark, PNAS 2010 • Activities of a large number of retrotransposons are severely silenced by piRNAs. 1) quantified expression levels of 32 TE families 2) in ovaries of one wild-type and three piRNA mutants • Other reports prove piRNA silence transposons. piRNA repress TEs in RNA level Life cycles of transposable elements(TE) a: IR/DR: inverted/direct repeats bind by transposases b: Reverse transcription in cytoplasm, integration in nucleus by integrases LTR: long terminal repeat GAG protein: form virus-like particles c: ORF1 and ORF2 have reverse transcriptase domain mRNAs are integrated into genome by target-primed reverse transcription Forward simulation of piRTs and targetRTs • Focus on retrotransposon only • Retrotransposons including piRTs: located inside piRNA loci; targetRTs: remaining. • Model: largely come from Dolgin and Charlesworth, Genetics, 2008 Start prameters For 15000 generations Selection: fitness is exponential quadratic, decreasing function of TE copy number. recombination: uniform distribution of crossover positions transposition and excision(Poisson process throughout genome) parameters • Ne = 10e6, constant-sized • One chromosome, 40Mb(close to chr2,3) • One chr one crossover per generation, r = 2.5e-8 an bn2 / 2 • Fitness of chr: w e ; n: the number of retrotransposons; a = 10e-5; b = 5e-6 • Excision rate is v. v = 0 • For targetRTs, retrotransposition rate is u1 if piRNA is not expressed in the cell, u2 if piRNA is expressed. Model piRNA’s suppression effect • For targetRTs, retrotransposition rate is u1 if piRNA is not expressed in the cell, u2 if piRNA is expressed. • Four scenarios: 1) piRNAs have no repression effect: u1 = u2 2) 3) 4) piRNAs can reduce retrotransposition rates to 10%, 1% and 0.1%(u2 = 0.1u1, 0.01u1, 0.001u1) Model’s other assumptions • Retrotransposons inserted into piRNAgenerating regions will be suppressed. • No sequence divergence between paralogous copies of retrotransposons, so that one piRNA can potentially repress all retrotransposons. • Retrotransposons located inside piRNA loci lose the ability to retrotranspose • Ectopic recombination is not allowed Forward simulation for 15000 generations: active piR repress retrotransposons and increase fitness A) Number of retrotransposons B) Fitness costs to the host * scenario1,2,3,4: piRNA’s repressing capabilities increases Forward simulation for 15000 generations: if piR is active, piRTs increases with time Scenario IV Scenario I Proportion(%) of all retrotransposons that are piRTs Scaled parameters: Ne = 500, a = 0.001, b= 0.0005, r = 2.5e-8, u1 = 0.01, v = 0 Frequency spectra of piRT insertions: When piRNA takes effect, piRTs have a higher probability to be fixed Frequency spectra of targetRT insertions Also has a higher probability to be fixed, because their deleterious effects are alleviated by piRNAs Frequency spectra of piRT and targetRT from published genomic data: recombination matters A vs. B: TE longer than 500bp No recombination occured C vs. D: subset of A vs. B, recombination occured Combination of 2 datasets TE and piR like low recombination regions • piR loci enriched in ericentromeric or telomeric heterochromatin • TEs significantly enriched in low recombination regions, even in euchromatin • When recombination occurs, piRTs are deleterious because they can mediate ectopic recombination ** piRNA also enriches in centromeric and telomeric region. Telomeric lacZ insertion produces abundant piRNAs. Telomere is a picluster? Reducing recombination rate of piRNA loci greatly reduces retrotransposons Recombination rate = 2.5e-8 Recombination rate = 0 Human piRNAs are under selection in Africans and Repress TEs • Sergio Lukic and Kevin Chen, MBE, 2011 • Material and methods: human piR sequences(Girard et al. 2006) mouse piR sequences(Lau et al. 2006) Hapmap phase3 data methods: derived allele frequency spectrum; BWA tool, mapping reads piR evolved rapidly between human and chimp • piR region vs. piR flanking region piR flanking region (1000bp each side of piR); nucleotide substitution between human and chimp is not significantly different between piR region and piR flanking region Selective constraint in African populations ASW, YRI vs. CHB, CEU Selective constraint in Africans is consistent with a much higher rate of transposon insertions in African compared with non-African populations(Ewing A, Kazazian H, Genome Res, 2010) Interspecies: no constraint intraspecies: constraint in Africans; mild constraint in non-African • Explanation1: strength of selective constraint my simply differ between these two time scales. • Explanation2: interspecies substitution rate, but not the derived allele frequency distribution, is affected by mutation rate biases. younger TE, more piR targets Younger TEs are more active, so more piRNAs which target them remain in current human genome; the pattern is the same for mouse The pattern is the same in our data ORF2 of LINE1 is depleted of piRNA matches L1-ORF2 functions as reverse transcriptase Red line: number of G/C-nucleotides per base on L1 mRNA Blue line: density of sense piRNA matches to L1 Green line: density of antisense piRNA matches to L1 THANKS