Detection and Characterization of Gene Conversion in Mouse and Human Recombination Hotspots Yves Clément IMPRS Colloquium 25/11/2011 Why is meiotic recombination important? (1) 1 diploid cell Recombination 4 haploid cells (sperm, egg) Why is meiotic recombination important? (2) A B a b A b a B Meiotic recombination 5’ 3’ paternal 3’ chromosome DNA repair and 5’ synthesis maternal chromosome Gene conversion Double-strand break 5’ to 3’ resection Synthesis of new DNA Invasion of DNA with sequence similarity Holliday junction formation [Duret & Galtier 2009] Questions • What is the length distribution of gene conversion tracts? • Can we detect other characteristics of meiotic recombination? GC-Biased Gene Conversion (gBGC) Fixation bias G T mismatch A T W G C S Biased mismatch repair [Meunier & Duret, 2004] [Duret & Arndt, 2008] [Smagulova et al, 2011] Mus m. musculus DSB hotspot Mus m. castaneus [Smagulova et al, 2011] Methods DSB “hotspots” middle point 100 bp Mus m. musuclus Mus m. castaneus Mus spretus [Smagulova et al, 2011] [Keane et al, 2011] Inferring substitution rates • Estimation of nucleotide substitution frequencies in lineages from multiple alignments: C • on each branch: M1 C T Mus m. musculus M2 M4 M3 C C Mus m. castaneus Mus spretus – 12 XY substitutions – 2 CpG methylation deamination processes: CpG CpA / TpG – GC* (equilibrium or “future” GC-content) [Duret & Arndt, 2008] ≈ 1.5 kbp Control (1) DSB “coldspots” DSB “hotspots” Mus m. musculus Control (2) DSB “hotspots” Mus m. musculus Mus m. castaneus Gene conversion (1) • Centered on DSB hotspots middle points. • Affects region of approximately 1.5 kbp • Gene conversion tracts have variable length Base composition skews [Smagulova et al, 2011] Strand specific mutations G-T C-G G-C C-G G-T C-A G-C C-G G-C T-G A-C T-G GA ≠= CT 5’ AAAAA 3’ TTTTT a 5’ 5’ TTTTT a AAAAA 3’ AAAAA TTTTT a 3’ AAAAA TTTTT a 3’ 5’ 5’ GA CT 3’ 3’ CT GA 5’ 5’ GA CT 3’ 5’ GA CT 3’ Gene conversion (2) • No evidence of strand specific mutations caused by meiotic recombination. ? ✔ DSB Hotspots DSB Hotspots ✔ Recombination Hotspots PRDM9 genome markers CCNCCNTNNCCNC PRDM9 middle points Gene conversion in human • Gene conversion centered around PRDM9 binding sites: – DSB occur in very close proximity to PRDM9 binding sites. • PRDM9 binding sites directly affected by gene conversion – “hotspot paradox” Acknowledgements • • • • • Peter F Arndt Brian Cusack Barbara Wilhelm Laurent Duret Martin Vingron • Thank you for your attention! Recombination hotspots recombination rate marker 1 Hotspot Recombination >= 10 cM/Mb Hotspot genome 5’ 3’ CCNCCNTNNCCNC 3’ CCNCCNTNNCCNC CCNCCNTNNCCNC 5’ CCNCCNTNNCCNC 5’ 3’ 3’ 5’ CCNCCNTNNCCNC 5’ 3’ 100bp 0 Substitution patterns… Inferring substitution rates • Estimation of nucleotide substitution frequencies in lineages from multiple alignments: C • on each branch: M1 C T human M2 M4 M3 C chimp C gorilla – 12 XY substitutions – 2 CpG methylation deamination processes: CpG CpA / TpG – GC* (equilibrium or “future” GC-content) [Duret & Arndt, 2008] ≈ 1.5 kbp Gene conversion tract regulation? Loose Regulation GC* GC* Tight regulation middle points middle points