Why this paper • Causal genetic variants at loci contributing to complex phenotypes unknown • Rat/mice model organisms in physiology and diseases • Relevant to our work – Integration of GWAS of different traits – Interpretation of human GWAS Advantages of genetic mapping using heterogeneous stocks • Accuracy of QTL mapping to Mb resolution • WGS imputation from progenitor genomes • Haplotypes well defined – Single SNP vs haplotype (spatial) association – Difficult in humans, large #of rare/unknown haplotypes Design AJ AKR B a lb C 3H C 57 DBA IS R III Sequencing HS R a n d o m B re e d in g H S G e n e ra tio n > 6 0 Reconstruction of rat genomes as mosaic of founder haplotypes based on 265,551 SNPs (“sequence imputation”) Genotypes • 1,407 phenotyped NIH-HS animals • 198 parents (~14.2 litter size) • RATDIV genotyping array (13 inbred strains) – 803,485 SNPs – 560,000 segregating in NIG-HS – 265,551 used for haplotype reconstruction • Sequencing of founder samples – Number ? – 22x coverage Phenotypes • 160 measurements Sequencing • 7.2M SNP • 633,000 indels • 44,000 structural variants Sequencing • False Positives • 2.7% SNP • 2.2% indels • 16.7% structural variants • False Negatives • 17.2% SNPs • 41.4% indels • 65% structural variants Nucleotide diversity in NIH-HS progenitors • Similar diversity between strains Nucleotide diversity in NIH-HS progenitors • Similar diversity between strains • 29% SNP private to particular strain – Unique haplotypes relatively common • Regions of low diversity are small (~400 kb) Genotyping QTL mapping • Reconstruction of rat genomes as mosaics of founder haplotypes – R HAPPY Svenson K L et al. Genetics 2012;190:437-447 QTL mapping • Reconstruction of rat genomes as mosaics of founder haplotypes – R HAPPY. – Mixed Linear Model (EMMA, normal phenotypes) Expected number of haplotypes random effect Haplotype from strain s at locus l – Resample model averaging (BAGPHENOTYPE,non-normal) • Non-parametric bootstrap aggregation (bagging) QTL mapping Haplotype Strain A B C -----------------------------y1 = 2 0 0 y2 = 0 2 0 y3 = 0 1 1 QTL results • 355 QTLs for 122 phenotypes (avg. 2.9) QTL results QTL results Merge analyses Haplotype (1) Strain A B C -----------------------------y1 = 2 0 0 y2 = 0 2 0 y3 = 0 1 1 Strain distribution pattern (SDP) ABC = 0 0 1 ABC = 1 0 0 Sequence variants A B C Strain CC CC TT -----------------------------SDP 0 0 1 Merge analyses Haplotype (1) Sequence variants Strain A B C -----------------------------y1 = 2 0 0 y2 = 0 2 0 y3 = 0 1 1 Strain CC CC TT -----------------------------y1 = 2 0 0 y2 = 0 2 0 y3 = 0 1 1 Merge model (2) • (2) Sub model (1) • if QTL == single variant • R2(2)~R2(1) • [logPmerge – logPhaplotype] > 0 Strain C T -----------------------------y1 = 2 0 y2 = 2 0 y3 = 1 1 Merge analyses • 343 QTLs – 131 (38%) at least 1 candidate variant • Increased resolution – 90% of variants ruled out, d <0 – Candidates in coding regions affecting protein structure more likely to be causal – Eliminates candidate genes that are distant from candidate variant Merge analyses (examples) • 3 QTL for patelet aggregation Merge analyses (examples) • Candidate variant in single gene Merge analyses (examples) • Candidate variant in coding region Merge analysis • Single variants rarely account for QTL effects – 212 (68%) QTL had no candidate variant • Possible reasons – Causative variants missed in sequencing – QTL mapping biased towards QTL without candidate variants – Merge underestimates statistical significance – Multiple causal variants Merge analysis – Causative variants missed in sequencing • Simulation of all possible SDPs for di-tri-allelic SNPs and merge analysis • 168 (49%) would still have no causative variant – Simulation different QTL architectures • Single variants • Multiple variants within gene, multiple variants linked loci • Haplotype effects/ no individual variants Merge analysis – Simulation of causal variants Merge analysis • Haplotype mapping overestimates QTL without causative variant (?) • Merge analysis underestimates number of QTL without causative variant (?) – Multiple causative variants Concordance between species • 38 measures common between NIG-HS and mice HS • Orthologous rarely contribute to the same phenotype Concordance between species • 38 measures common between NIG-HS and mice HS • Orthologous rarely contribute to the same phenotype • KEGG pathways for QTL associated genes in rat in mice only significantly enriched for “proportion of B cells”) Discussion • Combining sequence with mapping data can identify candidate loci • 50% of QTL can not be attributed to single causal variant – Multiple causal variants, more complex models required – Rat QTL similar to Trans eQTL • Not possible to accurately asses overlap between species – limited power of pathway analysis – limited power from comparing phenotypes (within species?) – Variants in orthologous genes rarely contribute to same phenotype