Banff.Zhao

advertisement
Statistical Methods to Prioritize
GWAS Results by Integrating
Pleiotropy and Annotation
Hongyu Zhao
Yale School of Public Health
June 25, 2014
Joint work with Min Chen, Lin Hou, Tianzhou Ma, Can Yang,
Dong-Jun Chung, Cong Li, Judy Cho, Joel Gelernter
What we have learned from GWAS
• Genes/Variants associated with phenotypes
• Genetic risk prediction
• Genetic architecture
What we have learned from GWAS
• Genes/Variants associated with phenotypes
• Prediction
• Genetic architecture
Crohn’s Disease
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
IL22
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
Soluble
jewish
pathway_name
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
Receptor Signaling Pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
pathway
gene_symbol pvalue
IL23R
0.002297
SOCS1
0.010415
IL2RA
0.017337
PRLR
0.019376
STAT2
0.033827
TYK2
0.052902
IL10RB
0.060543
CNTFR
0.068332
IL12RB2
0.072698
IL20RA
0.077203
IFNAR2
0.085782
IL22
0.10299
IL22RA2
0.113906
IL6ST
0.124483
IL21R
0.125142
IL6R
0.125529
SOCS2
0.131336
IL13RA2
0.142406
IL7R
0.146245
JAK2
0.166414
IL11RA
0.16868
GHR
0.191144
CSF3R
0.191723
IFNGR2
0.208994
IL12RB1
0.267659
IL28RA
0.294141
JAK1
0.317088
STAT6
0.349177
LEPR
0.391859
IFNAR1
0.392715
IL15RA
0.414013
SOCS6
0.442633
SOCS3
0.444405
IL22RA1
0.469906
STAT1
0.503734
STAT4
0.504923
EPOR
0.553102
SOCS4
0.556056
IL2RB
0.61677
STAT5A
0.661919
IL2RG
0.672769
IFNGR1
0.676117
JAK3
0.702464
IL4R
0.746998
STAT3
0.780401
IL5RA
0.78238
LIFR
0.803115
SOCS5
0.807055
CSF2RB
0.903223
STAT5B
0.906422
IL10RA
0.924236
OSMR
0.928906
IL13RA1
0.973552
Network-Based Analysis
• Start from a known interaction/co-expression network [N:
assumed to be known]
• Each gene is either associated or not associated with a
phenotype [D: unknown]
• Each gene has an observed statistical evidence for
association [Z: observed]
• Goal: Infer D conditional on N and Z
Chen, Cho, Zhao (2011) PLoS Genetics
Chen, Cho, Zhao (2011) PLoS Genetics
Application to CD GWAS
Chen, Cho, Zhao (2011) PLoS Genetics
Co-Expression Networks
Zhou et al. (2002) PNAS
Guilt by Rewiring: Motivation
• Gene networks are different between healthy controls and
diseased individuals.
• The differences are as important or even more important
than their commonalities.
A
B
A
C
D
Control
A
B
C
D
B
C
D
Disease
Rewiring network
Hou et al. (2014) Human Molecular Genetics
MRF model leads to better replication
rates between independent studies
• Negative control:
– Non-specific microarray dataset (brown line, left figure)
Hou et al. (2014) Human Molecular Genetics
Signal enrichments in DHS sites
Hou, Ma, Zhao (2014)
Better replication rates at DHS sites
Hou, Ma, Zhao (2014)
Weighted scheme to integrate DHS site
information to prioritize SNPs
http://dongjunchung.github.io/GPA/
GPA formulation
GPA formulation
GPA formulation
GPA formulation
GPA formulation
GPA formulation
GPA formulation
GPA: Single GWAS
Chung et al. (2014) PLoS Genetics, under revision
GPA: Modeling Pleiotropy
GPA: Modeling Annotation Data
Modeling Pleiotropy and Annotation
Key Assumptions for GPA
Simulations
Comparisons with conditional FDR approach
GPA: Enrichment Testing
• Pleiotropy & enrichment for annotation can be
checked conveniently using the hypothesis
testing procedure incorporated into the GPA
G1/G2
Null
Assoc.
framework.
• Null hypothesis for pleiotropy:
H0: ( π10 + π11 ) ( π01 + π11 ) = π11
Null
π00
π01
Assoc.
π10
π11
• Hypothesis testing for annotation enrichment:
H0: q0 = q1
GPA: Hypothesis Testing
Comparisons with GSEA
Five Psychiatric Disorders
• Five psychiatric
disorders:
–
–
–
–
–
ADHD.
Autism spectrum disorder.
Bipolar disorder.
Major depression disorder.
Schizophrenia.
• Strong pleiotropy exists
for BIP-SCZ, MDD-SCZ,
ASD-SCZ, & BIP-MDD.
Five Psychiatric Disorders
BIP: separate analysis
BIP: joint analysis
Five Psychiatric Disorders
SCZ: separate analysis
SCZ: joint analysis
Comparisons with Linear Mixed Models
• Integration of bladder
cancer GWAS data with
ENCODE DNase-seq
data from 125 cell lines.
• Annotation from 11 cell
lines are significantly
enriched, under α =
0.01, after Bonferroni
correction.
Acknowledgements
Medicine: Judy Cho (Mount Sinai)
Psychiatry: Joel Gelernter
Yale Center for Statistical Genomics and Proteomics:
Min Chen (UT Dallas), Lin Hou, Tianzhou Ma (U.
Pittsburgh), Can Yang (HKBU), Dong-Jun Chung
(MUSC), Cong Li
Various NIH and NSF grants
Download