In search of susceptibility genes for psychiatric illness Andrea Christoforou Medical Genetics Section

advertisement
In search of susceptibility genes for
psychiatric illness
Andrea Christoforou
Medical Genetics Section
Molecular Medicine Centre
MVM Research Symposium: November 3, 2008
Introduction
Two devastating psychiatric illnesses:
Bipolar Disorder
(BP)
Schizophrenia
(SCZ)
Mania
Positive
(delusions, hallucinations)
Negative
(lack of affect)
Cognitive
Depression
(speech poverty,
disorganised thought)
Each affects 1% of the general population
Complex Genetics
Adapted from Chpts 9&10 of Psychiatric Genetics and Genomics
Gener al Po p ulat io n
SCZ
F ir st co usins
BPAD
U ncles/ aunt s
Complex pattern of
inheritance
N ep hew s/ nieces
Gr and child r en
Half - sib ling s
Par ent s
Sib ling s
C hild r en
Difficult to identify
causative genetic
factors
DZ
MZ
0
Gender
10
20
30
% Concordance
Environment
Variable
age at onset
40
50
Genetic
heterogeneity
Poly/oligo-genic
inheritance
BP
SCZ
Unclear diagnostic
boundaries
Single LARGE Family with BP
F22
Chr 4
A single LARGE family
4.4
0.88
1.97
3.24
LODs
→ Reduce genetic heterogeneity
Linkage analysis
(eg MERLIN, SUPERLINK,…)
→ identify large segments of DNA
that segregate with a particular
phenotype/illness within a family
→ genome-wide
→ no a priori information
→ Significant LOD score = gene of
major effect
Blackwood et al., 1996
Le Hellard et al., 2007
Houlihan, 2008
Association Analysis of the Chromosome 4p15-p16
Candidate Region for Bipolar Disorder and Schizophrenia.
4.4
0.88
1.97
3.24
Region A
„
Refining the locus
„
“First linkage, then association”
9
9
9
Region B
In unrelated cases and controls
More powerful
Better resolution
(Risch & Merikangas, Science 1996)
Region C
„
Region D
Hypothesis:
Gene X
Gene X
Prioritised: 8.3Mb of 20Mb F22 linkage region
Analysis is underway for remaining region
Outline of Methods
Marker Selection
Genotyping
Data processing
Data Analysis
Replication
SNP Selection
Catalogue of common genetic variants
(single nucleotide polymorphisms, SNPs)
in humans
http://www.hapmap.org/
G T C A A A G T
C T C C A A A T
→ Downloaded all available SNP
genotype data for Regions B and D
→CEU sample data
→ Selected based on linkage
disequilibrium (LD) ~ correlation
between SNPs
→ Using Haploview
→ SNPs selected to tag
haplotypes of SNPs -> htSNPs
G A G
C C A
UNDERLYING THEORY
Summary of SNPs Genotyped
Chromosome 4
htSNPs
Region B
149
Region D
259
Total
408
368 BP 386 SCZ 458 Controls
From the Scottish Population
Methods
χ12
Markers Selection
(HapMap)
Genotyping
(Illumina BeadArray)
SNP3
χ22
Cases
Cases
Controls
Controls
AA
AT
TT
Single marker
ƒ χ2 test
Haplotype (Cocaphase v2.4)
ƒ Sliding windows: 2-5
ƒ Global and individual (EM)
ƒ Log ratio test (LRT)
Data Analysis
SNP2
T
Allele and genotype P-values to determine if difference
b/w cases and controls.
Data processing & QC
(~700,000 genotypes)
SNP1
A
SNP4
By Diagnosis and Gender
ƒ BP, SCZ and All cases
ƒ Male, Female and Both
Single-marker analysis
○ All
P-value (-log scale)
■ SCZ
What is an appropriate
significance threshold?
0.01
0.05
→ P ≤ 0.05?
→ P ≤ 0.01?
How do we account for
multiple testing?
Region B – Position along Chromosome 4
Bonferroni: ~ 0.05/# tests
P-value (-log scale)
Distribution of allele p-values
● BP
→ inappropriate
0.01
0.05
Region D – Position along Chromosome 4
Permutation: Shuffle
case/control status
→ GOLD STANDARD
→ BUT computational and
time restraints due to software
Single-marker analysis
■ SCZ
○ All
SNPSpD (D. Nyholt)
P-value (-log scale)
→ Meff
0.0005
0.01
0.05
“Nyholt-corrected”
significance thresholds
Region B: 149 -> 108 Meff
P ≤ 0.0005
Region D: 259 -> 191 Meff
P ≤ 0.0003
Region B – Position along Chromosome 4
0.0003
P-value (-log scale)
Distribution of allele p-values
● BP
9 Supported by permutation
correction
0.01
0.05
Region D – Position along Chromosome 4
FOR HAPLOTYPE ANALYSIS
Permuted with Cocaphase 2.4:
P ≤ threshold at global
≤ 3 SNPs
(1000 permutations)
Haplotype Analysis
Region B: P ≤ 0.0005
B-1
B-4
BP M, Global haplotype
BP F, Global haplotype
BP M, Individual haplotype
BP F, Individual haplotype
56
HS
T
0.0001
3S
T1
131
MI
S
P - v a lu e ( -(-log
lo g s c ascale)
le )
P-value
0.00001
577kb
239kb
126
0.001
0.01
0
20
40
60
80
100
120
140
RegionBB Project
SNP Project Number
Region
SNP Number
Most of these haplotypes survived the permutation correction.
Haplotype Analysis
Region D: P ≤ 0.0003
BP F, Global haplotype
All F, Individual Haplotype
SCZ M, Global haplotype
SCZ MF, Global haplotype
BP F, Individual Haplotype
BP MF, Global haplotype
SCZ M, Individual Haplotype
SCZ MF, Individual Haplotype
All F, Global haplotype
BP MF, Individual Haplotype
0.000001
D-7
074
GC
1
0.0001
KIA
A
AR
25kb
6
A
0.00001
PP
P -value(-log
(-log scale)
P-value
scale)
D-2
17kb
0.001
0
50
100
150
200
250
Region D SNP Project Number
Region
D Project SNP Number
GERMAN Sample
Most of these haplotypes
survived
the permutation correction.
REPLICATION
IS ESSENTIAL!
SCOTTISH Sample
Permutation analysis
General Principal
→ Disrupt relationship being tested (eg frequencies in cases vs controls)
multiple times to create null distribution. See where actual result falls within the
empirical null distribution.
“Gold standard” for multiple-testing correction
→ no assumptions
→ corrects for hidden correlation
BUT time consuming
→ in Cocaphase, run serially: Perm1 -> Perm2 -> Perm3 -> …
→ because of this, we only permuted
→ SNPs with P≤Nyholt threshold
→ Haplotypes with global P≤Nyholt threshold and ≤3 SNPs in size
Solution
→ parallelize Cocaphase so that permutations are run in parallel
→ problem tackled by student Omer Jilani, MSc in Computer Science
Cocaphase on the Grid
(ECDF)
5-SNP haplotypes
259 SNPs
Full dataset
Figures provided by Omer Jilani
A further approach…
4.4
0.88
1.97
3.24
LODs
F22
e
g
a is
k
n lys
i
L a
An
n
it o
ia is
c
o lys
s
a
As An
M
ra
r
a
o
r
ic
y
A
l
a
n
i
s
y
s
Microarray Expression Study
KK0053
KK0141
KK0142
KK0143 KK0152
KK0052
KK0234
KK0404
KK0053
KK0109
KK0027
KK0067
KK0035
KK0161
KK0025 KK0050
kk0028
KK0127
KK0026
KK0119
KK0088 KK0282
KK0265
KK0110
KK0162
KK0090
KK0091
KK0083
KK0662
KK0089 Kk0281
KK0098
KK0097
KK0191 KK0190
KK0100
KK0099
KK0274
KK0034
KK0233
KK0306
KK0147
KK0177
KK0273
KK0196
KK0250
0585, KK07 KK0272
KK0256
KK0251 KK0093
KK0257
KK0092
KK0255
KK0051
KK0054
KK0117
KK0116 KK0113
KK0238
KK0114
KK0115
KK0241
KK0240
KK0363
KK0365
KK0279
KK0364
KK0280 KK0278
KK0354
KK0355
KK0239
KK0362
KK0513
KK0512
KK0514
KK0409
KK1478
KK0504
KK0519
KK0515
Affected
w/ “disease
haplotype”
Affected
w/o “disease
haplotype”
Married-in
Controls
Differentially expressed
genes
Unaffected
w/ “disease
haplotype”
Expression QTLs
(eQTLs)
SNP loci that control expression
of genes (cis/trans)
e
l
b
li a ype
a ot
v
A e n a ta
G D
KK0552
KK0572
Conclusion
Acknowledgments
David Porteous
Kathy Evans
Albert Tenesa
CRF
Pippa Thomson
Stewart Morris
Naomi Wray
Ian White
Dave Liewald
Steve Cass
Psychiatric Genetics Section
Helen Torrance
Susan Anderson
Lorna Houlihan
Douglas Blackwood
Walter Muir
Sven Cichon
Lee Murphy
Angie Fawkes
Alison Condie
Medical Genetics Section
Varrie Ogilvie
Laura Hyndman
ECDF
Omer Jilani
Jon Weiss
Sam Skipsey
Mike Baker
John Blair-Fish
Download