Homogenizing data using ENIGMA-MEGA analysis Peter Kochunov University of Maryland, School of

advertisement
Homogenizing data using
ENIGMA-MEGA analysis
Peter Kochunov
University of Maryland, School of
Medicine
Introduction
•  What is mega-analysis
•  MEGA-Analysis algorithm developed by
ENIGMA
•  Examples:
–  MEGA analysis of additive genetic effects
–  MEGA analysis of SCZ effects on white
matter integrity
Mega-analysis
•  Combining of “raw” data from multiple studies
•  Pros:
–  Additive increase in degrees of freedom
–  Simplified analysis structure
–  Uniformly weighting by subjects
•  Removes weighting uncertainty in Meta analysis of
genetic data
–  Ideal for familial genetics studies where analysis is
performed per family
Mega-Analysis: Cons
•  Data may have a site-specific bias
Histogram %
30
TAOS: Ages 13-15
25
UCLA-QTIM: Ages: 20-30
20
15
GOBS: Ages: 18-85
10
5
0
0.3
0.35
0.4
0.45
Average FA values
0.5
0.55
Mega-Analysis Algorithm
•  Developed by Neda and Me
•  Coded in SOLAR-Eclipse
•  Tried in the following analyses
•  Additive Genetic Analysis (Heritability)
•  Additive Genetic Analysis (Genetic
correlation)
•  Association Analyses (GWA)
•  Disorder effect analyses
•  Effects of Schizophrenia on white
matter
Step 1. Regression of nuisance covariates is performed per site
Remove effects of the covariates that don’t act as “contrast” to make data
“equivalent” per site
Histogram %
30
TAOS: Ages 13-15
25
UCLA-QTIM: Ages: 20-30
20
15
GOBS: Ages: 18-85
10
5
0
0.3
0.35
0.4
0.45
Average FA values
0.5
0.55
3.5
UCLA-QTIM
3
2.5
2
GOBS
1.5
TAOS
1
0.5
0
-3
-2
-1
0
1
2
3
Z-score
Step 2 è Inverse normalization
2.5
2
1.5
1
0.5
0
-3
-2
-1
0
1
2
Z-score
Step 3 è Testing for Stratification
ANOVA of heritability estimates
3
Test of homogeneity of the effect
per group
Measure h2 per sample. Perform ANOVA
UCLA= 0.56±0.25; p=.0001
TAOS=0.49±0.23; p=0.04
GOBS= 0.45±0.07; p=10-8
No difference among
groups
We can combine them into a single pedigree with the
weight assigned based on the relativeness and the
pedigree strength .
Significance of additive effects:
Mega vs. Meta
•  Mega Analysis: lowest SE and highest
significance
–  h2=0.47±0.02; p=10-16
•  Meta Analysis StdErr-Weighted
Greatly influenced by
the small samples
–  h2=0.48 ±0.09; p=0.004
•  Meta Analysis N-Weighted
–  h2=0.44 ±0.03; p=10-6
Difficult to justify given
that subjects don’t
contribute equally
Similar trends in voxel-wise data
P-values for heritability estimates (-log10)
Effects of SCZ on white matter
integrity
•  Apply mega-analysis to study effects of
disorder on FA values
•  Use three samples collected on three
scanners
•  Some cross-over of subjects to directly
study effects of data transform
Effects per Site: Site 1 (N=350)
raw significance p=2*10-6
Controls
Patients
16
14
12
10
8
6
4
Transformed significance p=10-6
2
0
10
0.3
0.32
0.34
0.36
0.38
0.4
9
8
7
6
5
4
3
2
1
0
-3
-2
-1
0
1
2
3
Effects per Site: Site 2 (N=220)
Raw significance p=0.03
14
12
10
8
6
Transformed
significance p=0.01
4
2
0
0.3
0.35
0.4
0.45
0.5
9
8
7
6
5
4
3
2
1
0
-3
-2
-1
0
1
2
3
Effects per Site: Site 3 (N=120)
raw significance p=0.03
14
12
10
8
6
4
Transformed
significance p=0.03
2
0
0.32
0.34
0.36
0.38
0.4
0.42
10
9
8
7
6
5
4
3
2
1
0
-3
-2
-1
-1
0
1
2
3
Homogeneity
of
effect
per
site
Mega-analysis
10
Site 1
9
8
7
6
5
4
3
5
4.5
4
3.5
3
2.5
2
1.5
1
0.5
0
2
1
0
-3
-2
-1
0
1
2
3
0
1
2
3
9
Site 2
8
7
6
5
4
3
2
1
0
-3
-2
-1
Site 3
10
9
8
-3
7
-1
1
6
Combined Mega significance p=6*10-9
5
4
3
2
1
0
-3
-2
-1
-1
0
1
2
3
3
Regional Specificity?
10
9
Site 2
4
3
2
1
8
0
2
4
6
Site 1
7
4
y = 0.4117x + 0.5196
R² = 0.23196
Site 3
3
4
2
1
0
0
0.5
1
1.5
3
2.5
3
3.5
4
Site 2
2
1
2
4
Site 3
-log(p)
6
5
y = 0.2826x + 0.1678
R² = 0.37474
8
10
0
3
y = 0.0173x + 1.0281
R² = 0.00192
2
1
0
0
0
2
4
6
Site 1
8
10
Mega-analysis results of regional effects
Greatest impact with Schizophrenia
n  Anterior corona radiata p<10-11
n  Genu of Corpus Calosum p<10-6
n  Inferior Frontal Occipital p<10-5
n  Superior Corona radiata p<10-5
Mega-analysis results of regional effects
Least impact with Schizophrenia
n  Cortico-Spinal Tract p=0.2
n  Superior-Frontal Occipital 0.05
n  Uncinate fasiculos p=0.02
Regional MEGA vs site
Site 1 (N=350)
Site 1 (N=220)
10
Site 1 (N=120)
4
9
3.5
ACR
8
3.5
3
7
3
ACR
2.5
2.5
6
5
2
y = 0.3448x + 3.122
R² = 0.162
4
1.5
1.5
3
y = 0.1886x + 0.7025
R² = 0.2275
1
2
1
0
0
5
10
Mega p-values
15
y = 0.1108x + 0.6827
R² = 0.10733
0.5
0.5
1
0
ACR
2
0
0
5
10
Mega p-values
15
0
5
10
Mega p-values
15
How does this work on individual subjects?
N=35 subjects were imaged at Site 1 and 2 in studies 5 years apart
R=0.55
R=0.43
3
0.55
2
0.5
0
-3
-2
-1
0
1
2
3
Site 2
Site 2
1
0.45
0.4
-1
0.35
-2
0.3
-3
Site 1
0.3
0.32
0.34
0.36
0.38
Site 1
0.4
0.42
0.44
Limitations: Normality
•  Data for patients and controls has to be
transformable to “normal” state
–  Violated if patients have bi-modal distribution
Excessive Kurtosis
4.5
4
3.5
3
2.5
2
1.5
1
0.5
0
-4
-3
-2
-1
-0.5 0
6
1
2
3
4
5
4
3
2
1
0
-4
-3
-2
-1
-1 0
1
2
3
4
•  Caused by bi-modal
distribution of FLAIR
lesions in patients
•  Use inverse normal
mapping parameters
from the Controls
•  Use bi-Gaussian fit to
probabilistically separate
patients
Acknowledgement
•  ENIGMA Team
–  Paul Thompson, Neda Jahanshad, Siniad
Kelly, Jessica Turner
•  The PIs of the GOBS project: John
Blangero and David Glahn
•  NIH
–  R01s MH085646, R01DA027680,
R01EB015611, MH078111, MH0708143
and MH083824
–  U54EB020403 and P50MH103222
Download