Deborah Blacker

advertisement
PHENOTYPE DEFINITION
AND ANALYTIC STRATEGIES
FOR LATE-ONSET AD
Deborah Blacker, MD, ScD
Director, Gerontology Research Unit,
Mass General Hospital
Associate Professor of Psychiatry,
Harvard Medical School
Associate Professor in Epidemiology,
Harvard School of Public Health
OUTLINE:
Sequence of gene discovery
•Take stock
•Gather families
•Genotyping and analysis
•Confirmation and understanding
•Potential clinical utility
SEQUENCE OF GENE DISCOVERY:
Step 1: Taking stock
• Clinical features (diagnostic reliability and
validity, subtypes, boundary issues)
• Descriptive epidemiology (incidence,
prevalence, onset distribution, survival)
• Risk and protective factors
• Genetic epidemiology (familial aggregation,
heritability)
• Biology
• Known genes and previous reports
Clinical features
Late-life dementia
Insidious onset, prodromal cases designated
Mild Cognitive Impairment
Definitive diagnosis by autopsy: about 90%
accurate in academic centers
Diagnosis probably optimal early in the course,
when symptoms are definitive but relatively
mild
Diagnosis excellent in academic centers, highly
variable in the community
Descriptive epidemiology
Common disease: prevalence at least 5-10%
for >65-year-olds, at least 25% for >85-yearolds
Incidence and prevalence rise steeply with age
Age of onset varies widely
Accounts for about 70% of dementia
Putative environmental risk and
protective factors
Increased risk:
Longevity
Female gender
Atherosclerosis
Diabetes
Hypertension
Elevated homocysteine
Elevated cholesterol
Head trauma
Decreased risk:
Education
Estrogen
NSAIDs
Anti-oxidants
Statins
Exercise
Genetic epidemiology
2-3 fold increased risk in 1st degree
relatives ( = 2-3)
Age of onset correlated in families
Mostly complex inheritance, but rare
families autosomal dominant
MZ > DZ twin concordance, MZ < 1, MZ
age of onset may differ
Family and twin findings hold in earlyand late-onset cases
Biology
Extensive knowledge of AD biology facilitates
selection of candidate genes
Options include genes for homologs,
substrates, and ligands related to:
Known genes (e.g., presenilins, APP)
Neuropathologic lesions (e.g., amyloid and
tau metabolism)
Broader theories of pathophysiology (e.g.,
cholesterol metabolism, inflammation,
clotting cascade)
Ab Ab Ab Aggregation
Ab
AbAb
Ab
Ab Production:
APP*, PSEN1*,
PSEN2*, BACE
APP
Ab
Ab
a2M
Endocytosis
Ab Clearance
LRP
Ab Degradation
Lysosomal
Degradation
a2M
apoE
Ab
AbAb
Ab
Plasma Membrane
Free Protease
(IDE, Plasmin)
Ab
Known genes and
previous reports
Early onset gene mutations (APP, PSEN1,
PSEN2) generally fully penetrant, rare but
important for understanding pathophysiology
APOE a susceptibility gene with a complex but
substantial impact on risk and age of onset
Many other reported genetic associations, most
inconsistently replicated, if at all
Many AD genes remain
to be discovered
Family history is a risk factor even at
advanced ages, and controlling for APOE
APOE-4 effect attenuated when AD incidence
most common
Segregation analysis predicts 3-4 additional
AD genes for late onset AD (age of onset)
Segregation analysis predicts
4-7 additional AD genes
Daw et
al, 2000
Taking stock:
Constraints and opportunities
High prevalence
Easy to ascertain large samples
High likelihood of within-family heterogeneity
Late-onset
Complex survival issues and competing risks
Parental genotypes rarely available
Diagnostic ambiguities
Need to minimize error, follow for autopsies
Option for age of onset and other quantitative
phenotypes
Constraints and opportunities (continued)
Complex inheritance
Reality far more complex than available
models
Linkage peaks extremely broad
Known gene with substantial but variable effect
Stratification vs. controlling for APOE
Accumulating knowledge of pathophysiology
Extensive source of candidate genes
Low prior probability for any one candidate
SEQUENCE OF GENE DISCOVERY:
Step 2: Gather families
• Sample size
• Ascertainment criteria
• Diagnostic methods
• Additional phenotypic information allowing
for alternate phenotype definition
• Databasing and cell-banking
Sample size
“More is better”: more families means more
power
“Better is better”: power also increases with:
More individuals per family
Greater accuracy of diagnosis
More homogeneous samples
Series of trade-offs to obtain optimal families in
large enough numbers
Multi-site and pooled samples common
Ascertainment criteria
Implementation of tradeoffs between optimal
family structure and ease of ascertainment
Issues to consider:
Number of affecteds per family
Number of unaffecteds per family
Number of additional family members
Diagnostic threshold (AD vs. MCI)
Age of onset of affecteds
Age of unaffecteds
Complete information available
LOAD study: 2 affecteds with onset >60, one
additional family member (affected with onset
>50 or unaffected age >60)
Diagnostic methods: Affecteds
Accurate phenotyping is critical for success, as
errors can cause substantial reduction in power
To increase diagnostic accuracy:
Raise diagnostic threshold (AD vs. MCI)
Operationalize diagnostic criteria
Raise level of certainty (Definite > Probable >
Possible)
Require in-person evaluation
Facilitate autopsy confirmation
In person vs. remote evaluation
Based on autopsies to date from MGH site of
NIMH Genetics Initiative
Subjects evaluated at MGH or an affiliated center
vs. those evaluated by medical records +
telephone interview
Probable AD
Possible AD
Correct/total PV+
Correct/total PV+
In person
22/24
91.7%
--
--
Remote
61/68
89.7%
10/15
66.7%
Assessment methods: Unaffecteds
For late-onset AD, unaffected status inherently
provisional
To improve accuracy of current designation:
Use formal assessments of cognitive and
functional status
Include follow-up: older unaffecteds have
passed through a greater fraction of the age
of risk
Additional phenotypic information
Allows alternate phenotype definition
Difficult to anticipate what will be useful later on;
trade-off between completeness and efficiency
Quantitative phenotypes particularly appealing
given potential for increased power and control
of multiple other genetic and non-genetic risk
factors
Subtyping offers chance to analyze a more
homogeneous sample
Proposed alternate phenotypes
Quantitative traits:
Age at onset
Memory function
Plasma A-beta levels
Subtypes:
Onset age
Psychotic features
Parkinsonian features
Databasing and cell-banking
Database requirements:
Accuracy, accessibility, security, ease of use,
back-ups, documentation, updates
Cell banking requirements:
Sample safety and security, ease and
reliability of distribution, back-ups, stability
SEQUENCE OF GENE DISCOVERY:
Step 3: Genotyping and analysis
• Overall strategy
• Overarching considerations
• Linkage analysis
• Association analysis
Overall strategy: Mendelian disorders
Genome screen identifies a chromosomal
region of interest
Fine mapping with more closely spaced
markers, often with enlarged or extended
sample, narrows this region
Positional cloning used to identify disease
genes within the narrowed region
Highly effective: virtually all Mendelian
disorders have been mapped
Overall strategy: complex diseases
Linkage peaks broad, often span much of a
major chromosome
Tend to remain broad even with increases in
sample size or addition of more markers
Require more complex strategies, typically
linkage and then association analysis
Association analysis used to test specific
candidate genes
Growing interest in association for narrowing
the linked region after a linkage screen, or for
initial screen itself
Positional Candidate Approach
Genome Screen
Genetic Linkage Analysis
Chromosomal Regions of Interest
Gene and EST Database Searches
Candidate Gene Assessment
Family-based Association Tests
Putative Disease-Associated Gene
Demonstration of
Functional Effects
Confirmation in
Population Sample
Established Disease-Associated Gene
Overarching considerations: Trait
Trait analyzed in most studies is AD, but there
is growing interest in alternative phenotypes
Quantitative trait analysis offers greater
power in theory, and more flexible analytic
methods, but need to address scaling issues
Subtyping may offer greater homogeneity, but
difficult to address variability within families
Overarching considerations: Other
How to handle unaffecteds
Many approaches use only genotypic
information from unaffecteds
Including phenotypic information optimal if
age effects can be modeled accurately
How to handle APOE and other modifiers of
risk (age, gender, education)
Stratification
Covariate-based methods
Linkage studies
Based on alleles traveling with the disease
within families
Aimed at identifying a narrow region in which a
disease gene may reside
Extremely successful for Mendelian disorders
(and Mendelian subforms of complex disorders)
Able to identify broad regions where complex
disease genes may reside, but rarely able to
narrow these regions enough to find genes
Typically use highly polymorphic markers with
modest spacing for a screen, and tighter spacing
for follow-up
Basis for linkage analysis
Chromosomal segments cross over and
recombine during meiosis
Genes infrequently separated across
generations are said to be genetically linked
Tighter linkage suggests greater proximity
Trait that travels with a genetic marker
suggests a disease gene in the region
Types of linkage analysis
Extended families vs. sibpairs
Depends on sample available, analytic
tools to be used
Parametric vs. non-parametric (the extent to
which a genetic model for the trait is used)
Parametric more powerful if the model is
correct—unlikely for complex diseases
Single point vs. multipoint
Multipoint more powerful in theory, but
computationally intensive, and susceptible
to genotyping and phenotyping errors
Results of an AD Genome Screen1
1Blacker
et al, Hum Molec Genet 2003
Chromosomes 1 - 3
Chromosomes 4 - 7
Chromosomes 8 - 11
Chromosomes 12 - 15
Chromosomes 16 - 19
Chromosomes 20 – 22, X
Association studies
Based on alleles traveling with the disease
across families
Two main uses:
To narrow the linked region
To test specific candidate genes (based on
position and/or biology)
Can be done in family, case-control, or
population-based samples
Generally uses single nucleotide polymorphisms
(SNPs), often within candidate genes
Basis for association studies
Association can occur for two reasons:
Causal association: an allele is associated
with disease because it increases risk
Linkage disequilibrium (LD): an allele is
associated with disease because it so close
to a risk allele or occurred so recently that
recombination hasn't separated them
Unlike linkage, LD depends on distance and
history, so not a monotonic function of distance
Associated allele may vary across populations
Haplotypes and multiple SNPs
Because association can be “patchy,” need to
test multiple SNPs to fully evaluate a given
candidate gene
Haplotype analysis, which incorporates multiple
closely spaced SNPs, can increase ability to
detect association
Limited programs available at present, but this is
an area of intensive methods development
Development of the “hapmap” will also facilitate
these analyses
Types of association analysis:
Family samples
Not susceptible to bias due to admixture and
population stratification
In theory less powerful because there is less
variability within families
Limited options to test for association when
parents not available, as is the case for AD
Effect size estimates for candidate genes
available, with or without controlling for
covariates, using conditional logistic regression
Conditional ORs cannot be pooled across
studies
Types of association analysis:
Unrelated individuals
Sampling: cases and controls, or population
based sample
More powerful under certain assumptions
Bias less of a concern if population based
Chi square test provides a simple test of genetic
association
For candidate genes, OR provides effect size
estimates, with or without controlling for
covariates
Crude ORs can be pooled across studies
Meta analysis: LRP1*
* Alzheimer Research Forum Gene Database: www.alzgene.org
SEQUENCE OF GENE DISCOVERY:
Step 4: Confirmation and
understanding
•Replicate in an additional sample, ideally
differently ascertained
•Assess impact in clinical and general
population samples
•Assess functional effects (critical but very
difficult to demonstrate)
Prevalence of AD by age, sex,
and APOE genotype
0.45
0.4
ε4/ε4
ε4/εx
εx/εx
F
0.35
M
0.25
F
M
0.2
0.15
0.1
0.05
Age
104
101
98
95
92
89
86
83
80
77
74
71
68
0
65
Pi
0.3
apoE and A-β accumulation
Puglielli et al, 2003
SEQUENCE OF GENE DISCOVERY:
Step 5: Clinical utility
Improved understanding of pathophysiology
Progress in genetics and epidemiology using
knowledge of known genes
Potential for rational drug development
Potential for pharmacogenomic effects, targeted
treatments and preventive strategies
Potential role of genes in early detection and
early intervention
Potential role in predictive risk assessment and
prophylaxis
Caveats
AD is a test case for the genetics of complex
diseases
Current expectations for genetic progress may be
overly optimistic, especially for rational drug
design
Pre-symptomatic knowledge, whether definitive or
probabilistic, may cause more harm than good
Limited knowledge of genetics in the general
population and among physicians compromises
ability to use and understand genetically based
strategies for treatment and prevention
ACKNOWLEDGEMENTS
Gerontology Research Unit, MGH
M.S. Albert, PhD
T.J. Moscarillo, BA Lynelle Cortellini, BA
Alzheimer’s Disease Research Center, MGH
J. Growdon, MD
L. Yap, PhD
C. Crosby, BA
Genetics and Aging Research Unit, MGH
R.E. Tanzi, Ph.D.
M.J. Kim, PhD
M. Parkinson, BS
L. Bertram, MD
R. Menon, BS
A.J. Sampson, BA
M. Hiltunen, PhD
K. Mullin, BS
M. Hsiao, BS
Depts. of Epidemiology and Biostatistics, HSPH
N.M. Laird, PhD,
C. Lange, PhD
M.B. McQueen, MS
Download