Approaches to complex genetic disease

advertisement
Disease and Gene Associations
North Carolina Genomics Symposium
March 18, 2005
Elizabeth R. Hauser
Duke University
Center for Human Genetics
Breaking News- March 14, 2005
3 Research Teams independently identify a gene for macular
degeneration
Significance of AMD result
• Affects 1 in 5 people over age 65
• Complex disease
– Clearly a genetic component
– Important environmental risk (e.g. Smoking)
• Multiple groups identified the same polymorphism
– accounts for 20-50% of the overall risk in these studies
• Each group used a slightly different approach
• Genetic analysis was an important step
– Started with linkage and proceeded to association
• Human Genome Project provided key information
The CFH gene for Age-related Macular Degeneration is the
most recent example of a gene for a common disease,
whose identification was greatly enhanced by the Human
Genome Project.
These are exciting times in which to be doing research on
genetic determinants of disease!
Outline
•
•
•
•
•
Types of genetic disease
Evidence for the involvement of genes
Study designs and analysis methods
Taking complexity into account
Coronary artery disease example
Simple vs complex disease
• Disease definition=phenotype
– Simple traits=> Easily defined
– Complex traits=> Difficult to determine
– How is the diagnosis made?
• Measurements
• Instruments
• May be expensive to collect
– There may be ambiguity in the definition of disease
• Affected well defined, Unaffected ?
• Coronary artery disease requires specialized procedures
– Coronary catheterization
– Stress tests
– Clinical event, such as heart attack or bypass operation
Genes and Disease
Monogenic Diseases
 Huntington
Disease
 Spinocerebellar
Ataxia
 Spastic Paraplegia
Complex Diseases
Environmental
Diseases
 Alzheimer disease
 Influenza
 Cardiovascular
Disease
 Hepatitis
 Autism
 Measles
 Parkinson Disease
 Tuberous Sclerosis
- Environment
- Genes
Causative or
Mendelian Gene
• Gene directly leads to
disorder
• Recognizable inheritance
patterns
• One gene per family
• Less common diseases
– Cystic fibrosis, muscular
dystrophies
Complex or
Susceptibility Gene
• Gene confers an increased
risk, but does not directly
cause disorder
• No clear inheritance pattern
• Involves many genes or
genes and environment
• Common in population
– cancer, heart disease,
dementia
Defining what to study
• As in any biomedical study, need to precisely define the
disease under study
• Define primary phenotype and secondary phenotypes
• Validity and Reliability
• Understanding risk factors
– Genetic or Environmental?
• Ethnic differences
• Age/gender distribution
• Use epidemiologic information to refine the phenotype
definition
• Consider comparability to other studies
Refining the phenotype-genes
• Idea: Make the effect of certain genes in the
sample more easily detectable
• Genetic effects may be stronger for extremes of
the risk factor distribution
– restrict sample to people with onset at a very young or
very old age
• Genetic effects may be stronger for unusual
presentations
– restrict sample to individuals with coronary artery
disease (CAD) without lipid abnormalities
– restrict sample to diabetics with nephropathy
Refining the phenotypeenvironment
• Minimize effect of known environmental
confounders
– restrict sample to nonsmokers
– restrict sample to unmedicated people, e.g. in
hypertension studies
• Collect data in a genetically homogenous
population such as a particular ethnic
group or genetically isolated population
– Reduce the number of genes contributing to
the phenotype
But, How do You Know
Your Trait is Genetic?
Familial aggregation
Familial aggregation is the clustering of affected
individuals within families. Documenting the
familial aggregation is often the first step in
characterizing the genetic basis for a trait.
Major questions to ask yourself:
 Is there heterogeneity?
 Is it possible that there is a Mendelian
subset of families?
Oftentimes, Mendelian subsets of complex
disease are characterized by early age of onset
or increased severity.
Follow Disease as it is Passed from
Parents to Children
Follow Disease as it is Passed from
Parents to Children
Follow Disease as it is Passed from
Parents to Children
Follow Disease as it is Passed from
Parents to Children
Follow Disease as it is Passed from
Parents to Children
Twin Studies
Purpose: Estimate the genetic component of
a disease or associated phenotype



Usually assume that twins share a common
environment which lessens the impact of
environmental influences (although this may
not be true for studies of adult twins)
Usually compare twins of same sex (especially
useful if there are known differences in disease
frequency in males and females)
Twins are same age so age-dependency is
not a problem
Twin Studies
One twin is affected, how often is the other?
MZ
DZ
90%
90%
Probably Environmental
100%
25%
Mendelian recessive, deviation from
the expected frequencies may be
due to incomplete penetrance
80%
16%
???
72%
35%
???
7%
May be the same as population
frequency
7%
Type of Disease
Review article: Martin et al. “A twin-pronged attack on
complex traits”
Nature Genetics 17: 387-392 (1997).
Twin Studies: Adoption
Comparison of disease frequency in adoptees
with their biological vs. their adopted parents
(or siblings). Given the adoptee is affected,
what percent of parents have the disease?
Biologic
85%
5%
Adoptive
5%
85%
Type of disease
Suggests strong genetic
component, frequency in adoptive
parents may reflect risk in the
general population
Suggests strong environmental
component, frequency in biologic
parent may reflect risk in general
population
Segregation Analysis
• Test the disease distribution in families for
concordance with specific genetic transmission
models
• Very difficult studies to perform
– Families need to be collected in a very precise way
• Works best for single gene disorders
• Not terribly successful for common diseases
Recurrence Risk to Relatives:
I
A measure of how “genetic” a trait or
disease is: What is the rate of affection
for relative of proband with the disease
vs. the frequency of the disease in the
general population?
I=
recurrence rate in relative of proband
rate in general population
where ‘I’ indicates the degree of relationship
Risch N. Am J Hum Genet (1990): 46 pp. 222 - 253.
Recurrence Risk to Relatives:



s
Values > 1.0 are generally taken to indicate
evidence in favor of a genetic component. In
general, the higher the value, the stronger
the genetic component.
Values can be used to estimate the number
of genes under different genetic models.
Note that the magnitude of the estimate is
very dependent on the frequency in the
population. For example, a common disorder
may have frequency estimates of 3-6%
depending on how a given study was
performed but this results in small .
Recurrence Risk to Relatives:
Disease:
s:
Alzheimer
4-5
Neural tube defects
25-50
Obesity
1.8
Autism
100-150
Cystic fibrosis
1000
s
Best Proof of All?
Connect genetic variation to the disease!
But, How Do We Find the
Gene?
Locating a Variation
30,000 Genes on
46 chromosomes
Locating a Variation
Gene 1
Gene 2
Gene 3
Gene 4
Gene 5
Gene 6
Region carrying
the variation
Locating a Variation
Variation
found in gene
The process of recombination in
meiosis creates a relationship
between two genes that is a function
of the distance between them.
Genetic Markers
• In order to use recombination need to have
genetic markers throughout the genome
• Know where the markers are in the genome
– Human Genome Project tells us precisely where the
markers are
• Unchanged from generation to generation
• Follow transmission from parents to offspring
• Be able to distinguish alleles
– Polymorphic- having more than one state (alleles)
– Can follow markers and alleles from one generation to
the next
Observe Disease and Markers of
Genes Passed Together from
Parents to Children
Aa
aa
Observe Disease and Markers of
Genes Passed Together from
Parents to Children
Aa
Aa
Aa
aa
Aa
aa
Observe Disease and Markers of
Genes Passed Together from
Parents to Children
Aa
Aa
Aa
aa
aa
aa
Aa
Aa
Aa
Aa
aa
aa
Aa
Aa
aa
aa
Aa
aa
Observe Disease and Markers of
Genes Passed Together from
Parents to Children
Aa
aa
Aa
Aa
aa
Aa
Aa
aa
aa
aa
aa
Aa
Aa
Aa
aa
aa
aa
aa
aa
aa
Aa
aa
Aa
aa
aa
Aa
Aa
aa
aa
aa
Observe Disease and Markers of
Genes Passed Together from
Parents to Children
Aa
aa
Observe Disease and Markers of
Genes Passed Together from
Parents to Children
Aa
Aa
aa
aa
Aa
Aa
Observe Disease and Markers of
Genes Passed Together from
Parents to Children
Aa
Aa
Aa
aa
Aa
aa
aa
aa
aa
Aa
aa
aa
aa
Aa
Aa
aa
Aa
Aa
Observe Disease and Markers of
Genes Passed Together from
Parents to Children
Aa
aa
Aa
Aa
aa
Aa
Aa
Aa
Aa
aa
aa
aa
Aa
aa
aa
aa
aa
aa
aa
aa
Aa
aa
Aa
aa
Aa
Aa
Aa
aa
Aa
aa
The phenomenon of the cotransmission of disease and marker
alleles within a given family is called
LINKAGE.
No recombination is taking place
between the disease and marker
suggesting that they are close
together on the same chromosome.
Suppose that we see linkage and we
can follow transmission of marker
alleles from parents to offspring.
Suppose that in comparing many
families, all diseased people in all
families get the A allele.
Now we have ASSOCIATION too.
Allelic Association
A
B
A
B
a
b
a
b
A
B
A
B
a
b
• Alleles A and B at two loci are associated if the event that
a gamete carries A is not independent of the event that
the gamete carries allele B.
• Alleles are not associated if they occur together in the
same gamete randomly.
A
B
A
b
a
B
a
b
A
b
• Association is population-specific.
A
B
a
b
We can test for genetic association in
families or in unrelated people.
Many genetic association studies are
performed as case control studies.
Information is gained when we can
combine evidence for genetic linkage
with evidence for genetic association.
How do we apply these
ideas?
Coronary Artery Disease
• Major cause of death and disability
throughout the world
• 12 million Americans have coronary
artery disease
– 7 million with myocardial infarctions
– 6.2 million with angina pectoris
• Well-defined risk factors: smoking,
high cholesterol, physical inactivity,
overweight, family history
What is the evidence that CAD is
genetic?
Family history is a strong risk factor.
Evidence for Genes in CAD
• Familial aggregation
– The clustering of affected
individuals in families
• Twin studies
– If one twin is affected, the other twin is affected
more often than by chance
– A monozygous (identical) twin has higher risk than
a dizygous twin (Marenberg et al. 1994)
– Relative risk to co-twins is increased at young age
– In old twins, rates are similar in MZ and DZ twins
Estimation of relative risk ()for CAD
• Shea at al (1984): relative risk to sibs 2-3.9
– Controlled for known risk factors
– Risk to relatives higher in low risk factor group
– Suggests risk due to family history may be
independent of other known risk factors, especially at
young ages
• Risanen (1989) risk increased in first degree
relatives
– Risk to brothers <55 was 6.7, sisters <55 2.8
Characteristics of Familial CAD
• Many family members affected, especially female
relatives
• Early onset <55 in men; <65 in women
• Multi-vessel disease
• Multiple risk factors
• Refractory to conventional therapy
• Family history of related conditions (i.e., stroke,
diabetes, hypertension, cholesterol abnormalities)
Complex or Multifactorial Inheritance
Family
History
Smoking
DISEASE
Exercise
Age
Diet
CAD is a multi-factorial condition
"good"
environment
"good" genes
"bad" genes
low risk
of CAD
medium risk of
CAD
"bad"
medium risk of
environment
CAD
high risk
of CAD
Genetics and Age of Onset
"good" genes
"good"
CAD at very
environment old age
"bad" genes
CAD at young
age
"bad"
CAD at average CAD at very
environment Age
young age
The GENECARD Study
• Goal: Identify genes predisposing to early onset CAD
• Use genome screen approach with a very large sample
size (950 families) to map genes for early-onset CAD
• Ascertain siblings with CAD verified by medical record
review
• Age of onset is the key feature
– Males < 50 at diagnosis
– Females < 55 at diagnosis
GENECARD Criteria
• Inclusion Criteria
– Men who have had coronary atherosclerotic heart disease
diagnosed at or before age 50, and women with a diagnosis at or
before age 55, using any of the following criteria
• Angina or myocardial ischemia
• Cardiac catheterization indicating a blockage in at least one vessel of
50% or greater
• An acute myocardial infarction diagnosed by enzymes or
electrocardiogram
• Unstable angina
• Coronary Artery Bypass Graft (CABG)
• Percutaneous Transluminal Coronary Angioplasty (PTCA)
•
GENECARD
• Exclusion Criteria
– Substance abuse in the absence of diagnosed
coronary stenosis
– Congenital heart disease
– History of chest irradiation
– End stage renal disease
– Myocarditis as a primary etiology of chest pain
GENECARD: Study Requirements
• Family history-at least 2 siblings with early CAD
• Blood sample
• Medical history with medical record confirmation
• Measurement of hips and waist
• Risk factor interview
• Measurement of blood pressure
Linkage Analysis
• Assume the affected people in the same family
have the disease because of the same gene.
• Idea: If the gene causing the disease in this
family is close to a genetic marker (linked), then
we should see less recombination than we would
expect under the hypothesis of no linkage.
• Genotype markers across the genome.
• Look for markers that are shared more often by
family members with CAD.
Several Intervals Are Linked to CAD
Chromosome 1 ~ 22
Hauser ER. Et al., AJHG, 2004 Sep;75(3):436-47
6
GENECARD: Chromosome 3
CAD can have different clinical
characteristics in different people.
What if we divide our families into
subsets based on presence or
absence of additional conditions:
Acute Coronary Syndrome, Diabetes,
Metabolic Syndrome.
GENECARD: Chromosome 1
Different facets of the disease may
have different genetic contributions.
It is often useful to consider disease
subtypes or other clinical covariates
to develop more genetically similar
sets of families.
CATHGEN: CAD Association Study
• Identify cases and controls from the Duke
Coronary Catheterization Lab
– Cases have significant atherosclerosis
– Controls have minimal atherosclerosis
• Genotype markers in regions of linkage
• Look for alleles that appear more often in cases
and controls
Preliminary Candidate Gene
Association Study in CATHGEN
Genotype and Allele Comparisons
1
Logist P-values (-log10)
4
3
2
4
5
6
7
8
9
11
10
12
13
14
155
16
17
18
19
20
21
22
X
3
2
1.3
1
0
0
500
1000
1500
2000
2500
Map Position (cM)
Young Affecteds vs. Old Normals - Genotype
Young Affecteds vs. Old Normals - Allele
Young Affecteds vs. Old Affecteds - Genotype
Young Affecteds vs. Old Affecteds - Allele
Old Affecteds vs. Old Normals - Genotype
Old Affecteds vs. Old Normals - Allele
GC Affecteds vs. Old Normals - Genotype
GC Affecteds vs. Old Normals - Allele
3000
3500
Conclusions
• Gene identification studies of complex disease
are fun, exciting and challenging.
• These studies require input from individuals with
different expertise: Clinicians, Epidemiologists
Molecular Biologists, Bioinformaticians,
Statisticians, Geneticists.
• The Human Genome Project has accelerated our
understanding of genetic architecture.
• Genes for complex disease will be discovered at
a fast rate.
• Next steps are studies that identify the gene
function as it relates to disease.
GENECARD Collaborators
DUCCS Network
William Kraus
Christopher Granger
Elaine Dowdy
Susan Estabrooks
Liling Huang
Stephanie Decker
Teresa Peace
Jerome Anderson
Sherry Jameson
Alan Bartel
Cathy Garvey
Paul Campbell
Janet Patterson
Brian Crenshaw
Teresa Schrader
Charlie Dennis
Kim DeRosa
James Heinsimer
Nancy Howald
William Herzog
Tania Geshoff
Micheal Hindman
Jennifer Kane
Mike Rotman
Virginia Remeny
Kent Salisbury
Dianne Oskins
Charise Patten
Alan Wiseman
Mary Duquette
Brent Muhlestein
Chloe Maycock
Sandra Reyna
Richard Goulah
Gina Kavanaugh
Sebastian Palmeri
Casey Casazza
Fred McNeer
Susan Marple
Jeff Michel
Steve Royal
Brian Hilbourn
Duke CHG
Elizabeth Hauser
Margaret PericakVance
Jeffery Vance
Michael Hauser
Silke Schmidt
Margaret Jamison
Sandra West
Donny Asper
Kruti Desai
Jason Flor
Jason Gibson
Adam MacLaurin
George Ward
George Willis
Carol Haynes
Colette Blach
Rodney Jones
Lin Hu
International Network
David Crossman
Sheila Francis
Karen Eggleston
Jonathan Haines
Douglas Vaughn
Brendan McAdam
William Hillegas
Paula Clevenger
Chris Jones
Kath Roche
Vincent Mooser
Vincent Jomini
Nicolas Redondi
Bernhard Winkelmann
Glaxo-Smith-Kline
Julia Perry
Sanjay Sharma
Scott Sundseth
Lefkos Middleton
Allen Roses
Vincent Mooser
Download