“Forward Genetics” and Toxicology

advertisement
Role of Genetic Polymorphisms
in Responses to Toxic Agents
• Definitions
• “Forward genetics” and toxicology
• “Reverse genetics” and toxicology
• Genetic markers
• SNPs and their use in toxicology
• Ethical, Legal and Social Issues (ELSI)
“Toxicology is concerned with the interaction between xenobiotics
and biological molecules directly or indirectly coded in the DNA, and
can be regarded as a branch of GENETICS.”
Michael F.W. Festing (2001)
Gregor Mendel (1822 – 1884)
TERMINOLOGY
Gene: A sequence of DNA bases that encodes a protein
Allele: A sequence of DNA bases
Locus: Physical location of an allele on a chromosome
Linkage: Proximity of two alleles on a chromosome
Marker: An allele of known position on a chromosome
Distance: Number of base-pairs between two alleles
centiMorgan: Probabilistic distance of two alleles
Phenotype: An outward, observable character (trait)
Genotype: The internally coded, inheritable information
Penetrance: No. with phenotype / No. with allele
Modified from M.F. Ramoni, Harvard Medical School
The 80s Revolution and
the Human Genome Project
Genetic Polymorphisms: naturally occurring DNA markers that
identify regions of the genome and vary among individuals
The intuition that polymorphisms could be used as markers sparkled
the revolution
On February 12, 2001 the Human Genome
Project announced the completion of a first
draft of the human genome and declared:
“A SNP map promises to revolutionize both
mapping diseases and tracing human history”
SNP are Single Nucleotide Polymorphisms – subtle
variations of the human genome across individuals
Modified from M.F. Ramoni, Harvard Medical School
DISTANCES ON A GENETIC MAP
• Physical distances between alleles are base-pairs
• But the recombination frequency is not constant
• A useful measure of distance is based on the
probability of recombination: the Morgan
• A distance of 1 centiMorgan (cM) between two alleles
means that they have 1% chance of being separated
by recombination
• A genetic distance of 1 cM is roughly equal to a
physical distance of 1 million base pairs (1Mb)
Modified from M.F. Ramoni, Harvard Medical School
MORE TERMINOLOGY
Physical Maps: maps in base-pairs
Human physical map: 3000Mb (Mega-bases)
Genetic Maps: maps in centiMorgan
Human Male Map Length: 2851cM
Human Female Map Length: 4296cM
Correspondence between maps:
Male cM ~ 1.05 Mb; Female cM ~ 0.88Mb
Modified from M.F. Ramoni, Harvard Medical School
Simple and Complex Traits
Single Gene (Mendelian) diseases:
Autosomal dominant (Huntington)
Autosomal recessive (Cystic Fibrosis)
X-linked dominant (Rett)
X-linked recessive (Lesch-Nyhan)
Today, over 400 single-gene diseases have been identified
Problem: traits don’t always follow single-gene models
Complex Trait: phenotype/genotype interaction
Multiple cause: multiple genes in several loci determine
a phenotype in conjunction with non-genetic factors
(accidents of development, social factors, environment,
infections, other factors)
Multiple effect: gene causes more than one phenotype
Modified from M.F. Ramoni, Harvard Medical School
Genetic Markers
Even though we share most DNA, there are variations (polymorphisms)
Polymorphic: two or more forms of the same gene, or genetic marker
exist with each form being too common in a population to be merely
attributable to a new mutation
Classes of polymorphic genetic markers:
Single Nucleotide Polymorphisms (SNP): single base differences in population
Microsatellites: short tandem repeat (e.g. GATA, 2 – 6 bp long)
Minisatellites: simple sequence repeats (10 – 40 bp long)
Variable Number of Tandem Repeats: the number of repeats may vary
Restriction Fragment Length Polymorphisms: presence/absence of a site
Deletions, Duplications, Insertions: alterations on a chromosome level
Complex haplotypes: combinations of the above
Genetic Markers
Coding:
Single Nucleotide Polymorphisms
Restriction Fragment Length Polymorphisms
Deletions, Duplications, Insertions
Non-coding:
Microsatellites
Minisatellites
Variable Number of Tandem Repeats
Restriction Fragment Length Polymorphisms
Single Nucleotide Polymorphisms
Deletions, Duplications, Insertions
Genetic Markers
• Polymorphisms (allelic variations) are essential to:
– Study inheritance patterns
– Map phenotypes and anchor genes to the genetic map by cosegregation analysis
– Determine change in function: resistant/sensitive populations
• Genetically determined variability among humans is due to
a difference in 0.1% of the genomic sequence!
• Polymorphisms can be silent, or be exhibited at levels of:
– Morphology
– Protein
– DNA
Chromosomal rearrangements:
Deletions, Duplications, Insertions
Deletions:
a certain part is lost, for
example abc  ac
Insertions:
a part is added, for
example ac  abc
Duplications: can be tandem, for
example abc  abbc, or
not, for example
abc  abcabc
Reversals:
a part is turned around,
head to tail abc  cba
Transpositions: two parts change
places, for example
abcd  acbd
Insertion
Deletion
Copy Number Variability (CNVs)
• CNV are DNA segments at 1 kb or larger with a variable number of copies in comparison with a reference
genome. CNV can have dramatic phenotypic consequences as a result of altering gene dosage, disrupting
coding sequences, or perturbing long-range gene regulation.
• There are several well-known examples of CNV, including CYP2A6, CYP2D6, GSTM1, GSTT1, SULT1A1,
SULT1A3, UGT2B17, and also the nearby UGT2B7, UGT2B10 and UGT2B11 genes. All these genes are
deleted at a relatively high frequency in at least one ethnic group. In addition, CYP2A6, CYP2D6,
SULT1A1and SULT1A3 can also present duplications and even multiduplications.
Pharmacology & Therapeutics 116, Issue 3, 2007, Pages 496–526
Minisatellites
• Original DNA fingerprinting technique
• Relies on stretches of tandemly repeated sequences
(usually 15 - 100bp)
• Alleles show high variability in numbers of repeats
Genotyping using minisatellites:
• Digest genomic DNA
• Run out on gel
• Southern blot and probe with radiolabelled repeat DNA
• Individuals appear with a set of bands unique to them,
although each band is shared with one of their parents
Microsatellites
•
•
•
•
•
•
Number of repeats varies greatly between individuals
Make up to 10-15% of the mammalian genome
Believed to have no function
Have high mutation rates
Used in forensic analysis
Can be amplified by PCR – fragments that are generated
have different length due to different number of repeats
Microsatellites are highly polymorphic due to potential for
“skipping” during DNA replication
Restriction Fragment Length Polymorphisms (RFLPs)
• Consider two alleles having slightly different sequences
GAATTC
CTTAAG
GCATTC
CGTAAG
EcoRI will cut the first but not the second
Single nucleotide polymorphisms (SNPs)
Variations of a single base between individuals:
A most common form of genetic variation in humans
Thought to be a major cause of genetic diversities among
different individuals in drug response, disease susceptibility...
A SNP must occur in at least 1% of the population
Occur every 500-1000 bp
About 50,000 – 100,000 SNPs in coding sequences
SNPs may occur in coding regions:
cSNP: SNP occurring in a coding region
rSNP: SNP occurring in a regulatory region
sSNP: Coding SNP with no change on amino acid
Modified from M.F. Ramoni, Harvard Medical School
Single nucleotide polymorphisms (SNPs)
• Two bases (one for chromosome) for each locus
• Because of the A-T C-G complement, a SNP can have
only two variants: (AT) or (CG)
• A SNP is a variable with two states:
Major allele: Allele (AT) or (CG) more frequent
Minor allele: Allele (AT) or (CG) less frequent
• An individual can be, for each polymorphic locus:
Homozygous on major allele
Heterozygous on major/minor allele
Homozygous on minor allele
The role of SNP analysis through all stages of
drug development
Target Identification: disease association studies identify SNPs in candidate genes.
The proteins encoded by such genes may represent novel
drug targets
Target Validation:
population analysis determines the level of variation within a
candidate gene. The presence of several SNPs will generate a
large number of potential variants and such candidates can be
eliminated
Lead Identification: screens can be developed to identify lead compounds that
interact with each variant of the drug target
Lead Validation:
biological assays can be performed that incorporate different
lead compounds and all variants of the target protein
Lead Optimization: knowledge of polymorphisms affecting the target can be used
to develop drugs that work more efficiently over a broader
group of patients or to identify drugs that work more efficiently
in specific genotypes
Preclinical Testing: animal models can be developed incorporating all known
variants of the target to provide more accurate predictions of
drug efficacy in humans
Clinical Trials:
trials can be carried out with groups of patients selected on
the basis of genotype, to specifically test for adverse drug
reactions at particular doses
SNP discovery and SNP genotyping
SNP discovery: detection of novel polymorphisms
•
DNA sequencing
•
In silico: comparing the sequences of genomic clones or ESTs
deposited in public and proprietary databases
•
Single strand conformational polymorphisms
SNP genotyping: identification of specific alleles in a known
polymorphism
1.Allele discrimination:
allele-specific PCR, allele-specific single-base primer extension (minisequencing), allele-specific ligation, allele-specific enzymatic cleavage, etc.
2. Presence of allele(s) of interest in a given DNA sample:
Fluorescence detection, fluorescence resonance energy transfer, fluorescence
polarization, mass spectrometry, etc.
See details in: Twyman RM & Primrose SB, Techniques patents for SNP genotyping. Pharmacogenomics 4:67-79 (2003)
Toxicology ≈ Genetics
There is substantial polymorphism in genes that determine the
response to xenobiotics both in humans and animals
This has important implications for toxicology and pharmacology:
• adverse reactions to drugs cause thousands of deaths each year and
many of those are associated with susceptible phenotypes
• are we protecting the most sensitive in human population when
occupational/environmental limits of exposure are established?
• how to account for strain differences in susceptibility in animal studies
(1000-fold differences have been reported for TCDD LD50 in rats)?
• genotyping of individuals from a sample of blood DNA is becoming
increasingly easy so it is possible to genotype people for loci that are
thought to control susceptibility to certain drugs/xenobiotics
Adapted, in part, from M.F.W. Festing, Tox. Lett. 120:293-300 (2001)
…loci that are thought to control susceptibility to
certain drugs/xenobiotics:
Before we can correctly interpret genotyping results we need to:
• gain a much better understanding of the genetics of susceptibility
• know the mode of action of xenobiotics
Problem: relatively little research is done on the genetics of
susceptibility and toxicologists in general seem to be
unaware of the extent of genetic variation in response
among the experimental animals that are being used
Problem: modes of action of an overwhelming majority of
established toxic substances are still largely
unknown (not even worth mentioning scores of
compounds that are being newly developed)
Adapted, in part, from M.F.W. Festing, Tox. Lett. 120:293-300 (2001)
Genotype-Phenotype Interactions in
Complex Biological Systems
Age
Environment
Adapted from: Huang, 2002
“The classical interaction of exposure with phase I and phase II XME metabolism, and risk of
developing cancer. High exposure to a foreign chemical, combined with rapid metabolic activation
and slow conjugation, should put an individual at a high risk of developing cancer. Low or negligible
exposure, in combination with slow rates of activation and rapid rates of conjugation, should lead
to a low risk of developing environmentally caused cancer.”
Aromatic
amines
Heterocyclic
amines
N-oxidation
O-acetylation:
Reactive metabolites
(acetoxy-derivatives)
cancer
Rapid acetylator
Intermediate acetylator
Slow acetylator
From: Hulla et al. Toxc. Sci. (1999)
NAT1 and NAT2
J Cancer Res Clin Oncol. 2011 Nov;137(11):1661-7
Lung Cancer. 2011 Aug;73(2):153-7.
Survival in women with epithelial ovarian cancer
From: Introduction to Biochemical Toxicology 3rd Edition (2001) p. 128
From: Strange et al. Toxc. Lett. (2000)
• Several GST gene families have been identified
• Null-phenotypes are detoxification-deficient and more likely to suffer
formation of carcinogen-DNA adducts and/or mutations
• In general, GSTM1- and GSTT1-null are considered high-risk
“Reverse Genetics”
“Forward Genetics”
Genetics in Toxicology
Phenotype (e.g., toxic symptoms, cancer)
Studying mechanisms
of action
Genes that control susceptibility/resistance
Genotype (gene knockout, polymorphism, etc.)
Studying mechanisms
of action
Phenotype
Adapted, in part, from M.F.W. Festing, Tox. Lett. 120:293-300 (2001)
“Forward Genetics” and Toxicology
Different animal strains nearly always respond differently to the
same agent/dose unless the toxic insult is so dramatic that all the
animals die very quickly
Examples of strain differences (rats) in response to xenobiotics:
3,2’-dimethyl-4-aminobiphenyl  prostate tumors
48% F344, 41% ACI, 13% LEW, 7% CD, 0% Wistar
N-methyl-N-nitro-N-nitrosoguanidine(MNNG)  stomach adenocarcinomas
67% WKY, 60% S-D, 53% LEW, 23% Wistar, 6% F344
There is no such thing as an “animal strain that is particularly
susceptible/resistant to carcinogenesis” !
Adapted, in part, from M.F.W. Festing, Tox. Lett. 120:293-300 (2001)
Current Approach:
Animal studies  Human population
CD-1
Pharmaceutical Industry
Single genome-based
risk prediction
B6C3F1
National Toxicology Program
Genetically Diverse
Human Population
“Forward Genetics” and Toxicology
Designing an IDEAL “forward genetics” animal study for
investigating genetic variability in response to a toxic agent:
• Survey the known facts about susceptibility in different strains of rodents
• Small numbers of animals (4-6 per strain) of several strains should be
used to characterize the response to the toxic agent “X”
• At least 5 strains should be studied
• Dose levels should be selected to elicit a suitable response
• Endpoints should be quantitative (e.g., number of tumors)
Adapted, in part, from M.F.W. Festing, Tox. Lett. 120:293-300 (2001)
Parental strains and derivation of five major
types of mouse genetic resources
Each of the sequenced strains is shown in a
different color depending on the origin. The four
wild-derived strains, denoted by asterisks, are
CAST/EiJ (M. m. cataneus) in red, PWD/PhJ (M. m.
muculus) in blue, MOLF/EiJ (M. m. molossinus) in
purple, and WSB/EiJ (M. m. domesticus) in green.
The remaining 12 classical laboratory strains are
shown in green reflecting the predominant
contribution of the M. m. domesticus subspecies to
these strains. The shade of green denotes the
different origin of the classical strains, with the
darker shades denoting strains of Swiss origin
(FVB/NJ and NOD/LtJ), the yellow-green denoting a
strain of Asian origin (KK/HlJ), and intermediate
shade denoting Castle or C57-related strains
(129S1/SvImJ, A/J, AKR/J, BALB/cBy, C3H/HeJ,
DBA/2J, BTBR T+tf/J, and NZW/LacJ).
The figure also shows schematically the
derivation process for five types of resources,
recombinant inbred lines (BXD);
chromosome substitution strains (B.P),
Collaborative Cross (CC), heterogeneous
stocks (Northport HS), and laboratory strain
diversity panel (LSDP)
Mamm Genome. 2007 July; 18(6): 473–481
Recombinant inbred strains (RIs)
female
C57BL/6J (B)
fully
inbred
male
DBA/2J (D)
BXD
chromosome pair
isogenic
F1
heterogeneous
F2
Inbred
Isogenic
siblings
20 generations
Recombined
chromosomes
are needed for
mapping
BXD RI
Strain set
BXD1
brother-sister
matings
BXD2
+…+
BXD80
Image Credit: genenetwork.org
• Once a susceptible/resistant strains have been identified, loci can be mapped
• In mice, Recombinant Inbread strains (susceptible x resistant) can be
generated
• A set of RI strains can be tested for the susceptibility to agent “X”
• Once the phenotype have been established, mice can be genotyped to
determine which loci segregated with susceptibility/resistance
From Zhou et al. (2005)
Problems:
large number of animals (100-300, or more)
resolution of the genetic mapping is only about ± 20 cM (mouse
genome is ~50K genes and 1900 cM  1cM ≈ 0.5 Mb) so the
identified locus can contain ~500 genes
Adapted, in part, from M.F.W. Festing, Tox. Lett. 120:293-300 (2001)
“Collaborative Cross” The Resource for Forward Genetics Research
Images from Threadgill DW
Single Strain: Constant Genotype
Control
10 mg
25 mg
50 mg
100 mg
Vary the environment (e.g., treatment)
Many Strains: Varied Genotype
Strain 1
Strain 2
Strain 3
Strain 4
Strain 5
Strain 6
Fix the environment (same treatment), vary the genotype
Total mouse SNPs = ~40M
(M.m.musculus, M.m.domesticus,
M.m.castaneous)
Total human SNPs = ~20M
Strain 7
Profiling Liver Toxicity to APAP in a Genetically Diverse Population
Dose response to liver injury: ALT (24 h)
Multi-strain profiling of APAP-induced liver injury:
% liver necrosis (24h), reduced GSH (4h), ALT (24h), ALT (4h)
Dose response to liver injury (4 h) vs survival (24 h)
“Reverse Genetics” and Toxicology
A knockout or over-expressor animal strain, or animals with a
known polymorphism(s) in important genetic regions
Dose with a chemical(s)
Evaluate the phenotype
Looks MUCH easier than “Forward Genetics” experiment! Let’s do it!
Problems:
if mutant to non-mutant comparison is being made, the genetic
backgrounds MUST be identical !
if the strains have been crossed, care is needed to ensure that
the observed differences are not due to a gene closely linked to
the gene of interest
genes do not act alone! Several alleles may be important, their
effects can be additive or epistatic
Adapted, in part, from M.F.W. Festing, Tox. Lett. 120:293-300 (2001)
PPARa (+/+)
+ WY-14,643 (11 months)
PPARa (-/-)
+ WY-14,643 (11 months)
Peters et al., Carcinogenesis, 1997
Peroxisome Proliferators: Species Differences
•
•
•
•
Mouse and rat:
Marmoset:
Guinea Pig:
Humans:
highly responsive
does not respond
no peroxisome proliferation, but have hypolipidaemia
believed to be unresponsive, but have hypolipidaemia
•
•
PPARa exists in mouse, rat, guinea pig and human
In humans:
Lower hepatic levels of PPARa
Lower ligand binding activity
Different structure (polymorphisms)
Different PP Response Elements in DNA
Presence of competing proteins for PPRE
Expression of dominant-negative form of PPARa
across mouse inbred strains
Palmer et al., Molecular Pharmacology, 1998
Untreated (6, 24 hr in culture)
Activation of PPARα in mouse and
human hepatocytes
Treated (6, 24 hr in culture)
Limited overlap in response to Wy-14,643 at individual
gene level but major overlap at pathway level
Upregulated genes
Wy-14,643 treatment causes major
changes in gene expression in human
and mouse hepatocytes
Gene Ontology
Downregulated genes
Gene Set Analysis
Well-studied genetic variants in human disease
From Taylor et al. Trends Mol Med 7:507-512 (2001)
Most drug-metabolizing enzymes exhibit clinically relevant genetic polymorphisms. Essentially all of the major
human enzymes responsible for modification of functional groups [phase I reactions (left)] or conjugation with
endogenous substituents [phase II reactions (right)] exhibit common polymorphisms at the genomic level;
those enzyme polymorphisms that have already been associated with changes in drug effects are separated
from the corresponding pie charts. The percentage of phase I and phase II metabolism of drugs that each
enzyme contributes is estimated by the relative size of each section of the corresponding chart. ADH, alcohol
dehydrogenase; ALDH, aldehyde dehydrogenase; CYP, cytochrome P450; DPD, dihydropyrimidine
dehydrogenase; NQO1, NADPH:quinone oxidoreductase or DT diaphorase; COMT, catechol Omethyltransferase; GST, glutathione S-transferase; HMT, histamine methyltransferase; NAT, Nacetyltransferase; STs, sulfotransferases; TPMT, thiopurine methyltransferase; UGTs, uridine 5'-triphosphate
glucuronosyltransferases. From Evans WE and Relling MV Science 286:487 (1999).
Cytochrome P450 genotyping
From: Flockhart DA and Webb DJ. Lancet (1998)
FDA OKs Genetic Test Linked to Warfarin
Sep 17 2007
WASHINGTON (AP) - A genetic test that can reveal what patients are especially sensitive to the blood-thinner warfarin won
federal approval Monday. Such screenings could prevent thousands of complications each year, health officials estimate.
The approval of the test comes a month after warfarin, sold under the brand name Coumadin and in generic forms, became
the first widely used drug to include genetic testing information on its label. The information can help doctors
determine how best to prescribe the drug.
An estimated one-third of patients process the drug differently than do most others, exposing them to a higher risk of
bleeding. Research suggests that most of that sensitivity is due to variations in two genes. The new test, made by Nanosphere
Inc. of Northbrook, Ill., can detect some of those variants.
One of the genes produces an enzyme that helps the body metabolize warfarin and other medicines; the second produces the
blood-clotting protein that warfarin blocks.
Human Cytochrome P450 2C9
with bound Warfarin
Nature 424, 464-468 (2003)
Image Source: www.pharmgkb.org
POPULATION-BASED GWAS AND TOXICOLOGY:
DRUG-INDUCED ADVERSE EFFECT STUDIES
866,399 markers
51 cases of flucloxacillin DILI
282 matched controls
Daly et al. HLA-B*5701 genotype is a major determinant of drug-induced
liver injury due to flucloxacillin. Nat Genet. 2009 Jul;41(7):816-9.
The FDA Abacavir Warning (July 24, 2008)
Abacavir (marketed as Ziagen) and Abacavir-containing Medications
FDA reviewed data from two studies that support a recommendation for pre-therapy
screening for the presence of the HLA-B*5701 allele and the selection of alternative therapy
in positive subjects.
Genetic tests for HLA-B*5701 are available and all patients should be screened for the HLAB*5701 allele before starting or restarting treatment with abacavir or abacavir-containing
medications. Development of clinically suspected abacavir HSR requires immediate and
permanent discontinuation of abacavir therapy in all patients, including patients negative for
HLA-B*5701.
Genomenewsnetworks.org
The genomes of more than 180 organisms have been sequenced since 1995. The Quick Guide includes
descriptions of these organisms and has links to sequencing centers and scientific abstracts.
Ultra High Throughput Sequencing – Towards the “$1,000 Genome”
Illumina® “SOLEXA” Genome Analyzer
Roche® 454 Genome Sequencer
Seqanswer.com
Roche.com & Nature Biotechnology
Illimina.com
DNA Sequencing
Transcriptome
analysis
Gene regulation
and control
DNA Sequencing
Transcriptome
analysis
Gene regulation
and control
Ultra High Throughput Sequencing – Enabling GWAS Studies
amateurbrainsurgery.com
Genome-wide plots of available GWAS results for all
associations P = 0.0001. (BMC Medical Genetics 2009)
compgen.unc.edu
www.niehs.nih.gov/crg/
ornl.gov
From: Strange et al. Toxc. Lett. (2000)
Toxicogenetics: what’s next?
Goal:
When we find all polymorphisms in genes important
for metabolism/detoxification of xenobiotics, we can
link them to particular drug or chemical toxicity and
identify susceptible populations
Problem: Simple research questions generate erroneous
results (e.g. CYP2D6 polymorphisms and lung
cancer, CYP2E1 polymorphisms and alcoholic liver
disease)
Problem: Biological complexity of mechanisms, ethnic variation,
clinical heterogeneity, etc…  both positive and
negative results are true?
Linking complex trait diseases to genetic polymorphisms requires (Todd, 1999):
• large sample sizes and small p-values
• Initial study + several replications
• Genetic associations should make biological sense
• Physiologically meaningful data should support a functional role of the
polymorphism in question
Ethical, Legal and Social Issues in toxicogenetics are as
complex as the studies of polymorphisms themselves
http://genomics.unc.edu/articles/elsi_article.htm
Download