Case Study #1 Use of bioinformatics in drug development

advertisement
Use of bioinformatics in drug
development and diagnostics
Bringing a New Drug to Market
Review and approval by Food
& Drug Administration
1
compound
approved
Phase III: Confirms effectiveness and monitors
adverse reactions from long-term use in 1,000 to
5,000 patient volunteers.
Phase II: Assesses effectiveness and
looks for side effects in 100 to 500 patient
volunteers.
5 compounds enter
clinical trials
Phase I: Evaluates safety and dosage
in 20 to 100 healthy human volunteers.
5,000 compounds
evaluated
0
2
4
6
8
Discovery and preclininal testing:
Compounds are identified and evaluated
in laboratory and animal studies for
safety, biological activity, and formulation.
10
Source: Tufts Center for the Study of Drug Development
12
14
Years
16
Biological Research in 21st
Century
“ The new paradigm, now emerging is that all
the 'genes' will be known (in the sense of
being resident in databases available
electronically), and that the starting "point
of a biological investigation will be
theoretical.”
- Walter Gilbert
Rational Approach to
Drug Discovery
Identify target
Clone gene encoding target
Express target in recombinant form
Crystal
structures of
target and
target/inhibitor
complexes
Synthesize
modifications
of lead
compounds
Screen
recombinant
target with
available
inhibitors
Identify lead
compounds
Synthesize
modifications
of lead
compounds
Identify lead
compounds
Toxicity &
pharmacokinetic
studies
Preclinical trials
An Ideal Target
• Is generally an enzyme/receptor in a pathway and
its inhibition leads to either killing a pathogenic
organism (Malarial Parasite) or to modify some
aspects of metabolism of body that is functioning
dormally.
• An ideal target…
–
–
–
–
–
Is essential for the survival of the organism.
Located at a critical step in the metabolic pathway.
Makes the organism vulnerable.
Concentration of target gene product is low.
The enzyme amenable for simple HTS assays
How Bioinformatics can help in
Target Identification?
•
•
•
•
•
Homologous & Orthologous genes
Gene Order
Gene Clusters
Molecular Pathways & Wire diagrams
Gene Ontology
Identification of Unique Genes of Parasite
as potential drug target.
Comparative Genomics
Malarial Parasites: Source for
identification of new target molecules.
• Genome comparisons of malarial parasites of
human.
• Genome comparisons of malarial parasites of
human and rodent.
• Comparison of genomes of –
– Human
– Malarial parasite
– Mosquito
What one should look for?
Human
P.f
Mosquito
Proteins that are shared by –
•All genomes
•Exclusively by Human & P.f.
•Exclusively by Human &
Mosquito
•Exclusively by P.f. & Mosquito
Unique proteins in –
Human
P.f.
Targets for
anti-malarial drugs
Impact of Structural Genomics on
Drug Discovery
Dry, S. et. al. (2000) Nat. Struc.Biol. 7:976-949.
Drug Development Flowchart
• Check if structure is known
• If
unknown,
model
it
using
KNOWLEDGE-BASED
HOMOLOGY
MODELING APPROACH.
• Search for small molecules/ inhibitors
• Structure-based Drug Design
• Drug-Protein Interactions
• Docking
Why Modeling?
• Experimental determination of structure
is still a time consuming and expensive
process.
• Number of known sequences are more
than number of known structures.
• Structure information is essential in
understanding function.
Sequence identities &
Molecular Modeling methods
Methods
Sequence Identity
with known
structures
• ab initio
0-20%
• Fold recognition
20-35%
• Homology Modeling
>35%
STRUCTURE-BASED DRUG
DESIGN
Compound
databases,
Microbial broths,
Plants extracts,
Combinatorial
Libraries
Random
screening
synthesis
Lead molecule
3-D ligand
Databases
Docking
Linking or
Binding
Receptor-Ligand
Complex
Target Enzyme
OR Receptor
3-D structure by
Crystallography,
NMR, electron
microscopy OR
Homology Modeling
Testing
Redesign
to improve
affinity,
specificity etc.
Binding Site Analysis
• In the absence of a structure of Targetligand complex, it is not a trivial exercise to
locate the binding site!!!
• This is followed by Lead optimization.
Lead Optimization
Active site
Lead
Lead Optimization
Compounds which are weak inhibitors may be modified
by combinatorial chemistry in silico if the target structure
(3-dimensional!) is known, minimizing the number of
potential test compounds
Target structure
Z
X
N
H
C
Y
Factors Affecting The Affinity Of A Small
Molecule For A Target Protein
LIGAND.wat n +PROTEIN.wat n
LIGAND.PROTEIN.watp+(n+m-p) wat
• HYDROGEN BONDING
• HYDROPHOBIC EFFECT
• ELECTROSTATIC INTERACTIONS
• VAN DER WAALS INTERACTIONS
DIFFERENCE BETWEEN AN INHIBITOR AND DRUG
Extra requirement of a drug compared to an inhibitor
•Selectivity
LIPINSKI’S RULE OF FIVE
Poor absorption or permeation are more
•Less Toxicity
likely when :
•Bioavailability
-There are more than five H-bond donors
•Slow Clearance
-The mol.wt is over 500 Da
•Reach The Target -The MlogP is over 4.15(or CLOG P>5)
•Ease Of Synthesis -The sums of N’s and O’s is over 10
•Low Price
•Slow Or No Development Of Resistance
•Stability Upon Storage As Tablet Or Solution
•Pharmacokinetic Parameters
•No Allergies
Mecanismo antibacteriano de la
PZA: Pro-droga
THERMODYNAMICS OF RECEPTOR-LIGAND BINDING
•Proteins that interact with drugs are typically enzymes or
receptors.
•Drug may be classified as: substrates/inhibitors (for enzymes)
agonists/antagonists (for receptors)
•Ligands for receptors normally bind via a non-covalent reversible
binding.
•Enzyme inhibitors have a wide range of modes:non-covalent
reversible,covalent reversible/irreversible or suicide inhibition.
•Inhibitors are designed to bind with higher affinity: their affinities
often exceed the corresponding substrate affinities by several
orders of magnitude!
•Agonists are analogous to enzyme substrates: part of the binding
energy may be used for signal transduction, inducing a
conformation or aggregation shift.
•To understand ‘what forces’ are responsible for ligands binding to
Receptors/Enzymes,
•The observed structure of Protein is generally a consequence of the
hydrophobic effect!
•Proteins generally bury hydrophobic residues inside the core,while
exposing hydrophilic residues to the exterior
Salt-bridges inside
•Ligand building clefts in proteins often expose hydrophobic residues to
solvent and may contain partially desolvated hydrophilic groups that are
not paired:
Docking Methods
• Docking of ligands to proteins is a
formidable problem since it entails
optimization of the 6 positional degrees of
freedom.
• Rigid vs Flexible
• Manual Interactive Docking
Automated Docking Methods
• Speed vs Reliability
• Basic Idea is to fill the active site of the
Target protein with a set of spheres.
• Match the centre of these spheres as good as
possible with the atoms in the database of
small molecules with known 3-D structures.
• Examples:
– DOCK, CAVEAT, AUTODOCK, LEGEND,
ADAM, LINKOR, LUDI.
GRID Based Docking Methods
• Grid Based methods
– GRID (Goodford, 1985, J. Med. Chem. 28:849)
– GREEN (Tomioka & Itai, 1994, J. Comp.
Aided. Mol. Des. 8:347)
– MCSS (Mirankar & Karplus, 1991, Proteins,
11:29).
• Functional groups are placed at regularly spaced
(0.3-0.5A) lattice points in the active site and their
interaction energies are evaluated.
Folate Biosynthetic
pathway
DHFR
CLUSTAL W (1.81) multiple sequence alignment
chabaudi
vinckei
berghei
yoelii
vivax
falciparum
-----------------------E--KAGCFSNKTFKGLGNEGGLPWKCNSVDMKHFSSV
-----------AICACCKVLNSNE--KASCFSNKTFKGLGNAGGLPWKCNSVDMKHFVSV
MEDLSETFDIYAICACCKVLNDDE--KVRCFNNKTFKGIGNAGVLPWKCNLIDMKYFSSV
-----------AICACCKVINNNE--KSGSFNNKTFNGLGNAGMLPWKYNLVDMNYFSSV
MEDLSDVFDIYAICACCKVAPTSEGTKNEPFSPRTFRGLGNKGTLPWKCNSVDMKYFSSV
-------------------------KKNEVFNNYTFRGLGNKGVLPWKCNSLDMKYFCAV
*
*. **.*:** * **** * :**::* :*
35
47
58
47
60
35
chabaudi
vinckei
berghei
yoelii
vivax
falciparum
TSYVNETNYMRLKWKRDRYMEK---------NNVKLNTDGIPSVDKLQNIVVMGKASWES
TSYVNENNYIRLKWKRDKYIKE---------NNVKVNTDGIPSIDKLQNIVVMGKTSWES
TSYINENNYIRLKWKRDKYMEKHNLK-----NNVELNTNIISSTNNLQNIVVMGKKSWES
TSYVNENNYIRLQWKRDKYMGKNNLK-----NNAELNNGELN--NNLQNVVVMGKRNWDS
TTYVDESKYEKLKWKRERYLRMEASQGGGDNTSGGDNTHGGDNADKLQNVVVMGRSSWES
TTYVNESKYEKLKYKRCKYLNKET----------VDNVNDMPNSKKLQNVVVMGRTNWES
*:*::*.:* :*::** :*:
*
.:***:****: .*:*
86
98
113
100
120
85
chabaudi
vinckei
berghei
yoelii
vivax
falciparum
IPSKFKPLQNRINIILSRTLKKEDLAKEYN------NVIIINSVDDLFPILKCIKYYKCF
IPSKFKPLENRINIILSRTLKKENLAKEYS------NVIIIKSVDELFPILKCIKYYKCF
IPKKFKPLQNRINIILSRTLKKEDIVNENN--NENNNVIIIKSVDDLFPILKCTKYYKCF
IPPKFKPLQNRINIILSRTLKKEDIANEDNKNNENGTVMIIKSVDDLFPILKAIKYYKCF
IPKQYKPLPNRINVVLSKTLTKEDVK---------EKVFIIDSIDDLLLLLKKLKYYKCF
IPKKFKPLSNRINVILSRTLKKEDFD---------EDVYIINKVEDLIVLLGKLNYYKCF
** ::*** ****::**:**.**:.
* **..:::*: :*
:*****
140
152
171
160
171
136
chabaudi
vinckei
berghei
yoelii
vivax
falciparum
I----------------------------------------------------------IIGGASVYKEFLDRNLIKKIYFTRINNAYT-----------------------------IIGGSSVYKEFLDRNLIKKIYFTRINNSYNCDVLFPEINENLFKITSISDVYYSNNTTLD
IIGGSYVYKEFLDRNLIKKIYFTRINNSYN-----------------------------IIGGAQVYRECLSRNLIKQIYFTRINGAYPCDVFFPEFDESQFRVTSVSEVYNSKGTTLD
I----------------------------------------------------------*
141
182
231
190
231
137
chabaudi
vinckei
berghei
yoelii
vivax
falciparum
----------------FIIYSKTKE 240
--------FLVYSKVGG 240
---------
Multiple alignment of DHFR of
Plasmodium species
Drug binding pocket of L. casei DHFR
Antifolate drugs in the active site of DHFR L.
casei to show hydrogen bonding with
surrounding residues
MTX
TMP
PYR
SO3
How molecular modeling could be
used in identifying new leads
• These two compounds
a triazinobenzimidazole &
a pyridoindole were found to
be active with high Ki against
recombinant
wild
type
DHFR.
• Thus demonstrate use of
molecular
modeling
in
malarial drug design.
Sitio Activo de la pirazinamidasa
Docking P. Horikoshii – PZA en presencia de Zn
Additional Drug Target: glutathione-GR
Glutathione-GR
Additional Drug
Target:
Thioredoxin
reductase (TrxR)
How Bioinformatics Aids in
Vaccine Development / Peptide
Vaccine Development Using
Bionformatics Approaches
Emerging and re-emerging infectious diseases threats, 1980-2001
Viral
-
-
Bolivian hemorrhagic fever-1994,Latin America
Bovine spongiform encephalopathy-1986,United Kingdom
Creulzfeldt-Jackob disease(a new variant V-CID)/mad cow disease-1995-96, UK/France
Dengue fever-1994-97,Africa/Asia/Latin America/USA
Ebola virus-1994,Gabon;1995,Zaire;1996,United States(monkey)
Hantavirus-1993,United States; 1997, Argentina
HIV subtype O-1994,Africa
Influenza A/Beijing/32/92, A/Wuhan/359/95, HS:N1-1993,United States; 1995,China;
1997, Hongkong
Japanese Encephalitis-1995, Australia
Lassa fever-1992,Nigeria
Measles-1997, Brazil
Monkey pox-1997,Congo
Morbillivirus – 1994, Australia
O’nyong-nyong fever-1996,Uganda
Polio-1996,Albania
Rift Valley fever-1993,Sudan
Venezuelan equine encephalitis-1995-96,Venezuela/Colombia
West Nile Virus-1996,Romania
Yellow fever-1993,Kenya;1995,Peru
Emerging and re-emerging infectious diseases threats contd.,
• Parasitic
- African trypanosomiasis-1997,Sudan
- Ancylcostoma caninum(eosinophilic enteritis)1990s,Australia
- Cryptosporiadiasis-1993+,United States
- Malaria-1995-97,Africa/Asia/Latin America/United
states
- Metorchis-1996,Canada
- Microsporidiosis-Worldwide
• Fungal
- Coccidiodomycosis-1993,United States
- Penicillium marneffi
Emerging and re-emerging infectious diseases threats contd.
• Bacterial
– Anthrax-1993,Caribbean
– Cat scratch disease/Bacillary angiomatosis(Bartonella henseiae)-1900s, USA
– Chlamydia pneumoniae(Pneumonia/Coronary artery disease?)-1990s, USA(discovered
1983)
– Cholera-1991,Latin America
– Diphtheria-1993,Former Soviet Union
– Ehrlichia chaffeensis,Human monocytic ahrlichiosis(HME)-United States
– Ehrlichia phagocytophilia,Human Granulocytic ehrlichis(HGE)-United States
– Escherichia coli O157-1982-1997,United States;1996,Japan
– Gonorrhea(drug resistant)-1995,United States
– Helicobacter pylori(ulcers/cancer_-worldwide(discovered 1983)
– Leptospirosis-195,Nicaragun
– Lyme disease(Borrelia burgdorferi)-1990s,United states
– Meningococcal meningitis(serogroup A)-1995-1997,West Africa
– Pertussis-1994,UK/Netherlands;1996,USA
– Plague-1994,India
– Salmonella typhimurium DT104(drug resistant)-1995,USA
– Staphylococcus aureus(drug resistant)-1997,United States/Japan
– Toxic strep-United States
– Trench fever(Barnionella quintana)-1990s,United States
– Tuberculosis(highly transmissible)-1995,United states
– Vibrio cholerae 0139-1992,Southern Asia
Types of Vaccines
•
•
•
•
•
•
•
Killed virus vaccines
Live-attenuated vaccines
Recombinant DNA vaccines
Genetic vaccines
Subunit vaccines
Polytope/multi-epitope vaccines
Synthetic peptide vaccines
Systems with potential use as T-cell vaccines
CD4 + T-cell vaccines
Killed microbe
Live attenuated microbe
Synthetic peptide coupled
to protein
Recombinant microbial protein
bearing CD4+ T-cell epitope
CD8+ T-cell vaccines
Live attenuated microbe
Synthetic peptide
delivered in liposomes
or ISCOMs
-
Chimeric virus expressing
CD4+ T-cell epitope
Chimeric virus expressing
CD8+ T-cell epitope
Chimeric Ig
Self-molecule expressing
CD8+ T-cell epitope
Chimeric-peptide-MHC
class II complex
Chimeric peptide-MHC
Class I complex
Receptor-linked peptide
Naked DNA expressing
CD4+ T-cell epitope
Naked DNA expressing
CD8+ T-cell epitope
Abbreviations: Ig, Immunoglobulin, ISCOM, immune-stimulating complex;
MHC,Major histocompability complex.
Why Synthetic Peptide Vaccines?
 Chemically well defined, selective and safe.
 Stable at ambient temperature.
 No cold chain requirement hence cost effective in
tropical countries.
 Simple and standardised production facility.
What Are Epitopes?
Antigenic determinants or Epitopes are the
portions of the antigen molecules which are
responsible for specificity of the antigens in
antigen-antibody (Ag-Ab) reactions and
that combine with the antigen binding site
of Ab, to which they are complementary.
Epitopes could be contiguous (when Ab binds to a
contiguous sequence of amino acids)
non-contiguous (when Ab binds to
non-contiguous residues, brought
together by folding).
Sequential epitopes are contiguous
epitopes.
Conformational epitopes are noncontiguous antigenic determinants.
Epitopes …
B-cell epitopes
Th-cell epitopes
Properties of Amino Acids: predictors for
Epitopes
Sequential epitope prediction methods
Theoretical methods are based on properties of amino
acids and their propensity scales.
Hopp & Woods, 1981.
Parker et al., 1986
Kolaskar & Tongaonkar, 1990.
The accuracy of prediction: 50-75%.
Conformational epitope prediction method
Kolaskar & Kulkarni-Kale, 1999.
Identified antigens must be checked for strain varying
polymorphisms, these polymorphism must be represented
in a anti-blood stage vaccine
Protective
epitope
Variants in strains
A
B
C
Candidate protein X
D
Antigenic determinants of Egp of JEV
Kolaskar & Tongaonkar approach
Peptide vaccines to be launched in
near future
•
•
•
•
•
•
•
Foot & Mouth Disease Virus (FMDV)
Human Immuno Deficiency Virus (HIV)
Metastatic Breast Cancer
Pancreatic Cancer
Melanoma
Malaria
* T.solium cysticercosis *
Various transformations on side-chain
orientation in a model tetrapeptide
Reverse Vaccinology
• Advantages
–
–
–
–
–
Fast access to virtually every antigen
Non-cultivable can be approached
Non abundant antigens can be identified
Antigens not expressed in vitro can be identified.
Non-structural proteins can be used
• Disadvantages
– Non proteinous antigens like polysaccharides,
glycolipids cannot be used.
Rappuoli 2001
Curr. Opin. Microbiol.
Rappuoli 2001
Curr. Opin. Microbiol.
Vaccine development
In Post-genomic era:
Reverse Vaccinology
Approach.
Genome Sequence
Proteomics
Technologies
In silico
analysis
IVET, STM, DNA
microarrays
High throughput
Cloning and expression
In vitro and in vivo assays for
Vaccine candidate identification
Global genomic approach to identify new vaccine candidates
In Silico Analysis
Peptide
Multitope
vaccines
VACCINOME
Candidate Epitope DB
Epitope prediction
Disease related protein DB
Gene/Protein Sequence Database
Synthetic Peptide Vaccine
Design and Development of
Synthetic Peptide vaccine
against Japanese encephalitis
virus
Egp of JEV as an Antigen
 Is a major structural antigen.
 Responsible for viral haemagglutination.
 Elicits neutralising antibodies.
 ~ 500 amino acids long.
 Structure of extra-cellular domain (399) was
predicted using knowledge-based homology
modeling approach.
Model Refinement
PARAMETERS USED
• force field:
• Dielectric const:
• Optimisation:
AMBER all atom
Distance dependent
Steepest Descents &
Conjugate Gradients.
• rms derivative 0.1 kcal/mol/A for SD
• rms derivative 0.001 kcal/mol/A for CG
• Biosym from InsightII, MSI and modules therein
Model For Solvated Protein
 Egp of JEV molecule was soaked in the
water layer of 10A.
 4867 water molecules were added.
 The system size was increased to
20,648 atoms from 6047.
Model Evaluation II:
Ramachandran Plot
An Algorithm to Identify
Conformational Epitopes
Calculate the percent accessible surface
area (ASA) of the amino acid residues.
If ASA  30%, then residue was termed
as accessible residues.
A contiguous stretch of more than three
accessible residues was termed as the
antigenic determinant.
…Cont.
A determinant is extended to N- and Cterminals, only if, accessible amino
acid(s) are present after an
inaccessible amino acid residue.
A list of sequential antigenic
determinants was prepared.
Peptide Modeling
Initial random conformation
Force field: Amber
Distance dependent dielectric constant 4rij
Geometry optimization: Steepest descents & Conjugate gradients
Molecular dynamics at 400 K for 1ns
Peptides are:
SENHGNYSAQVGASQ
NHGNYSAQVGASQ
YSAQVGASQ
YSAQVGASQAAKFT
NHGNYSAQVGASQAAKFT
SENHGNYSAQVGASQAAKFT
149
168
Prediction of conformations of the antigenic peptides
Lowest energy Allowed conformations were obtained
using multiple MD simulations:
– Initial conformation: random, allowed
– Amber force field with distance dependent dielectric
constant of 4*rij
– Geometry optimization using Steepest descents &
Conjugate gradient
– 10 cycles of molecular dynamics at 400 K; each of 1ns
duration, with an equilibration for 500 ps
– Conformations captured at 10ps intervals, followed by
energy minimization of each
– Analysis of resulting conformations to identify the
lowest energy, geometrically and stereochemically
allowed conformations
MD simulations of following peptides were carried out
B Cell Epitopes:
SENHGNYSAQVGASQ
NHGNYSAQVGASQ
YSAQVGASQ
YSAQVGASQAAKFT
NHGNYSAQVGASQAAKFT
149
T-helper Cell Epitope:
436
445
SIGKAVHQVF
168
SENHGNYSAQVGASQAAKFT
Chimeric B+Th Cell Epitope With Spacer:
SENHGNYSAQVGASQAAKFTSIGKAVHQVF
Structural comparison of Egps of Nakayama and Sri Lanka
strains of JEV.
Single amino acid differences are highlighted.
Ts18 epitope mapping
1.6
1.6
1.2
1.2
1.2
0.8
0.4
0.8
0.0
0.0
1
3
5
7
9
11
13
15
17
19
0.8
0.4
0.4
0.0
1
3
5
7
9
11
13
15
17
1
19
1.6
1.6
1.2
1.2
1.2
0.8
0.8
0.4
0.4
0.0
0.0
1
3
5
7
9
11
13
15
17
19
A650
1.6
A650
A650
A650
1.6
A650
A650
13-mers window skipping 3 aminoacids
3
5
7
9
11
13
11
13
15
15
17
0.8
0.4
0.0
1
3
5
7
9
11
13
15
17
19
1
3
5
7
9
17
19
19
Ts18 MHC II epitope profiles for different
alleles
Ts18 MHC I and MHC II consensus
profile
45
40
35
30
25
20
15
10
5
0
1
5
9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73
Ts18 modeled 3D structure
Download