Insilico drug designing Dinesh Gupta Structural and Computational Biology Group ICGEB

advertisement
Insilico drug designing
Dinesh Gupta
Structural and Computational Biology Group
ICGEB
Modern drug discovery process
Target
identification
Target
validation
2-5 years
Lead
Lead
identification optimization
Preclinical
phase
Drug
discovery
6-9 years
• Drug discovery is an expensive process involving high R & D cost and
extensive clinical testing
• A typical development time is estimated to be 10-15 years.
Drug discovery technologies
• Target identification
– Genomics, gene expression profiling and proteomics
• Target Validation
– Gene knock-out, inhibition assay
• Lead Identification
– High throughput screening, fragment based screening, combinatorial
libraries
• Lead Optimization
– Medicinal chemistry driven optimization, X-ray crystallography, QSAR,
ADME profiling (bioavailability)
• Pre Clinical Phase
– Pharmacodynamics (PD), Pharmacokinetics (PK), ADME, and toxicity
testing through animals
• Clinical Phase
– Human trials
Rational Approach to Drug Discovery
Identify and validate target
Clone gene encoding target
Express target
Crystal structures/MM of target and target/inhibitor complexes
Identify lead compounds
Synthesize modified lead compounds
Toxicity & pharmacokinetic studies
Preclinical trials
Bioinformatics tools in DD
•
•
•
•
•
Comparison of Sequences: Identify targets
Homology modelling: active site prediction
Systems Biology: Identify targets
Databases: Manage information
In silico screening (Ligand based, receptor
based): Iterative steps of Molecular
docking.
• Pharmacogenomic databases: assist
safety related issues
Currently used drug targets
J. Drews Science 287, 1960 -1964 (2000)
This information is used by bioinformaticians to narrow the search in the groups
Published by AAAS
Insilico methods in Drug Discovery
• Molecular docking
• Virtual High through put screening.
• QSAR (Quantitative structure-activity relationship)
• Pharmacophore mapping
• Fragment based screening
Molecular Docking
•
Docking is the computational determination of binding
affinity between molecules (protein structure and ligand).
• Given a protein and a ligand find out the binding free
energy of the complex formed by docking them.
L
L
R
R
Molecular Docking: classification
• Docking or Computer aided drug designing can be
broadly classified
– Receptor based methods- make use of the structure of the target
protein.
– Ligand based methods- based on the known inhibitors
Receptor based methods
• Uses the 3D structure of the target receptor to search for
the potential candidate compounds that can modulate
the target function.
• These involve molecular docking of each compound in
the chemical database into the binding site of the target
and predicting the electrostatic fit between them.
• The compounds are ranked using an appropriate scoring
function such that the scores correlate with the binding
affinity.
• Receptor based method has been successfully applied
in many targets
Ligand based strategy
• In the absence of the structural information of the target,
ligand based method make use of the information
provided by known inhibitors for the target receptor.
• Structures similar to the known inhibitors are identified
from chemical databases by variety of methods,
• Some of the methods widely used are similarity and
substructure searching, pharmacophore matching or 3D
shape matching.
• Numerous successful applications of ligand based
methods have been reported
Ligand based strategy
Search for similar compounds
database
known actives
structures found
Binding free energy
• Binding free energy is calculated as the sum of the
following energies
- Electrostatic Energy
- Vander waals Energy
- Internal Energy change due to flexible deformations
- Translational and rotational energy
• Lesser the binding free energy of a complex the more
stable it is
Basic binding mechanism
Complementarities between the ligand and the
binding site:
• Steric complementarities, i.e. the shape of the
ligand is mirrored in the shape of the binding site.
• Physicochemical complementarities
Components of molecular docking
A) Search algorithm
• To find the best conformation of the ligand
and the protein system.
• Rigid and flexible docking
B) Scoring function
• Rank the ligands according to the interaction energy.
• Based on the energy force-field function.
Success with vHTS
•
•
•
•
•
Dihydrofolate reductase inhibitor (1992)
HIV-protease (1992)
Phospholypase A2 (1994)
Thrombine (1996)
Carbonic anhydrase inhibitors(2002)
Virtual High Throughput Screening
• Less expensive than High Throughput Screening
• Faster than conventional screening
• Scanning a large number of potential drug like
molecules in very less time.
• HTS itself is a trial and error approach but can be
better complemented by virtual screening.
QSAR
• QSAR is statistical approach that attempts to relate
physical and chemical properties of molecules to their
biological activities.
• Various descriptors like molecular weight, number of
rotatable bonds LogP etc. are commonly used.
• Many QSAR approaches are in practice based on the
data dimensions.
• It ranges from 1D QSAR to 6D QSAR.
Pharmacophore mapping
• It is a 3D description of a pharmacophore, developed by
specifying the nature of the key pharmacophoric features
and the 3D distance map among all the key features.
• A Pharmacophore map can be generated by
superposition of active compounds to identify their
common features.
• Based on the pharmacophore map either de novo design
or 3D database searching can be carried out.
Modeling and informatics in drug design
Increased application of structure based drug
designing is facilitated by:
 Growth of targets number
 Growth of 3D structures determination (PDB
database)
 Growth of computing power
 Growth of prediction quality of proteincompound interactions
Summary: role of Bioinformatics?
• Identification of homologs of functional
proteins (motif, protein families, domains)
• Identification of targets by cross species
examination
• Visualization of molecular models
• Docking, vHTS
• QSAR, Pharmacophore mapping
Example: use of Bioinformatics in
Drug discovery
Identification of novel drug targets
against human malaria
Malaria – A global problem!
• Malaria causes at least 500 million clinical cases and
more than one million deaths each year.
• A child dies of malaria every 30 seconds.
• Out of four Plasmodium species causing human malaria,
P.falciparum poses most serious threat: because of its
virulence, prevalence and drug resistance.
• Malaria takes an economic toll - cutting economic growth
rates by as much as 1.3% in countries with high disease
rates.
• There are four types of human malaria:
–
–
–
–
Plasmodium falciparum
Plasmodium vivax
Plasmodium malariae
Plasmodium ovale.
• Approximately half of the world's population is at risk of malaria,
particularly those living in lower-income countries.
•
Today, there are 109 malaria affected countries in 4 regions
Chemical structures of drugs in widely used for treatment of Malaria
a) Chloroquine
b) Quinine
c) Artemether
d) Sodium artesunate
e) Dihydroartemisinin
f) Pyrimethamine
g) Sulfadoxine
h) Mefloquine
i) Halofantrine
j) Primaquine
k) Tafenoquine
l) Chlorproguanil
m) Dapsone
http://malaria.who.i
nt/docs/adpolicy_t
g2003.pdf
Problems with the existing drugs
• Drug resistance is most common problem
• Adverse effects (Shock and cardiac arrhythmias
caused by Chloroquine)
• Poor patient compliance (Quinine tastes very
unpleasant, causes dizziness, nausea etc.)
• High cost of production for some effective drugs
(Atovaquine).
• Urgent need for identification of novel drug
targets which are effective and affordable.
Strategies for drug target identification in P.
falciparum
• Parasite culture for functional assays are difficult and expensive.
Making computational approaches more relevant.
• Malaria remains a neglected disease- very few stake holders!
• Availability of the genomic data of P.falciparum and H.sapiens has
facilitated the effective application of comparative genomics.
• Comparative genomics helps in the identification and exploitation of
different characteristic features in host and the parasite.
• Identification of specific metabolic pathways in
P.
falciparum and targeting the crucial proteins is an attractive approach
of target based drug discovery.
Comparison of proteomes helps in identifying
important indispensible parasite proteins
A. gambiae
Predicted
proteome
P. falciparum
H. sapiens
• Out of 5334 predicted
proteins in P. falciparum,
60% didn’t show any
similarity to known proteins.
• Hence assigning a
physiological functional role
to these hypothetical
proteins using
bioinformatics approach still
remains a challenge.
Novel drug target identification in P.falciparum
Comparative genomics studies
~40% identity threshold for
three-dimensional
modeling
BlastP
Relational
Database of
homology
models
Human
proteome
476 P.falciparum
proteins
Large set of proteins with no/low
similarity
Literature search for all these proteins
Check for physiological and biochemical
functions; etc ..
Putative drug
targets in
P.falciparum
Proteasome
machinery (ClpQY
and ClpAP) in
P.falciparum
Targets identified by comparison of
proteins models
• Identification of two proteasomal proteins
of prokaryotic origin, not present in hosts.
• The protein degradation is an important
process in parasite development inside
host RBCs.
Eukaryotic and prokaryotic proteasome machinery
26S proteasome: eukaryotic type
•19S regulatory + 20S proteolytic particle
•Present only in Eukaryotes and archae
•Degrades ubiquitinated proteins
20S proteasome
ClpQY system: prokaryotic type
•ClpY cap + ClpQ core particle
•Present only in prokaryotes
> 20 different proteins involved
•No ubiquitination in prokaryote
•Substrate specificity is not known
•Only two proteins ClpQ & ClpY
Substrate protein
ClpY
ClpQ
ClpY
Peptides
ATP Dependent Protease Machinery
ClpQY (PfHslUV system)
• The HslUV complex in prokaryotes is composed of an
HslV threonine protease and HslU ATP-dependent
protease, a chaperone of Clp/Hsp100 family.
• HslV (ClpQ) subunits are arranged in form of two-stacked
hexameric rings and are capped by two HslU (ClpY)
hexamers at both ends.
• HslU (ClpY) hexamer recognizes and unfold peptide
substrates with an ATP dependent process, and
translocates them into HslV for degradation.
Crystal structure of HslUV complex
in H. influenzae
PfClpQY complex model in
P. falciparum
ATP Dependent Protease
machineries ClpQY (PfHslUV
system)
•
The HslUV complex in prokaryotes is
composed of an HslV threonine
protease and ATP-dependent protease
HslU, a chaperone of clp/Hsp100
family.
•
HslV subunits are arranged in the form
of two-stacked hexameric rings and
are capped by two HslU hexamers at
both ends.
•
In an ATP dependent process, HslU
hexamer recognizes and unfold
peptide substrates and translocate
them into HslV for degradation.
PfClpQ component
MFIRNFVNIIGSQKSITKTIARNYFSDNSKLIIPRHGTTILCVRKNN
EVCLIGDGMVSQGTMIVKGNAKKIRRLKDNILMGFAGATADCFTLLDKFETKIDEYPNQL
LRSCVELAKLWRTDRYLRHLEAVLIVADKDILLEVTGNGDVLEPSGNVLGTGSGGPYAMA
AARALYDVENLSAKDIAYKAMNIAADMCCHTNNNFICETL
For full length & matured active protein
Length
: 207 aa (170)
Pro domain
: 37aa
Important motifs found:
•TT at N terminal in mature protein
•GSGG common chymotrypsin
protease signal.
•Lys(28) and Arg(35) are two
conserved amino acids play some
role in the activity.
Homologs of PfClpQ protein in other Plasmodium spp
PK_ClpQ
PV_ClpQ
PF_ClpQ
PY_ClpQ
PB_ClpQ
TTILCVRKNNEVCLIGDGMVSQGTMIVKGNAKKIRRLKDNILMGFAGATADCFTLLDKFE
TTILCVRKNNEVCLIGDGMVSQGTMIVKGNAKKIRRLKDNILMGFAGATADCFTLLDKFE
TTILCVRKNNEVCLIGDGMVSQGTMIVKGNAKKIRRLKDNILMGFAGATADCFTLLDKFE
TTILCVRKNNEVCLIGDGMVSQGTMIVKGNAKKIRRLKDNILMGFAGATADCFTLLDKFE
TTILCVRKNNEVCLIGDGMVSQGTMIVKGNAKKIRRLKDNILMGFAGATADCFTLLDKFE
************************************************************
PK_ClpQ
PV_ClpQ
PF_ClpQ
PY_ClpQ
PB_ClpQ
TKIDEYPDQLLRSCVELAKLWRTDRYLRHLEAVLIVADKDVLLEVTGNGDVLEPSGNVLG
TKIDEYPDQLLRSCVELAKLWRTDRYLRHLEAVLIVADKDVLLEVTGNGDVLEPSGNVLG
TKIDEYPNQLLRSCVELAKLWRTDRYLRHLEAVLIVADKDILLEVTGNGDVLEPSGNVLG
TKIDEYPDQLLRSCVELAKLWRTDRYLRHLEAVLIVADKDTLLEVTGNGDVLEPSGNVLG
TKIDEYPDQLLRSCVELAKLWRTDRYLRHLEAVLIVADKDTLLEVTGNGDVLEPSGNVLG
*******:******************************** *******************
PK_ClpQ
PV_ClpQ
PF_ClpQ
PY_ClpQ
PB_ClpQ
TGSGGPYAIAAARALYDVENLSAKDIAYKAMNIAADMCCHTNNNFICETL
TGSGGPYAIAAARALYDVENLSAKDIAYKAMNIAADMCCHTNNNFICETL
TGSGGPYAMAAARALYDVENLSAKDIAYKAMNIAADMCCHTNNNFICETL
TGSGGPYAMAAARALYDIENLSAKDIAYKAMNIAADMCCHTNHNFICETL
TGSGGPYAIAAARALYDIENLSAKDIAYKAMNIAADMCCHTNHNFICETL
********:********:************************:*******
Homology modeling of PfClpQ
PfClpQ
1kyi
Conservation of catalytic residues
S125-G45-T1-K33
Structural alignment of PfClpQ and HslV
(H.influenzae)
Homology Modeling of PfClpQ
E. coli
S. enterica
H. influenzae
X. campestris
W. pipientis
P. falciparum
T. brucei
T. cruzi
L. infantum
E. coli
S. enterica
H. influenzae
X. campestris
W. pipientis
P. falciparum
T. brucei
T. cruzi
L. infantum
E. coli
S. enterica
H. influenzae
X. campestris
W. pipientis
P. falciparum
T. brucei
T. cruzi
L. infantum
•Most of the conserved residues in different bacterial species
were either identical or similar in PfClpQ
Biochemical characterization of PfClpQ protein
Protease
Activity assay for PfClpQ protein
Fluorogenic
peptide
substrate
Fluorescence
Threonine protease like
Chymotrypsin like
Suc-LLVY-AMC
chymostatin
100
50
0
1h
2h
3h
4h
5h
6h
Time
Substrate conc (mM)
Km =19.18 mM
500
400
300
200
100
0
30
60
90
120
150
Time in minutes
Substrate conc (mM)
Km = 58.22 mM
180
AMC released (m moles)
AMC released (m moles)
150
AMC released (m moles)
Substrate: Cbz-GGL-AMC
Inhibitor: Lactacystin
Peptidyl glutamyl hydrolase
Z-LLE-AMC
MG132
150
100
50
0
1h
2h
3h
4h
5h
6h
Time
Substrate conc (mM)
Km =37.79 mM
Insilico identification of novel inhibitors against PfClpQ ,
a novel drug target of P.falciparum by high throughput
docking
Drug-like compound
library (1,000,00)
PfclpQ
Molecular
docking
Ligand docked into protein’s
active site
Top 100 solutions
Out of top 40 only 10 compounds available for purchase
ClpQ interaction with ligand identified by virtual screening
Phe46
Gly49
Gly48
Arg36
Thr2
Thr50
Val21
Ser22
Crystal structure of
HslV complexed
with a vinyl sulfone
inhibitor
Compound
Gold
Score
Flexx
score
1
52.54
-25.14
2
54.76
-17.37
3
54.66
-24.43
4
52.84
-24.47
Chemical Structure
Identification of P. falciparum ClpY (PfClpY) gene
A regulatory component of ClpQY system
ClpY
Recognizes the substrate; unfolds the substrate; feeds it
into the degradation machine (ClpQ)
ClpQ
ClpY
Belongs to AAA+ family of proteins
PfClpY
ATPase domain
Walker A
DOMAINS
 ~1.3 kb
 Contain all the three
ClpY domains- N, I and C
N
Walker B
I
I-Domain
C-Domain
N
C
N-Domain
Homology of PfClpY protein with homologs in other organisms
Variation in I domain:
plays role in recognition of
different substrate
Targeting the ClpQY interaction
Crystal structure of HslUV in H. influenzae
Modeled ClpQY interaction in P.falciparum
J Biomol Struct Dyn. 2009 Feb;26(4):473-9
IDENTIFICATION OF DRUG TARGETS USING INTERACTION NETWORKS
EXTRACTING THE
MICROARRAY DATA FROM
NCBI GEO
NORMALIZATION IF NECESSARY
OTHERWISE PREPARING EXCEL
FILES FOR WGCNA ANALYSIS
EXCEL SHEET OF NORMALIZED DATA
AND GENE SIGNIFICANCE
ANALYSING THESE FILES IN R
LANGUAGE AND RUNNING THEM IN
ANOTHER R PACKAGE –”WGCNA”
FINDING DIFFERENT HUB GENES AND
MODULES WHICH CAN BE USED AS
DRUG TARGET BY REFERING TO THESE
NETWORKS
VISUALIZATION OF
NETWORKS BY DIFFERENT
GRAPHS AND SOFTWARE IN R
PACKAGE
PRINCIPLE BEHIND CONSTRUCTING
NETWORK IS THAT THE GENES
WHICH ARE CO-EXPRESSED,
RELATED AND CAN BE CONNECTED
TO MAKE A NETWORK , USING
PEARSON CORRELATION
COEFFICIENT
THESE NETWORKS CAN BE USED FOR FINDING THE DRUG
TARGETS
THESE CAN ALSO BE USED FOR ANNOTATION OF PROTEINS AND
GENES BY COMPARING THEM BY INTERACTOME STUDIES
THESE NETWORKS CAN BE USED FOR PATHWAY ANNOTATION
BETTER THAN OTHER STUDIES AS THEY ARE BASED ON THE
MICROARRAY DATA
Tools used:
• Sequence analysis: Pairwise and multiple
sequence alignments, Pfam.
• Molecular modelling: Modeller
• Docking: Tripos FlexX, GOLD, Arguslab
• PP network: R package and Visant
Molecular docking hands on
• Download and install Arguslab in windows
• Load a PDB file, practice Arguslab tools
• Follow the tutorial at
http://www.arguslab.com/tutorials/tutorial_
docking_1.htm
Molecular Docking using Argus lab:
Ex : Benzamidine inhibitor docked into Beta Trypsin
Create a binding site from bound ligand
Setting docking
parameters
Analyzing docking results
Polypeptide builder.
Download