Quantum Pharmaceuticals software and services

advertisement

QUANTUM Drug Discovery Software

QUANTUM employs quantum mechanics, thermodynamics, and an advanced continuous water model for solvation effects to calculate ligands binding affinities. This approach differs dramatically from scoring functions that are commonly used for binding affinity predictions. By including the entropy and aqueous electrostatics contributions in to the calculations directly, QUANTUM algorithms produce much more accurate and robust values of binding free energies.

Interaction of a ligand with a protein is characterized by the value of binding free energy. The free energy (F) is the thermodynamic quantity, that is directly related to experimentally measurable value of inhibition constant (IC50) and depends on electrostatic, quantum, aqueous solvation forces, as well as on statistical properties of interacting molecules. There are two major contributing quantities leading to non-additivity in F : 1) the electrostatic and solvation energy, and 2) the entropy.

Most of popular scores (e.g.

AutoDock, Dock 5, XSCORE, GOLD etc.) employ a reasonable approximation for short-range quantum interactions, but do not perform a detailed calculation of aqueous electrostatics and entropy. Both the solvation energy and the entropy are difficult to compute: so instead of exact computations, scoring function use an approximation. In this approximation, the contributions of non-additive properties are estimated as fractions of easierto-calculate pairwise Figure 1 Docking example: binding affinities interactions: electrostatics and van der Waals forces. for 220 protein ligand complexes, calculation vs. experiment

Such approximation works to a some extent because vacuum electrostatics is nearly canceled by solvation energies; at the same time the enthalpy of binding is approximately compensated by entropy. Therefore, the calculations of solvation energies and entropy seemingly can be avoided by combining molecular mechanics, van der Waals and electrostatic forces linearly with usually small numerical coefficients (of the order of 10%).

However, a potential energy surface given by such linear combinations of unrelated quantities with statistics-based coefficients is not necessarily related to the true interaction profile. That is why such a simple score fails frequently to reproduce unique binding modes and hence gives docking false negatives. In the same time, this approximation tends to overestimate the affinity of weak binders producing docking false positives. Thus, in spite of reasonable accuracy of such predictions, the selectivity of scoring function is low. This

means that frequently scoring functions will not allow to identify really strong binder among the pool of similar weak binders.

Moreover, affinities of weak binders may be overestimated.

QUANTUM software does not rely on approximate cancellations of important physical quantities. Instead, we employ our original continuous water model to compute both the vacuum and aqueous electrostatic energies, use quantum mechanics to calculate the short range forces and thermodynamic sampling to obtain the value of the free energy (entropy). As the result we can not only observe the necessary subtractions of individual energy components, but also perform molecular modeling in a more realistic and physically justified potentials. Fig. 1 shows the results of a docking run on a single rigid protein structure for

220 different ligands. R.m.s. error in free energy is 2kJ/mol, the correlation coefficient is 0.7.

The selectivity improvement in our approach is illustrated by the following model calculation. First, we derived a simple linear model directly from our QUANTUM vacuum force field. The short range interactions where variationally adjusted (to allow for empirical hydrogen bonds). The van der Waals “scaling factors” and “protein dielectric constant” were found by correlating the suggested score with experimental binding affinities for a number of known complexes. Both the linear score and complete QUANTUM force field were tested using 300 protein-ligand pairs and showed comparable accuracy.

Fig. 2 shows docking funnels (the

Selectivity of the full thermodynamic energy difference between the conformers plotted as a function of model vs. simplified statistical score r.m.s. from the known crystallographic position of a ligand). Both full

QUANTUM free energies (red lines) and scoring function (blue lines) were used to calculate, in the case of QUANTUM, or estimate, in the case of linear score, the free energies for numerous binding modes (conformers). The conformers where generated by the

QUANTUM 3.3 docking program. The resulting energies were averaged over the conformers with similar r.m.s. values and plotted on the same graph.

The model calculation shows that the QUANTUM force field has a much steeper docking funnel, i.e. is more sensitive to misplacements of a ligand, than a scoring function. Therefore QUANTUM complete force field can be used to distinguish similar binding modes and hence obtain much more accurate docking positions. As a result, QUANTUM shows dramatically lower ratio of false positives and false negatives as compared with a scoring function based method.

In fact, non-additive interactions (especially solvation effects) play a key role in molecular recognition of small molecules by proteins. QUANTUM approach is practically the only accurate and highly sensitive method available to a broad audience

of researchers, which is capable to realistically model the intermolecular interactions.

Quantum Pharmaceuticals software and services

Quantum Pharmaceuticals develops and markets a number of software tools and services for molecular modeling and drug discovery.

Compound profiling in silico

Compound profiling provides comprehensive prediction effects: identification of potential targets, prediction of therapeutic and adverse effects by screening in silico of a compound against diverse protein set representing human proteome.

Compound profile is predicted by virtual screening of a drug candidate molecule against hundreds of diverse proteins representing different active site types of a human proteome. As a result we obtain Kd of the tested compound for every protein from our carefully preselected protein set. Active site types that show high affinity for the compound are considered as sensitive. On the second round of screening the tested compound is screened against all biologically important proteins with sensitive active site types and solved 3D structure.

Optionally, the screening can be run against specialized protein assays such as kinase assays etc. As opposed to the compound profiling in vitro/vivo compound profiling in silico does not require synthesis and/or purchasing chemicals. In addition, determination of cumulative effects takes from several months to several years, while virtual compound profiling does this within several hours or days. Finally, compound profiling in silico saves life of many animals.

Kinase Profiling assays - a unique in silico platform

Kinase profiling identifies prospective safe, effective and selective kinase inhibitors by virtual screening of a compound against our kinase profiling platform. Kinase profiling in silico does not require sometimes complex and expensive synthesis of drug candidates and acquiring of kinase assays and kits (and associated with this time-delays).

Our kinase assays represent practically all major kinase groups of human kinome. The table below shows the number of kinase families from each group represented in our kinase platform.

Kinase group

Number of

Families

AGC 14

CAMK 18

CMGC

CK1

RGS

STE

9

3

1

4

Number of

Families, represented in kinases, in our kinase our kinase

Number of platform platform

7

8

8

11

6

1

0

3

8

1

0

5

12

10

8

6

4

Prediction of pKd for kinaseinhibitor complexes

TK

TKL

31

8

17

3

24

3 2

Other

Atypical 12

Total

32

132

6

1

52

6

1

67

0

0 2 4 6 8

Experimental pKd

The assays is constantly revised and expanded with newly solved kinase structures.

10 12

Quantitative Drug toxicity prediction (including LD50)

The service predicts drug toxicity and LD50. Toxicity prediction is provided both as an individual service and as an integral part of compound profiling service. First, toxicological profile is predicted by screening in silico of a drug candidate molecule against hundreds of diverse proteins representing different active site types of human proteome. The active site types that show high affinity for a compound are considered as sensitive ones. In the second round of toxicity assessment procedure all proteins similar to those with sensitive active site types are included in screening. As a result, we obtain

IC50 (inhibition constant) of a tested compound for every protein from our toxicological protein platform. On the basis of this information we

8

7

6

LD50 Prediction

draw a conclusion about adverse reaction targets of a compound and its possible adverse effects.

On the third step, our expert system calculates

5

4

LD50 on the basis of predicted affinities (The figure shows calculated values of LD50 plotted against experimental ones for 100 drug

3

2 molecules taken from Drug Bank ).

In contrast to the toxicity testing in vitro/vivo, compound profiling in silico does not

1

1 2 3 4 5 6 7 8 require synthesis or purchasing chemicals. In addition, determination of chronic toxicity and

Experiment (logarithmic units) carcirogenicity takes from several month to several years, while toxicity prediction in silico takes several hours or days. Finally, toxicity prediction in silico saves life of animals.

Virtual high throughput screening

Virtual screening identifies prospective strong inhibitors for indicated by a customer protein target. Hit identification is conducted by docking of huge virtual libraries of compounds on a target protein.

Screening in silico dramatically decreases the number of compounds for experimental assessment of their activity (thus cuts down time and financial expenses) and increases percentage success of in vitro experiments. Virtual screening is an effective way to rationalize drug-design. We are sure that Quantum virtual high throughput screening can successfully substitute HTS and UHTS for protein-targets with solved 3D structure.

The hit identification procedure consists of four steps. First, the library is divided into clusters according to structural similarity of compounds. Second, the representatives of every cluster are docked on a rigid structure model of a protein-target. Third, the whole clusters containing identified hits are docked on a rigid structure model of a protein-target. Fourth, the best docked molecules, with IC50 > 10um are sent to a refinement calculation. Our refinement procedure is a complete Free Energy Perturbation molecular dynamics run for a whole proteinligand complex in aqueous environment, which takes into account protein flexibility.

Our hit rate for predicted strong inhibitors is close to 50% (see Ftsz inhibitors search example ).

Target ed Compound Libraries Design

Each targeted compound library is a set of compounds with predicted in silico high affinity for a given protein target. Focused Compounds Libraries reduce dramatically the number of substances for experimental activity assessment and increase the number of really strong inhibitors

found among experimentally tested compounds.

Our focused compound libraries have outstanding quality due to high accuracy of docking and energy calculation. This became possible because our drug discovery software was developed using a new paradigm in molecular modeling – applying quantum and molecular physics instead of statistical approaches – scoring function-like and QSAR-like methods. This approach enables to identify novel chemical classes of strong inhibitors.

Our hit rate for predicted strong inhibitors is close to 50% . This means that about half of the molecules from our compound library for experimental activity assessment are real potent inhibitors.

Features of our Focused Compound Libraries

High Affinity

Compliance with Lipinski’s drug-like parameters

High Selectivity*

Low Toxicity*

Patentability*

*Optional features. Not included in the base price.

Our source compound library

>3 000 000 compounds.

LogP and LogD calculation

The main advantage of QUANTUM is the quality of underlaying physical models. Most of competing approaches use different kinds of fragment based descriptors to calculate the molecular properties from known properties of similar compounds (QSAR). Such models rely heavily on additivity of molecular properties, are often overparametrized and lack direct physical justification. As a result, the prediction power of the models may be very good for structures similar to those used in the training set and may not be sufficient for absolutely novel compounds. QUANTUM derives molecular properties from first principles based models directly using advanced quantum mechanical analysis of molecular interactions and thermodynamics.

The service predicts water/octanol partition coefficient, LogP, for any low weight organic molecule (both charged and non-charged ones), calculates LogD for dissociative systems at a given pH, and drug-likeness according to

Lipinski’s rule. Such parameters as the solvent temperature and ionic strength are adjustable.

10

Log P Prediction

Mean square deviation between calculated and experimental values is 0.7 Log P units that is of the same order of accuracy of currently used experimental techniques for determining LogP!

The figure demonstrates calculated Log

5

0

-5

P values plotted against their experimental ones for over 900 organic molecules (actually drugs from The Drug Bank database). The calculation showed outstanding correlation with experimental values: R

2

= 0.94.

-10

-10 -5 0 5

Experimental Log P

10

The Ligands can be supplied in one of commonly used chemical file format such as .hin, .pdb, .sdf, .mol2. For LogD calculation pKa value

is necessary. If its experimental value is unavailable, you can order our pKa calculation service first to predict these values and then to calculate LogD.

The output data includes LogP and LogD in logarithmic units and estimation of compliance of the drugs with Lipinski's rule of 5.

Water solubility/DMSO solubility prediction

The main advantage of QUANTUM is the quality of underlaying physical models. Most of competing approaches use different kinds of fragment based descriptors to calculate the molecular properties from known properties of similar compounds (QSAR). Such models rely heavily on additivity of molecular properties, are often overparametrized and lack direct physical justification. As a result, the prediction power of the models may be very good for structures similar to those used in the training set and may not be sufficient for absolutely novel compounds. QUANTUM derives molecular properties from first principles based models directly using advanced quantum mechanical analysis of molecular interactions and thermodynamics.

The service predicts both water and Dimethyl Sulfoxide (DMSO) solubility of organic compounds at various temperatures and pH values (from 0.0 to 14.0). The accuracy of calculations for most structures is usually better than 0.2-0.5 logS units; for more complicated molecules the error can be up to 0.8-1.1 logS units. Such parameters as the solvent temperature, pH and ion strength are adjustable. The results of the calculations are represented both in logarithmic (LogS) and absolute (g/l) units.

The figures below demonstrate calculated solubility in water and DMSO for over 1300 and

60 compounds, correspondingly, plotted against their experimental values. The white points on the water solubility graph represent the LogS values calculated for a number of commercially available drugs taken from the Drug Bank database , whereas the blue points show our performance for a set of generic molecules taken from the Virtual Computational Chemistry Lab . The points on the

DMSO solubility graph represent the LogS values calculated for a number of chemicals taken from the Gaylord Chemical Corporation database.

RMSD = 0.81 LogS units; R

2

= 0.84 RMSD = 0.70 LogS units; R

2

= 0.72

The compound structures can be supplied in one of commonly used chemical file format such as .hin, .pdb, .sdf, .mol2. The final report includes aqueous and DMSO solubility in logarithmic and absolute units (g/l). The results of the calculations can be saved in Excel-readable

(.csv) format.

In silico identification of a protein target for Low Molecular Weight

(LMW) organic compounds.

The service identifies a target protein for biologically active LMW organic compounds (e.g. compounds from natural substrates: herbs, venoms etc). Assume, you have already experimentally determined biologically active molecules with unknown target and mechanism of action. We offer fast and cost saving way to identify the targets of such active compounds.

The flow chart shows how we identify a target. Initially a compound is screened against (were available) 2-3 representative proteins of every active site pattern (where solved 3D structures available). The active sites of the proteins with high affinity for a compound are considered as sensitive. Then a compound is screened against all proteins with sensitive active site patterns and solved 3D structure. As a result sensitive active site patterns and proteins are identified.

If an active compound from a complex mixture (e.g. natural substrate) is not identified, then the above-described procedure is run for every component of a mixture. The ingredients that showed high affinity for any protein are considered as active ones.

Download