Molecular Docking Using GOLD Tommi Suvitaival Seppo Virtanen S-114.2500 Basics for Biosystems of the Cell Fall 2006 Table of Contents Introduction Software and Virtual Screening Fitness Functions GoldScore ChemScore Combined Fitness Functions Genetic Algorithm Overview The Genetic Algorithm in GOLD Algorithm Efficiency References Introduction Computational molecular docking is a research technique for predicting whether one molecule will bind to another, usually protein. Ligand is a small molecule – compared to protein. It binds to a macromolecule. In biochemistry, the macromolecule is usually a protein. When binding to a protein, the ligand changes the conformation of the larger molecule, thereby affecting the protein operation. In cell biology, ligand is usually a signal molecule. For example, a cell can obtain information from its surroundings via receptors floating on its lipid bilayer and then adjust its physiology to suit it. In protein-ligand docking the goal is to predict the position and orientation of a ligand when it is bound to a protein receptor or enzyme. The initial situation is such where the structure of the inspected protein and the ligand are known. Such information can be obtained by spectroscopic methods such as X-ray crystallography. Because of the excess of possible conformations due to huge number of degrees of freedom in large systems such as macromolecules, all possible conformations cannot be compared. The problem must be somehow limited. The need for computational power can be reduced by simplifying the model. The active site of protein-ligand interaction must of course be modeled as precisely as possible but the further regions of the macromolecule can be modeled less precisely because of their interaction with the active region being much weaker. Software and Virtual Screening There are a few programs centered on predicting protein-ligand docking. The most important property of such software is the ability to reproduce the results of experimental binding modes of ligands found by crystallographic methods. To test the function of the program, an imaged docked-in ligand is taken out of the protein-ligand complex. Then best conformations for the possible docking of the two molecules are evaluated by the algorithm. Then the computed result is compared with the real world conformation by calculating the root-mean-squared deviation. Root-means-squared error (RMSE) is a method for predicting the effective difference between the expected and measured values. In this case, the imaged points of atoms are considered as expected values. The value of RMSE can be calculated from equation RMSE ( X ) E (( X ) 2 ) , n where is the standard deviation of the points (atoms) and n the number of points. A model is usually considered successful when RMSE is below the value of 2.0 Å. The greatest interest in docking is in life sciences. No wonder that computational methods for drug-like ligands are in the center of the problem. Usually also the functionality of docking software is tested with molecules expressing features of an average medicine. Lipinski’s Rule of Five presents a rule of thumb for such features. Generally an orally active drug has a molecular weight under 500 u, its Van der Waals bonding activity is limited to at most 5 hydrogen bond donors and 10 acceptors and its solubility is limited by partition coefficient log p < 5. Virtual screening is used for this inspection. It is a term for using large libraries of compounds with well-known dockings. Such a library is gained by imaging proteinligand-complexes. Accuracy of conformation is not the only property sought after. Also efficiency of the algorithm plays a critical role when large numbers of molecules are evaluated one after another. An easy way of quickening an algorithm is by going through fewer steps or by taking fewer samples in the genetic algorithm. There, though, is a danger of losing the accuracy. A large and carefully constructed set of protein-ligand complexes is required for estimating the success rate of a docking program. In the library which is a protein data bank, complexes should represent usual features of protein-ligand-docking. A validation set of complexes should not contain protein-ligand clashes, crystallographic contacts, or unlikely ligand geometries. Diversity of different types should still be preserved to get as broad as possible view of the quality of an algorithm. Fitness Functions A good docking program also takes the binding affinity of atoms into account. In a scoring function, the atom-scale electromagnetic forces have to be taken into consideration. On viewing results of fast-working algorithms, a trend of disagreement between model and real world is seen when talking about binding affinity GOLD offers a choice of fitness functions: GoldScore, ChemScore, and also a user defined score. GoldScore and ChemScore are both equally reliable, but they may give different prediction depending on the problem. GoldScore GoldScore fitness function is the original GOLD scoring function and it is selected by default. It is made up of four components: protein-ligand hydrogen bond energy and van der Waals energy, ligand internal van der Waals energy and ligand torsional strain energy. Optionally fifth component ligand intra-molecular hydrogen bond energy may be added. Empirical parameters used in fitness function such as hydrogen bond energies, atom radii and polarizations, hydrogen bond directionalities etc. are taken from a parameter file. Goldscore function uses bond strengths in the fitness function, which is of form f S hb _ ext S vdw _ ext S hb _ int S vdw _ int , where Shb_ext is the protein-ligand hydrogen bonding score, and Shb_int the internal hydrogen bonding of the ligand. Usually, the best result is obtained by letting the internal hydrogen bonding tend to zero. Svdw_ext and Svdw_int are the scores arising from weak Van der Waals forces. Goldscore has a mechanism for placing the ligand in the binding site, which is based on fitting points. The program adds hydrogen-bonding fitting points to the protein and ligand. Then it maps acceptor points on the ligand on donor points in the protein and vice versa. Additionally, it generates hydrophobic fitting points in the protein cavity onto which ligand CH groups are mapped. The fitness function in GoldScore is optimized for the prediction of binding positions rather than binding affinities. The actual search algorithm is a genetic algorithm optimizing several parameters of which one is the fitting point score described above. Other parameters are dihedrals of ligand rotable bonds, ligand ring geometries, and dihedrals of protein OH and NH3+ groups. It is obvious, that all the variables arise from the multiplicity of possible conformations the molecules can be stretched into. ChemScore ChemScore was derived empirically from a set of 82 protein-ligand complexes for which measured binding affinities were available. Unlike GoldScore, ChemScore was trained by regression against measured affinity data, although there is no clear indication that it is superior to GoldScore in predicting affinities. ChemScore estimates the total free energy change that occurs on ligand binding as described below: ΔGbinding = ΔG0 + ΔGhbond + ΔGmetal + ΔGrot + ΔGlipo G0 0 Ghbond 1 Phbond Gmetal 2 Pmetal Glipo 3 Plipo Grot 4 Prot Here the v terms are regression coefficients and the P terms represent the various types of physical contributions to binding. The final ChemScore value is obtained by adding in a clash penalty and internal torsion terms, which militate against close contacts in docking and poor internal conformations. Covalent and constraint scores may also be included. ChemScore = ΔGbinding + Pclash + Cinternal Pinternal (+ CcovalentPcovalent + Pconstraint) The hydrogen-bond term is computed as a sum over all possible acceptor-donor pairs such that one atom belongs to the protein and the other to the ligand. Each term in the summation is the product of three Gaussian-smoothed block functions. The purpose of the block functions is to reduce the contribution of a hydrogen bond according to how much its geometry deviates from (a) ideal H…A distance (where ‘H’ is the hydrogen atom linked to the donor atom (‘D’), ‘…’ the hydrogen bond, and ‘A’ the acceptor atom), (b) ideal D-H…A angle (where D-H is a covalent bond between donor and hydrogen atom), and (c) ideal directionality with respect to the acceptor atom. The maximum contribution of a given acceptor-donor pair to the summation is 1; this will occur if the pair forms a hydrogen bond of “ideal” geometry. Block function is of form and the Gaussian-smoothed block function looks like: The summation function for hydrogen bond strengths is where r is the distance, and α the angle as described above. In ChemScore the block function is convoluted with a Gaussian function. σ represents the smearing sigma for each term. The third block function in the H-bond equation, B´*, is the sum of all possible values for a given hydrogen bond. For example, a tertiary amine acceptor has three covalently bound atoms that could be deemed as the ‘X’ atom: in this case, the term added for an Hbond to the amine is the product of the block function values for all three possible H…A-X angles. The metal-binding term in ChemScore is computed as a sum over all possible metal-ion acceptor pairs, where the acceptor is an atom in the ligand that is capable of binding to a metal. Again we use Gaussian-smoothed block function whose purpose is to reduce contribution of the metal-acceptor interaction if the geometry is not ideal. The parameter raM is the actual acceptor-metal (A-M) distance, Rideal is the ideal A-M distance, Rmax the maximum A-M distance to be considered as a binding interaction, and metal the Gaussian smearing sigma with this term. The lipophilic term is defined in a similar way. The parameter rll is the actual distance between the pair of lipophilic atoms, Rideal is the ideal atom-atom distance, Rmax the maximum separation beyond which no interaction is deemed to occur, and lipo is the Gaussian smearing sigma associated with this term. Lipophilic atoms are defined as non-accepting sulphurs, non-polar carbon atoms and nonionic chlorine, bromine and iodine atoms. The following formula is used to estimate the entropic loss that occurs when single, acyclic bonds in the ligand become non-rotatable upon binding: Nrot is the number of frozen rotatable bonds in the ligand (a bond is considered frozen if one or more atoms on both sides of the rotatable bond are in contact with the protein). The expression is deemed to have a value of zero if there are no rotatable bonds in the ligand. Pnl(r) and P’nl(r) are the percentages of non-hydrogen atoms on either side of the rotatable bond that are not lipophilic. For example, if there are 10 non-hydrogen atoms on one side of the bond, of which 3 are not lipophilic, and there are 20 non-hydrogen atoms on the other side, of which 2 are not lipophilic, then Pnl(r) and P’nl(r) are 30% and 10%, respectively. In addition, the final ChemScore fitness function contains terms such as clash penalty term and internal torsion term. Clashes between protein and ligand atoms and ligand internal torsional strain are accommodated by penalty terms in order to prevent poor geometries in docking. The clash penalty terms differ on the nature of the contact, whether it is a hydrogen-bonding contact, a metal-binding contact or neither of these. Combined Fitness Functions In Goldscore-CS protocol, dockings are produced by Goldcore function and then are ranked by Chemscore. In Goldscore-GS, for one, dockings are produced by Chemscore and ranked by Goldscore. Docking with Chemscore is up to three times faster but with larger ligands Goldscore gives more accurate results. In small ligands, no such difference appears. The difference, therefore, seems to arise from the number of degrees of freedom in the molecules. Combination of both functions, like in Goldscore-CS and GS, gives improved results. Goldscore CS gives success rates up to 81 %, which is top-ranked GOLD solution within 2.0 Å (the usual root-mean-square distance considered as a successful prediction of docking) of the experimental binding mode. Longer search time is a cost of this combination of methods. In terms of producing binding-energy estimates, the Goldscore function appears to perform better than the Chemscore function and the two consensus protocols, particularly for faster search settings. Verdonk et al. compared results from Goldscore and Chemscore functions and came to a conclusion that Goldscore outperforms Chemscore in on larger molecules. Usually Goldscore was better but also cases existed where Chemscore was the winner. The interesting finding was the lack of cases where the both functions predicted correctly the experimentally discovered conformation. In all cases, a combination of the two functions gave better results than solitary functions. The reason for this is that by using only one function, also errors are more probable. These “hard failures”, which are of high rank in one function, do not have good scoring in the other function using different parameters for the scoring. Although an incorrect conformation might receive good grading by one function, the other function can be used as a filter for these failures. Goldscore-CS, where the conformations are first ranked by Goldscore, and then verified by Chemscore gives better result than Chemscore-GS, where the functions are used in reversed order. Accordingly, Goldscore, giving solitarily better results than Chemscore, also gives better results when used as the primary scoring function (as in Goldscore-CS). As mentioned above, the advantage of Chemscore function is its efficiency. There are two main reasons for this difference. Firstly, Chemscore does not take hydrogen atoms into account in lipophilic and clash terms. Therefore, the external van der Waals term, Svdw_ext can be precalculated. Secondly, the functional form of the ligand intramolecular energy is simpler in Chemscore. Consensus docking, where several functions are used can be also extended to using several algorithms. DOCK, FlexX, and GOLD have been used cooperatively to get better results. Verdonk et al. also noted that Goldscore function gives an equally good correlation with binding affinity as ΔGbingind, which is surprising because the Goldscore function does not have that parameter. To investigate this, though, the intramolecular terms of the Goldscore function must be subtracted, because they cannot be compared between different complexes. The correlation between the real and model deteriorates rapidly with faster search settings. It means that to obtain reasonable estimates of the binding energy, correctly predicted binding mode is essential. Genetic Algorithm Overview A genetic algorithm can be used to evolve the pose of the molecule in the search of optimum state. In genetic algorithm, definition of a fitness function is necessary. The function must emphasize the properties of the evolving system that are being optimized. In the case of protein structure and docking, the natural property of the quality of a conformation is the overall energy of the molecule. The energy of the molecule varies as a function of the positions of its components. Thereby, there are one or more conformational states into which the molecule geometry attempts to converge. The task of the algorithm is to find these few states from the excess of all states by changing the values of variables. The algorithm is initiated so that multiple conformations are produced randomly. The genetic algorithm proceeds so that of these candidates several conformations with most favorable value of the fitness function is chosen for the next step. Then properties of these conformations are recombinated between each other. This recombination has an analog in recombination of DNA chromosomes. Then again, the conformations are also mutated to obtain new properties into the system. Of these recombinated and mutated conformations a new generation is then chosen in a similar way. After several steps the system finds its optimum so that the best conformation is one of the results. The Genetic Algorithm in GOLD GOLD optimizes the fitness score by using a genetic algorithm. A population of potential solutions (in this case, possible docked orientations of the ligand) is set up at random. Each member of the population is encoded as a chromosome which contains information about the mapping of ligand H-bond atoms onto complementary protein H-bond atoms, mapping of hydrophobic points on the ligand onto protein hydrophobic points and the conformation around flexible ligand bonds and protein OH-groups. Each chromosome is assigned a fitness score based on its predicted binding affinity and the chromosomes within the population are ranked according to fitness. The population of chromosomes is iteratively optimized. At each step, a point mutation may occur in a chromosome or two chromosomes may mate to give a child. The selection of parent chromosomes is biased towards fitter members of the population. A number of parameters control the precise operation of the genetic algorithm: population size, selection pressure, number of operations, number of islands, niche size, operator weights and van der Waals and hydrogen bonding annealing parameters. No changes are recommended in the algorithm parameters. As mentioned above population size refers to the number of chromosomes on one island. It is possible to have two or more islands each with specific population size. Each of the genetic operations (crossing-over, migration and mutation) takes information from parent chromosomes and assembles this information in child chromosomes. The child chromosomes then replace the worst members of the population. Again the selection of parent chromosomes is biased towards those of high fitness. The selection pressure is defined as the ratio between the probability that the fittest member is selected as a parent and the probability that an average member is selected as a parent. Too high a selection pressure will result in the population converging too early. The genetic algorithm starts off with a random population. Genetic operations are then applied iteratively to the population. The parameter Number of operations is the number of operations that are applied over the course of run. It is the key parameter in determining how long a run will take. Rather than maintaining a single population, the genetic algorithm can maintain a number of populations that are arranged as a ring of islands. Individuals can migrate between adjacent islands using the migration operation. The effect of the number of the islands on the efficiency of the algorithm is uncertain. Niching is a common technique used in genetic algorithms to preserve diversity within the population. In GOLD two individuals share the same niche if the RMSD (deviation) of the coordinates of their donor and acceptor atoms is less than 1 Å. When adding a new individual to the population, count is made of the number of individuals in the population that inhabit the same niche as the new chromosome. If there is more than the adjusted number of individuals in the niche, the new member replaces the worst member of the niche rather then the worst member of the total population. Operator weights are the parameters mutate, crossover and migrate. They govern the relative frequencies of the three types of operations that can occur during a genetic optimization: point mutation of the chromosome, migration of population member from one island to another and crossover of two chromosomes. Algorithm Efficiency The efficiency of the algorithm depends on how many steps the algorithm takes. If the optimal state is found more quickly than expected, the algorithm can be stopped to save time. If the algorithm converges so that the optimum state is found repeatedly within certain extent of error, no more steps are needed. By default, GOLD terminates when the top three dockings are within 1.5 Å of each other. More efficiency can be gained by fixing the large protein almost completely to its solitary energy optimum. The attaching ligand affects only on the receptor site of the protein, leaving the rest of the protein unchanged. Also the ligand conformation can be considered constant except for its docking groups (OH and NH3+). In this way, a lot of computational power can be saved without considerable loss of accuracy. Of course, there are cases where the ligand considerably affects the protein conformation so that the geometry of the complex no longer resembles the prior protein. The total time spent docking a ligand obviously depends on the number of docking runs which by default are set to 10 for each ligand. By reducing the number of docking runs we can make GOLD go faster. However, it is useful to perform at least a few docking runs on each ligand. This increases the chances of getting right result. If the same answer is found in several different runs it is usually a strong indicator that the answer is correct. The early termination option can be used to save time. This option instructs GOLD to terminate Docking runs on a given ligand as soon as a specified number of runs have given essentially the same answer. The time taken by GOLD to dock ligands can be controlled by altering the values of genetic algorithm parameters. The easiest way to make GOLD go faster is to reduce the number of genetic algorithm operations performed in the course of a run. GOLD manipulates a pool of chromosomes of size (population size)*(number of islands). The size of this pool should be such that the optimization converges within the specified number of operations. If the pool size is too small for a given value of operations the algorithm will converge prematurely. References Paul, N., Rognan, D. ConsDock: A new Program for the consensus analysis of proteinligand interactions. Proteins, 47, 521-533, 2002 Development and validation of a genetic algorithm for flexible docking G. Jones, P. Willett, R. C. Glen, A. R. Leach and R. Taylor, J. Mol. Biol., 267, 727-748, 1997 Improved Protein-Ligand Docking using GOLD M. L. Verdonk, J. C. Cole, M. J. Hartshorn, C. W. Murray, R. D. Taylor Proteins, 52, 609-623, 2003