GLOSSARY Complex Two or more molecules bound together. Interactor or component One of the constituents of a complex. Ligand A drug-like small molecule, or the smaller of two interacting proteins. Receptor The larger interactor. Binding sites Place on the molecular surface of each interactor where it contacts the other(s). Interface A part of the complex where two interactors contact each other. Docking Identifying the structure of a complex by using knowledge of the interactors. Rigid-body docking A simple model which only permits wholesale translations and rotations of each interactor relative to the other. Distances between all pairs of atoms within each interactor are not varied. Flexible docking A model which accommodates change in the torsion angles of the interactors. More rarely, bond angles or bond lengths may also be varied. Sidechain flexible docking A model which accommodates change in the torsion angles, bond angles or bond lengths of the sidechains of protein interactors, but not their backbones. Protein backbone The chain of atoms in a protein formed by the peptide bonds and the alpha-carbon atoms between neighbouring residues. Protein sidechains The branches of atoms in a protein which have an alpha-carbon atom at their base. Bond torsion Rotating atoms downstream of a chemical bond about an axis defined by the bond. Torsion angle or dihedral angle The angle of bond torsion necessary to line up two reference atoms. Bond flexing Changing the angle between two chemical bonds formed by a given atom. Bond angle The angle between two chemical bonds formed by a given atom. Native The structure of a molecular complex as it exists in vivo. More loosely can refer to an in vitro experimental determination by X-ray crystallography, nuclear magnetic resonance, electron microscopy etc . Configurations A set of models for a molecular complex. Decoys A set of incorrect models for a molecular complex, constructed for the purpose of developing an effective scoring protocol. Double blind trial for docking (cf. CAPRI) A trial where ‘modellers’ submit models of a given complex to ‘assessors’. Modellers do not have access to knowledge of native structure; assessors do not have access to the identity of modellers. Coordinate file A computer file containing 3-dimensional coordinates for centres of the atoms of a molecule. PDB file The dominant file format for coordinates of proteins (‘Protein Data Bank’). Scoring function or fitness function A scheme for associating a number with each model of a complex aiming to reward models near to the native complex structure. Energy function A scoring function which has a physical interpretation as energy. Pairwise energy An energy function which has a contribution from each possible pair of atoms in the model. Electrostatics A pairwise energy function encoding Coulomb’s law E es q1 q 2 r12 r12 . Dielectric The function r12 in electrostatics. The choice r12 r12 is common. Stereochemistry or Van der Waals Penalizing geometric clashes between atoms, and rewarding glancing contact. Lennard-Jones A B A stereochemical pairwise energy function of the form E vdw 12 126 12 r r1212 12 Force Field . A prescription for the values q1 q 2 12 A12 B12 for all atoms. Examples are CHARMM and AMBER. Solvation Rewarding the exposure of polar residues to water, and penalizing their burial at the interface. Convolution A combination of two mathematical functions based on superimposing one of them at every place where the other is non-zero. Fourier series A mathematical technique based on a global sum or integral weighted by a sine or cosine.Useful for evaluating scores which have the form of ‘convolutions’. Monte Carlo A generic term for mathematical techniques based on random numbers.In the Monte Carlo approach to docking, new configurations are generated by applying random displacements to existing ones. Normal modes Patterns of perturbation of the atomic coordinates for which the molecule will behave like a simple spring (with linear restoring force) in response. A protein’s normal modes with the lowest frequency of oscillation can be identified, and tend to be similar to frequently observed patterns of large conformational change on docking, for instance pincer and hinge movements . Affinity The likelihood that the interactors will associate to form the complex in the cellular environment. Mutagenesis An experimental technique to identify binding sites in protein complexes, by replacing each residue by alanine and re-measuring the affinity. Correlated mutations Pairs of residues, one from each interactor, which are either both mutated or both not mutated across a range of species. Binding sites are suggested by the presence of correlated mutations. Molecular surface The locus of points on the boundary of at least one atom and not on the inside of any atoms of the molecule. May be artificially smoothed to remove internal voids and channels. Solvent-accessible surface The locus of the centre of a water molecule which is in glancing contact with at least one atom of the molecule but does not penetrate any atoms. Normal The normal to at a point on a surface, is the direction in which the surface is facing at that point. Marching cubes Method for generating a simplicial complex which approximates a molecular surface or the solventaccessible surface of a molecule. Voxel A 3-dimensional pixel. High-performance computing (HPC) Dividing a single large task between multiple processors. Experimental screening Automated lab-testing a large variety of compounds for their suitability as drug leads. Virtual screening Using computers to score a large library of compounds for their suitability as drug leads. ADME(T) Acronym of the important criteria for suitability of a molecule as a drug: Absorption, Distribution, Metabolism, Excretion, (Toxicity). Lipinski’s rule of 5 Heuristic rules for identifying potentially ADME-compliant drugs: not more than 5 H-bond donors, not more than 10 H-bond acceptors, molecular weight less than 500, log partition coefficient less than 5. QSAR Informatics methods for infer the suitability of a molecule as a drug (Quantitative Structure-Activity Relationship).The next step in sophistication from Lipinski’s rule.