Considerations in Multi-molecular Simulations One of the powers of molecular mechanics is the ability to approximate the interactions between a large number of atoms. This can be extended to interactions between multiple molecules. In biomolecular systems, the molecules never exist in an isolated gas phase. The ability to include solvent molecules and counter ions explicitly in a simulation has the potential to lead to a realistic model for the molecule and its environment. There are several issues that need to be considered before and during a simulation on such a complex system. These include: Initial Configuration Initial Velocities Solvent Model Boundary Conditions Non-Bonded Interaction Cutoffs Group-based Cutoffs Updating Neighbour Lists Temperature/Pressure/Volume Control Measuring Equilibration Initial Configuration Before beginning a simulation it is necessary to have an initial 3-D structure for the molecule. In the case of a protein, the initial conformation may be one for which a crystalographically determined structure has been reported. In this case it is straightforward to download the Cartesian coordinates from the Protein Database at the Brookhaven National Laboratories. http://www.pdb.bnl.gov/ The coordinates are stored in a "standard" format known as PDB format. A faster site for local access is maintained in the Department of Biochemistry and Molecular Biology at UGA: http://www.uga.edu/~biocryst/ At the PDB web site it is possible to search for and retrieve structures from NMR and theoretical studies, although by far the majority of structures are from X-ray diffraction. Alternatively, the initial structure may have been obtained through an earlier experimental (NMR) or modeling study (homology modeling). For more flexible molecules, such as carbohydrates, the initial structure may be more hypothetical, since it will be expected to change and "converge" to a realistic ensemble of structures during the simulation. Once the structure is obtained there are still a few details to address, in particular, 1) Are all of the residues recognized by the force field of choice? That is, are all of the atom types in the PDB file the same as those that the force field expects? If not, they may have to be manually corrected. Does the structure contain structurally important metal ions? Are these parameterized in the force field? Does the structure contain any counter ions (SO42-, Ca2+, Na+ etc), and are they treated properly by the force field? 2) Does the structure contain hydrogen atoms? Most X-ray determined protein structures do not, and they must be added. This is usually an automated procedure based on simple valence geometry rules. For example, if the atom is sp3 hybridized (such as the CA in an amino acid), the hydrogen is tetrahedrally positioned with respect to the CA atom. But what about charged groups? Most amino acids that have ionizable side chains (Asp, Glu, Lys, Arg and the C- and N-terminus) are ionized at physiological pH (i.e. 6 - 8). Note, the imidazole side chain of histidine may be neutral or charged (its observed pKa = 6 - 7), therefore its ionization state must be specified and a hydrogen atom added as necessary. Further, in the neutral state the side chain must contain a hydrogen atom at one of the nitrogen atoms (either ND1 or NE2), usually NE2 but it depends on the local pH. N N H C +H+ -H C O O "HIE" "HIP" N + N H H N H N H -H + +H + HN N N C H O "HID" HIS = histidine pKa = 6.5 3) Does the structure contain any "waters of crystallization" if so they should be retained in the structure if they appear to be filling any surface or interior cavities. Otherwise they may be deleted. Note, the names for the waters (atomic and residue) must agree with the water model used in the simulation, and, these individual waters must be treated the same as the rest of the solvent waters. That is they must be treated as part of the solvent, not part of the solute. Initial Velocities To begin a MD simulation initial velocities must be assigned to each of the atoms. Since the initial temperature of the simulation is very low, the initial velocities are very small. The velocities are usually assigned randomly from a Maxwell-Boltzman probability distribution at the initial temperature (typically 5 K). (vix) ( 1 2 mi ) e 2 k B T ( mi vix ) 2 kb T That is, 1) select the temperature (T) 2) for each atom (i) choose a random number ρ between 0 - 1 (from a generator that produces a value that is distributed according to a Boltzman probability), and calculate the velocity component (vx, vy, vz) for that atom, remember velocity is vector property. 3) Repeat for each component and each atom. Water Model The choice of solvent model depends on which properties are important in the simulation. Generally, the more sophisticated the water model, the slower the calculation. Therefore you must decide on a suitable level of accuracy. Are you more interested in the solute or the solvent? Does a rigid water model that displays the correct bulk water behavior (density, radial distribution) suffice? Or, is a relaxed model that allows the O–H bonds to stretch and the valence angle to bend necessary? Water Model: General considerations Regardless of the geometry of the model, the electrostatic interactions between the solvent and the solute will be important. It is good practice to model the electrostatic interactions between the water and the solute in the same way that the way that the water was designed for. For example, a poor approach is to employ a protein model with partial atomic charges on each atom derived from one approximation with a water model in which the partial atomic charges were derived from a different approximation. As an extreme example, an unbalanced model would be expected to result from employing MM3 (which does not use partial atomic charges) to model a solute, with a model for water (TIP3P) that incorporates partial atomic charges. This sort of apples and oranges mixture of models is sometimes the result when an investigator creates a new force field for a particular class of solute. The solute force field may be (indeed should be!) internally consistent, but it may not have been derived with attention to applying it with a given water model. Water Model: Validation How is a water (or any solvent) model judged? One obvious criteria is density of the simulated solvent. But that doesn't say anything about the dynamics of the model. Other experimentally observable properties include the diffusion coefficient and the viscosity. The diffusion coefficient (D) can be calculated in a straightforward way from a simulation by knowing the initial and final position (r) of a molecule after a time t, during which the molecule has diffused through the solvent. D 1 | ri (t ) ri (0)|2 6t Information about the detailed "structure" of a liquid may be obtained from a study of the radial distribution function (rdf) also known as the g(r). The rdf is a measure of the number of molecules at a given distance from a central molecule. Since liquids are dynamic, the rdf gives a characteristic average structure. X-ray diffraction can be used to measure the rdf of liquids. The rdf of a molecule give the probability of finding a molecule a distance r from another molecule. In practice, the environment of the molecule is divided up into thin shells of thickness dr. The number of molecules in each shell is then counted and averaged over the course of the simulation. For short distances (r < the molecular radius) the rdf is zero since there can be no other molecules within the molecular surface. Thereafter, the rdf exhibits ripples corresponding to solvation shells. The first peak it the largest indicating a high probability of finding another molecule at that intermolecular separation. As the separation increases, there is less order and the peak intensities decrease. The area under the curve defines the number of molecules in the solvation shell. For water, there is a high probability of finding another water molecule ~3Å away corresponding to a water-water hydrogen bond. Boundary Conditions: Molecular Clusters or Droplets If a molecule is simply surrounded by a droplet of other molecules (perhaps solvent), there will exist a boundary between the droplet and the vacuum around it. It then becomes difficult to prevent the molecules from diffusing into the vacuum. An artificial restraint may be applied to force the molecules to stay within the boundary, but this does not correspond to a traditional thermodynamic ensemble. Variable Density Vacuum Boundary Periodic Boundary Conditions (PBC) An alternative to the droplet model is to arrange the molecules into a regular lattice structure. By mirroring the contents (positions and velocities) of the central "box" a periodic system is generated. This periodic boundary system avoids edge effects. When a molecule diffuses out of one side of the box, it reenters on the other. Thus a constant density can be maintained. If the box dimensions are allowed to change with temperature, it is possible to maintain a constant internal pressure (NPT ensemble), alternatively, if the box dimensions are kept frozen, the internal pressure will fluctuate with temperature (NVT ensemble).