jcc23673-sup-0001-suppinfo

advertisement
Supplementary Information:
Exploring Chemical Reactivity of Complex Systems
with Path-Based Coordinates. Role of the Distance
Metric
Kirill Zinovjev, Iñaki Tuñón*
Departament de Química Física, Universitat de València, 46100 Burjassot, (Spain)
*to whom correspondence should be addressed
ignacio.tunon@uv.es
Distance metrics used in Figure 1.
Figure 1 was obtained using an arbitrary reference path in a two-dimensional space and three
different metric tensors, the Euclidean metric, a constant metric and a variable metric, which
were defined as:
a) Euclidean Metric:
1 0
)
0 1
(
b) Constant Metric:
1 0
)
0 2
(
c) Variable Metric. The metric tensor was smoothly interpolated between the following three
different matrixes.
Reactants
1
0
(
0
) ;
2
;
TS
;
1
−0.5
) ;
−0.5
2
(
Products
1 0
(
)
0 2
Note that this is a toy model introduced to illustrate the role of the metric and that does not
correspond to any real system.
Convergence of the string
Figure S1. Convergence of the string is monitored from the Root-Mean-Square-Displacements (RMSD) of the
nodes versus the simulation time.
Definition of the s coordinate
The s coordinate is defined from the converged MFEP in the following steps. First, the positions
of the string nodes are averaged over the time range 10-50 ps of the string simulation. Then a
least squares cubic spline interpolation of the string is done. Having the path defined with cubic
splines has two advantages: i) least squares interpolations flatten any roughness that might lead
to numerical problems and ii) it is more convenient to work with an analytic continuous
description of the path than with a set of discrete points.
The next step is to calculate the value of the average distance metric tensor at different points on
Μƒ ) and the current
the path. During the string calculations, values of the local metric tensor (𝑴
position in the CV space were saved at each step of the dynamics and for each of the walkers.
Μƒ observations roughly
This results in W×T (W – number of walkers, T – number of time steps) 𝑴
equally distributed along the path. To calculate the average metric tensor as a function of the
position on the path (𝑴(𝜏), where  is some initial arbitrary parameterization), the path is split in
Μƒ is then assigned to the closest bin in the CV space. Then 𝑴(𝜏) is obtained in
100 bins. Each 𝑴
each bin and a least squares cubic splines interpolation is performed for each element of the
tensor. Components of the 9x9 metric tensor are provided in Table S1.
Once the path and the metric have been fitted to analytic functions, a reparameterization of the
path has to be done, so that the value of the s coordinate correspond to the arc-length of the path
under the non-constant distance metric. To do this, the arc-length as a function of 𝜏 is obtained:
𝜏′ 𝑑𝑑
𝑑(𝜏′) = ∫0
π‘‘πœ
𝜏′ 𝑑𝒛(𝜏)
π‘‘πœ = ∫0 |
π‘‘πœ
|
𝑴−1 (𝜏)
π‘‘πœ
(S1)
Finally, having a one-to-one mapping between the initial parameterization 𝜏 and the arc-length t,
it is straightforward to construct an inverse function 𝜏(𝑑) and then to acquire the equidistant
points along the path and the corresponding values of the metric tensor for each. Also, as was
done in Ref. 6 and 7, the path is additionally extrapolated into the reactants and products basins
to avoid numerical instabilities. The metric tensor in the extrapolated tails is constant and equal
to the metric tensor at the corresponding endpoint of the path. The free energy profile
corresponding to the converged string is plotted in Figure S2 as a function of 𝜏, the arc-length t
and the coordinate s.
In case of a constant distance metric the procedure is the same, except that the 𝑴() calculation
step is omitted and its constant value is directly used.
Table S1. Components of the 9x9 metric tensors used in s0, sav and sM coordinates
The coordinates are in the following order: d(C1-H2), d(H2-C3), d(O5-C6), d(C1-C6),
d(C4-O5), d(C3-C4), hyb(C1), hyb(C3), hyb(C6)
S0:
1.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
1.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
1.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
1.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
1.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
1.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
1.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
1.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
1.00000
-0.03902
1.07538
0.00000
0.00000
0.00000
-0.00284
0.00000
-0.07263
0.00000
0.00000
0.00000
0.14576
-0.01770
-0.00755
0.00000
0.02218
0.00000
-0.07892
-0.02161
0.00000
-0.01770
0.16651
0.00000
0.00000
0.01516
0.00000
0.01505
0.00000
0.00000
-0.00755
0.00000
0.14576
-0.04354
0.00000
0.00289
0.00000
0.00000
-0.00284
0.00000
0.00000
-0.04354
0.16651
0.00000
0.01472
0.00000
-0.07991
0.00000
0.02218
0.01516
0.00000
0.00000
0.10873
0.00000
-0.04667
0.00000
-0.07263
0.00000
0.00000
0.00289
0.01472
0.00000
0.34662
0.00000
0.02149
0.00000
-0.07892
0.01505
0.00000
0.00000
-0.04667
0.00000
0.25172
-0.78394
1.07538
0.00000
0.00000
0.00000
-0.00848
0.00000
-0.08122
0.00000
0.00000
0.00000
0.14576
-0.02824
-0.02616
0.00000
0.02209
0.00000
-0.08135
-0.02214
0.00000
-0.02824
0.16651
0.00000
0.00000
0.01708
0.00000
0.02259
0.00000
0.00000
-0.02616
0.00000
0.14576
-0.03990
0.00000
0.00703
0.00000
0.00000
-0.00848
0.00000
0.00000
-0.03990
0.16651
0.00000
0.01888
0.00000
-0.08237
0.00000
0.02209
0.01708
0.00000
0.00000
0.10871
0.00000
-0.04729
0.00000
-0.08122
0.00000
0.00000
0.00703
0.01888
0.00000
0.34244
0.00000
0.02175
0.00000
-0.08135
0.02259
0.00000
0.00000
-0.04729
0.00000
0.25596
Sav:
1.07538
-0.03902
0.00000
-0.02161
0.00000
0.00000
-0.07991
0.00000
0.02149
SM (at TS):
1.07538
-0.78394
0.00000
-0.02214
0.00000
0.00000
-0.08237
0.00000
0.02175
Figure S2. Free energy profile (red line) corresponding to the converged string for the isochorismate reaction as a
function of t0 (a), the arc-length L (b) and the coordinate s (c). In this last case the PMF (green line) is also
presented.
Parameters for Umbrella Sampling Simulations
One of the results of the string method calculation is the free energy profile along the converged
path (see Figure S2). Although not being a PMF, this profile is expected to be a good
approximation of it (as was discussed in Ref. 7 and observed in Figure S2c). Therefore, one
might take advantage of having this profile to guess the shape of the PMF. This information can
be used to define the biasing potentials for Umbrella Sampling windows, flattening the
underlying free energy as much as possible. Usually, the biasing potential is harmonic:
𝑉𝑖𝑏 (𝑠) =
𝐾𝑖
2
(𝑠 − 𝑠𝑖0 )2
(S3)
The force constants (Ki) and reference positions (𝑠𝑖0 ) of the biasing potentials in window i should
be chosen in such a way that the sampling accumulated from the whole set of US windows were
as uniform as possible. If the number of simulation windows is sufficiently large one can assume
that the density of states within each window will be normally distributed. Therefore to obtain a
uniform sampling after summation of histograms of all windows the corresponding Gaussians
should be equally distributed with standard deviation be equal to the spacing between windows:
𝑖−1
πœ‡π‘– = 𝑁−1 𝐿;
πœŽπ‘– = 𝐿/(𝑁 − 1)
(S4)
Where N is the number of windows, L the whole range of the RC, μi and σi mean and standard
deviation of the distribution of the window i, respectively. If the underlying free energy can be
approximated by the free energy profile (A), then the force constants (Ki) and reference positions
(𝑠𝑖0 ) of the US windows can be estimated as follows:
π‘˜π‘‡
𝑑2
𝐾𝑖 = 𝜎2 − 𝑑𝑠2 𝐴(πœ‡π‘– );
𝑑
𝐴(πœ‡π‘– )
𝑠𝑖0 = πœ‡π‘– + 𝑑𝑠 𝐾
𝑖
(S5)
Figure S3 shows the expected and observed probability density obtained during the US
simulations performed to obtain the PMF corresponding to the isochorismate reaction. It can be
concluded that the use of the harmonic biasing potentials given by eq. (S5) produce a
homogeneous sampling of the whole range of possible values of the reaction coordinate.
Figure S3. Expected and observed normalized probability density during US simulations.
Download