Determination of force field parameters

advertisement
1
Determination of force field parameters
2
a) Van de waals (vdw) parameters
3
As described in the Method section, the reasonable distance between OG and CT is
4
1.5 Å in substrate productive docking geometry. In practice, we applied enhanced vdw
5
interaction to assure this desired inter-atomic distance. The potential energy derived
6
from vdw interaction is calculated using Equation (S1-S3) in AD4.
7
Vvdw
   12   6 
 4       
 r 
 r  

8

1
( A   B )
2
9
   A B
(S1)
(S2)
(S3)
10
According to the Equation S2, the original σOG (3.2 Å) and σCT (4.0 Å) value
11
should be reduced to present a sum of atomic σ values of 3.0 Å to present idea
12
equilibrium OG-CT distance. On the other hand, the appropriate vdw radii must be
13
extensively investigated to eliminate possible troubles. For instance, we reduce the
14
σCT to 0.5 Å, while the σOG remains relatively large value (2.5 Å). In this case, the
15
nearby atoms of CT will present large repulsion to OG atom when the substrate
16
docked in ground state (Figure S1). This will lead the high docking energy in
17
productive docking conformation, and make the conformation unfavorable in
18
energy-based docking program.
19
We also note that the σCT should be less than 2.0 Å, which is equal to the vdw
20
radius of hydrogen atom. Otherwise, the hydroxyl hydrogen atom in substrate will
21
have higher priority than CT to dock around the OG due to small radius and higher
22
positive charge (Figure S2b). Practically, the σCT value 1.0 Å of and σOG value of 2.0
23
Å was applied in our docking process.
24
In addition, the original non-bound interaction between CT and OG is relatively
25
weak and unable to present productive docking geometry with reasonable OG-CT
26
distance (1.5-1.7 Å). Consequently, we increased the ε value of CT and OG (to 30
27
kcal/mol and 70 kcal/mol, respectively) to overcome the high system energy of TI
28
docking model (Figure S3). As a result, the OG-CT distance ranged from 1.5-1.7 Å in
29
most of docking solutions and the potential energy increased rapidly when big
30
deviation appeared in the inter-atomic distance (e.g. the potential energy increased
31
approximately 3.5 kcal/mol when the distance rose from 1.7 Å to 2.0 Å).
32
33
b) Parameters for catalytic hydrogen bond
34
Another important feature of substrate productive docking geometry is that the three
35
catalytically important hydrogen bonds (H-bonds) between substrate and the active
36
sites of enzyme must be present (Figure 2). However, these H-bonds, especially the
37
one between O2 atom in substrate and the HE atom of the protonated His residue were
38
usually absent in docking poses with the original AD4 force field setting. It might be
39
due to the naturally high system energy of catalytic transition state similar to that
40
shown in Figure S3. Accordingly, we tried to enhance these catalytic H-bonds
41
interaction and have applied a number of ε values (H-bond well depth) to explore the
42
most appropriate parameter. Practically, a ε value of 30 kcal/mol has been applied to
43
present productive docking geometry. Under this setting, the docking energy increases
44
approximately 5.0 kcal/mol in the absence of one catalytic H-bond. This energy is so
45
high that downgrade or eliminate the non-productive docking pose in docking ranks.
46
47
Receptor preparation
48
The lipase structures in active open conformation were obtained from the protein
49
database (PDB) as suggested by Tyagi and Pleiss (2006), and the esterase structures
50
were built by SWISS-MODEL (Arnold et al., 2006) using homology modeling
51
method.
52
Firstly, the ligand and other non-protein molecules were removed from the
53
protein structure. The omitted hydrogen atoms were then added and atom partial
54
charges in protein were calculated using the Dock Prep utility in Chimera software
55
(Pettersen et al., 2004). The catalytic histidine residue was protonated and had total
56
charge of +1. The hydrogen in serine hydroxyl group was removed manually and the
57
partial charge of the residue was recalculated using the restrained electrostatic
58
potential (RESP) method (Bayly et al., 1993) based on electrostatic potentials
59
calculated with GAUSSIAN 03 (Frisch et al. 2003), and the total residue charge was
60
-1. Finally, the receptor files were prepared in pdbqt format by ADT program (Sanner,
61
1999). A special atom type was assigned to the serine OG atom with modified vdw
62
parameters (ε=70 kcal, σ = 2.0 Å). The atom was also set as non-hydrogen forming
63
type in order to eliminate the possible hydrogen bonds to the polar hydrogen in
64
substrates (Figure S2b).
65
66
Ligand preparation
67
All substrates were prepared in their ester forms to facilitate comparison of the results
68
(Table 1). The 3D structures of substrates were generated by Chemoffice (Cambridge
69
Soft Corporation) program. Both of ground state form (GS) and high energy
70
tetrahedral intermediate (TI) of the substrate were included. The substrate structures
71
were then minimized and the atom partial charges were computed using Chimera
72
program. The molecular charge of the substrate was neutral in GS form while was -1
73
in TI. The final ligand files were prepared in pdbqt format by ADT.
74
Each chiral substrate had two enantiomers at ground state and four stereogenic
75
conformations in TI form owing to an additional asymmetric center in the molecule
76
(Figure S4). As a result, each substrate had 6 different structures to be docked and the
77
whole substrate library contained 486 molecules in total. These ligands could be
78
classified into GS, (R/S,R)-TI and (R/S,S)-TI forms (Figure S4). In TI model, the first
79
R/S is the original substrate enantioform, and the second one indicates the
80
conformation of the substrate tetrahedral carbon atom (CT).
81
For all ligands, the vdw well depth value (ε) of CT atom was adjusted to 30 kcal
82
and the collision radius (σ) was set at 1.0 Å. Simultaneously, the ε values of the
83
catalytic hydrogen bonds was increased to 30 kcal as described above.
84
85
Docking parameters
86
A searching grid box was set in appropriate size prior to docking (Table S1). The box
87
center was set exactly at the OG atom of the catalytic serine. 100 possible docking
88
conformations of each substrate in receptor were obtained by Lamakian genetic
89
algorithm (LGA) using system time as random seed generator. The population size
90
was 150 and maximum number of energy evaluation was 25,000,000. The maximum
91
generation of generations was 27,000, the gene mutation rate was 0.02 and the
92
crossover rate was set to 0.8 for the LGA searching. Other searching parameters were
93
used default setting (step size, energy outside grid and so on). Finally, the docking
94
free energy (ΔGdocking) was calculated by the sum of vdw, electrostatic, hydrogen
95
bond, desolvation and torsion items (Equation S4). The resulting docking solutions
96
were ranked with the docking energy from high to low.
97

GG
d
o
c
k
i
n
g
=

v
d
w


G
h
b
+

G
e
l
e
c
+

G
d
e
s
o
l
v
+
G
t
o
r
(S4)
98
99
Screening for the productive docking geometries
100
As described in Methods section, the productive docking geometry of substrate must
101
meet several criteria: 1), the distance between serine OG atom and substrate CT atom
102
should not exceed 1.7 Å; 2) all catalytic hydrogen bonds (H-bonds) must be formed
103
between enzyme and substrate (Figure 2).
104
In practice, we applied a Perl script to handle with the extensive screening work.
105
For each potential substrate docking pose, the coordinates of CT atom along with the
106
coordinates of OG atom in enzyme was obtained to calculate the OG-CT distance.
107
Meanwhile, the presence of catalytically important H-bond was checked out based on
108
a distance constraint (the distance between hydrogen atom and H-bond receptor
109
ranges from 1.2 to 2.7 Å) and an angle constraint (H-bond donor -- hydrogen atom --
110
H-bond receptor is equal to or large than 120 ˚) as shown in Figure S5.
111
112
Statistical analysis
113
The statistical analysis was performed with SPSS 10.0 software (SPSS Inc.). The
114
prediction error distribution of GS and TI model were analyzed by nonparametric
115
tests (K-S test and two independent samples test). The influence of substrate torsion
116
level and the E value of enzyme/substrate on prediction were evaluated by analysis of
117
variance (ANOVA). The correlation analysis was achieved by Pearson correlation
118
method.
119
120
121
Results and discussions
122
123
Ground state versus tetrahedral intermediates
124
In the search of possible productive poses of substrate, the GS substrates presented
125
better docking performance than the substrate in TI form. Most of the well oriented
126
GSs (127/138) had the highest score in the ranks of docking energy. While in the TI
127
model, 95% of corrected docked (R/S,S)-TI ranked among the top 5 docking solutions.
128
Meanwhile, only 43% of (R/S,R)-TIs were scored within this range and approximately
129
30% (43/138) of (R/S,R)-TIs ranked out of the top 50. Additionally, different
130
proportions of (R/S,R)-TI and (R/S,S)-TI were observed in the scoring conformations
131
that were used to calculate the modified docking energy difference (ΔΔG’docking).
132
These scoring conformations mostly were (R/S,S)-TI (119/138), which well agreed
133
with the previous structure studies about the conformation of substrate analogues in
134
enzymes (Derewenda et al., 1992;Uppenberg et al., 1995).
135
To represent enzyme enantioselectivity, the ΔΔG’docking values were calculated by
136
Equation (2) and compared with the activation free energy difference (ΔΔG≠). Among
137
all the 69 enzyme/substrate docking pairs (Table 1), the experimental ΔΔG≠ value was
138
within the range from -4.02 kcal/mol to 4.09 kcal/mol. The predicted ΔΔG’docking
139
ranged from -7.36 kcal/mol to 2.74 kcal/mol in GS model (K=1.03), and from -4.8
140
kcal/mol to 2.8 kcal/mol for TIs (K=1.22), respectively. The overall prediction error
141
(EP) distributions of GS and TI model were demonstrated in Figure S6. As shown, the
142
errors of GS and TI model both normally distributed and no significant difference
143
(P<0.05, two independent samples test) was observed between them. However, the TI
144
model showed a little smaller error range (-4 to 4.5 kcal/mol) than GS (±5 kcal/mol),
145
and it presented superior accuracy in prediction. In TI model, 48% of prediction
146
results had the error lower than 1 kcal/mol, and the average error was 1.2 kcal/mol.
147
The GS model had 29% of results with the 1 kcal/mol of prediction error and the
148
average error was 2.3 kcal/mol.
149
It has been widely reported that docking with high-energy substrate intermediates
150
has striking improvement to represent enzyme activity and enantioselectivity
151
compared with the docking in GS model (Hermann et al., 2006; Tyagi and Pleiss,
152
2006; Juhl et al., 2009). This might be due to the fact that the intermediate structure
153
can adopt its docking geometry mostly similar to those naturally occurring.
154
As shown in Figure S7, the high-energy TI usually had its chiral groups exactly
155
localize at these so-called specificity pockets that are in charge of the enantiomeric
156
recognition (Orrenius et al., 1998). The docked GS substrate, by comparison, had its
157
alcohol and acid moiety differently oriented. Consequently, the chiral groups might be
158
docked out of the binding pockets and presented a different docking energy. The
159
docking results also revealed that the substrate docked as (R/S,S)-TI usually had lower
160
docking free energy than the results of (R/S,R)-TI, agreeing with the previous
161
structure and simulation studies (Uppenberg et al., 1995; Orrenius et al., 1998). As
162
shown in Figure S7, the docked (R/S,S)-TI had both acyl and alcohol moiety well
163
oriented towards the specific pockets. By comparison, the acid part and alcohol part of
164
(R/S,R)-TI adopt an opposite direction, which could lead to the severe steric repulsion
165
from protein as observed in other studies (Uppenberg et al., 1995; Orrenius et al.,
166
1998; Ema, 2004).
167
168
References:
169
Arnold K, Bordoli L, Kopp J, Schwede T. 2006. The SWISS-MODEL workspace: a
170
web-based environment for protein structure homology modelling. Bioinformatics
171
22:195-201.
172
Bayly CI, Cieplak P,Cornell WD, Kollman PA. 1993. A well-behaved electrostatic
173
potential based method using charge restraints for deriving atomic charges-the
174
RESP model. J Phys Chem 97:10269-10280.
175
Derewenda, U., Brzozowski, A.M., Lawson, D.M., Derewenda, Z.S. 1992. Catalysis
176
at the interface: the anatomy of a conformational change in a triglyceride lipase.
177
178
179
Biochem 31:1532-1541.
Ema T. 2004. Mechanism of enantioselectivity of lipases and other synthetically
useful hydrolases. Curr Org Chem 8:1009-1025.
180
Frisch MJ, Trucks GW, Schlegel HB, Scuseria GE. 2003. Gaussian 03, Revision B.04,
181
Gaussian, Inc., Pittsburgh PA, Revision C.02 was used for the O3LYP, TPSS and
182
B1B95 calculations.
183
Orrenius C, Hæffner F, Rotticci D, Öhrner N, Norin T, Hult K. 1998. Chiral
184
recognition of alcohol enantiomers in acyl transfer reactions catalysed by Candida
185
antarctica lipase B. Biocatal Biotransfor 16:1-15.
186
Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, Ferrin
187
TE. 2004. UCSF Chimera - A visualization system for exploratory research and
188
analysis. J Comput Chem 25:1605-1612.
189
190
191
192
Sanner MF. 1999. Python: A programming language for software integration and
development. J Mol Graphics Mod 17:57-61.
Tyagi S, Pleiss J. 2006. Biochemical profiling in silico -Predicting substrate
specificities of large enzyme families. J Biotech 124:108-116.
193
Uppenberg J, Öhrner N, Norin M, Hult K, Kleywegt GJ, Patkar S, Waagen V,
194
Anthonsen T, Jones TA. 1995. Crystallographic and molecular-modeling studies of
195
lipase B from Candida antarctica reveal a stereospecificity pocket for secondary
196
alcohols. Biochem 34:16838-16851.
197
198
199
200
Table:
201
202
Table S1. Grid size used for substrate docking conformation searching of 7 enzymes
Enzyme Grid points Grid spacing Enzyme Grid points Grid spacing
203
204
205
206
207
208
209
210
211
212
213
214
(X, Y, Z)
(Å)
(X, Y, Z)
(Å)
BTL
50, 60, 50
0.375
RML
60, 60, 60
0.375
CALB
50, 60, 50
0.375
PFE
50, 60, 50
0.375
CRL
50, 60, 60
0.375
PPE
50, 60, 50
0.375
HLL
60, 60, 50
0.375
215
216
Figure legends:
217
Figure S1. The nearby atoms of carbonyl carbon (CT) present large repulsion to the
218
OG atom if OG remains large vdw radius.
219
220
Figure S2. Schematic representation of incorrect TI’s docking geometries in the ELE.
221
a), the long distance between the serine OG atom and substrate CT atom disrupts one
222
catalytic hydrogen bond; b), serine OG atom can form hydrogen bond to the polar
223
hydrogen in substrate.
224
225
Figure S3. Increasing the vdw interaction between OG and CT makes the substrate
226
productively docked. In this figure, “A” (green arrow) is vdw interaction between OG
227
and CT, “B” (red arrow) indicates the steric repulsion between protein and substrate
228
molecule. In the productive docking geometry, the substrate molecule is more close to
229
the protein than that in non-productive one; hence suffer higher repulsion (B). This
230
docking pose has so high energy and is excluded in the low-energy preferred docking
231
process. Increasing the vdw attractive interaction (A) can reduce the system energy
232
and equip the productive docking geometry higher rank in the docking solutions.
233
234
Figure S4. Schematic presentation of substrate GS, (R/S,R)-TI and (R/S,S)-TI forms
235
used in docking. R* indicates the chiral group in substrate.
236
237
Figure S5. Distance and angle constraints to confirm H-bonds (modified from the
238
Deep View software manual, point 101).
239
240
Figure S6. Prediction error (EP) distribution in the docking mediated prediction with
241
substrate in gound state (GS) and tetrahedral intermediate (TI) form. The errors were
242
calculated in Equation (4). The y axis shows the number of enzyme/substrate pairs
243
with a given error range (x axis). Values of zero correspond to the complex pairs that
244
are accurately predicted.
245
246
Figure S7. Productive docking geometry of substrate used in GS, (R/S,S)-TI and
247
(R/S,R)-TI form. Catalytic residues (Ser105 and His224) and oxyanion hole (Thr40
248
and Gln106) of CALB has been shown to facilitate the illustration of the substrate
249
orientation. In the enzyme, GS substrate docks in two distinctive conformations (GS1
250
and GS2), while GS1 has lower docking energy compared with GS2 and dominates
251
the scoring conformation. In TI model, the first R/S is the original substrate
252
enantioform, and the second one indicates the conformation of the substrate
253
tetrahedral carbon atom (CT).
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
Figure S1
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
Figure S2
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
Figure S3
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
Figure S4
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
Figure S5
351
352
353
354
355
356
357
358
359
360
361
362
Figure S6
363
364
365
366
367
Figure S7:
Download