PHASING BY MULTIPLE ISOMORPHOUS REPLACEMENT (MIR

advertisement
PHASING BY MULTIPLE ISOMORPHOUS REPLACEMENT (MIR) PART II
We will now finish up by phasing the structure using our U and Hg derivatives, calculating a map,
improving the quality of the map, and then viewing it. This encompasses #?s 8 through 12 (in bold)
below.
The flow chart of what we will be doing is as follows, along with the programs in parentheses:
1.
Convert the data to CCP4 format
2.
Combine these three data sets into one file
3.
Scaling the data set together
(denzo2mtz)
(cad)
(scaleit)
4.
Calculating a Difference Patterson for the U derivative
5.
Viewing the Harker section of the difference patterson map
6.
Solving for a consistent set of heavy atom positions for U
(rsps)
7.
Using the U phases to solve for the Hg positoins
(ddf)
8.
Refining the heavy atom positions
9.
Calculating phases
10.
Calcuating a map
11.
Improving the map using density modification
12.
8 and 9.
phases
Viewing the map
(fft_dp)
(npo)
(mlphare)
(mlphare)
(fft)
(dm)
(O)
Refining U and Hg heavy atom positions together and calculating MIR
Using the positions of both the U and Hg atoms (the first determined by Patterson methods, the
second by cross-difference Fourier), we will refine the positions, occupancies and B-factors of each
atom.
mlphare_mir.com
#!/bin/sh -f
#
$CCP4/bin/mlphare HKLIN lyso_nat_U_Hg_sc.mtz \
HKLOUT lyso_nat_U_Hg_mlph.mtz <<eof-f> mlphare_nat_U_Hg.log
TITLE refining U and Hg position(s)
CYCLE 20
THRES 2.5 0.5
ANGLE 10
PRINT AVE AVF
LABIN FP=FP SIGFP=SIGFP FPH1=FPU SIGFPH1=SIGFPU FPH2=FPHg
SIGFPH2=SIGFPHg
LABOUT ALLIN PHIB=PHImir FOM=FOMmir
RESO 20 3.2
EXCLUDE SIGFP 2.0
HLOUT
APPLY
DERIV U
DCYCLE PHASE ALL REFCYC ALL KBOV ALL
RESO 20 3.2
EXCLUDE DISO 100 SIGFPH1 2.0
ATOM U 0.583 0.818 0.042 0.153 BFAC 12.687
ATREF X ALL Y ALL Z ALL OCC 1 3 5 7 9 11 13 15 17 19 B 2 4 6 8 10 12 14 16 18 20
DERIV Hg
DCYCLE PHASE ALL REFCYC ALL KBOV ALL
RESO 20 3.2
EXCLUDE DISO 200 SIGFPH2 2.0
ATOM Hg 0.913 0.308 0.009 0.284 BFAC 15.776
ATREF X ALL Y ALL Z ALL OCC 1 3 5 7 9 11 13 15 17 19 B 2 4 6 8 10 12 14 16 18 20
NOHARVEST
END
eof-f
We use the FP, FPU and FPHg columns, along with their SIG values. We will refine to 3.2 Å
resolution because we?d like to have a decent resolution map to look at (but we cannot compare
directly these numbers to the 4 Å refinement of U alone we did previously). Output labels are now
PHImir and FOMmir. Instead of just the U section, with heavy atom position, etc, we now have a
second one for the Hg atom position.
Run this by typing ?mlphare_mir.com?.
The output file contains sections for both U and Hg refined positions:
DERIV
U
DCYCLE PHASE ALL REFCYC ALL KBOV ALL
RESO
20.00
3.20
SCALE FPH1 0.9954 -0.3958
ISOE
23.29 19.54 17.77 19.01 25.42 29.34 29.76 28.23
ATOM1 U
0.583 0.817 0.043 0.154 BFAC 13.764
ATREF X ALL Y ALL Z ALL OCC ALL AOCC ALL B ALL
DERIV
Hg
DCYCLE PHASE ALL REFCYC ALL KBOV ALL
RESO
20.00
3.20
SCALE FPH2 0.9938 -1.2428
ISOE
37.13 33.86 34.70 41.76 51.83 67.69 71.56 66.67
ATOM1 Hg 0.914 0.307 0.013 0.284 BFAC 12.336
ATREF X ALL Y ALL Z ALL OCC ALL AOCC ALL B ALL
The Hg positoin refied to a decent occupancy (0.28) and B (12.3). The U position behaved
essentially has it had previously.
The phasing power and other stats are given for each derivative seperately. Deriv #2 is Hg:
**********************************************************
*** Analysis of Derivative 2 Last Phasing cycle : ***
**********************************************************
$$ 1/resol^2
Resolution(Angstroms)
Number_acentric_reflections
Isomorphous_difference_acentric
Lack_of_closure_acentric
Phasing_power_acentric
Cullis_R_acentric(?<1.0)
Number_centric_reflections
Isomorphous_difference_centric
Lack_of_closure_centric
Phasing_power_centric
Cullis_R_centric(?<1.0) $$
1/resol^2 Resol Nref_a DISO_a LOC_a PhP_a CullR_a Nref_c DISO_c LOC_cPhP_c CullR_c
$$
0.007
11.96
13 49.1 29.4 1.91 0.60
20 55.4 26.1 1.73 0.47
0.014
8.60
40 43.7 27.6 1.78 0.63
31 56.5 27.4 1.97 0.48
0.022
6.71
74 45.1 27.3 1.85 0.60
45 48.1 28.8 1.69 0.60
0.033
5.50
127 42.5 31.1 1.44 0.73
62 57.3 38.9 1.30 0.68
0.046
4.66
184 47.8 38.6 1.20 0.81
69 57.3 48.8 0.80 0.85
0.061
4.05
261 58.5 53.4 0.79 0.91
83 66.4 57.1 0.70 0.86
0.078
3.57
345 60.6 53.3 0.74 0.88
94 80.2 67.7 0.61 0.84
0.098
3.20
448 54.9 50.1 0.74 0.91
114 72.4 60.3 0.54 0.83
$$
TOTAL
1492 54.1 46.5 0.89 0.86
518 65.3 51.0 0.82 0.78
We can now look at both acentric and centric reflections. In each case, the phasing power is quite
good up until about 4.5 Å resolution, and then falls off until then. This is a hallmark of a relatively
non-isomorphous derivative - the Hg atom partially distortes the structure at high resolution,
interferring with good statistics. Same behavior is seen with R-cullis.
Deriv #1 is U:
**********************************************************
*** Analysis of Derivative 1 Last Phasing cycle : ***
**********************************************************
$$ 1/resol^2
Resolution(Angstroms)
Number_acentric_reflections
Isomorphous_difference_acentric
Lack_of_closure_acentric
Phasing_power_acentric
Cullis_R_acentric(?<1.0)
Number_centric_reflections
Isomorphous_difference_centric
Lack_of_closure_centric
Phasing_power_centric
Cullis_R_centric(?<1.0) $$
1/resol^2 Resol Nref_a DISO_a LOC_a PhP_a CullR_a Nref_c DISO_c LOC_cPhP_c CullR_c
$$
0.007
11.96
13 25.1 17.0 1.74 0.68
22 43.0 22.0 1.49 0.51
0.014
8.60
42 26.5 15.6 2.05 0.59
34 38.0 17.6 1.87 0.46
0.022
6.71
76 24.9 14.4 2.25 0.58
47 34.9 14.9 1.78 0.43
0.033
5.50
131 22.3 14.0 2.12 0.63
64 30.4 19.3 1.42 0.64
0.046
4.66
187 27.2 19.8 1.38 0.73
72 32.0 22.3 1.00 0.70
0.061
4.05
264 29.5 23.8 1.07 0.81
88 32.9 25.9 0.91 0.79
0.078
3.57
349 29.5 23.7 1.01 0.80
101 35.2 26.9 0.85 0.76
0.098
3.20
449 25.3 21.5 1.04 0.85
116 35.1 27.0 0.65 0.77
$$
TOTAL
1511 27.0 21.0 1.21 0.78
544 34.3 23.4 1.01 0.68
In this case, the U derivative shows good phasing power from 10-3.2 Å resolution; in addition, its
R-cullis values are quite good, too, across the resolution ranges.
Estimations of Figure of Merit (FOM) are given for the phasing experiment as a whole:
Resolution in angstroms
11.96
8.60
6.71
5.50
4.66
4.05
3.57
3.20
Number of Measurements phased -ACENTRIC
13
42
76
131
187
264
349
TOTAL
449
1511
Mean Figure of Merit
0.5358 0.6081 0.6014 0.5221 0.4692 0.3993 0.3964 0.3593
Number of Measurements phased -CENTRIC
22
34
47
64
72
88
101
0.4232
TOTAL
116
544
Mean Figure of Merit
0.8349 0.8760 0.7757 0.7817 0.6433 0.5897 0.6560 0.6098
Number of Measurements phased -ALL
35
76
123
195
259
352
0.6799
TOTAL
450
565
2055
Mean Figure of Merit
0.7238 0.7279 0.6680 0.6073 0.5176 0.4469 0.4547 0.4107
0.4911
The overall figure of merit is good (0.49). However, to 4.5 Å it is very good.
Now we check both derivatives for additional sites of heavy atom substitution that may not have
appeared on the Patterson maps or in the initial cross-difference Fourier map. We use the same
technique as a cross difference Fourier; the only difference is we now use MIR phases rather than
the SIR U phases.
fft_cdf_U.com
$CCP4/bin/fft HKLIN lyso_nat_U_Hg_mlph.mtz \
MAPOUT temp.map <<eof-f> fft_cdf.log
TITLE
LABIN F1=FPU SIG1=SIGFPU F2=FP SIG2=SIGFP PHI=PHImir W=FOMmir
RESO 20 3.2
EXCLUDE sig1 3 sig2 3 diff 100
END
eof-f
$CCP4/bin/peakmax mapin temp.map << eof > lyso_MIR_toget_U_cdf.peaks
threshold rms 3.0 #negatives
output peaks
eof
rm -f TO
#rm -f temp.map
Gives this output from the peaksearch:
Order No. Site Height/Rms Grid
Fractional coordinates Orthogonal coords
1
5
1 38.17
42 59 1 0.5827 0.8177 0.0414
45.77 64.23 1.53
2
4
2
8.81
66 22 1 0.9139 0.3058 0.0174
71.78 24.02 0.64
3
6
4
3.18
24 14 3 0.3305 0.1975 0.0962
25.96 15.51 3.55
4
3
3
3.15
42 20 1 0.5818 0.2803 0.0359
45.70 22.02 1.32
5
9
6
3.14
45 59 4 0.6291 0.8228 0.1250
49.41 64.63 4.60
6
7
5
3.04
41 55 3 0.5688 0.7654 0.0921
44.68 60.12 3.39
The existing U site and the existing Hg site (recall ghost peaks?) are present here. This site looks
possible: 0.33, 0.19, 0.09. Add a new site to the mlphare_mir.com run and see if it helps. All parts
of the input file are the same except for the U sites:
DERIV U
DCYCLE PHASE ALL REFCYC ALL KBOV ALL
RESO 20 3.2
EXCLUDE DISO 100 SIGFPH1 2.0
ATOM U 0.583 0.818 0.042 0.153 BFAC 12.687
ATREF X ALL Y ALL Z ALL OCC 1 3 5 7 9 11 13 15 17 19 B 2 4 6 8 10 12 14 16 18 20
ATOM U 0.3309 0.1972 0.0962 0.153 BFAC 12.687
ATREF X ALL Y ALL Z ALL OCC 1 3 5 7 9 11 13 15 17 19 B 2 4 6 8 10 12 14 16 18 20
This additional site refines to the following parameters:
DERIV
U
DCYCLE PHASE ALL REFCYC ALL KBOV ALL
RESO
20.00
3.20
SCALE FPH1 0.9967 -0.4601
ISOE
20.64 18.69 17.98 17.50 24.77 29.33 29.26 27.75
ATOM1 U
0.582 0.817 0.043 0.153 BFAC 12.444
ATREF X ALL Y ALL Z ALL OCC ALL AOCC ALL B ALL
ATOM2 U
0.334 0.201 0.111 0.048 BFAC 58.199
ATREF X ALL Y ALL Z ALL OCC ALL AOCC ALL B ALL
1/resol^2 Resol Nref_a DISO_a LOC_a PhP_a CullR_a Nref_c DISO_c LOC_cPhP_c CullR_c
$$
0.007
11.96
13 25.1 17.5 1.92 0.70
22 43.0 18.9 1.79 0.44
0.014
8.60
42 26.5 14.0 2.41 0.53
34 38.0 17.4 1.85 0.46
0.022
6.71
76 24.8 14.1 2.38 0.57
47 34.9 15.2 1.89 0.44
0.033
5.50
131 22.3 13.3 2.24 0.60
64 30.3 16.9 1.63 0.56
0.046
4.66
187 27.2 19.3 1.46 0.71
72 32.0 21.1 1.08 0.66
0.061
4.05
264 29.5 23.8 1.10 0.81
88 32.9 25.6 0.94 0.78
0.078
3.57
349 29.5 23.3 1.06 0.79
101 35.2 26.6 0.88 0.75
0.098
3.20
449 25.3 21.3 1.09 0.84
116 35.1 26.3 0.71 0.75
$$
TOTAL
1511 27.0 20.7 1.27 0.77
544 34.3 22.7 1.07 0.66
Number of Measurements phased -ACENTRIC
13
42
76
131
187
264
349
TOTAL
449
1511
Mean Figure of Merit
0.5718 0.6361 0.6039 0.5449 0.4803 0.4012 0.4011 0.3663
0.4312
Number of Measurements phased -CENTRIC
22
34
47
64
72
88
101
TOTAL
116
544
Mean Figure of Merit
0.9023 0.8903 0.8073 0.8240 0.6598 0.5900 0.6631 0.6236
Number of Measurements phased -ALL
35
76
123
195
259
352
0.6977
TOTAL
450
565
2055
Mean Figure of Merit
0.7796 0.7498 0.6816 0.6365 0.5302 0.4484 0.4599 0.4191
0.5018
This is some improvement over the previous refinement with one U site. We will keep it. A site
not worth keeping have little or no impact on phasing stas (PhP, FOM) and refines to a very poor
occ or B.
fft_cdf_Hg.com
$CCP4/bin/fft HKLIN lyso_nat_U_Hg_mlph.mtz \
MAPOUT temp.map <<eof-f> fft_cdf.log
TITLE
LABIN F1=FPHg SIG1=SIGFPHg F2=FP SIG2=SIGFP PHI=PHImir W=FOMmir
RESO 20 3.2
EXCLUDE sig1 3 sig2 3 diff 200
END
eof-f
$CCP4/bin/peakmax mapin temp.map << eof > lyso_MIR_toget_Hg_cdf.peaks
threshold rms 3.0 #negatives
output peaks
eof
rm -f TO
#rm -f temp.map
From the peaksearch file:
Order No. Site Height/Rms Grid
Fractional coordinates Orthogonal coordinates
1
2
2 28.44
66 22 0 0.9140 0.3068 0.0000
71.80 24.10 0.00
2
8
3 12.07
42 59 1 0.5835 0.8171 0.0396
45.83 64.18 1.46
3
9
6
4.04
19 63 1 0.2666 0.8757 0.0353
20.94 68.78 1.30
4
7
5
3.71
66 53 1 0.9172 0.7388 0.0332
72.05 58.03 1.22
5
6
4
3.29
9 52 1 0.1236 0.7248 0.0347
9.71 56.93 1.28
6
3
1
3.17
14 30 0 0.1912 0.4160 0.0000
15.02 32.68 0.00
7 10
7
3.10
66 18 2 0.9180 0.2503 0.0575
72.11 19.66 2.12
8 11
0
3.03
50 6 3 0.6969 0.0836 0.0991
54.74 6.57 3.65
Looking past the ghost peaks we see an additional site at 0.26, 0.87, 0.04, that might be possible.
Adding this site to the mlphare_mir refinement we get:
DERIV
Hg
DCYCLE PHASE ALL REFCYC ALL KBOV ALL
RESO
20.00
3.20
SCALE FPH2 0.9954 -1.3833
ISOE
28.10 32.86 31.00 40.13 50.78 66.98 71.03 66.32
ATOM1 Hg 0.914 0.306 0.013 0.281 BFAC 9.637
ATREF X ALL Y ALL Z ALL OCC ALL AOCC ALL B ALL
ATOM2 Hg 0.257 0.870 0.066 0.101 BFAC 82.845
ATREF X ALL Y ALL Z ALL OCC ALL AOCC ALL B ALL
1/resol^2 Resol Nref_a DISO_a LOC_a PhP_a CullR_a Nref_c DISO_c LOC_cPhP_c CullR_c
$$
0.007
11.96
13 48.8 22.0 2.92 0.45
20 55.4 23.1 2.02 0.42
0.014
8.60
40 43.7 28.0 1.77 0.64
31 56.5 25.5 2.10 0.45
0.022
6.71
74 45.0 24.4 2.15 0.54
45 48.1 26.1 1.78 0.54
0.033
5.50
127 42.5 30.0 1.55 0.71
62 57.2 37.0 1.42 0.65
0.046
4.66
184 47.8 37.6 1.30 0.79
69 57.3 48.2 0.90 0.84
0.061
4.05
261 58.5 52.7 0.83 0.90
83 66.5 56.8 0.76 0.85
0.078
3.57
345 60.6 53.0 0.77 0.87
94 80.2 67.0 0.65 0.83
0.098
3.20
448 55.0 49.9 0.78 0.91
114 72.4 59.7 0.57 0.82
$$
TOTAL
1492 54.1 45.8 0.95 0.85
518 65.3 49.9 0.87 0.76
Number of Measurements phased -ACENTRIC
13
42
76
131
187
264
349
TOTAL
449
1511
Mean Figure of Merit
0.6128 0.6551 0.6188 0.5539 0.4962 0.4084 0.4085 0.3723
Number of Measurements phased -CENTRIC
0.4404
TOTAL
22
34
47
64
72
88
101
116
544
Mean Figure of Merit
0.8853 0.8897 0.8284 0.8383 0.6728 0.5970 0.6693 0.6255
Number of Measurements phased -ALL
35
76
123
195
259
352
0.7049
TOTAL
450
565
2055
Mean Figure of Merit
0.7841 0.7600 0.6989 0.6473 0.5453 0.4556 0.4670 0.4243
0.5104
Thus, this additional Hg site (despite its relatively high B factor) helps with the overall phasing.
We will keep it.
10.
Calcuating an electron density map
We use the F?s from the protein and our MIR phases and figures of merit. In this case we only
need to do a sigma cut-off on the FP?s (we use a 2sig cut-off in this case).
fft_f.com
$CCP4/bin/fft HKLIN lyso_nat_U_Hg_mlph.mtz \
MAPOUT lyso_MIR.map <<eof-f> fft_f.log
TITLE
LABIN F1=FP SIG1=SIGFP PHI=PHImir W=FOMmir
RESO 20 4.5
EXCLUDE sig1 2
XYZLIM ASU
END
eof-f
#
$CCP4/bin/mapmask MAPIN lyso_MIR.map \
MAPOUT lyso_MIR_ex.map << EOF > mapmask.log
XYZLIM -0.9 1.1 -0.9 1.1 -0.9 1.1
END
EOF
#
We generate this map: lyso_MIR.map. Then we extend it to cover the complete unit cell.
Download