PHASING BY MULTIPLE ISOMORPHOUS REPLACEMENT (MIR) PART II We will now finish up by phasing the structure using our U and Hg derivatives, calculating a map, improving the quality of the map, and then viewing it. This encompasses #?s 8 through 12 (in bold) below. The flow chart of what we will be doing is as follows, along with the programs in parentheses: 1. Convert the data to CCP4 format 2. Combine these three data sets into one file 3. Scaling the data set together (denzo2mtz) (cad) (scaleit) 4. Calculating a Difference Patterson for the U derivative 5. Viewing the Harker section of the difference patterson map 6. Solving for a consistent set of heavy atom positions for U (rsps) 7. Using the U phases to solve for the Hg positoins (ddf) 8. Refining the heavy atom positions 9. Calculating phases 10. Calcuating a map 11. Improving the map using density modification 12. 8 and 9. phases Viewing the map (fft_dp) (npo) (mlphare) (mlphare) (fft) (dm) (O) Refining U and Hg heavy atom positions together and calculating MIR Using the positions of both the U and Hg atoms (the first determined by Patterson methods, the second by cross-difference Fourier), we will refine the positions, occupancies and B-factors of each atom. mlphare_mir.com #!/bin/sh -f # $CCP4/bin/mlphare HKLIN lyso_nat_U_Hg_sc.mtz \ HKLOUT lyso_nat_U_Hg_mlph.mtz <<eof-f> mlphare_nat_U_Hg.log TITLE refining U and Hg position(s) CYCLE 20 THRES 2.5 0.5 ANGLE 10 PRINT AVE AVF LABIN FP=FP SIGFP=SIGFP FPH1=FPU SIGFPH1=SIGFPU FPH2=FPHg SIGFPH2=SIGFPHg LABOUT ALLIN PHIB=PHImir FOM=FOMmir RESO 20 3.2 EXCLUDE SIGFP 2.0 HLOUT APPLY DERIV U DCYCLE PHASE ALL REFCYC ALL KBOV ALL RESO 20 3.2 EXCLUDE DISO 100 SIGFPH1 2.0 ATOM U 0.583 0.818 0.042 0.153 BFAC 12.687 ATREF X ALL Y ALL Z ALL OCC 1 3 5 7 9 11 13 15 17 19 B 2 4 6 8 10 12 14 16 18 20 DERIV Hg DCYCLE PHASE ALL REFCYC ALL KBOV ALL RESO 20 3.2 EXCLUDE DISO 200 SIGFPH2 2.0 ATOM Hg 0.913 0.308 0.009 0.284 BFAC 15.776 ATREF X ALL Y ALL Z ALL OCC 1 3 5 7 9 11 13 15 17 19 B 2 4 6 8 10 12 14 16 18 20 NOHARVEST END eof-f We use the FP, FPU and FPHg columns, along with their SIG values. We will refine to 3.2 Å resolution because we?d like to have a decent resolution map to look at (but we cannot compare directly these numbers to the 4 Å refinement of U alone we did previously). Output labels are now PHImir and FOMmir. Instead of just the U section, with heavy atom position, etc, we now have a second one for the Hg atom position. Run this by typing ?mlphare_mir.com?. The output file contains sections for both U and Hg refined positions: DERIV U DCYCLE PHASE ALL REFCYC ALL KBOV ALL RESO 20.00 3.20 SCALE FPH1 0.9954 -0.3958 ISOE 23.29 19.54 17.77 19.01 25.42 29.34 29.76 28.23 ATOM1 U 0.583 0.817 0.043 0.154 BFAC 13.764 ATREF X ALL Y ALL Z ALL OCC ALL AOCC ALL B ALL DERIV Hg DCYCLE PHASE ALL REFCYC ALL KBOV ALL RESO 20.00 3.20 SCALE FPH2 0.9938 -1.2428 ISOE 37.13 33.86 34.70 41.76 51.83 67.69 71.56 66.67 ATOM1 Hg 0.914 0.307 0.013 0.284 BFAC 12.336 ATREF X ALL Y ALL Z ALL OCC ALL AOCC ALL B ALL The Hg positoin refied to a decent occupancy (0.28) and B (12.3). The U position behaved essentially has it had previously. The phasing power and other stats are given for each derivative seperately. Deriv #2 is Hg: ********************************************************** *** Analysis of Derivative 2 Last Phasing cycle : *** ********************************************************** $$ 1/resol^2 Resolution(Angstroms) Number_acentric_reflections Isomorphous_difference_acentric Lack_of_closure_acentric Phasing_power_acentric Cullis_R_acentric(?<1.0) Number_centric_reflections Isomorphous_difference_centric Lack_of_closure_centric Phasing_power_centric Cullis_R_centric(?<1.0) $$ 1/resol^2 Resol Nref_a DISO_a LOC_a PhP_a CullR_a Nref_c DISO_c LOC_cPhP_c CullR_c $$ 0.007 11.96 13 49.1 29.4 1.91 0.60 20 55.4 26.1 1.73 0.47 0.014 8.60 40 43.7 27.6 1.78 0.63 31 56.5 27.4 1.97 0.48 0.022 6.71 74 45.1 27.3 1.85 0.60 45 48.1 28.8 1.69 0.60 0.033 5.50 127 42.5 31.1 1.44 0.73 62 57.3 38.9 1.30 0.68 0.046 4.66 184 47.8 38.6 1.20 0.81 69 57.3 48.8 0.80 0.85 0.061 4.05 261 58.5 53.4 0.79 0.91 83 66.4 57.1 0.70 0.86 0.078 3.57 345 60.6 53.3 0.74 0.88 94 80.2 67.7 0.61 0.84 0.098 3.20 448 54.9 50.1 0.74 0.91 114 72.4 60.3 0.54 0.83 $$ TOTAL 1492 54.1 46.5 0.89 0.86 518 65.3 51.0 0.82 0.78 We can now look at both acentric and centric reflections. In each case, the phasing power is quite good up until about 4.5 Å resolution, and then falls off until then. This is a hallmark of a relatively non-isomorphous derivative - the Hg atom partially distortes the structure at high resolution, interferring with good statistics. Same behavior is seen with R-cullis. Deriv #1 is U: ********************************************************** *** Analysis of Derivative 1 Last Phasing cycle : *** ********************************************************** $$ 1/resol^2 Resolution(Angstroms) Number_acentric_reflections Isomorphous_difference_acentric Lack_of_closure_acentric Phasing_power_acentric Cullis_R_acentric(?<1.0) Number_centric_reflections Isomorphous_difference_centric Lack_of_closure_centric Phasing_power_centric Cullis_R_centric(?<1.0) $$ 1/resol^2 Resol Nref_a DISO_a LOC_a PhP_a CullR_a Nref_c DISO_c LOC_cPhP_c CullR_c $$ 0.007 11.96 13 25.1 17.0 1.74 0.68 22 43.0 22.0 1.49 0.51 0.014 8.60 42 26.5 15.6 2.05 0.59 34 38.0 17.6 1.87 0.46 0.022 6.71 76 24.9 14.4 2.25 0.58 47 34.9 14.9 1.78 0.43 0.033 5.50 131 22.3 14.0 2.12 0.63 64 30.4 19.3 1.42 0.64 0.046 4.66 187 27.2 19.8 1.38 0.73 72 32.0 22.3 1.00 0.70 0.061 4.05 264 29.5 23.8 1.07 0.81 88 32.9 25.9 0.91 0.79 0.078 3.57 349 29.5 23.7 1.01 0.80 101 35.2 26.9 0.85 0.76 0.098 3.20 449 25.3 21.5 1.04 0.85 116 35.1 27.0 0.65 0.77 $$ TOTAL 1511 27.0 21.0 1.21 0.78 544 34.3 23.4 1.01 0.68 In this case, the U derivative shows good phasing power from 10-3.2 Å resolution; in addition, its R-cullis values are quite good, too, across the resolution ranges. Estimations of Figure of Merit (FOM) are given for the phasing experiment as a whole: Resolution in angstroms 11.96 8.60 6.71 5.50 4.66 4.05 3.57 3.20 Number of Measurements phased -ACENTRIC 13 42 76 131 187 264 349 TOTAL 449 1511 Mean Figure of Merit 0.5358 0.6081 0.6014 0.5221 0.4692 0.3993 0.3964 0.3593 Number of Measurements phased -CENTRIC 22 34 47 64 72 88 101 0.4232 TOTAL 116 544 Mean Figure of Merit 0.8349 0.8760 0.7757 0.7817 0.6433 0.5897 0.6560 0.6098 Number of Measurements phased -ALL 35 76 123 195 259 352 0.6799 TOTAL 450 565 2055 Mean Figure of Merit 0.7238 0.7279 0.6680 0.6073 0.5176 0.4469 0.4547 0.4107 0.4911 The overall figure of merit is good (0.49). However, to 4.5 Å it is very good. Now we check both derivatives for additional sites of heavy atom substitution that may not have appeared on the Patterson maps or in the initial cross-difference Fourier map. We use the same technique as a cross difference Fourier; the only difference is we now use MIR phases rather than the SIR U phases. fft_cdf_U.com $CCP4/bin/fft HKLIN lyso_nat_U_Hg_mlph.mtz \ MAPOUT temp.map <<eof-f> fft_cdf.log TITLE LABIN F1=FPU SIG1=SIGFPU F2=FP SIG2=SIGFP PHI=PHImir W=FOMmir RESO 20 3.2 EXCLUDE sig1 3 sig2 3 diff 100 END eof-f $CCP4/bin/peakmax mapin temp.map << eof > lyso_MIR_toget_U_cdf.peaks threshold rms 3.0 #negatives output peaks eof rm -f TO #rm -f temp.map Gives this output from the peaksearch: Order No. Site Height/Rms Grid Fractional coordinates Orthogonal coords 1 5 1 38.17 42 59 1 0.5827 0.8177 0.0414 45.77 64.23 1.53 2 4 2 8.81 66 22 1 0.9139 0.3058 0.0174 71.78 24.02 0.64 3 6 4 3.18 24 14 3 0.3305 0.1975 0.0962 25.96 15.51 3.55 4 3 3 3.15 42 20 1 0.5818 0.2803 0.0359 45.70 22.02 1.32 5 9 6 3.14 45 59 4 0.6291 0.8228 0.1250 49.41 64.63 4.60 6 7 5 3.04 41 55 3 0.5688 0.7654 0.0921 44.68 60.12 3.39 The existing U site and the existing Hg site (recall ghost peaks?) are present here. This site looks possible: 0.33, 0.19, 0.09. Add a new site to the mlphare_mir.com run and see if it helps. All parts of the input file are the same except for the U sites: DERIV U DCYCLE PHASE ALL REFCYC ALL KBOV ALL RESO 20 3.2 EXCLUDE DISO 100 SIGFPH1 2.0 ATOM U 0.583 0.818 0.042 0.153 BFAC 12.687 ATREF X ALL Y ALL Z ALL OCC 1 3 5 7 9 11 13 15 17 19 B 2 4 6 8 10 12 14 16 18 20 ATOM U 0.3309 0.1972 0.0962 0.153 BFAC 12.687 ATREF X ALL Y ALL Z ALL OCC 1 3 5 7 9 11 13 15 17 19 B 2 4 6 8 10 12 14 16 18 20 This additional site refines to the following parameters: DERIV U DCYCLE PHASE ALL REFCYC ALL KBOV ALL RESO 20.00 3.20 SCALE FPH1 0.9967 -0.4601 ISOE 20.64 18.69 17.98 17.50 24.77 29.33 29.26 27.75 ATOM1 U 0.582 0.817 0.043 0.153 BFAC 12.444 ATREF X ALL Y ALL Z ALL OCC ALL AOCC ALL B ALL ATOM2 U 0.334 0.201 0.111 0.048 BFAC 58.199 ATREF X ALL Y ALL Z ALL OCC ALL AOCC ALL B ALL 1/resol^2 Resol Nref_a DISO_a LOC_a PhP_a CullR_a Nref_c DISO_c LOC_cPhP_c CullR_c $$ 0.007 11.96 13 25.1 17.5 1.92 0.70 22 43.0 18.9 1.79 0.44 0.014 8.60 42 26.5 14.0 2.41 0.53 34 38.0 17.4 1.85 0.46 0.022 6.71 76 24.8 14.1 2.38 0.57 47 34.9 15.2 1.89 0.44 0.033 5.50 131 22.3 13.3 2.24 0.60 64 30.3 16.9 1.63 0.56 0.046 4.66 187 27.2 19.3 1.46 0.71 72 32.0 21.1 1.08 0.66 0.061 4.05 264 29.5 23.8 1.10 0.81 88 32.9 25.6 0.94 0.78 0.078 3.57 349 29.5 23.3 1.06 0.79 101 35.2 26.6 0.88 0.75 0.098 3.20 449 25.3 21.3 1.09 0.84 116 35.1 26.3 0.71 0.75 $$ TOTAL 1511 27.0 20.7 1.27 0.77 544 34.3 22.7 1.07 0.66 Number of Measurements phased -ACENTRIC 13 42 76 131 187 264 349 TOTAL 449 1511 Mean Figure of Merit 0.5718 0.6361 0.6039 0.5449 0.4803 0.4012 0.4011 0.3663 0.4312 Number of Measurements phased -CENTRIC 22 34 47 64 72 88 101 TOTAL 116 544 Mean Figure of Merit 0.9023 0.8903 0.8073 0.8240 0.6598 0.5900 0.6631 0.6236 Number of Measurements phased -ALL 35 76 123 195 259 352 0.6977 TOTAL 450 565 2055 Mean Figure of Merit 0.7796 0.7498 0.6816 0.6365 0.5302 0.4484 0.4599 0.4191 0.5018 This is some improvement over the previous refinement with one U site. We will keep it. A site not worth keeping have little or no impact on phasing stas (PhP, FOM) and refines to a very poor occ or B. fft_cdf_Hg.com $CCP4/bin/fft HKLIN lyso_nat_U_Hg_mlph.mtz \ MAPOUT temp.map <<eof-f> fft_cdf.log TITLE LABIN F1=FPHg SIG1=SIGFPHg F2=FP SIG2=SIGFP PHI=PHImir W=FOMmir RESO 20 3.2 EXCLUDE sig1 3 sig2 3 diff 200 END eof-f $CCP4/bin/peakmax mapin temp.map << eof > lyso_MIR_toget_Hg_cdf.peaks threshold rms 3.0 #negatives output peaks eof rm -f TO #rm -f temp.map From the peaksearch file: Order No. Site Height/Rms Grid Fractional coordinates Orthogonal coordinates 1 2 2 28.44 66 22 0 0.9140 0.3068 0.0000 71.80 24.10 0.00 2 8 3 12.07 42 59 1 0.5835 0.8171 0.0396 45.83 64.18 1.46 3 9 6 4.04 19 63 1 0.2666 0.8757 0.0353 20.94 68.78 1.30 4 7 5 3.71 66 53 1 0.9172 0.7388 0.0332 72.05 58.03 1.22 5 6 4 3.29 9 52 1 0.1236 0.7248 0.0347 9.71 56.93 1.28 6 3 1 3.17 14 30 0 0.1912 0.4160 0.0000 15.02 32.68 0.00 7 10 7 3.10 66 18 2 0.9180 0.2503 0.0575 72.11 19.66 2.12 8 11 0 3.03 50 6 3 0.6969 0.0836 0.0991 54.74 6.57 3.65 Looking past the ghost peaks we see an additional site at 0.26, 0.87, 0.04, that might be possible. Adding this site to the mlphare_mir refinement we get: DERIV Hg DCYCLE PHASE ALL REFCYC ALL KBOV ALL RESO 20.00 3.20 SCALE FPH2 0.9954 -1.3833 ISOE 28.10 32.86 31.00 40.13 50.78 66.98 71.03 66.32 ATOM1 Hg 0.914 0.306 0.013 0.281 BFAC 9.637 ATREF X ALL Y ALL Z ALL OCC ALL AOCC ALL B ALL ATOM2 Hg 0.257 0.870 0.066 0.101 BFAC 82.845 ATREF X ALL Y ALL Z ALL OCC ALL AOCC ALL B ALL 1/resol^2 Resol Nref_a DISO_a LOC_a PhP_a CullR_a Nref_c DISO_c LOC_cPhP_c CullR_c $$ 0.007 11.96 13 48.8 22.0 2.92 0.45 20 55.4 23.1 2.02 0.42 0.014 8.60 40 43.7 28.0 1.77 0.64 31 56.5 25.5 2.10 0.45 0.022 6.71 74 45.0 24.4 2.15 0.54 45 48.1 26.1 1.78 0.54 0.033 5.50 127 42.5 30.0 1.55 0.71 62 57.2 37.0 1.42 0.65 0.046 4.66 184 47.8 37.6 1.30 0.79 69 57.3 48.2 0.90 0.84 0.061 4.05 261 58.5 52.7 0.83 0.90 83 66.5 56.8 0.76 0.85 0.078 3.57 345 60.6 53.0 0.77 0.87 94 80.2 67.0 0.65 0.83 0.098 3.20 448 55.0 49.9 0.78 0.91 114 72.4 59.7 0.57 0.82 $$ TOTAL 1492 54.1 45.8 0.95 0.85 518 65.3 49.9 0.87 0.76 Number of Measurements phased -ACENTRIC 13 42 76 131 187 264 349 TOTAL 449 1511 Mean Figure of Merit 0.6128 0.6551 0.6188 0.5539 0.4962 0.4084 0.4085 0.3723 Number of Measurements phased -CENTRIC 0.4404 TOTAL 22 34 47 64 72 88 101 116 544 Mean Figure of Merit 0.8853 0.8897 0.8284 0.8383 0.6728 0.5970 0.6693 0.6255 Number of Measurements phased -ALL 35 76 123 195 259 352 0.7049 TOTAL 450 565 2055 Mean Figure of Merit 0.7841 0.7600 0.6989 0.6473 0.5453 0.4556 0.4670 0.4243 0.5104 Thus, this additional Hg site (despite its relatively high B factor) helps with the overall phasing. We will keep it. 10. Calcuating an electron density map We use the F?s from the protein and our MIR phases and figures of merit. In this case we only need to do a sigma cut-off on the FP?s (we use a 2sig cut-off in this case). fft_f.com $CCP4/bin/fft HKLIN lyso_nat_U_Hg_mlph.mtz \ MAPOUT lyso_MIR.map <<eof-f> fft_f.log TITLE LABIN F1=FP SIG1=SIGFP PHI=PHImir W=FOMmir RESO 20 4.5 EXCLUDE sig1 2 XYZLIM ASU END eof-f # $CCP4/bin/mapmask MAPIN lyso_MIR.map \ MAPOUT lyso_MIR_ex.map << EOF > mapmask.log XYZLIM -0.9 1.1 -0.9 1.1 -0.9 1.1 END EOF # We generate this map: lyso_MIR.map. Then we extend it to cover the complete unit cell.