BSTR521 Winter 2011 SOLVING THE PHASE PROBLEM 2011 11 BSTR521 MIR-I V01 Page 1 Wim Hol BSTR521 Winter 2011 BIOMACROMOLECULAR CRYSTALLOGRAPHY NATURALLY OCCURING SOURCES GENE CLONING & EXPRESSION PURIFICATION CRYSTALLIZATION DATA COLLECTION INITIAL PHASE DETERMINATION MOLECULAR REPLACEMENT (MR) MULTI-WAVELENGTH ANOMALOUS DISPERSION (MAD) ISOMORPHOUS REPLACEMENT (MIR) ELECTRON DENSITY IMPROVEMENT MODEL BUILDING STRUCTURE REFINEMENT EVOLUTION STRUCTURE ANALYSIS VACCINE DESIGN MECHANISMS PROTEIN ENGINEERING DRUG DESIGN 2011 11 BSTR521 MIR-I V01 Page 2 Wim Hol BSTR521 Winter 2011 Solving the phase problem in protein crystallography by de novo methods 1A. The isomorphous replacement method a) Multiple isomorphous replacement (MIR) b) Single isomorphous replacement (SIR) B. Isomorphous replacement enhanced by anomalous scattering information a) Multiple isomorphous replacement and anomalous scattering (MIRAS) b) Single isomorphous replacement and anomalous scattering (SIRAS) 2. Multi-wavelength anomalous dispersion (MAD) methods & Single-wavelength anomalous dispersion (SAD) methods a) Use of “intrinsic” anomalous scatterers such as occur in metallo proteins and in SelenoMet proteins and even sulfurs!! b) Use of “rational” “non-intrinsic” anomalous scatterers such as brominated nucleotides c) Use of heavy atom derivatives – but only one crystal needed! 3. Direct methods Has so far been "only" successful in small proteins with resolution of data beyond 1.2 Å. 4. “Umweg reflections” or “Renninger effect” Keep one reflection permanently in reflection condition and rotate about its reciprocal lattice vector S, then variations in its intensity can be observed in favorable cases due to the fact that other reflections pass the Ewald sphere. The variation in intensity provides information about the phase difference between the two reflections which diffract simultaneously. (See Giacovazzo, “Fundamentals of Crystallography”, pp. 191-195 (1st Ed., 1991).) 2011 11 BSTR521 MIR-I V01 Page 3 Wim Hol BSTR521 Winter 2011 Multiple Isomorphous Replacement (MIR) Basic Idea : Use a simpler problem (=finding the heavy atom sites) as a stepping stone towards solving a very complicated one (= finding all protein atoms) Step 1. Prepare heavy atom (HA) derivatives Step 2. Measure |FPH| Step 3. Scale |FPH| vs. |FP| (assuming |FP| already measured ) Step 4. Find heavy atom positions by: Difference Pattersons; Direct methods; Difference Fouriers Step 5. Optional: refine HA-parameters without phase info Step 6. Calculate phase probability Step 7. Calculate best and figure of merit, m Step 8. Optimize HA-parameters and back to 6 until convergence reached Step 9. Calculate difference Fourier : m(FPH-FP)expibest & residual Fourier: m (FPHobs FPHcalc ) exp i calc PH to complete HA-parameter set and optimize phase again Step 10. Calculate ‘best’ Fourier : ( x ) m hkl | Fhklobs | exp i hkl best hkl Note: |FP| = the structure factor of the protein; |FPH| = the structure factor of the HA derivative. 2011 11 BSTR521 MIR-I V01 Page 4 Wim Hol BSTR521 Winter 2011 MIR: CAN A HEAVY ATOM COMPOUND GIVE A MEASURABLE SIGNAL? Crick & Magdoff derived in 1959: I 2 N H I NP Where: 1 2 f H fP NH = number of heavy atoms NP = number of protein atoms fH = atomic scattering factor h. a. fP = atomic scattering factor protein. The effect of a single mercury is surprisingly large: Mr of protein 14,000 28,000 56,000 112,000 224,000 448,000 896,000 1,792,000 I / I due to one Hg I / I due to four Hg 0.51 0.36 0.25 0.18 0.13 0.09 0.06 0.045 1.02 0.72 0.50 0.36 0.26 0.18 0.13 0.09 So even multi-million Dalton proteins easily doable, in principle.... 2011 11 BSTR521 MIR-I V01 Page 5 Wim Hol BSTR521 Winter 2011 Intensity differences in diffraction patterns of native and derivative crystals of papain: One half reflections from native papain crystal; one half from a papain crystal soaked in mercury compound. Notice that the left-right mirror symmetry is broken but that the upper-lower symmetry is maintained since the latter are crystallographically equivalent and from the same crystal. From: http://www.esi.umontreal.ca/~syguschj/cours/BCM6200/BCM6200_Isomorphous%20Replacement.pdf But: the original diffraction images combination from Jan Drenth, University of Groningen, The Netherlands. 2011 11 BSTR521 MIR-I V01 Page 6 Wim Hol BSTR521 Winter 2011 MIR Step 1 : Preparation of Heavy Atom Derivatives General comments: - Classical “Medium Long Soaks”: see two pages below. - Modern “Quick Soaks”: see two pages below. (Even unfreezing a frozen crystal and soak in HA solution and refreeze is a worthwhile idea – you might get annealing benefit coupled with HA benefit (but in many cases the HA compound is damaging your crystal) - To backsoak or not to backsoak is a delicate question. - Co-crystallization with heavy atom compounds is sometimes spectacularly successful. - K2PtCl4 single most successful HA compound? - Change mother liquor, or pH, where needed or sensible. - Beware of complexation of your beloved heavy atom compound by your beloved additive, e. g. EDTA, azide, DTT, metal ions, phosphate ions, etc. - Know your protein perfectly. For instance, if it has a putative Ca-site try lanthanides, or barium. If there are no or very few Cys then use sitedirected mutagenesis to introduce extra Cys to increase the probability of binding a Hg, Au or other HA. If there are no His, introduce a few His to increase probability of HA binding. Basic texts: - Blundell & Johnson - chpt. 8 - still full of insight. - Bernard Rupp “Biomolecular Crystallography” (2010) 2011 11 BSTR521 MIR-I V01 Page 7 Wim Hol BSTR521 Winter 2011 MIR Step 1 : Preparation of heavy atom derivatives (26) 465 SCREENING FOR HEAVY-ATOM DERIVATIVES TABLE 1 MOST COMMONLY CITED HEAVY-ATOM DERIVATIZING REAGENTS AS COMPILED FROM MACROMOLECULAR STRUCTURES FOR 1991-1994a Reagent K2PtCl4 KAu(CN)2 Hg(CH3COO)2 Pt(NH3)2Cl2b UO2(CH3COO)2 HgCl2 K3UO2F5 Ethyl mercurithiosalicylatec (K/NA)AuCl4 (Na/K)3IrCl6 CH3CH2HgPO4 K2PtCl6 UO2(NO3)2 K2Pt(NO2)4 (CH3)3Pb(CH3COO) CH3HgCl p-Chloromercuribenzene sulfonate a b c d e f Citation 73 29 29 26 25 25 23 22 22 21 20 19 17 17 14 13 13 Reagent K2Pt(CN)4 PIPd Pb(CH3COO)2 K2HgI4 Mersalyt p-Chloromercuribenzoate CH3Hg(CH3COO) TAMMe SmCl3 K2OsO4 (K/Na)2OsCl6 UO2SO4 Baker's dimercurialf 2-Chloromercuri-4-nitrophenol AgNO3 CH3CH2HgCl p-Hydroxymercuribenzoate Citation 12 12 12 12 12 11 11 10 8 8 7 6 6 6 5 5 5 The reagents are ranked by the number of times they were used in MIR or SIR structure determinations, and do not necessarily reflect the quality of the derivatives. From Ref. 10. When specified, the cis isomer was used most often. Thimerosal. Di-m-iodobis(ethylenediamine)diplatinum. Tetrakis(mercuriacetoxy)methane; C(HgOOCH3)4. 1,4-Diacetoxymercuri-2,3-dimethoxybutane. Reagents are listed only with regard to their frequency of use, and not by any measure of their usefulness in phasing. Some combinations of heavy atom reagents and stabilizing solution are not recommended, because the heavy atom is bound by or reacts with the stabilizing solution; these considerations are thoroughly discussed elsewhere. From: Mark A. Roulp "Screening for Heavy Atom Derivatives and obtaining Accurate Isomorphous Differences" Methods in Enzymology, Vol. 276, p. 465 (1997) (Carter and Sweet, Eds) 2011 11 BSTR521 MIR-I V01 Page 8 Wim Hol BSTR521 Winter 2011 Long, Medium and Quick Heavy Atom soaks. In addition to the question which heavy-atom compounds (HACs) to use, there is the question of which concentrations to use, and how long a soak should last. The traditional procedures used to be 1 to 3 mM HAC concentrations for 1-3 days at room temperature. This “medium” approach is still a very useful initial guide. Some protein and nucleic acid crystals are very sensitive to certain HACs, and then trying shorter times and lower concentrations are worthwhile trying out. If no trace of a bound HAC is seen in difference Pattersons, or in difference Fouriers, then higher concentrations should be considered. For certain Pb-containing HACs soaking times should be much longer, in the order of a few weeks. With the advent of flash-cooling of crystals in liquid nitrogen, a new procedure is becoming popular: the “quick soak” method. Use high concentrations of HACs but only for a minute or a few minutes, or in some cases only a few seconds. There is ample evidence that small compounds can lead to full occupancy in binding sites after soaking for only 10 seconds (E.g. Bosch et al, J. Med. Chem. 49, 5939-5046 (2006)). Of course, if a slow conformational change, or a slow chemical reaction, has to occur during HAC binding then such short times might not work. The HAC concentrations in the quick-soak method are usually quite high and would destroy often the crystals when soaked for days as used in the traditional medium procedure. But with flash freezing, crystals can often be rescued before they are destroyed and give reasonable diffraction patterns and yield good derivatives. Below are some papers which describe initial successes with the quick-soak method, which was developed originally specifically with anomalous diffraction difference-phasing (SAD and MAD) in mind. Dauter, Z., Dauter, M. & Rajashankar, K. R. (2000). Novel approach to phasing proteins: derivatization by short cryo-soaking with halides. Acta Crystallogr D Biol Crystallogr 56, 232-7. Dauter, Z. & Dauter, M. (2001). Entering a new phase: using solvent halide ions in protein structure determination. Structure 9, R21-6. Nagem, R. A., Dauter, Z. & Polikarpov, I. (2001). Protein crystal structure solution by fast incorporation of negatively and positively charged anomalous scatterers. Acta Crystallogr D Biol Crystallogr 57, 996-1002. R. A., Polikarpov, I. & Dauter, Z. (2003). Phasing on rapidly soaked ions. Methods Enzymol 374, 120-37. Dauter, M. & Dauter, Z. (2006). Phase determination using halide ions. Methods Mol Biol 364, 149-58. 2011 11 BSTR521 MIR-I V01 Page 9 Wim Hol BSTR521 Winter 2011 Noble Gases as Heavy Atom Derivatives Although already tried out with crystals mounted at room temperature in capillaries by Schoenborn, Watson & Kendrew, Nature 207, 28-30 (1965), in recent years the use of xenon as heavy atom derivative has become quite popular even at cryotemperatures. Xenon derivatives tend to be very isomorphous to the native crystals! The success rate claimed is in the order of 50%! (Maybe better : was once in the order of 50%...) A nice example of Xenon SIRAS phasing is described by Machius et al., PNAS, 96, 11717-11722 (1999). In this particular case the protein crystal was prepared by pressurizing the crystal with 500 psi of xenon gas for 15 minutes at room temperature. The chamber was then decompressed within 15 seconds and the crystal flash-frozen in liquid propane within another 5 seconds. The xenon-derivatized crystals diffracted to 1.9 Å. Papers describing devices to prepare xenon-derivatized crystals are: A simple device for studying macromolecular crystals under moderate gas pressures (0.1-10MPa). Stowell, M. H. B., Soltis, S. M., Kisker, C., Peters, J. W., Schindelin, H., Rees, D. C., Cascio, D., Beamer, L., Hart, P. J., Wiener, M. C. & Whitby, F. G. J. Appl. Cryst. 29, 608-613 (1996). Freeze-trapping isomorphous xenon derivatives of protein crystals. Sauer, O., Schmidt, A. & Kratky, C. J. Appl. Cryst. 30, 476-486 (1997). A cell for producing xenon-derivative crystals for cryocrystallographic analysis. Djinovic-Carugo, K., Everitt, P. & Tucker, P. A. J. Appl. Cryst. 31, 812-814 (1998). Note: Xenon has an interesting but not large anomalous signal - yet, worth measuring. However, krypton binds generally at the same sites as xenon, but has a significantly larger anomalous signal at = 0.87 Å and is therefore very interesting from a MADphasing perspective even though for krypton higher pressures seem to be required than for xenon. Relevant krypton papers are: High-pressure krypton gas and statistical heavy-atom refinement: a successful combination of tools for macromolecular structure determination. Schiltz, M., Shepard, W., Fourme, R., Prangé, T., De La Fortelle, E. & Bricogne, G. Acta Cryst. D53, 78-92 (1997). MAD phasing with krypton. Cohen, A., Ellis, P., Kresge, N. & Soltis, S. M. Acta Cryst. D57, 233-238 (2001). 2011 11 BSTR521 MIR-I V01 Page 10 Wim Hol BSTR521 Winter 2011 MIR Step 2 : MEASURE DERIVATIVE DATA MIR Step 3 : SCALING DERIVATIVE AND NATIVE DATA * Must be done with very great care and with thorough analysis of results since everything depends on small differences between FP and FPH. * Analysis of fall-off of FP and FPH in 3 (almost) perpendicular directions useful to detect suspicious differences in falloff. (Look e.g. at the TRUNCATE output in the CCP4 suite of programs). * Anisotropic scaling of FPH versus FP is the minimum one should do. * Local scaling can sometimes do wonders. * Scan for differences as function of FP : saturation effects might cause problems. * Be aware that one or two outrageous differences can make your difference Patterson look like a checkerboard or a mountain ridge: always check for outliers in the FPH FP list. 2 2011 11 BSTR521 MIR-I V01 Page 11 Wim Hol BSTR521 Winter 2011 MIR Step 4 : LOCATING HEAVY ATOM POSITIONS This is of course a crucial next step. If this fails everything later does not apply. Be aware that in tough cases one needs to explore small differences in intensities and hence great care in the measurement step is essential – and the highest resolution might not be best sine then radiation damage takes its toll. High resolution is not really required for this particular step since in general the sites are far apart (of course not always!). For a 480K Dalton case, 4.5 Å resolution was OK to find 30 and 70 sites in two derivatives. Currently there are several powerful computational procedures to find at least a subset of the HA sites, including: - Direct Methods Shake-and-Bake (or SNB) is among the most powerful among these. A similar method is used by SHELXD with some Patterson information. - Patterson Procedures - By “hand” – sometimes works like a dream – in simple cases… - Superposition methods - Vector Search methods The program SOLVE uses Patterson methods based on a previous program HASPP. Distinguish: - No local symmetry present; e.g., only one subunit per asymmetric unit. - Local ( non-crystallographic) symmetry present. Distinguish: - polar local symmetry (e.g. “cyclic” such as 3, 5, 6, also called C3, C5, C6, etc.) - non-polar local symmetry (e.g. “dihedral” 222, 32, also called D2, D3, etc.); or “cubic” (i.e. tetrahedral (T), octahedral (O), or icosahedral (I)) - helical symmetry - irregular (or indecent) symmetry where e.g. you have several trimers but with non-intersecting axes… can be very tricky. 2011 11 BSTR521 MIR-I V01 Page 12 Wim Hol BSTR521 Winter 2011 MIR Step 4 : DIFFERENCE PATTERSON The ideal Patterson to derive heavy atom positions from would have 2 coefficients FH . However, the only measurements available are FP & FPH . Therefore take FPH FP as an estimate of FH . In many cases this will be a poor estimate, i.e. in cases when FH FP . However, the estimate is excellent when FH // Fp , i.e. when H P and also when H P . How well will FPH FP 2011 11 BSTR521 MIR-I V01 2 2 approximate FH on average? Page 13 Wim Hol BSTR521 Winter 2011 MIR Step 4 : DIFFERENCE PATTERSON H FPH FP CE AC AE AD DC AE Fp cos P PH FH cos PH H FP FH cos PH H FP 1 cos PH P P FH cos PH H 2FP sin 2 PH 2 F PH (1) (2) (3) (4) 2 P 2 PH P FP FH2 cos2 PH H 4FP2 sin 4 PH 4FP FH sin cos PH H 2 2 2 2 FH cos PH H noise 1 1 FH2 FH2 cos 2 PH H noise 2 2 1 FH2 theoretical noise 2 (5) (6) (7) (8) In practice, add measuring errors & non-isomorphism: 1 2 2 FPH FP FH2 Fiso 2 theoretical noise + measurement noise + non-isomorphous noise. 2011 11 BSTR521 MIR-I V01 Page 14 Wim Hol BSTR521 Winter 2011 MIR Step 4 Fig. 3. The Harker section at w=1/3 from the difference Patterson function for the K2OsCl6 single-site derivative of porcine growth hormone (16). Peaks are contoured at equal intervals with the first contour at one standard deviation of the entire map. Number 1 2a 3 4 5 a Table 2 Position of Interatomic Vectors Representing Symmetry-Related Heavy Atoms in a Unit Cell of Space Group P3221 Vector positions (u,v,w) Symmetry operations used x, y, z y, x y, z 2 3 x, y, z x y, x, z 1 3 x, y, z y, x, z x, y, z x y, y, z 1 3 x, y, z x, x y, z 2 3 x y, x 2y,1 3 2x y, x y, 2 3 x y, x y,2z y,2 y,2z 2 3 2x, x,2z 1 3 Vector position number 2 is related to that of number 1 by symmetry, thus it is not unique. Ref: Sherin S. Abdel-Meguid 1996 “Structure Determination Using Isomorphous Replacement” In: C. Jones, B. Mulloy, and M.R. Sanderson (Eds.) Methods in Molecular Biology Volume 56: Crystallographic Methods and Protocols, Humana Press, Totowa, New Jersey, pp153-171. 2011 11 BSTR521 MIR-I V01 Page 15 Wim Hol BSTR521 Winter 2011 MIR Step 4 :Solving a Patterson by “hand” Space group P21. 1 1 I.e. In Harker section v 2 : u 2x x 2 u w 2z z 1 w 2 But x & z with respect to which two-fold screw axis? For first position: take your choice & thereby define origin not only in x & z but also in y For second position: use cross vectors (1-2) to obtain coordinates relative to same origin as first position. For each of the potential positions for site #2, the crossvector #1 to #2 is different. Hence the crossvectors determine the position of site #2, once site #1 is placed. 2011 11 BSTR521 MIR-I V01 Page 16 Wim Hol BSTR521 Winter 2011 MIR Step 4 : Solving a Patterson by “hand” Space group P21 21 21 Positions: Vector: Patterson symmetry.: Pmmm 1 : x 2 : 1 2 3 : 4 : 2 - 1: y z -y z+ 12 -x y+ 12 -z+ 12 x+ 12 -y+ 12 -z 1 2 -x - 2x 3 - 1 : -2x 4 - 1: 1 2 -2y 1 2 -2y+ 12 Harker w = 1 2 -2z+ 12 Harker v = 1 2 -2z Harker u = 1 2 1 2 Therefore, if Harker section w = 12 contains a peak at (u1, v1, 1 2 ) and Harker section v = 12 contains a peak at ( 12 -u1, 1 2 , w1) and Harker section u = 12 contains a peak at ( 1 2 , 1 2 -v1, 1 2 -w1). Then: (u1, v1, w1) (x1, y1, z1) of position 1, by using the 2-1, 3-1 and 4-1 equations for the vectors given above, but going from u,v,w to x,y,z.. Again: define origin with this first position & use cross-vectors to set other position relative to the same origin. The figure above shows the symmetry operations on the frequently occurring space group P212121 (See the International Tables of Crystallography for full explanation). This space group is surprisingly complex. 2011 11 BSTR521 MIR-I V01 Page 17 Wim Hol BSTR521 Winter 2011 MIR Step 5 :Refinement of Heavy Atom Parameters without phase information. FPH 2 Q R Whkl FP ,obs FPH ,obs FH ,calc hkl F with: H ,calc and: xj, yj, zj Bj Zj Whkl j 2 2 f j Z j exp B j sin exp 2 i hx j ky j lz j 2 = positional parameters position of atom j; = isotropic temperature factor (sometimes aniso); = occupancy (usually highly correlated with Bj); 2 = FPH FP to give large differences most weight. QR is quite weak in removing erroneous positions (the dreadful model-bias problem). Hence Terwilliger & Eisenberg (1983) introduced the origin-removed version of QR, called here QTE: 2 2 2 2 Q TE Whkl FP,obs FPH ,obs FP,obs FPH ,obs k FH ,calc FH ,calc hkl k = 1 for centric, ½ for acentric reflections w = weight factor - quite complex in QTE. 2 Note: QR can incorporate anomalous dispersion info, and is then called “FHLE-refinement.” 2011 11 BSTR521 MIR-I V01 Page 18 Wim Hol BSTR521 Winter 2011 MIR I - APPENDIX Solving a Patterson by “hand” in the presence of local symmetry. Hemocyanin: Space group P21 Particle point group 32, also called D3: Rotation function showed that the local 3-fold runs approximately parallel to the crystallographic 2-fold screw axis. This was one of the derivatives in the structure determination of Panulirus interruptus haemocyanin, which is a hexamer with 6 subunits of ~75kDa each, about 450 kDa in total. The K2PtCl4 difference Patterson yielded 4 sites found by hand. Total set was 30 sites eventually. JMB 158, 457-483 (1982). 2011 11 BSTR521 MIR-I V01 Page 19 Wim Hol BSTR521 Winter 2011 MIR Step 4: Hemocyanin (ctd) (Hexamer, with point group 32, in asymmetric unit of space group P21 with local 3 fold parallel 21). In the top Fig. below left: A horizontal line means a triangle of HA sites. Δ is the distance between two planes of triangles with HA sites. On the right, a horizontal line means a plane with the endpoints of vectors between two triangles with HA sites shown at the left. 3 Harker 3 3 Pseudo-Harker Local Harker 3 Corresponding Patterson Heavy atom arrangement viewed perpendicular to 21 = b-axis 0 0 0 0 0 0 Heavy atom sites viewed parallel to the 2-fold screw axis 2011 11 BSTR521 MIR-I V01 Page 20 Wim Hol BSTR521 Winter 2011 MIR Step 4 : Hemocyanin (ctd.) * In the F K 2 PtCl 4 Fnative 0.14, w 0)! 2 Patterson a major peak occurred at u 0, v = * This suggested that two “heavy atom triangles” would be eclipsed! * If this were the case then: (1) the vector set of the “local Harker” v 0.14 should be similar to the “local Harker” v 0; (2) the “pseudo Harker” v ½ -0.14 0.36 should share features with the true Harker v = ½. View of the arrangement of the major Pt-sites in P. interruptus heamocyanin down the crystallographic two fold 2011 11 BSTR521 MIR-I V01 Page 21 Wim Hol BSTR521 Winter 2011 MIR Step 4 : How section v = 0.14 of the K2PtCl4 difference Patterson of hemocyanin actually looked like: (a) 2011 11 BSTR521 MIR-I V01 Page 22 Wim Hol BSTR521 Winter 2011 MIR Step 4 : Hemocyanin - Section v = 0.00 of K2PtCl4 difference Patterson. (b) 2011 11 BSTR521 MIR-I V01 Page 23 Wim Hol BSTR521 Winter 2011 MIR Step 4 : Hemocyanin: K2PtCl4 difference Patterson. Harker section v = ½. (c) 2011 11 BSTR521 MIR-I V01 Page 24 Wim Hol BSTR521 Winter 2011 MIR Step 4 : Hemocyanin: K2PtCl4 difference Patterson section v = 0.36. This is in this special case a “pseudo Harker section”, since 0.36=0.50-014 (d) 2011 11 BSTR521 MIR-I V01 Page 25 Wim Hol