Bio-SAXS Absolute Units, Mass Retrieval, Globularity, Distribution Function Javier Pérez Beamline SWING, Synchrotron SOLEIL, Saint-Aubin, France BioSAXS Workshop, Honolulu, July 2013 Mass retrieval from Guinier analysis Q 2 Rg 2 I (Q) I (0) exp 3 Prof. André Guinier 1911-2000 Orsay, France Absolute Unit : cm-1 Classical electron radius c M r02 2 I (0) v p prot buf NA Rg 2 r d r V r 2 prot r buf d r V prot buf Mass concentration Electronic density contrast Protein specific volume I(0) gives an independent estimation of the molar mass of the protein (only if the mass concentration, c, is precisely known …) Rg depends on the volume AND on the shape of the particle 1 3 For globular proteins : Rg (Å) ≈ 6.25 ∗ 𝑀 , 𝑀 𝑖𝑛 𝑘𝐷𝑎 For unfolded proteins : Rg (Å) ≈ 8. 05 ∗ 𝑀0.522 Bernado et al. (2009), Biophys. J., 97 (10), 2839-2845. Typically : M (kDa) = 1500 * I0 (cm-1) / C (mg/ml) BioSAXS Workshop, Honolulu, July 2013 Calibration of the set-up using water scattering SWING Liquid scattering (theory): I(Q) constant at small Q r0 IH2O,theory = 0.0163 cm-1 Molecular density 2 Z 2 A kTT 2 Isothermic compressibility Water is used as primary reference to get the absolute intensity scale •Capillary diameter =1.6 mm •Average of 2 frames of 2s •Empty capillary subtracted •Normalized by solid angle •Normalized by transmitted intensity Example: IH2O,exp = 0.042 Exp. Units IH2O,exp = Kexp* IH2O,theory Here : Kexp=2.56 Exp.Units / cm-1 For any sample in that capillary : Itheory(cm-1) = Iexp / Kexp = Iexp / 2.56 BioSAXS Workshop, Honolulu, July 2013 Example of Mass retrieval from Guinier analysis Hen egg-white lysozyme M=14.3 kDa •C =5.6 g/l •Average of 8 frames of 2s Ln I •Buffer subtracted •Normalized by solid angle •Normalized by transmitted intensity Rg 2 2 LnI (Q) LnI (0) Q 3 Rg = 15.1 ± 0.03 Å Iexp(0) = = 0.0543 cm-1 From I(0) provided the set-up was calibrated to give I(Q) in absolute units (cm-1). Mexp(kDa) = Iexp (0) *1500 / c, Mexp = 14.6 kDa Q2 From Rg, supposing the protein is globular: MRg(kDa) = (Rg / 6.3)3 MRg = 13.8 kDa BioSAXS Workshop, Honolulu, July 2013 Kratky Plot SAXS provides a sensitive means to evaluate the degree of compactness of a protein: o To determine whether a protein is globular, extended or unfolded Prof. Otto Kratky 1902-1995 Graz, Austria o To monitor the folding or unfolding transition of a protein This is most conveniently represented using the so-called Kratky plot: Q2 I(Q) versus Q Folded particle : bell-shaped curve (asymptotic behaviour I(Q)~Q-4 ) Random polymer chain : plateau at large q-values (asymptotic behaviour in I(Q)~ Q-2 ) Extended polymer chain : increase at large q-values (asymptotic behaviour in I(Q)~ Q-1.x ) BioSAXS Workshop, Honolulu, July 2013 Kratky Plots of folded proteins 0.0025 G-Actin ASNP ASDG CDA2 BCDA3 0.0015 2 Q I(Q) / I(0) 0.002 0.001 0.0005 0 0 0.1 0.2 0.3 0.4 0.5 Q Folded proteins display a bell shape. Can we go further? BioSAXS Workshop, Honolulu, July 2013 Dimensionless Kratky Plots of folded proteins Introduced for biology in Durand et al. (2010), J. Struct. Biol. 169, 45-53. The relation MRg(kDa) ≈ (Rg / 6.3)3 only works for the globular structures, not the elongated For globular structures, DLKPs fold into the same maximum 1.6 G-Actin Rg=23.2 Angs, Mass=41.7 kDa ASNP Rg=26.0 Angs, Mass=71.4 kDa ASDG Rg=35.6 Angs, Mass=146.6 kDa CDA2 Rg=39.1 Angs, Mass=98.9 kDa BCDA3 Rg=51.7 Angs, Mass=144.4 kDa 1.4 2 (QRg) I(Q) / I(0) 1.2 1.1 1 0.8 0.6 0.4 0.2 0 0 2 1.75 4 6 8 10 QRg The maximum value on the dimensionless bell shape tells if the protein is globular. BioSAXS Workshop, Honolulu, July 2013 Dimensionless Kratky Plots of (partially) unfolded proteins Receveur-Bréchot V. and Durand D (2012), Curr. Protein Pept. Sci., 13:55-75. unfolded 3.5 PolX p47 p67 XPC IB5 2.5 2 g 2 (qR ) I(q)/I(0) 3 1.5 1.1 1 0.5 0 0 1.75 2 4 qR 6 8 10 globular g The bell shape vanishes as folded domains disappear and flexibility increases. The curve increases at large Q as the structure extends. BioSAXS Workshop, Honolulu, July 2013 Kratky Plot : NCS heat unfolding ! In practice, thin Gaussian chains do not exist. In spite of the plateau at T=76°C, NCS is not a Gaussian chain when unfolded, but a thick chain with persistence length Pérez et al., J. Mol. Biol.(2001), 308, 721-743 BioSAXS Workshop, Honolulu, July 2013 Cytochrome c folding kinetics 44 ms after mixing 160 µs after mixing S. Akiyama et al. (2002), PNAS, 99, 1329-1334. ApoMb : T. Uzawa et al. (2004), PNAS, 101, 1171-1176 BioSAXS Workshop, Honolulu, July 2013 Porod Invariant and Porod Volume Porod law: Homogeneous object I(Q) ~ Q-4 at high Q values Q I (Q) Q dQ 2 1 r 2 2 2 e 2 0 Porod invariant Number of proteins N Vobj V Protein volume Solution volume Protein volumic concentration • calculated from experimental data in absolute units • does not depend on shape, only on contrast Since Then I0 re φVobj 2 Vobj 2 2 I (0) 2 Q • valid for diluted systems • does not require absolute units I(0) / Q therefore gives an independent estimation of the volume of the protein But : • Requires Porod law is fulfilled • Not valid for unfolded proteins BioSAXS Workshop, Honolulu, July 2013 Porod Invariant Q I (Q) Q dQ 2 Porod invariant 0 Kratky representation : I(Q)·Q2 vs Q I(Q)·Q2 The Porod Invariant is the integral of this curve Program Primus Atsas suite of programs www.embl-hamburg.de/biosaxs/software.html Q BioSAXS Workshop, Honolulu, July 2013 Molecular Weight estimation based on Porod invariant http://www.ifsc.usp.br/~saxs/saxsmow.html • does not require knowledge of concentration • relies on Porod Volume theory + structural database • does not work for proteins with unfolded domains Recent methods for MW estimation based on similar though different grounds were developed Track B. Rambo R. And Tainer J. (2013), Nature, 496, 477-481. BioSAXS Workshop, Honolulu, July 2013 Distance Distribution Function p(r) The distance distribution function p(r) is proportional to the average number of atoms at a given distance, r, from any given atom within the macromolecule. P(R) Cylindre Sphere solide Disque Domaines Protein Dmax R The pair distribution function characterises the shape of the particle in real space BioSAXS Workshop, Honolulu, July 2013 Relation between p(r) and I(Q) sin Qr IQ 4 re φ obj (r )r dr Qr Vobj Intensity is the Fourier Transform of self-correlation function γobj(r): It can be shown that : Then : 2 p(r ) obj (r )r 2 IQ 4 re φ p(r ) D 2 0 And : 2 r2 p(r) 2 2 2φ re 0 sin Qr dr Qr Fourier Transform for isotropic samples sin Qr Q I (Q) dQ Qr 2 p(r) could be directly derived from I(Q). Both curves contain the same information. However, direct calculation of p(r) from I(Q) is made difficult and risky by [Qmin,Qmax] truncation and data noise effects. BioSAXS Workshop, Honolulu, July 2013 Back-calculation of the Distance Distribution Function Glatter, O. J. Appl. Cryst. (1977) 10, 415-421. Main hypothesis : the particle has a « finite » size, characterised by Dmax. Prof. Otto Glatter Guinier Prize 2012 Graz, Austria • Dmax is proposed by the user • p(r) is expressed over [0, DMax] by a linear combination of orthogonal functions M ptheoret (r ) cn n (r ) 1 • I(Q) is calculated by Fourier Transform of ptheoret(r) I (Q) 4 re 2 Dmax 0 sin(Q r ) ptheoret (r ) dr Qr Svergun (1988) : program "GNOM" Dr. Dmitri Svergun M ~ 30 - 100 ill-posed LSQ regularisation method Hamburg, Germany + "Perceptual criteria" : smoothness, stability, absence of systematic deviations • Each criterium has a predefined weight • The solution is given a score calculated by comparison with « ideal values » BioSAXS Workshop, Honolulu, July 2013 Distance Distribution Function Experimental examples Heat denaturation of Neocarzinostatin GBP1 Pérez et al., J. Mol. Biol. (2001) 308, 721-743 BioSAXS Workshop, Honolulu, July 2013 Distance Distribution Function Experimental examples Bimodal distribution Topoisomerase VI 0.0008 0.0007 70 Å P(r) / I(0) 0.0006 0.0005 0.0004 0.0003 0.0002 0.0001 0 0 50 100 150 200 250 r (Å) M. Graille et al., Structure (2008), 16, 360-370. BioSAXS Workshop, Honolulu, July 2013 Distance Distribution Function Scattering curves obtained on different complexes Spire-Actin and Actin alone Complexes Radius of gyration Maximum diameter 75.5 Å 285 Å 55.5 Å 210 Å 38.9 Å 130 Å 25 Å 75 Å 23.1 Å 70 Å Histogram of intramolecular distances and ab initio molecular enveloppes determined using DAMMIF P(R) KindABCD-A4 P(R) Dmax = 285 BCD-A3 P(R) Dmax = 210 r in Å • Organization of actin oligomers CD-A2 P(R) Dmax = 130 r in Å D1-A1 Dmax = 75 r in Å r in Å BioSAXS Workshop, Honolulu, July 2013 Distance Distribution Function The radius of gyration and the intensity at the origin can be derived from p(r) using the following expressions : R 2 g Dmax 0 2 r 2 p (r )dr Dmax 0 p (r )dr and I0 4 re φ p (r )dr 2 D 0 This alternative estimate of Rg makes use of the whole scattering curve, and is less sensitive to interactions or to the presence of a small fraction of oligomers. Comparison of estimates from Guinier analysis and from P(r) is a useful cross-check. BioSAXS Workshop, Honolulu, July 2013 To what extent does SAXS give information of flexible proteins ? The example of IB5 • Salivary, proline-rich protein, 70 residues (pink) First analyzed as a thick worm-like polymer chain • Intrinsically Disordered Protein (IDP) L = 190 Å b = 30 Å Rc = 2.7 Å Rg = 30 Å Rg larger than usual IDP : Rg = 2.54(n)0.522 Å Rg = 23 Å H. Bose et al., Biophys. J. (2010), 99,656-665 BioSAXS Workshop, Honolulu, July 2013 To what extent does SAXS give information of flexible proteins ? The example of IB5 • Salivary, proline-rich protein, 70 residues (pink) • Intrinsically Disordered Protein (IDP) Data-compatible average structure Model dependent structure distribution Which is the best way to present the results is an open question BioSAXS Workshop, Honolulu, July 2013 Comments Analysis and modeling require a monodisperse and ideal solution, which has to be checked independently. SAXS is at his best when it is used to distinguish between several preconceived hypotheses. BioSAXS Workshop, Honolulu, July 2013 BioSAXS Workshop, Honolulu, July 2013