Protein crystallography in practice MCB 10 Dec 2015 Daved H. Fremont fremont@pathology.wustl.edu Department of Pathology and Immunology Washington University School of Medicine An 7-step program for protein structure determination by x-ray crystallography 1. Produce monodisperse protein either alone or as relevant complexes 2. Grow and characterize crystals 3. Collect X-ray diffraction data 4. Solve the phase problem either experimentally or computationally 5. Build and refine an atomic model using the electron density map 6. Validation: How do you know if a crystal structure is right? 7. Develop structure-based hypothesis 1. Produce monodisperse protein either alone or as relevant complexes Methods to determine protein purity, heterogeneity, and monodispersity Gel electrophoresis (native, isoelectric focusing, and SDS-PAGE) Size exclusion chromatography Dynamic light scattering http://www.protein-solutions.com/ Circular Dichroism Spectroscopy http://www-structure.llnl.gov/cd/cdtutorial.htm Characterize your protein using a number of biophysical methods Establish the binding stoichiometry of interacting partners 2. Grow and characterize crystals Hanging Drop vapor diffusion Sitting drop, dialysis, or under oil Macro-seeding or micro-seeding Sparse matrix screening methods Random thinking processes, talisman, and luck The optimum conditions for crystal nucleation are not necessarily the optimum for diffraction-quality crystal growth Space Group P21 4 M3 /ASU diffraction >2.3Å 14.4% Peg6K NaCacodylate pH 7.0 200mM CaCl2 Space Group C2 2 M3 /ASU diffraction >2.1Å 18% Peg4K Malic Acid/Imidazole pH 5.1 100mM CaCl2 Hanging Drop Sitting drop Commercial screening kits available from http://www.hamptonresearch.com; http://www.emeraldbiostructures.com Space Group P3121 3 M3 + 3 MCP-1/ASU diffraction > 2.3Å 18% Peg4K NaAcetate pH 4.1 100mM MgCl2 No Xtals? Decrease protein heterogeneity Remove purification tags and other artifacts of protein production Remove carbohydrate residues or consensus sites (i.e., N-x-S/T) Determine domain boundaries by limited proteolysis followed by mass spectrometry or amino-terminal sequencing. Make new expression constructs if necessary. Think about the biochemistry of the system! Does your protein have cofactors, accessory proteins, or interacting partners to prepare as complexes? Is their an inhibitor available? Are kinases or phosphatases available that will allow for the preparation of a homogeneous sample? Get a better talisman Building a crystal The unit cell c c b b a b a b g g a a Crystal symmetries A triclinic lattice (no symmetry) Crystal symmetries Introducing a twofold axis produces a monoclinic lattice P2 Crystal symmetries The threefold axis generates a trigonal crystal - but now a=b=90o, g=120o and a = b Crystal symmetries We cannot fill space with a fivefold arrangement – although the asymmetric unit can contain a fivefold axis (e.g. virus capsids) These restrictions give rise to 7 crystal classes in 3 dimensions The seven crystal classes 3. Collect X-ray diffraction data Initiate experiments using home-source x-ray generator and detector Determine liquid nitrogen cryo-protection conditions to reduce crystal decay While home x-rays are sufficient for some questions, synchrotron radiation is preferred Anywhere from one to hundreds of crystals and diffraction experiments may be required Argonne National Laboratory Structural Biology Center beamlineID19 at the Advanced Photon Source http://www.sbc.anl.gov 3. Collect X-ray diffraction data Lawrence Berkeley National Laboratory ALS Beamline 4.2.2 4. Solve the phase problem either experimentally or computationally Structure factor equation: By Fourier transform we can obtain the electron density. We know the structure factor amplitudes after successful data collection. Unfortunately, conventional x-ray diffraction doesn’t allow for direct phase measurement. This is know as the crystallographic phase problem. Luckily, there are a few tricks that can be used to obtain estimates of the phase a(h,k,l) Experimental Phasing Methods MIR - multiple isomorphous replacement - need heavy atom incorporation MAD - multiple anomalous dispersion- typically done with SeMet replacement MIRAS - multiple isomorphous replacement with anomalous signal SIRAS - single isomorphous replacement with anomalous signal Computational Methods MR - molecular replacement - need related structure Direct and Ab Initio methods - not yet useful for most protein crystals MAD phasing statistics for the AP-2 a-appendage 5. Build an atomic model using the electron density map Electron density for the AP-2 aappendage Initial bones trace for the AP-2 aappendage Final trace for the AP-2 aappendage What does a good map look like? plexiglass stack brass parts model Before computers, maps were contoured on stacked pieces of plexiglass. A “Richards box” was used to build the model. halfsilvered mirror Low-resolution At 4-6Å resolution, alpha helices look like sausages. Medium resolution ~3Å data is good enough to see the backbone with space in between. Holes in rings are a good thing Seeing a hole in a tyrosine or phenylalanine ring is universally accepted as proof of good phases. You need at least 2Å data. The resolution of the electron-density map and the amount of detail that can be seen Resolution Structural Features Observed 5.0 Å 3.5 Å 3.0 Å 2.8 Å 2.5 Å Overall shape of the molecule Ca trace Side chains Carbonyl oxygens (bulges) Side chain well resolved, Peptide bond plane resolved Holes in Phe, Tyr rings Current limit for best protein crystals 1.5 Å 0.8 Å The 2.8 Å density of SrfTE The 2.8 Å density of SrfTE could be skeletonised and traced 6. Validation: How do you know if a crystal structure is right? The R-factor R = S(|Fo-Fc|)/S(Fo) where Fo is the observed structure factor amplitude and Fc is calculated using the atomic model. R-free An unbiased, cross-validation of the R-factor. The R-free value is calculated with typically 5-10% of the observed reflections which are set aside from atomic refinement calculations. Main-chain torsions: the Ramachandran plot Geometric Distortions in bond lengths and angles Favorable van der Waals packing interactions Chemical environment of individual amino acids Location of insertion and deletion positions in related sequences 6. Validation: How do you know if a crystal structure is right? 6. Validation: Mapping of sequence conservation in AP-2 a-subunit appendages Traub LM, Downs MA, Westrich JL, and Fremont DH: (1999) Crystal structure of the a-appendage of AP-2 reveals a recruitment platform for clathrin-coat assembly. Proc. Natl. Acad. Sci. U.S.A. 96:8907-8912. 7. Develop structure-based hypothesis Structure-Based Mutagenesis of the a-appendage Traub LM, Downs MA, Westrich JL, and Fremont DH: (1999) Crystal structure of the a-appendage of AP-2 reveals a recruitment platform for clathrin-coat assembly. Proc. Natl. Acad. Sci. U.S.A. 96:8907-8912. Example: West Nile Virus About 70 members, half of which are associated with human disease (Yellow fever, Japanese encephalitis) Enveloped, spherical virion, 40 - 50 nm in size Three structural proteins: C,M (prM) and E ; seven non-structural proteins (NS1-5) ssRNA genome, linear, positive polarity, 11 kb, infectious C M E NS1 NS2a 2b NS3 NS4a 4b NS5 5’UTR Structural proteins 3’UTR Non-structural proteins Production of soluble E proteins and ectodomain fragments Immunize mice with soluble E (25 mg x 3) Fuse splenocytes with myeloma line Large panels of flavivirus mAbs Structure Determination of WNV Envelope Protein Table 1. Summary of Data Collection and Refinement Data Collection for West Nile Virus Envelope a Space Group P41212 Unit Cell (Å3) a=89.6 b=89.6 c=154.0 Wavelength(Å) 0.90 X-ray Source APS-BM 14 Resolution(Å) (outer shell) 20-2.9 (3.08-2.90) Observations/Unique 14408/62790 Completeness(%) 98.5 (99.5) Rsym(%) 5.7 (52.4) I/s 16.9 (2.05) Refinement Statistics b Resolution(Å) (outer shell) 20-3.0 (3.19-3.00) Reflections Rwork/Rfree 11506/607 #Protein Atoms/Solvent/Heterogen 3031/28/38 Rwork overall(outer shell) (%) 26.2(35.6) Rfree overall(outer shell) (%) 30.8(34.1) o Rmsd Bond lengths (Å)/angles( ) 0.008/1.6 Rmsd Dihedral/Improper (o) 24.9/0.84 Ramachandran plot Most Favored/Additional (%) 78.2/21.8 Generous/Disallowed (%) 0.0/0.0 Average B-values 92.0 Est. Coordinate Error (Å) 0.47 a Values as defined in SCALEPACK (Otwinowski and Minor, 1997). Values as defined in CNS (Brunger, AT) b Envelope Protein and the Flavivirus virion X-ray crystal structure of E DI DIII DII Immature Cryo-EM model of WNV Mature 5 prM Cleavage 3 60 trimers of prM/E heterodimers 2 180 E monomers E16 is a potent neutralizing mAb with therapeutic activity against WNV in mice Single Dose mAb at Day 5 Post-Infection Humanized E16 binds WNV DIII with similar affinities and kinetics as E16 RU 45 40 DIII binding E16 35 30 25 Summary of Surface Plasmon Resonance (SPR) studies 20 15 Response 10 5 0 -5 -10 -50 0 50 100 150 200 250 300 350 Time RU 30 400 s DIII binding Hm-E16.3 25 20 15 10 Response 5 0 -5 -10 -50 0 50 100 150 200 Time 250 300 350 400 s Antibody E16 Hm-E16.1 Hm-E16.2 Hm-E16.3 ka (1/Ms) 1.1 x 106 9.6 x 105 1.0 x 106 9.9 x 105 kD (1/s) 0.0118 0.0201 0.0092 0.0070 Rmax 39.5 32.8 24.7 24.1 KD (nM) 10.8 21.0 9.2 7.1 Chi2 0.33 0.16 0.13 0.16 Production and purification of DIII in complex with E16 Fab Bacterial expression of WNV E Domain 3 Refolding of DIII Complex purification by size exclusion chromatography Abs280 (mAU) DIII DIII-E16 Fab complex DIII alone Elution Volume (ml) Hybridoma expression of E16 mAb mAb capture by Protein A E16 Fab by papain cleavage Structure determination of DIII-E16 complex by X-ray crystallography Structure of the DIII-E16 Fab complex E16 Fab CH CL VH VL VH VL DIII DIII E16 Fab L3 DE Loop L1 H2 BC Loop H1 H3 N-terminal region L2 FG Loop Nybakken et al, Nature 2005 Selection of E16 specific epitope variants of DIII Yeast library of DIII variants created by error prone PCR E -DIII Pooled DIII mAbs E16 staining DIII mutations at Ser306, Lys307, ThrE330 and Thr332 significantly diminish E16 binding DIII yeast display mutations are centrally located at the E16 interface E16 Fab H1 TrpH33 CL CH H3 SerH95 VH VL LysE307 DIII N-Term SerE306 DIII H3 ThrE332 SerE306 AspH100 H2 ArgH58 LysE307 ThrE330 DIII N-Term LysE307 ThrE332 ThrE330 DIII DIII BC loop E16 Fab C CL C 1A1D-2 Fab C CL C CH1 VH N WNVE DIII VL FG VH N N DE BC DE DV2E DIII N VH VL N N BC N FG AB WNVE DII N CL C CH1 CH1 VL C E53 Fab Fusion loop C C Thr 332 Thr 330 Ser 306 Lys 306 Lys 305 Arg 99 Lys 310 Gly 106 Lys 307 Yeast display ≤ 4.5 Å contacts Thr 76 Pro 75 Leu 107 E16 Fab could potentially bind 120/180 E protein DIII sites on WNV Fusion Loop E16 binding to 2- and 3-fold clustered DIIIs appears permissive while 5-fold clustered DIII binding appears sterically non-permissive Zhang, et al, Nat Structural Biology, 2003 Mukhopadhyay, et al, Science, 2003 Combination of crystallographic and cryo-EM data - the E16 Fab/WNV complex Model of E16 Fab complex with WNV Cross-section of Cryo-EM reconstruction Cryo-EM work done in collaboration with Rossmann and Kuhn groups at Purdue University Cryo-EM reconstruction of E16 Fab complex with WNV Fitting E16 Fab complex into CryoEM reconstruction of WNV E16 internalizes with the virus during infection of vero cells Pre bind virus + Alexa-Ab Add to cells at 4 or 37oC E53 E16 15 minutes. Fix, add Lyso-tracker Confocal microscopy DIC / Bright Field Fluorescent Merge Alexa 488-labeled WNV mAbs and lysotracker red (acidified endosomes) E16 Fab decoration appears to trap WNV particles - a fusion intermediate? pH 8 nucleocapsid core (~ 154Å) outer lipid layer (~200 Å ) outer glycoprotein layer (~245Å) pH 6 n ucleocapsid core (~ 158Å) o uter lipid layer (~205Å) outer density layer (~340Å) When things go wrong: Chang G, Roth CB. (2001) Structure of MsbA from E. coli: a homolog of the multidrug resistance ATP binding cassette (ABC) transporters. Science 293(5536):1793-800. PMID 11546864 Pornillos O, Chen YJ, Chen AP, Chang G. (2005) X-ray structure of the EmrE multidrug transporter in complex with a substrate. Science 310(5756):1950-3. PMID 16373573 Reyes CL, Chang G. (2005) Structure of the ABC transporter MsbA in complex with ADP.vanadate and lipopolysaccharide. Science 308(5724):1028-31. PMID 15890884 Chang G. (2003). Structure of MsbA from Vibrio cholera: a multidrug resistance ABC transporter homolog in a closed conformation. J Mol Biol 330(2):419-30. PMID 12823979 Ma C, Chang G. (2004). Structure of the multidrug resistance efflux transporter EmrE from Escherichia coli. Proc Natl Acad Sci USA 101(9):2852-7. PMID 14970332 When things go wrong: Figure 2. Tertiary structure of the N-domain of STAT-4. (A) Overall representation of two monomers (green and gray) in the crystallographic dimer, viewed approximately orthogonal to the molecular twofold axis, which is vertical. The ring-shaped NH2terminal element is colored red in one monomer. (B) Orthogonal view of one of the N-domains shown in (A), depicting details of the architecture of the ring-shaped element. Side chains that participate in a charge-stabilized hydrogen-bond network are shown in a ball-and-stick representation. The side chain and backbone carbonyl of buried R31 are shown in magenta. For clarity, the indole ring of the invariant residue W4 that seals off this arrangement on the proximal side is drawn with thinner bonds. The blue sphere denotes a buried water molecule. Hydrogen bonds are indicated by dotted lines. Oxygen, nitrogen, and carbon atoms are red, blue, and yellow, respectively. Q3-N marks the position of the backbone amide group of residue Q3. The lightred segment of helix 2 highlights its 310 helical conformation. Fig. 2 and Fig. 3, B and C were created with the program RIBBONS, version 2.0 (28). Structure of the Amino-Terminal Protein Interaction Domain of STAT-4 Uwe Vinkemeier, Ismail Moarefi, * James E. Darnell Jr., John Kuriyan Science 13 February 1998:Vol. 279. no. 5353, pp. 1048 - 1052 Figure 1. Analysis of STAT4 dimers produced by crystallographic symmetry to identify the physiologic dimer.(a) Dimer A (produced by the fractional transformation -Y, -X, -Z+1/6 with translation 1, 1, 1) represents the dimer implied previously22. Dimer B (produced by the fractional transformation X, X−, -Z+5/6 with translation 0, 1, 0) represents an alternative interface recently suggested25. Highlighted residues were targeted for mutational studies. Residues W37, T40, and E66 (magenta) are located in the dimer A interface, whereas residues D19 and L78 (cyan) are located in the dimer B interface. (b) Surface analysis of the two dimers. According to this analysis, dimer B is a statistically better molecular interface (as compared to dimer A) and is more likely to represent a physiologically relevant dimer. Naruhisa Ota, Tom J Brett, Theresa L Murphy, Daved H Fremont & Kenneth M Murphy N-domain-dependent nonphosphorylated STAT4 dimers required for cytokine-driven activation Nature Immunology 5, 208 - 215 (2004)