SSNMR studies of intact bacteriophage virus Assignment of the coat protein of pf1 filamentous phage A. Goldbourt1, B.J. Gross1, L. Day2, A.E. McDermott1 1Chemistry 2The Department, Columbia University, New York Public Health Recent Institute, Newark, NJ Filamentous bacteriophage 2000nm 6nm Pf1 cartoon model G. Stubbs, Rep. Prog. Phys. (2001) 64, 1389 Pf1 belongs to the filamentous Bacteriophage (Inovirus) family of organisms known to attack bacteria. Members of the family include Pf1, Pf3 and Xf (Class-II) as well as M13, fd, f1, If1, and IKe (Class-I). The host bacteria for Pf1 (and Pf3) are Pseudomonas aeruginosa of different strains (strain K for Pf1). Most virions consist of a long single stranded circular DNA encapsulated by multiple copies of protein subunits. The DNA loop is stretched from one end of the virus to the other in an unknown conformation. On the surface of the coat protein, several additional functional proteins are docked. These proteins are crucial for the bacterial infection and for the reassembly process of the virion. Pf1 is the longest known virion CTXF filamentous bacteriophage codes for Cholera Toxin Waldor and Mekalanos Science (1996) 272, 1910. It has been discovered that ctxAB, which codes for Cholera Toxin, is found in the genome of the filamentous phage CTXF. Only upon infection of Vibrio Cholera by CTXF does it become toxic. – The first gene product, cep, closely resembles the genes for the capsid proteins of M13 and Pf1. The alignment of the three phages is shown below Alignment of Pf1, M13 and cep M13:--AEGDDPAKAAFNSLQASATEYIGYAWAMVVVIVGATIGIKLFKKFTSKAS PF1:-GVIDTSAVESAITDGQGDMKAIGGYIVGALVILAVAGLIYSMLRKA----cep: DAGLVTEVTKTLGTSKDTVIALGPLIMGVVGAIVLIVTVIGLIRKAK---- N-terminus Surface exposed Hydrophobic region Positively Charged, DNA contact region? EM of partially purified CTXF Uses of filamentous phages Pf1 readily aligns in a magnetic field and is regularly used for protein partial alignment in order to obtain RDC constrains. M13 and T7 are used as sequencing primers. Virions like fd, f1 and M13 are used for peptide phage display. pf1 virion GVIDTSAVESAITDGQGDMKAIGGYIVGALVILAVAGLIYSMLRKA Longest known phage, 2000nm. Coat protein with 46 amino acids, all-helical Nucleotide/subunit ratio of 1:1 – smallest known Undergoes a phase transition at 10oC Side view of 1PJF Showing the repeating subunits Top view of 1PFI, DNA in the center Known models for Pf1 Models for the structure of Pf1 were published by 3 groups: – X-ray powder diffraction, pdb codes (1-4)IFM and 1QL1 for the low temperature form (Marvin et al.) – 2IFN and 1QL2 for the high-temperature form (Marvin). – 1PFI, a new calculation of existing data incorporates the DNA and predicts a different symmetry then Marvin (Day et al.) – 1PJF, SSNMR structure of the coat protein backbone for an aligned sample, model of the phage was built with the symmetry from 1QL1 (Opella et al.) Structure of Pf1 The helix is assumed to have a kink in the center, or to be gently curved the N-terminus can be a loop (1PJF,1QL1,4IFM) or helical (1PFI, 1IFM) and probably depends on the solvent (surface exposed) The locations of the sidechains are unresolved in the x-ray data and are partially obtained from alternative spectroscopic experiments. The helical bundle symmetry around the DNA is on debate – 27 units in 5 turns (1QL1/2) or 71 subunits in 13 turns (1PFI) or alternatively, 71 dimers in 26 turns (Day, unpublished). And perhaps, a non-rational ratio. SSNMR dipolar waves (opella, 1PJF) 1QL1 gently curved helix 15N-H experimental dipolar coupling 1QL1 Advantages of MAS NMR What is the coat protein’s exact structure? – Refinement of X-ray diffraction data requires a fitting of the protein structure + the whole helical bundle. – Different coat protein models can give different results for the helical bundle arrangement. – A more accurate structure of the protein will directly contribute to the solution of the whole virus structure by reducing the number of fit parameters for the X-ray powder diffraction data. Is it an inside out DNA? – Model 1PFI suggests that the structure is ‘inside out’: The bases point outside towards the coat protein and the phosphates point towards the inside. – Hopefully, 31P/13C NMR experiments … Site specific information What is the nature of the phase transition? – The pf1 virus undergoes a reversible phase transition at 10oC. It is known that the overall length of the virus increases slightly (~100nm ; 15 turns). A model assumes reorganization of the helix bundle. – Site-specific information will be obtained with ssNMR experiments. Site specific information What is the nature of the phase transition? – The pf1 virus undergoes a reversible phase transition at 10oC. It is known that the overall length of the virus increases slightly (~100nm ; 15 turns). A model assumes reorganization of the helix bundle. – Site-specific information will be obtained with ssNMR experiments. What kind of contacts exist with the DNA? – Mainly qualitative data exist: Quenching of Tyr40 fluorescence signal, Lys45 & Arg44 compensate for phosphate charges etc. – With site specific information contacts to the DNA can be obtained Where is pf1 coming from? The virus was prepared in the laboratory of Loren day at the ‘Public Health Recent Institute’ in Newark, NJ The Pseudomonas aeruginosa host was grown on a 13C-Glucose/15N-ammonium chloride M9 media and pf1 was purified and isolated. Precipitation was done in the McDermott lab (a protocol developed by L. Day) using PEG, 5mM MgCl2 and Ethylene Glycol as a cryo-protectant. Assignment summary for pf1 N REMARK: All assignments shown are site specific. Data for ‘residue type’ assignments are not included Next: sidechains G1 V2 I3 D4 T5 S6 A7 V8 Q9 S10 A11 I12 T13 D14 G15 E16 G17 D18 M19 K20 A21 I22 G23 CO CA CB CG - - - - - - - - CD Site specifically assigned Multiple conformations Not-assigned not in sidechain unresolved N G24 Y25 I26 V27 G28 A29 L30 V31 I32 L33 A34 V35 A36 G37 L38 I39 Y40 S41 M42 L43 R44 K45 A46 CO CA CB CG - - - - - - Backbone: assigned 42/46 (91%) well resolved: 38/46 (83%) Sidechains: assigned: 35/40 (85%) CD 13C-13C correlation Cb Cb/g Sidechain of Lys20 Cg-Cb Cg-Cd O a b g Cd-Cg d Cb-Cg Cb-CO Ca-Cg • Red color on the peaks serve as a guide to the eye. • Negative peaks were beyond the contour level threshold Ca-CO Ca-Cb Ca Ca CO Cb 13C-13C correlation Glu9 O Cb-Cg Cb-Cd a b g d Cg-Cd Cg-Cb Ca-CO • One bond contacts are underlined Ca-Cb 13C-13C correlation Cg2/1-Cb Cg1/2-Cb Val2 O (Cb-CO) a b g1 g2 Ca-Cb Ca-CO STRIP PLOTS FROM 750MHz 3D EXPERIMENTS RED: NCACX BLUE: NCOCX 29 31 Strip plots from 3D 33 35 37 39 41 43 45 G17 G17 47 49 51 i-1 D18 i 55 55 57 59 Ca (ppm) 61 G15 Q16 Q16 i i-1 Similar 15N shifts 121.9 121.8 121.7 Similar 13CO/13CA shifts, sequential 15N 176.9 177.0 Assignment of Alanine residues The advantage of two-bond contacts DCP: NCA (750) RAD, 6ms (750) A7 A46 Alanine Ca A36 Ca O A34 A21 A11 Ca Cb N a b 36 21 V35Ca-A34CO V8Ca-A7CO CO Ca I12Ca-A11CO CO RAD, 6ms (600) Cb Ca Initial structural information The TALOS derived secondary structure TALOS ‘Torsion Angle Likelihood Obtained from Shifts and sequence similarity’ Database of ~80 Proteins with Cornilescu, Delgalio, Bax, J. Bio. NMR (1999) 13, 289. – Known X-ray structure < 2Å resolution – Known NMR chemical shifts For every 3-amino acid sequence (e.g. GQG), search matching chemical shifts in the database, give a score for (i ) N,HA,CA,CB,CO secondary shift difference and (ii) amino acid identity. The best 10 matches are used to derive the dihedral angle for the center a.a. (Q). Remark : (i) TALOS predicts 3% in error !! (ii) The database is derived for proteins in solution and we look at a well organized helical bundle of proteins! Weights : HA>CO>CA~CB>N Example – Ser6 10 best predictions Dihedral angle in pdb file 1PFI f Protein matches from database y Best matching triplets Prediction is good of 9/10 results agree TALOS score Example – Ser6 TALOS results in a text file y Df -40.000 8.000 7.000 50.130 10 Good 5 T -67.000 -42.000 9.000 8.000 57.810 10 Good 6 S -62.000 -40.000 5.000 6.000 51.160 10 Good 7 A -65.000 -38.000 4.000 8.000 41.410 10 Good 8 V -67.000 -44.000 9.000 7.000 33.250 10 Good # code 4 f 10 best predictions D -63.000 Dihedral9 E angle in pdb file 1PFI -64.000 -36.000 Dy score matches prediction 10 Good result 6.000 10.000 32.960 Protein matches from database f y Best matching triplets Prediction is good of 9/10 results agree TALOS average score TALOS results Comparison to 1PFI model (Day) 0 Y -20 -40 -60 -80 F -100 -120 G V I D T S A V E S A I T D G Q G D M K A I G G Y I V G A L V I L A V A G L I Y S M L R K A Y, NMR/TALOS f, NMR/TALOS Y, 1PFI model F, 1PFI model TALOS results Comparison to 1PJF model (Opella) 0 Y -20 -40 -60 -80 -100 -120 F G V I D T S A V E S A I T D G Q G Y, NMR/TALOS f, NMR/TALOS D M K A I G G Y I V G A L V I L A V A G L Y, 1PJF model F, 1PJF model I Y S M L R K A TALOS results 190 Comparison to 1QL1 model (Marvin) 150 110 0 Y -20 -40 -60 -80 -100 F -120 G V I D T S A V E S A I T D G Q G D MK A I G G Y I V G A L V I L A V A G L I Y S M L R K A -140 -150 -160 Y, NMR/TALOS f, NMR/TALOS Y, 1QL1 model F, 1QL1 model Summary Almost complete assignment of pf1, a filamentous bacteriophage virus, 35MD molecular weight, has been obtained. Weak DNA sugar-peaks have been observed. TALOS results suggest a completely helical structure. No significant loop/kink region has been predicted. Current work in progress: – Structural distance constrains will be obtained with emphasis on intermolecular contacts. – Site specific information will enable to probe the phase transition. – DNA-protein contacts will be obtained by low-temperature experiments. Thanks to $$$ Columbia University $$$ Loren Day Initiated the project, prepares constant flow of samples and enlightens us with all we know about pf1!! Ben Gross – from him I inherited this project, thanks and good luck!! (of course he also work hard – 600 3D’s, 400 experiments, assignments!!!) NYSBC (and Boris Itin!!) 2D NCA/NCO experiments – J. Lorieau Ann McDermott and the wonderful group at Havemeyer 3rd floor. DNA signals at the 2D 13C-13C spectra C5 C3 C1/4 C3 Additional expected regions For C5 (65ppm) and C2 (41ppm) C1/4 RAD mixing, 6ms Obtained with 60Hz line broadening in both dimensions Information from XPD R Powder diffraction layer lines encompass the symmetry of the whole helical virion and fit to a simple expression – I(line l)=tn+um integer # of turns # of subunits Bessel function Jn(R) A top view of the helical bundle • DNA is in the center and was not included in the models, except for 1PFI • Does the helix have a ‘kink’?, what is the structure of the Nterminus? Where do the subunits make contacts? 1PJF What do we have so far? Almost complete assignment has been achieved (for the hightemperature form) with 2D and 3D experiment on 600 and 750 MHz spectrometers. DNA signals from the sugars have been detected but they are weak. Low temperature experiment are underway in order to try and obtain the DNA-coat protein specific contacts. 31P-31P DQ and 31P/13C REDOR experiments failed to produce any correlations but 1D spectra suggest strong DNA dynamics. These also will require low temperatures The low temperature form was precipitated and spectra will be used to probe the site-specific changes of the phase transition