Crystal Structure of Plant Photosystem I Adam Ben-Shem +, Felix Frolow # and Nathan Nelson +* Department of Biochemistry + and Molecular Microbiology and Biotechnology #, The George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 69978, Israel Supplementary information Methods Structure determination Complete PSI isolated from pea seedlings (Pissum sativum var. Alaska) was purified in active form and crystallized11. Diffraction data for native and several heavy atom derivative crystals were collected under cryogenic conditions (100K) at ESRF beamlines ID14-1 and ID14-4 and processed by the HKL suite41 and CCP4 package42 (Table 1). Most of the 8 heavy atom derivative data sets (ethylmercurithiosalicylate: three data sets denoted EMTS1-3; Hg-acetate: two data sets denoted Hg-acetate 1-2; PtCl4: two data sets and uranyl acetate) were not isomorphous with the native data set. This obstacle has been resolved by a pair-wise check of isomorphism among all the derivative data sets which revealed that two putative Pt derivative data sets could serve as the native data set for those Hg derivatives which were not isomorphous with the native crystal (pseudo-native1 for EMTS1, pseudo-native2 for EMTS2 and Hg-acetate1-2, and native for EMTS3). Heavy atom positions for every derivative were determined by isomorphous difference or anomalous difference fourier maps utilizing external phases, obtained by molecular replacement method using coordinates of cyanobacterial PSI10 (data not shown). For phase determination, 5 Hg derivatives, that share the same 8 heavy atom sites but differ in heavy atom substitution level due to variations in concentrations of soaking solutions: 0.1 mM – 0.25 mM and soaking times: 1h – 2h, were utilized. In 1 addition, the anomalous signal from a uranyl derivative for which no suitable native or pseudo-native was found and from the 6 intrinsic iron-sulfur clusters (regarded as single scatterers) were exploited (Table 1). Phases were calculated to 5 Å using SHARP43 and extended to 4.44 Å using two-fold non-crystallographic symmetry averaging, solvent flattening and histogram matching as implemented in DM42,44. Electron density map visualization and model building were performed in O45. The cyanobacterial reaction center C backbone (without subunits X and M) could be fitted into the MIRAS (Multiple Isomorphous Replacement Anomalous Signal) derived electron density map very well. It required no modification in the membrane section and served as an initial model for the core moiety. Modifications of the core model in solvent exposed regions were made only if they could be unambiguously determined in the MIRAS map and where sequence alignment indicated the addition or deletion of residues. The 15 additional transmembrane helices that could be found in the MIRAS map were first modeled by C backbone of idealized alpha helical geometry and then adjusted to better fit the map. Some secondary structure elements in solvent exposed regions could also be added to the model, which was then completed by identifying 167 chlorophylls, 3 Fe-S centers and two phyloquinones (per PSI monomer) in the MIRAS map. Sections of somewhat diffuse electron density that are situated within the membrane may correspond to additional 8 chlorophylls that are not included in the model. Rigid body refinement (and TLS refinement of the core region) were performed in REFMAC 5 (ref. 42). No refinement of individual residues was undertaken hence the almost lack of difference between R-factor (41%) and R-free (42%). The structure solution is based solely on electron density maps calculated using the experimental MIRAS phases after density modification. Figures were generated by O45 and rendered by Molray46. Completeness of the C backbone model In all four LHCI proteins, we were able to trace all transmembrane domains and virtually all the lumenal exposed regions (except for the loop connecting transmembrane helices B and C in Lhca1 and very few disordered residues). We could trace partially the stromal exposed domains, i.e the loop connecting transmembrane helices A and C and the N-terminal tail. Electron densities corresponding to non-traced stromal regions are detected at the interface between LHCI and the core : between Lhca4 and PsaF close to the attachment site of the 2 stromal region of helix B to the core (Fig 1a) , between Lhca2 and the stromal region of PsaJ and between Lhca3 and the stromal region of helix b in PsaA. Deciding between assignment of these densities to an A-C loop or to an N-terminus is hampered by ambiguous discontinuities and bifurcations in the map in those areas. Our model of the reaction center moiety (the core) is missing only very few solvent exposed regions, most notably the stromal loop connecting the two transmembrane helices of PsaK (also missing in the cyanobacterial PSI structure). We find an electron density that probably corresponds to this region but cannot make a definite assignment. Our conclusions regarding the asymmetric nature of LHCI binding to the core took into account also unassigned electron densities. Furthermore, since the transmemrane regions are completely traced it can be asserted that Lhca1 closely interacts with the core and that only this Lhca protein binds the core within the membrane. Assigning the four LHCI monomers and subunits K and G SDS-PAGE and mass-spectrometer analysis confirm that plant PSI crystals contain four different LHCI proteins, namely Lhca1-411,40. Lhca1 and Lhca4 are kown to form a heterodimer that can be reconstituted in-vitro47 while Lhca2 and Lhca3 assemble into either homodimers and heterodimers9,13,14. The two hetero-dimers present in the structure are therefore assigned to Lhca1-Lhca4 and Lhca2-Lhca3. Since the association between PsaK and Lhca2-Lhca3 and in particular with Lhca3 is well documented13,26,48 we assigned the heterodimer near subunit K to Lhca2-Lhca3 (with Lhca3 closer to PsaK) and consequently the other dimer near PsaG to Lhca1Lhca4. The dimerization mode described in the text and in ref.16 dictates that Lhca1 is the monomer tightly bound to PsaG and PsaB with protruding N and C termini attached to Lhca4. In this arrangement Lhca4 is bound to PsaF, which is in agreement with ref.49. Antisense inhibition studies of individual LHCI subunits and the recorded variations in LHCI composition due to changes in environmental conditions also lend credence to this assignment7,17,26,50. Due to their sequence and structure similarity the assignment of subunits G and K to either of the two “poles” of plant PSI could not be based solely on backbone structure. The proposed arrangement with PsaK retaining its cyanobacterial position 3 (and hence PsaG occupies the opposite pole) makes use of the marked differences in the number of chlorophylls coordinated by these subunits in our structure- PsaG binds zero or maximally one chlorophyll whereas PsaK binds four. Biochemical data shows that PsaG binds indeed 0 or 1 chlorophylls and certainly not four15 . It follows from this proposed arrangement that Lhca3 and Lhca2-Lhca3 are bound much looser to the core compared to Lhca1-Lhca4. This is supported by experimental data. Bassi and Simpson51 could prepare PSI depleted of Lhca2-Lhca3 (collectively named then LHC-680) but not of Lhca1-Lhca4 by moderate detergent treatment of the holo-complex. Ref. 13 and B. Andersen (unpublished results cited Ref. 13) show that Lhca3 is the LHCI protein most easily lost during preparation of PSI from barely. Therefore Lhca3 is probably not the LHCI protein whose helix C forms a helix bundle with one of the core subunits and PsaK, which is known to be adjacent to Lhca3, is not this core subunit. Reference 41. Otwinowski, Z. & Minor, W. Processing of X-ray diffraction data collected in oscillation mode. Methods in enzymology 276, 307-326 (1997). 42. Bailey, S. The CCP4 Suite - Programs for Protein Crystallography. Acta Cryst. D 50, 760-763 (1994). 43. delaFortelle, E. & Bricogne, G. Maximum-likelihood heavy-atom parameter refinement for multiple isomorphous replacement and multiwavelength anomalous diffraction methods Methods in enzymology 276, 472-494 (1997). 44. Cowtan, K. & Main, P. Miscellaneous algorithms for density modification. Acta Cryst. D 54, 487-493 (1998). 45. Jones, T. A., Zou, J. Y., Cowan, S. W. & Kjeldgaard, M. Improved Methods for Building Protein Models in Electron- Density Maps and the Location of Errors in These Models. Acta Cryst. A 47, 110-119 (1991). 46. Harris, M. & Jones, T. A. Molray - a web interface between O and the POV-Ray ray tracer. Acta Crystallographica Section D-Biological Crystallography 57, 1201-1203 (2001). 47. Schmid, V.H., Cammarata, K.V., Bruns, B.U. & Schmidt, G.W. In vitro reconstitution of the photosystem I light-harvesting complex LHCI-730: Heterodimerization is required for antenna pigment organization. Proc. Natl. Acad. Sci. U S A. 94, 7667-7672 (1997). 4 48. Jensen, P.E., Gilpin, M., Knoetzel, J. & Scheller, H.V. The PSI-K subunit of photosystem I is involved in the interaction between light-harvesting complex I and the photosystem I reaction center core. J. Biol. Chem. 275, 24701-24708 (2000). 49. Haldrup, A., Simpson, D.J. & Scheller, H.V. Down-regulation of the PSI-F subunit of photosystem I (PSI) in Arabidopsis thaliana. The PSI-F subunit is essential for photoautotrophic growth and contributes to antenna function. J. Biol. Chem. 275, 31211-31218 (2000). 50. Zhang H, Goodman HM, Jansson S. Antisense inhibition of the photosystem I antenna protein Lhca4 in Arabidopsis thaliana. Plant Physiol. 115, 1525-1531 (1997). 51. Bassi, R. & Simpson, D. Chlorophyll-protein complexes of barley photosystem I. Eur. J. Biochem. 163, 221-230 (1987). 5 Table 1 Statistics for data collection and phase determination Data collection (Number in parentheses refer to highest resolution shell) Crystal Resolution Limit Rmerge Completeness I/(I) No of Reflections Redundancy 4.44 4.95 5.20 5.04 5.20 5.24 5.80 4.94 4.98 0.099(0.864) 0.077(0.646) 0.074(0.841) 0.058(0.435) 0.063(0.590) 0.067(0.624) 0.069(0.648) 0.069(0.476) 0.067(0.447) 99.6( 99.5) 99.2( 99.9) 99.0(100.0) 69.4( 67.7) 98.0( 96.8) 98.7(100.0) 98.6(100.0) 97.1( 89.4) 85.1( 85.0) 18.7(2.0) 19.4(2.0) 18.5(2.1) 14.1(2.1) 18.5(2.0) 17.7(2.0) 17.6(2.0) 15.3(2.0) 12.9(2.1) 862552 367121 309528 125210 268246 219386 155485 215201 181347 9.3 5.4 5.3 2.8 4.6 3.7 3.6 3.3 3.1 No. of heavy atom sites Total Occupancy Phasing power (iso/ano) Rcullis (centric reflections) 8 7 8 8 8 2 2.36 1.25 2.80 2.87 2.35 1.11 -/0.81 -/0.73 -/0.76 0.93/0.73 1.47/0.85 1.29/0.85 0.47/0.78 0.64/0.65 -/0.67 0.92 0.80 0.87 0.92 0.97 - Native Pseudo-native 1 Pseudo-native 2 EMTS1 EMTS2 Hg Acetate 1 Hg Acetate 2 EMTS3 Uranyl Acetate Phase determination Derivative Native Pseudo-native 1 Pseudo-native 2 EMTS1 EMTS2 Hg Acetate 1 Hg Acetate 2 EMTS3 Uranyl Acetate FOM (to 5 Å resolution) FOM after density modification (phases extended to 4.44 Å resolution) 0.43 0.71 Refinement (Rigid body and TLS) R-factor R-free 0.41 0.42 FOM (figure of merit), Phasing Power and Rcullis determined by the programs SHARP or DM Total occupancy - The sum of refined occupancies of the sites normalized to the occupancy of the iron-sulfur sites 6