X-ray structures of coronavirus N protein domains: Structural basis of nucleocapsid formation Hariharan Jayaram1%, Hui Fan2, Brian R. Bowman1&, Amy Ooi2, Jyothi Jayaram3, Ellen W. Collisson3, Julian Lescar2, B. V. Venkataram Prasad1* 1 Verna and Marrs McLean Department of Biochemistry and Molecular Biology; Baylor College of Medicine Houston, Texas, 77030, U.S.A 2 School of Biological Sciences Nanyang Technological University, Singapore 637551 3 Department of Veterinary Pathobiology Texas A&M University; College Station, Texas ,77843, U.S.A %Present address: Howard Hughes Medical Institute and the Department of Biochemistry, Brandeis University, Waltham, MA. 02454 &Present address: Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA. 02138; *Corresponding author (Ph: 713-798-5686; Fax: 713-798-1625; e-mail: vprasad@bcm.tmc.edu) 1 Abstract Coronaviruses cause a variety of respiratory and enteric diseases in animals and humans including SARS, a disease with emerging global impact. A loosely helical nucleocapsid formed by an association of a viral N protein with the genomic RNA is a common feature in many of the enveloped ssRNA viruses. As yet, there is no X-ray crystallographic structure of the full length N protein of any of these viruses. The coronavirus N protein is highly protease sensitive and polydisperse, making it a difficult protein for X-ray crystallographic analysis. Using limited proteolysis, we identified two major stable domains, NTD and CTD, and determined their X-ray structures to 1.3 Å and 2.2 Å resolution respectively. Our analysis presented here represents the most detailed structural characterization of an N protein to date. Structural analysis of these two domains in various crystal forms has provided structural insights into how the modular organization of the domain-swapped N protein dimer may facilitate not only the formation of the nucleocapsid but also enable intermittent interactions with the M protein in the viral envelope. 2 Coronaviridae, within the order Nidovirales, is a family of viruses which are notable causative agents of human upper respiratory infections including common colds and severe illnesses, such as SARS (severe acute respiratory syndrome). The coronaviruses are enveloped viruses with a diameter ranging from 80 to 160 nm. The viral genome consists of a single 30 kb long segment of positive sense single-stranded RNA (Siddell, 1995). Upon infection the genomic RNA encodes a 3′ co-terminal set of four or more subgenomic mRNAs with a common leader sequence at their 5′-ends. These subgenomic RNA encode various viral structural and non structural proteins required to replicate the virus and produce progeny virions. The enveloped capsid of the virus is predominantly made up of the membrane glycoprotein (M) and another small transmembrane protein (E) and an array of spikes composed of a glycoprotein (S) which gives the roughly spherical particles a corona. A necessary protein component of the capsid is the nucleocapsid (N) protein, which interacts with the genomic RNA forming the central core of the virion. Electron microscopic studies of detergent permeabilized transmissible gastroenteritis virus capsids (TGEV a prototype coronavirus) revealed that the internal nucleocapsid is helical and is composed of the ssRNA genome tightly associated with N protein (Risco et al., 1996; Risco et al., 1998). The N protein is typically a multifunctional basic phosphoprotein (50-60 kDa), which, along with its coding RNA, is synthesized in large amounts during infection (Lai and Cavanagh, 1997; Stohlman and Lai, 1979). The highly basic N protein exhibits non-specific ssRNA binding ability with an increased affinity for genomic RNA (Cologna and Hogue, 1998) and bind consensus sequences at 5′ and 3′ termini of the genome. Biochemical studies of coronaviruses such as mouse hepatitis virus (MHV), infectious bronchitis virus (IBV), and 3 SARS coronavirus (SARS-CoV) have mapped the RNA binding function to a minimal 55 residue segment in the N-terminal half of the N protein and the dimerization function to a the C-terminal half (Fan et al., 2005; Nelson et al., 2000; Yu et al., 2005). During the virus life-cycle multiple copies of the N protein interacts extensively with the genomic as well as the synthesized subgenomic RNA (Baric et al., 1988; Narayanan et al., 2003) and possibly participates in genome packaging which is initiated by recognition of a packaging signal by the M-protein. The M and N protein also interact closely via their C termini, an interaction which is important for genome encapsidation and nucleocapsid formation (Kuo and Masters, 2002). In addition, N protein has also been shown to play a role in controlling mRNA transcription, translation and replication (Lai and Cavanagh, 1997; Schelle et al., 2005; Tahara et al., 1998). The abundance of N protein produced during an infection results in this protein playing an important role in host modulation. Accordingly the N protein has been shown to interact with cycophilin, an immuno-modulator, activate the AP1 pathway involved in cell cycle control, enter the nucleus and induce apoptosis in certain cell types (He et al., 2003; Luo et al., 2004; Surjit et al., 2004a; Wurm et al., 2001). The N protein is also a major immunogen and an important diagnostic marker for coronavirus disease (Leung et al., 2004) and is shown to help improve the efficacy of avian coronavirus vaccines(Cavanagh, 2003; Zhao et al., 2005). We present in this paper, structural analysis of the N protein of infectious bronchitis virus (IBV), a member of the Coronaviridae family. The recombinant N protein of coronavirus is highly susceptible to proteolysis making the structural analysis of the full length protein 4 difficult. To date, there is only limited structural information on the coronavirus N protein which includes an NMR structural analysis of the N-terminal domain of SARS-CoV N protein (Huang et al., 2004)and our previously published crystallographic studies on the Nterminal domain of the IBV nucleocapsid protein at 2.8 Å resolution (Fan et al., 2005). As yet, there is no X-ray crystallographic structure of the C-terminal domain of the coronavirus N protein. In the present study, using limited proteolysis, we have been able to identify two stable domains of the N protein which represent N and C-terminal domains respectively (NTD and CTD), and determine their crystal structures. Each of these domains crystallized in multiple crystal forms allowing us to study their packing interactions and gain structural insights into nucleocapsid formation. With one of the crystal forms of the NTD, we have been able to determine the structure of the NTD to 1.3 Å resolution, a significantly higher resolution than our previous studies, and that of the CTD to 2.2 Å resolution, which represents the first crystal structure in this region of a coronavirus N protein, Results Limited proteolysis yields two stable independent domains. Noting that full length protein aggregated and degraded under variety of conditions, we sought to identify stable domains that were resistant to proteolysis under limiting amounts of trypsin and V8 protease. The digestion pattern with V8 protease was not very distinct and yielded several diffuse bands. However, with trypsin the full length protein was cleaved to a “single”, stable ~17 kDa band within 15 minutes of trypsinization. N-terminal sequencing identified this band to be composed of four tryptic fragments with two major cleavage sites at residues19 and 219 and two secondary cleavage sites at residues 27 and 226 (Fig. 1a). 5 The optimized domain constructs termed NTD (residues 19-162) and CTD (residues 219349) were then cloned, expressed and purified to homogeneity. The NTD was monomeric at moderate concentrations, whereas the CTD was a dimer even at very low concentrations, as assayed by gel-filtration chromatography. The NTD and CTD proteins tended to aggregate during purification and thus were purified at very low concentrations and concentrated only prior to crystallization screening. NTD and CTD crystallized in multiple crystal forms: In contrast to the recently reported structure of NTD which corresponded to the Beaudette strain and crystallized in the P1 spacegroup, this structural analysis used IBV-Gray strain which crystallized in a different spacegroup (C2) and diffracted to 1.3 Å resolution. The CTD also crystallized in different forms as needles, rods, flat sheets or hexagonal crystals under different conditions (Table I crystal forms CTD1 CTD2 and CTD3). Rod shaped CTD1 crystals of Se-Met substituted protein that diffracted to 2.0 Å in the P21 21 21 space group were used for the structure determination. The structure of CTD in the other crystal forms (CTD2 at 2.2 Å and CTD3 at 2.6 Å resolution) were determined by molecular replacement. The different packing arrangements in these crystal forms revealed multiple modes of self-interaction for these domains of the N protein and help suggest a plausible model for nucleocapsid organization in coronaviruses. High resolution structure of NTD. The NTD in this study crystallized as a dimer formed by two interlocking monomers in the crystallographic asymmetric unit (ASU) arranged in a head to tail fashion (Fig. 1b). The structure of NTD (IBV-Gray strain) is almost identical to the structure of the NTD of IBV Beaudette strain (Fig. 1c) reported 6 previously (Fan et al., 2005), with the exception of five additional residues discernible at the N-terminus in the present structure Briefly, the structure is composed of a relatively acidic globular core of twisted anti-parallel β-sheet that is surrounded by a number of loop regions. Prominent among the loop regions are two long loops corresponding to the N-terminal 12 amino acids (residues 22 to 34) and a loop region from residues 74 to 86 that constitutes an internal arm. These loops extend outward like long tethers from the globular core resulting in a U shaped monomer (Fig. 1b). NTD exhibits a novel dimeric arrangement. The dimer in the ASU of the present structure is formed by the interactions between the protruding basic arms of the U-shaped monomer (molecule A) with the acidic base of the other U shaped monomer (molecule B, Fig. 1b). The two monomers are rotated with respect to each other by about 90°. The main difference between these two molecules related by non crystallographic symmetry is that in molecule B one of the arms of “U” (internal arm) is disordered. The dimeric interaction has a buried surface area (BSA) of ~2150 Å2 thereby indicating a strong interaction between the dimers. In addition, the dimers from the neighboring unit cells, related by a translation, interact with each other with a BSA of 1082 Å2, using the N-terminal loop (residues 22 to 29) which contacts an acidic groove in the neighboring NTD molecule to form a linear array (Fig. 1d). In contrast to these dimers, the previous structure of NTD by Fan et. al. (9) also consisted of a dimer in the ASU wherein the “U” shaped monomers interacted with each other using the bases of the globular core, such that the arms of the U shaped monomers faced away from each other with a BSA of only ~590 Å2 (Fig. 1b). The dramatic difference 7 in packing by the NTD dimers in these two crystal forms are possibly due to differences in ionic strength and pH between the two crystallization conditions. CTD forms a domain-swapped dimer. The CTD in all three crystal forms exists as an intimate domain-swapped dimer (Fig. 2) formed by monomers related by noncrystallographic symmetry in the ASUs of these crystals. The domain swapping in the CTD dimers is brought about by exchange of beta strands from one monomer to the other. The overall topology of the CTD dimer can be described as a concave floor of ~400Å2 area consisting of an anti-parallel beta sheet (β1B-β2B-β2A-β1A) surrounded by helices and loops (Fig. 2). Helices 3 and 4 are connected by a loop and their dimeric partners form a groove which arch inward over this floor and constitute the other face of the dimer. Another α-helix, α5, at the extreme C-terminus forms an angled wall that flanks either side of the dimer (Fig. 2). The structure sequence relationships within the CTD are summarized in Fig.1 of the Supplementary material. Recent biochemical and Mass Spectrometric studies on IBV N protein Beaudette strain have suggested the possibility of disulfide bridges in the CTD (Chen et al., 2005). However in the CTD structure of either strain there are no intra- or intermolecular disulfide bridges. The integrity of the domain swapped dimer with a large BSA of ~5000 Å2 is consistent with the observation that CTD is a dimer in solution, and with several biochemical studies which map the dimerization domain of the full length protein to the Cterminal domain (Surjit et al., 2004b; Yu et al., 2005). Mutiple packing modes of CTD dimers. Although the structure of the dimer remains strikingly invariant, their molecular packing is considerably different in the three crystal forms we have studied. The presence of one dimer in the ASU in two crystal forms (CTD1 8 and CTD3) and four dimers in the ASU of the other crystal form (CTD2) allowed the analysis of dimer-dimer interactions not only at different pHs and crystallization conditions, but also in the presence and absence of any constraints imposed by crystal packing. We have focused on those interactions with a BSA of more than 1000 Å2 which typically signifies strong intermolecular interactions. Such an analysis of these inter-dimeric interactions is of relevance, considering the primary role of N protein in nucleocapsid formation. CTD1 which crystallized at pH 4.5 has one dimer in the ASU. The dimers related by the crystallographic 21 screw axis along orthogonal directions display three kinds of interdimeric interactions. In one of these interactions, dimers interact in a tail to tail fashion along one of the axes with a BSA of ~1100 Å2 (referred to as type S hereafter), while the other two have a considerably less BSA of 400-800 Å2 (Fig. 3a). In the CTD2 form which crystallized at pH 8.5, the ASU has four dimers (Fig. 3b). The interaction between three of these dimers, although unrelated by any crystallographic symmetry, is very similar to the type S interactions seen in CTD1. However, unlike in CTD1, where the dimers form an infinitely long linear array, in CTD2 because of a small swivel between the three dimers (dimers 1, 2 and 3) they exhibit a slight curvature. The type S interaction in both crystal forms is mediated by the C-terminal residues between 308 and 328, which includes α-helix α5 and a type II turn. The two dimers are held together by a network of water-bridged polar interactions and a salt bridge between residues Arg 308 and Asp 314 (Fig. 3a, bottom). Despite significant differences in pH (pH 4.5 vs. pH 8.5), this salt bridge is preserved in both CTD1 and CTD2. 9 In addition to the type S interactions there is a lateral interaction between dimer 2 and dimer 4 (Fig. 3b) with a BSA of ~1250 Å2 (type L). Dimer 4 is also involved in bridging the neighboring ASUs through a type S-like interaction (type S′) with dimer 1 across the ASUs. Such an inter ASU interaction extends the helical array formed by dimers 1, 2 and 3 in either direction (Fig. 3b). In the case of the type L interaction, the CTD dimers interact predominantly via their N-terminal residues (residues 221 to 230). Here the -helix lined grooves in the CTD interact with each other with the N-terminal loop serving as the interface. Consequently the N-terminus of molecule 4 and molecule 2 are more ordered in the electron density map than the termini of the other dimers. The CTD domain from the Beaudette strain, crystallized in a completely different space group, P43, with one dimer in the ASU. In this crystal form, the interacting dimers, with a BSA of ~1085 Å2 , are related by crystallographic 43 screw symmetry (Type F). These interactions are quite different from the type S interaction but bear some resemblance to the type L interactions seen in CTD1 and CTD2. The type F interaction is mediated by hydrogen bonding between Arg 230 of one monomer, and the backbone carbonyls in the loop formed by residues 263 to 266 in the other monomer. The large BSA of ~1085 Å2 along the 43 screw axis also results in an infinitely propagating 54Å wide columnar array (Fig. 3c and 4b). Discussion A loosely helical, non-rigid nucleocapsid formed by a close association of a virus encoded protein, commonly referred to as N protein, with the genomic RNA is a common feature in many of the enveloped ssRNA viruses including coronaviruses. Structural information on the N protein and the molecular understanding of how this protein facilitates 10 the formation of the nucleocapsid is limited. From the biochemical characterization of the N protein of IBV, a prototypical coronavirus, presented here, it is apparent that this protein has two major protease-resistant domains. Our X-ray crystallographic analysis of these two domains, NTD and CTD, provides some insights into how the two domain organization of the N protein may coordinate nucleocapsid assembly. NTD and CTD interact with RNA. Several biochemical studies have shown that determinants for RNA binding reside in the N-terminal region with the minimal region being mapped to residues 177 to 231 in MHV (corresponding to 136 to 190 in IBV). In addition to NTD, the involvement of CTD in the RNA binding has been shown by Fan et. al. using gelshift assays (Fan et al., 2005). Based on their recent structure of the NTD (IBV Beaudette strain), Fan et. al. have proposed that the arms of the “U” shaped monomer, which are quite basic in nature are likely the regions of the N protein binding to RNA. This is also consistent with NMR–NOE analysis of NTD-RNA interactions in the SARS-coronavirus N protein (Huang et al., 2004). A novel finding in our crystal structure analysis of the NTD (IBV-Gray strain) is that it can form a strong interlocking dimer, in contrast to the weak dimeric interaction observed in the NTD of the IBV Beaudette strain reported by Fan et. al. (Fan et al., 2005). These interlocking dimers associate to form a linear fiber with the basic tethers exposed along the surface. Such a fiber could provide for closely packed interactions of NTD with the genomic RNA. Analysis of N protein-RNA interactions in MHV at different stages of the virus life cycle revealed that these interactions progress from an RNAse sensitive complex involving subgenomic RNA to an RNAse resistant complex involving genomic RNA (Narayanan et al., 11 2003). The strong and weak NTD dimer interactions seen in the two structures possibly correspond to these different states of N protein-RNA associations. The electrostatic potential surface of the CTD dimer is significantly polarized with one of its faces being acidic and the other basic (Fig. 4a and 4b). This basic face made up of the -helix lined groove is a likely candidate for its interactions with RNA. Plausible model for nucleocapsid formation. In the formation of the nucleocapsid, the N protein has to self-associate tightly and interact with RNA such that the resulting structure is RNAse resistant. Our crystal structure analysis of CTD indicates a tight dimer mediated by a domain swapped interaction thus suggesting that the full length N protein very likely functions as a dimer, with the CTD providing a structural scaffold while the NTD serves as a module for RNA interaction. The orientation of the NTD with respect to the CTD in the N protein is not certain from our crystal structure analysis of these two independent domains because in the full length protein they are connected by a 47 residue protease sensitive loop. It is possible that the RNA binding regions of these two domains face each other engulfing the RNA between them, thus conferring resistance to RNAse. In the various crystal forms of the CTD, we have seen the ability of CTD to self-associate in multiple modes with BSAs of greater than 1000 Å2. Thus self association of the full length N protein is very likely to be nucleated by the CTD. A relevant question is which of these interdimeric interactions seen in the multiple crystal forms of the CTD is used in the formation of the nucleocapsid. Both the type S and F interactions are conducive to forming fibril structures. Propagation of any single type of interactions, however, would lead to a rigid strictly helical nucleocapsid. Considering that the nucleocapsid is not a rigid rod-like 12 structure in coronaviruses, nucleocapsid assembly may involve a combination of various inter-dimeric interactions observed in our studies. It is possible that the type S interaction is primarily used given that it is observed over a wider range of pH and seems to form independent of the constraints crystal packing (as in CTD2 crystals). A combination of the type S interactions with types L and F would appropriately modulate the curvature and change the direction of the nucleocapsid in the virion (Fig. 4d). N protein interactions with M-protein. In addition to its interactions with RNA, N protein is also known to interact with the M-protein which is an integral part of the viral membrane. Based on reverse genetic complementation assays, the interaction region between these two proteins has been mapped to their C-termini (Kuo and Masters, 2002). The C-terminus of the M-protein is significantly basic, and recent mutational studies on the M protein have demonstrated that its interaction with the N protein is predominantly electrostatic in nature (Luo et al., 2005). The exposed acidic β-sheet floor, on the opposite side of the proposed RNA-binding region, in the CTD dimer may serve as a suitable site for its interaction with the M protein. Thus the CTD may serve a dual purpose of not only mediating the self-association of the N protein in nucleocapsid formation but also in providing a complementary surface for intermittent interactions with the M protein in the virus envelope. Similarity with other coronaviral N proteins. Coronaviruses are classified into four groups, with SARS-CoV being an independent group. The N protein sequences are more similar within each group (~40%) than across groups (20-30%). The only X-ray structure of a coronaviral N protein available to date is that of the IBV N protein as described here and by 13 Fan et. al. (9). However, NMR structures of the N- and C-terminal domains of the SARS-N protein have been reported (Chang et al., 2005; Huang et al., 2004). Despite very low sequence similarity between IBV and the SARS N proteins, their NTD and CTD structures show the same general polypeptide fold suggesting that these folds are conserved across the Coronaviridae N proteins. The polypeptide fold of the NTD is novel and is observed only in the coronavirus N protein. Although domain-swapping has been observed in variety of proteins (Liu and Eisenberg, 2002), the nature of domain-swapping observed in the CTD appears to be rather unique as indicated by a DALI (Holm and Sander, 1998) search, which revealed a very striking similarity only to the 73 amino acid capsid forming domain of PRRSV (porcine reproductive and respiratory syndrome virus), a corona-like virus, which is a member of the Arteriviridae family. In the PRRSV N protein structure, this fragment also forms a very similar domain swapped dimer as seen in our CTD structure and exhibits self-association involving a salt bridge as seen in the type S interdimeric interactions of the IBV CTD (Doan and Dokland, 2003) (Fig. 4c). Based on the similarity between IBV-CTD and a distantly related arterivirus N protein, it is tempting to speculate that this type of domain-swapped dimer, capable of self association, may indeed be common in other enveloped viruses with a non-rigid helical nucleocapsid such as orthomyxovirus, paramyxovirus, bunyavirus and arenavirus, all of which contain genomic ssRNA associated with their respective nucleocapsid proteins. Materials and Methods 14 Purification of full length nucleocapsid protein and limited proteolysis: Full length N protein was expressed as described before (Zhou et al., 1996). The protein was further purified by heparin affinity chromatography, concentrated to 1-2 mg/ml and was checked for monodispersity by dynamic light scattering (Dynapro) and negative-stain electron microscopy. Limited proteolytic cleavage of full length N protein (1-2 mg/ml) was carried out with 2% (wt trypsin /wt protein) sequencing grade trypsin (Roche) to identify tryptically stable domains. The identity of the amino termini of the proteolytic product(s) was ascertained by N-terminal amino acid sequencing of the band following gel-electrophoresis and blotting onto a polyvinylidene fluoride membrane (PVDF-Immobilon-PSQ, Millipore). For construct optimization the carboxy termini were estimated based on the predicted secondary structure in the terminal region and mass spectrometric characterization of the proteolyzed protein. Cloning, expression, purification and crystallization of the tryptic fragments of N protein. The NTD and CTD protein from two strains were employed in this study, IBVGray (CTD1, CTD2 and NTD1) and IBV-Beaudette strain (CTD3). The proteins were cloned and expressed respectively as GST fusion proteins using the pet41 Ek-LIC vector (Novagen) or for the Beaudette strain as detailed previously (Fan et al., 2005). The expressed protein was purified using glutathione S sepharose (Pharmacia) columns followed by on-bead cleavage with enterokinase (EK-Max, Invitrogen). The cleavage reaction was performed by suspending 1 ml of beads in 40 ml of cutting buffer (250 mM NaCl, 50 mM Tris-HCl pH 8.0) with 10 units of protease. Following proteolysis, the dilute supernatant was purified further by gel filtration chromatography on a Superdex 75 16/60 column 15 (Pharmacia). The purified N- and C-terminal domains were concentrated to 5-8 mg/ml and used for crystallization. Data Collection and phasing. Data were collected at various synchrotron beam lines as indicated in Table I (Supplementary material). For each crystal, the diffraction data were collected with 1° oscillation angle and integrated and scaled using HKL2000 (Otwinowski and Minor, 1997). For the NTD, the diffraction data to 1.3 Å were phased using molecular replacement (MR) procedures in PHASER (Storoni et al., 2004) with the previously published NTD structure (PDB ID:2BTL) at 2.8 Å resolution (Fan et al., 2005). Following MR, further model building and refinement was performed in a similar manner to the CTD as described below. The CTD crystallized in three crystal forms (Table I, Supplementary material). The structure of the CTD was determined from selenomethionine (Se-Met) substituted protein (crystal form CTD1, Table I) to 2.0 Å resolution using MAD (multi wavelength anomalous dispersion) datasets collected at two different wavelengths (Se-peak, 0.9734 Å; Se-inflection, 0.9748 Å). Positions of the four Se atoms were located using the SnB program (Weeks and Miller, 1999) and refined using SHARP (figure of merit of 0.65) (Bricogne et al., 2003). An electron density map was calculated following density modification using CCP4 (1994). An initial model was built using ARP/WARP (Lamzin et al., 2001) followed by manual model building using COOT (Emsley and Cowtan, 2004). Model refinement was performed using a combination of CNS (Brunger et al., 1998), in the initial rounds of simulated annealing followed by refinement using REFMAC5 (Pannu et al., 1998). The structure of CTD in the two other crystal forms (CTD2 and CTD3, Table 1, Supplementary material) were phased 16 using MR procedures implemented in PHASER (23). Model bias in both NTD and CTD structures was reduced by using the prime and switch technique implemented in SOLVE/RESOLVE (Terwilliger and Berendzen, 1999). The stereochemistry of the structures was checked by PROCHECK (Laskowski et al., 1993)during the course of model building and refinement. Electrostatic potentials were calculated using DELPHI (Nicholls and Honig, 1991). All figures were generated using Pymol (DeLano, 2002)and Espript (Gouet et al., 1999). Acknowledgements This work was supported by grants from the NIH (AI36040), and the Robert Welch Foundation to BVVP, and grants from the Singapore Biomedical Research Council and the Academic Research Fund to JL. We thank Jennifer Falon and Florante Quiocho for use of inhouse X-ray diffraction facility at BCM. HJ wishes to thank Chris Miller and HHMI for support during the latter half of this project. We acknowledge use of the SBC-CAT 19ID and BIOCARS BM14 beam line and its staff for their help during data collection at the Advanced Photon Source supported by the U.S. Department of Energy, Basic Energy Sciences, Office of Science, under Contract No. W-31-109-Eng-38. 17 References "Collaborative Computational Project, n. (1994) The CCP4 suite: programs for protein crystallography. Acta Crystallogr D Biol Crystallogr, 50, 760-763. Baric, R.S., Nelson, G.W., Fleming, J.O., Deans, R.J., Keck, J.G., Casteel, N. and Stohlman, S.A. (1988) Interactions between coronavirus nucleocapsid protein and viral RNAs: implications for viral transcription. J Virol, 62, 4280-4287. Bricogne, G., Vonrhein, C., Flensburg, C., Schiltz, M. and Paciorek, W. (2003) Generation, representation and flow of phase information in structure determination: recent developments in and around SHARP 2.0. Acta Crystallogr D Biol Crystallogr, 59, 2023-2030. Brunger, A.T., Adams, P.D., Clore, G.M., DeLano, W.L., Gros, P., Grosse-Kunstleve, R.W., Jiang, J.S., Kuszewski, J., Nilges, M., Pannu, N.S., Read, R.J., Rice, L.M., Simonson, T. and Warren, G.L. (1998) Crystallography & NMR system: A new software suite for macromolecular structure determination. Acta Crystallogr D Biol Crystallogr, 54, 905-921. Cavanagh, D. (2003) Severe acute respiratory syndrome vaccine development: experiences of vaccination against avian infectious bronchitis coronavirus. Avian Pathol, 32, 567-582. Chang, C.K., Sue, S.C., Yu, T.H., Hsieh, C.M., Tsai, C.K., Chiang, Y.C., Lee, S.J., Hsiao, H.H., Wu, W.J., Chang, C.F. and Huang, T.H. (2005) The dimer interface of the SARS coronavirus nucleocapsid protein adapts a porcine respiratory and reproductive syndrome virus-like structure. FEBS Lett, 579, 5663-5668. Chen, H., Gill, A., Dove, B.K., Emmett, S.R., Kemp, C.F., Ritchie, M.A., Dee, M. and Hiscox, J.A. (2005) Mass spectroscopic characterization of the coronavirus infectious bronchitis virus nucleoprotein and elucidation of the role of phosphorylation in RNA binding by using surface plasmon resonance. J Virol, 79, 1164-1179. Cologna, R. and Hogue, B.G. (1998) Coronavirus nucleocapsid protein. RNA interactions. Adv Exp Med Biol, 440, 355-359. DeLano, W.L. (2002) The PyMOL Molecular Graphics System (2002) on World Wide Web http://www.pymol.org. Doan, D.N. and Dokland, T. (2003) Structure of the nucleocapsid protein of porcine reproductive and respiratory syndrome virus. Structure (Camb), 11, 1445-1451. Emsley, P. and Cowtan, K. (2004) Coot: model-building tools for molecular graphics. Acta Crystallogr D Biol Crystallogr, 60, 2126-2132. Fan, H., Ooi, A., Tan, Y.W., Wang, S., Fang, S., Liu, D.X. and Lescar, J. (2005) The nucleocapsid protein of coronavirus infectious bronchitis virus: crystal structure of its N-terminal domain and multimerization properties. Structure (Camb), 13, 1859-1868. Gouet, P., Courcelle, E., Stuart, D.I. and Metoz, F. (1999) ESPript: analysis of multiple sequence alignments in PostScript. Bioinformatics, 15, 305-308. He, R., Leeson, A., Andonov, A., Li, Y., Bastien, N., Cao, J., Osiowy, C., Dobie, F., Cutts, T., Ballantine, M. and Li, X. (2003) Activation of AP-1 signal transduction 18 pathway by SARS coronavirus nucleocapsid protein. Biochem Biophys Res Commun, 311, 870-876. Holm, L. and Sander, C. (1998) Touring protein fold space with Dali/FSSP. Nucleic Acids Res, 26, 316-319. Huang, Q., Yu, L., Petros, A.M., Gunasekera, A., Liu, Z., Xu, N., Hajduk, P., Mack, J., Fesik, S.W. and Olejniczak, E.T. (2004) Structure of the N-terminal RNA-binding domain of the SARS CoV nucleocapsid protein. Biochemistry, 43, 6059-6063. Kuo, L. and Masters, P.S. (2002) Genetic evidence for a structural interaction between the carboxy termini of the membrane and nucleocapsid proteins of mouse hepatitis virus. J Virol, 76, 4987-4999. Lai, M.M. and Cavanagh, D. (1997) The molecular biology of coronaviruses. Adv Virus Res, 48, 1-100. Lamzin, V.S., Perrakis, A. and Wilson, K.S. (2001) The ARP/WARP suite for automated construction and refinement of protein models. In Rossmann, M.G. and Arnold, E. (eds.), International Tables for Crystallography. Vol. F: Crystallography of biological macromolecules. Dordrecht, Kluwer Academic Publishers, The Netherlands, pp. 720-722. Laskowski, R.A., MacArthur, M.W., Moss, D.S. and Thornton, J.M. (1993) PROCHECK: a program to check the stereochemical quality of protein structures. J. Appl. Crystallogr., 26, 283-291. Leung, D.T., Tam, F.C., Ma, C.H., Chan, P.K., Cheung, J.L., Niu, H., Tam, J.S. and Lim, P.L. (2004) Antibody response of patients with severe acute respiratory syndrome (SARS) targets the viral nucleocapsid. J Infect Dis, 190, 379-386. Liu, Y. and Eisenberg, D. (2002) 3D domain swapping: as domains continue to swap. Protein Sci, 11, 1285-1299. Luo, C., Luo, H., Zheng, S., Gui, C., Yue, L., Yu, C., Sun, T., He, P., Chen, J., Shen, J., Luo, X., Li, Y., Liu, H., Bai, D., Shen, J., Yang, Y., Li, F., Zuo, J., Hilgenfeld, R., Pei, G., Chen, K., Shen, X. and Jiang, H. (2004) Nucleocapsid protein of SARS coronavirus tightly binds to human cyclophilin A. Biochem Biophys Res Commun, 321, 557-565. Luo, H., Wu, D., Shen, C., Chen, K., Shen, X. and Jiang, H. (2005) Severe acute respiratory syndrome coronavirus membrane protein interacts with nucleocapsid protein mostly through their carboxyl termini by electrostatic attraction. Int J Biochem Cell Biol. Narayanan, K., Kim, K.H. and Makino, S. (2003) Characterization of N protein selfassociation in coronavirus ribonucleoprotein complexes. Virus Res, 98, 131-140. Nelson, G.W., Stohlman, S.A. and Tahara, S.M. (2000) High affinity interaction between nucleocapsid protein and leader/intergenic sequence of mouse hepatitis virus RNA. J Gen Virol, 81, 181-188. Nicholls, A. and Honig, B. (1991) A Rapid Finite Difference Algorithm, to Solve the Poisson-Boltzmann Equation. J.Comput.Chem., 12, 435–445. Otwinowski, Z. and Minor, W. (1997) Processing of X-ray Diffraction Data Collected in Oscillation Mode. In Carter, C.W.J. and Sweet, R.M. (eds.), Methods in Enzymology. Academic Press, New York, N.Y., Vol. 276, pp. 307-326. 19 Pannu, N.S., Murshudov, G.N., Dodson, E.J. and Read, R.J. (1998) Incorporation of prior phase information strengthens maximum-likelihood structure refinement. Acta Crystallogr D Biol Crystallogr, 54, 1285-1294. Risco, C., Anton, I.M., Enjuanes, L. and Carrascosa, J.L. (1996) The transmissible gastroenteritis coronavirus contains a spherical core shell consisting of M and N proteins. J Virol, 70, 4773-4777. Risco, C., Muntion, M., Enjuanes, L. and Carrascosa, J.L. (1998) Two types of virusrelated particles are found during transmissible gastroenteritis virus morphogenesis. J Virol, 72, 4022-4031. Schelle, B., Karl, N., Ludewig, B., Siddell, S.G. and Thiel, V. (2005) Selective replication of coronavirus genomes that express nucleocapsid protein. J Virol, 79, 6620-6630. Siddell, S.G. (1995) The Coronaviridae. Plenum Press, New York, N.Y. Stohlman, S.A. and Lai, M.M. (1979) Phosphoproteins of murine hepatitis viruses. J Virol, 32, 672-675. Storoni, L.C., McCoy, A.J. and Read, R.J. (2004) Likelihood-enhanced fast rotation functions. Acta Crystallogr D Biol Crystallogr, 60, 432-438. Surjit, M., Liu, B., Jameel, S., Chow, V.T. and Lal, S.K. (2004a) The SARS coronavirus nucleocapsid protein induces actin reorganization and apoptosis in COS-1 cells in the absence of growth factors. Biochem J, 383, 13-18. Surjit, M., Liu, B., Kumar, P., Chow, V.T. and Lal, S.K. (2004b) The nucleocapsid protein of the SARS coronavirus is capable of self-association through a Cterminal 209 amino acid interaction domain. Biochem Biophys Res Commun, 317, 1030-1036. Tahara, S.M., Dietlin, T.A., Nelson, G.W., Stohlman, S.A. and Manno, D.J. (1998) Mouse hepatitis virus nucleocapsid protein as a translational effector of viral mRNAs. Adv Exp Med Biol, 440, 313-318. Terwilliger, T.C. and Berendzen, J. (1999) Automated MAD and MIR structure solution. Acta Crystallogr D Biol Crystallogr, 55, 849-861. Weeks, C.M. and Miller, R. (1999) Optimizing Shake-and-Bake for proteins. Acta Crystallogr D Biol Crystallogr, 55, 492-500. Wurm, T., Chen, H., Hodgson, T., Britton, P., Brooks, G. and Hiscox, J.A. (2001) Localization to the nucleolus is a common feature of coronavirus nucleoproteins, and the protein may disrupt host cell division. J Virol, 75, 9345-9356. Yu, I.M., Gustafson, C.L., Diao, J., Burgner, J.W., 2nd, Li, Z., Zhang, J. and Chen, J. (2005) Recombinant severe acute respiratory syndrome (SARS) coronavirus nucleocapsid protein forms a dimer through its C-terminal domain. J Biol Chem, 280, 23280-23286. Zhao, P., Cao, J., Zhao, L.J., Qin, Z.L., Ke, J.S., Pan, W., Ren, H., Yu, J.G. and Qi, Z.T. (2005) Immune responses against SARS-coronavirus nucleocapsid protein induced by DNA vaccine. Virology, 331, 128-135. Zhou, M., Williams, A.K., Chung, S.I., Wang, L. and Collisson, E.W. (1996) The infectious bronchitis virus nucleocapsid protein binds RNA sequences in the 3' terminus of the genome. Virology, 217, 191-199. 20 Figure Legends Fig. 1. Structure of the NTD. (a) Schematic diagram of limited proteolysis data showing major (arrow) and minor trypsinization sites (short line) in full length IBV N protein. The positions of the N- and C-terminal domains (NTD and CTD) are indicated by black rectangles. (b) Cartoon ribbon representation of the 1.3 Å structure of the NTD (Gray strain) asymmetric homodimer (molecules A and B as indicated). (c) NTD (Beaudette strain) dimer determined by Fan et al. (Fan et. al. (2005)). The region corresponding to the disordered internal arm is colored orange. (d) Electrostatic potential surface of the linear array of NTD dimers formed crystallographic translation. Molecules A and B that constitute the dimer are indicated. The Nterminal arm and the region corresponding to internal arm, rich in basic residues, are indicated by black and cyan arrows respectively. The disordered loop in the B molecule is indicated by a dotted line. Fig. 2. Structure of the CTD. Stereo view of the CTD dimer. The domain-swapped CTD dimer is formed by exchanging β2-strand between the two monomers (in magenta and yellow) related by non-crystallographic symmetry. The β-strands from both monomers form an extended antiparallel β-sheet floor lined by α-helices. Fig. 3 Interdimer interactions in CTD. (a) Crystal packing interactions in CTD1 (pH 4.5) crystals with one dimer in the ASU. Three consecutive dimers from neighboring ASU (numbered n, n+1 and n-1) related by one of the three orthogonal 21 screw axis (type S interaction). The monomers are colored differently, N and C termini for the n+1 dimer is indicated. The salt-bridge interaction seen in the type S dimer-dimer interactions is circled (between n and n-1 dimers). Close-up view of the salt-bridge interaction with electron density 21 map is shown in the inset below. (b) The ASU of CTD2 (pH 8.5) crystals has four dimers, each shown in a different color numbered 1 through 4. The two classes of dimer-dimer interactions are indicated by S (between molecules 1 and 2, and 2 and 3), and L (between molecules 2 and 4). The bridging type S′ interactions are shown with molecule 4 from two adjacent ASUs and molecules 1, 2 and 3 from the bottom ASU (all colored gray). (c) Dimerdimer interactions in the CTD3 crystals with one dimer in the ASU (type F). Each dimer related by the crystallographic 43 screw axis is shown in a different color. The columnar nature of the packing interactions is shown in red as a projection along the fiber axis below. Fig. 4 Possible model for helical nucleocapsid formation. (a) Electrostatic potential surface of the CTD fiber formed by S and S′ interactions in the CTD2 crystals with similar scale for basic (blue) and acidic (red) patches. (b) Electrostatic potential surface of the fiber formed by the close association of the five 43-related dimers from the adjoining unit cells in the CTD3 crystals using the same scale and color representations in (a) for basic and acidic patches. The basic -helix lined groves are well exposed. (c) Linear array formed by PRRSV capsid forming domain (PDB ID: 1P65, Doan & Dokland (2003)). The domain-swapped dimers interact via their terminal helices and a conserved salt bridge (circled). (d) A possible model for the nucleocapsid formation based on protein-protein interactions observed in our crystallographic structures of IBV NTD and CTD domains. The NTD dimers (grey spheres) possibly bind the genomic RNA (black line), which makes secondary contacts with CTD fiber, and together they enclose the genome. The domain-swapped CTD dimers (red and green) interact via the type S interaction, with S′ interaction used to introduce a slight bend in the 22 direction of the fiber. Any changes in the curvature or the direction of the nucleocapsid are facilitated by incorporating type L or type F interactions. 23