1 <pnas> Titles are limited to three lines or 135 Characters including spaces.</pnas> BIOLOGICAL SCIENCES X-ray structure of the N and C-terminal domain of a coronavirus nucleocapsid protein; structural basis of helical nucleocapsid formation Hariharan Jayaram, Hui Fan&, Brian R. Bowman, Amy Ooi& ,Jyothi Jayaram, Ellen W. Collison, Lescar Julian, B.V.Venkataram Prasad Verna and Marrs McLean Department of Biochemistry and Molecular Biology; Baylor College of Medicine; Houston, Texas, 77030; U.S.A , Department of Veterinary Pathobiology; Texas A&M University; College Station, Texas ,77843;U.S.A; School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore 637551 1 2 Abstract (250 words allowed ..page 2-Current 202): Coronaviruses cause a variety of respiratory and enteric diseases in animals and humans including SARS, a disease with emerging global impact. Enveloped capsids of these viruses enclose single stranded RNA genome that is associated tightly with a nucleocapsid protein (N protein), a highly immunogenic protein which is implicated in genome packaging, replication, and apoptosis in certain virus infected cells. We present here the X-ray structures of two protease-resistant domains NTD and CTD, which comprise 66% of the N protein, determined to 1.3 Å and 2.2 Å resolution respectively. Our studies indicate that the N protein forms stable dimers using domain-swapped interactions in the CTD and facilitate the presentation of RNA binding regions in the Ushaped NTD with two long arms rich in basic residues. In our studies of these two domains which crystallized in multiple forms under different conditions, we have observed variety of inter-dimeric interactions including those that promote fiber formation. Analysis of these interactions has provided structural insights into how the modular organization of the N protein may facilitate the formation of the non-rigid helical nucleocapsid with closely-packed N proteins. The fold of the CTD is similar to that observed in a distantly-related Nidovirales member, and in the nucleocapsid protein of a unrelated flavivirus indicating that these viruses may share common structural principles in the nucleocapsid formation involving a conserved domain, which provides a structural scaffold and allows concerted interaction between both NTD and CTD to package the genomic DNA in the virus. (245 words). 2 3 Coronaviridae, a member of the order Nidovirales, is a family of viruses which are significant causative agents of human upper respiratory infections including common colds and other severe illnesses such as SARS (severe acute respiratory syndrome). The coronaviruses are enveloped viruses with a diameter ranging from 80 to 160 nm . The viral genome consists of a single 30 kb long segment of positive sense single-stranded RNA (Siddell 1995). Upon infection the genomic RNA encodes a 3’ co-terminal set of four or more subgenomic mRNAs with a common leader sequence at their 5’-ends. These subgenomic RNA encode various viral structural and non structural proteins required to replicate the virus and produce progeny virions. The enveloped capsid of the virus is predominantly made up of the membrane glycoprotein (M) and another small transmembrane protein (E) and an array of spikes composed of the spike protein glycoprotein (S) which gives the roughly spherical particles a corona. A significant protein component of the capsid is the nucleocapsid protein (N), which interacts with the genomic RNA forming the central core of the virion. Electron microscopic studies of detergent permeabilized transmissible gastroenteritis virus capsids (TGEV a prototype coronavirus) revealed that the internal nucleocapsid is helical and is composed of the ssRNA genome tightly associated with N (nucleocapsid) protein (Risco, Anton et al. 1996; Risco, Muntion et al. 1998). The N protein is typically a multifunctional basic phosphoprotein of molecular weight 50-60 kDa, which is synthesized in large amounts along with its coding RNA during infection (Stohlman and Lai 1979; Lai and Cavanagh 1997). The highly basic N protein exhibits non-specific ssRNA binding ability with an increased affinity for genomic RNA (Cologna and Hogue 3 4 1998) and bind consensus sequences at 5’ and 3’ termini of the genome. Biochemical studies on some of the coronaviruses such as mouse hepatitis virus (MHV),infectious bronchitis virus (IBV), and SARS virus have mapped the RNA binding function to a minimal 55 residue segment in the N-terminal half of the N-protein and the dimerization function to a the C-terminal half. (Nelson, Stohlman et al. 2000; Hui Fan 2005; Yu, Gustafson et al. 2005). During the virus life-cycle multiple copies of the N protein interacts extensively with the genomic as well as the subgenomic RNA that are synthesized (Baric, Nelson et al. 1988; Narayanan, Kim et al. 2003) and possibly participates in genome packaging which is initiated by recognition of a packaging signal by the M-protein. The M and N protein also interact closely via their C termini, an interaction which is important for genome encapsidation and nucleocapsid formation (Kuo and Masters 2002). In addition, N protein is also shown to play a role in controlling mRNA transcription, translation and replication (Lai and Cavanagh 1997; Tahara, Dietlin et al. 1998; Schelle, Karl et al. 2005). The abundance of N protein produced during an infection results in this protein playing an important role in host modulation. Accordingly the N protein has been shown to interact with cycophilin, an immuno-modulator, activate the AP1 pathway involved in cell cycle control, enter the nucleus as well as induce apoptosis in certain cell types (He, Leeson et al. 2003; Luo, Luo et al. 2004; Surjit, Liu et al. 2004). The N protein is also a major immunogen and an important diagnostic marker for coronavirus disease (Leung, Tam et al. 2004) and is shown to help improve the efficacy of avian coronavirus vaccines 4 5 (Cavanagh 2003; Zhao, Cao et al. 2005). We present in this paper, structural analysis of the N-protein of infectious bronchitis virus (IBV), a member of the Coronaviridae family. The recombinant N protein of coronavirus is highly susceptible to proteolysis making the structural analysis of the full length protein difficult. To date, there is only limited structural information on the coronavirus N protein which includes an NMR structural analysis of the N-terminal domain of SARS nucleocapsid protein (Huang, Yu et al. 2004)and our previously published crystallographic studies on the N-terminal domain of the IBV nucleocapsid protein at 2.8 A resolution (Hui Fan 2005). As yet there is no X-ray crystallographic structure of the C-terminal domain of the coronavirus N protein. In the present study, using limited proteolysis we have been able to identify two stable domains of the N protein which represent N and C-terminal domains respectively (NTD and CTD) and determine their crystal structures. Each of these domains crystallized in multiple crystal forms allowing us to study their packing interactions and gain structural insights into nucleocapsid formation. With one of the crystal forms of the NTD, we have been able to determine the structure of the NTD to 1.3 Å resolution, significantly higher resolution than our previous studies, and the structure of CTD, which represents the first crystal structure in this region of the coronavirus N protein, to 2.2 Å resolution. Materials and Methods Purification of full length nucleocapsid protein and limited proteolysis: Full length nucleocapsid protein was expressed as before (Zhou, Williams et al. 1996). The protein was further purified by heparin affinity chromatography, concentrated to 1-2 mg/ml and 5 6 was checked for monodispersity by dynamic light scattering (Dynapro) and negative stain electron microscopy. Limited proteolytic cleavage of full length N protein (1-2 mg/ml) was carried out with 2% (wt trypsin /wt protein) sequencing grade trypsin (Roche) to identify tryptically stable domains. The identity of the amino termini of the proteolytic product(s) was ascertained by N-terminal amino acid sequencing of band following gel-electrophoresis and blotting onto a polyvinylidene fluoride membrane (PVDF-Immobilon-PSQ, Millipore). For construct optimization the carboxy termini were estimated based on predicted secondary structure in terminal region and mass spectrometric characterization of proteolyzed protein. Cloning, expression, purification and crystallization of the tryptic fragments of N protein: The NTD and CTD protein from two strains were employed in this study, IBVGray (CTD1,CTD2 and NTD1) and IBV-Beaudette strain (CTD3 and NTD2).The proteins were cloned and expressed respectively as GST fusion proteins using the pet41 Ek-LIC vector (Novagen) or for the Beaudette strain as detailed previously (Hui Fan 2005). The expressed protein was purified using glutathione S sepharose (Pharmacia) columns followed by on-bead cleavage with enterokinase (EK-Max, Invitrogen). The cleavage reaction was performed by suspending 1 ml of beads in 40 ml of cutting buffer (250 mM NaCl, 50 mM Tris-HCl pH 8.0) with 10 units of protease. Following proteolysis, the dilute supernatant was purified further by gel filtration chromatography on a Superdex 75 16/60 column (Pharmacia). The purified N- and C-terminal domains were concentrated to 5-8 mg/ml and used for crystallization. Data Collection and phasing: Data were collected at various synchrotron beam lines as 6 7 indicated in Table I (supplementary material). For each crystal, the diffraction data were collected with 1 degree oscillation angle and integrated and scaled using HKL2000 (reference). For the NTD, the diffraction data to 1.3 Å were phased using molecular replacement procedures in PHASER (Storoni, McCoy et al. 2004) with the previously published NTD structure at 2.8 A resolution (Hui Fan et. al.). Following molecular replacement, further model building and refinement was performed in a similar manner to the CTD as described below. The CTD crystallized in three crystal forms (Table I , supplementary material). The structure of the CTD was determined from selenomthionne substituted protein (pH 4.5 Crystal from Table I) to 2.0 A resolution using MAD (multi ewav wefgweg) datasets collected at two different wavelengths (Se-peak , 0.9734; Se-inflection, 0.9748). Positions of the four selenium atoms were located using the SnB program (Weeks and Miller 1999). The initial solution with an figure of merit of 0.65 was then refined and a electron density map was then calculated following density procedure using SHARP (Bricogne 1997). An initial model was built using ARP/WARP (Lamzin 2001) followed by manual model building using COOT (Emsley and Cowtan 2004). Refinement was performed using a combination of CNS (Brunger, Adams et al. 1998) which was used in the initial rounds of simulated annealing refinement followed by refinement using REFMAC5 (Pannu, Murshudov et al. 1998). The structure of CTD in the two other crystal forms were phased using molecular replacement procedures implemented in PHASER. Model bias in both NTD and CTD structures was reduced by using the prime and switch methodology implemented in SOLVE/RESOLVE (Terwilliger and Berendzen 7 8 1999). The stereochemistry of the structures was checked by PROCHECK(Roman A Laskowski 1993) during the course of model building and refinement. All figures were generated using Pymol (DeLano 2002) and Espript (Gouet, Courcelle et al. 1999). 8 9 Results and Discussion: Limited proteolysis yields two stable independent domains. Our structural characterization of purified full length N protein was impeded by its aggregation and degradation on storage under a variety of conditions. The full length N protein was also extremely polydisperse in solution as characterized by dynamic light scattering analysis and not amenable to X-ray crystallographic analysis. Using limited proteolysis we sought to identify regions of the protein that represented stable domains that were resistant to proteolysis under limiting amounts of trypsin and V8 protease. The digestion pattern with V8 protease was not very distinct and yielded several diffuse bands . However with trypsin the full length protein was cleaved to a “single”, stable ~17 kD band within 15 minutes of trypsinization. N-terminal sequencing identified this band to be composed of four tryptic fragments with two major cleavage sites at residues19 and 219 and two secondary cleavage sites at residues 27 and 226 (Figure 1a). The optimized domain constructs termed NTD (residues 19 -162) and CTD (residues 219-349) were then cloned, expressed and purified to homogeneity. The NTD was monomeric at moderate concentrations, whereas the CTD was a dimer even at very low concentrations, as assayed by gel-filtration chromatography. The NTD and CTD proteins tended to aggregate during purification and thus were purified at very low concentrations and concentrated only prior to crystallization screening. The NTD and CTD proteins failed to interact at a variety of salt and protein concentrations as assayed by gel-filtration co-fractionation and pull down experiments (data not shown). NTD and CTD therefore 9 10 represent independent non-interacting domains of the full length protein and were suitable for X-ray crystallographic analysis separately. NTD and CTD crystallized in multiple crystal forms: In contrast to the recently reported structure of NTD which corresponded to the Beaudette strain and crystallized in the P1 spacegroup, this structural analysis employs the IBV-Gray strain which crystallized in a different spacegroup and diffracted to 1.3 A resolution. The CTD also crystallized in different forms as needles, rods, flat sheets or hexagonal crystals under different conditions(Table I crystal forms I II and III). Rod shaped CTD1 crystals of SeMet substituted protein that diffracted to 2.0 A in the P212121 spacegroup was used for structure determination. Structure of CTD in two other crystal forms (P21212 at 2.2 A resolution and P43 at 2.6 A resolution) were determined by molecular replacement. In all we have here determined the structure of NTD in two space groups and CTD in three spacegroups in this study. The different packing arrangements in these crystal forms reveal multiple modes of self-interaction for these domains of the nucleocapsid protein and help suggest a plausible model for nucleocapsid organization in coronaviruses. High resolution structure of NTD The NTD in this study crystallized as an asymmetric dimer of two interlocking monomers arranged in a head to tail fashion in the crystallographic asymmetric unit (Figure 1a).The secondary structure and fold of the NTD (IBV-Gray strain) is almost identical to the structure of the NTD-IBV N-protein Beaudette strain reported previously (Hui Fan 2005),with the exception of five additional residues discernible at the N1 0 11 terminus in the present structure. Briefly, the structure is composed of a relatively acidic globular core of twisted anti-parallel β-sheet that is surrounded by a number of loop regions. Prominent among the loop regions are two long loops corresponding to the N-terminal 12 amino acids (residues 22 to 34) and a loop region from residues 74 to 86 that constitutes an internal arm. These loops extend outward like long tethers from the globular core resulting in a U shaped monomer (cyan and black arrow, Figure 1b). NTD dimer exhibits novel dimeric arrangement The dimer in the ASU of the present structure is formed by the interactions between the protruding basic arms of the U shaped monomer (molecule A) with the acidic base of the other U shaped monomer (molecule B, figure 1B). These two monomers are rotated with respect to each other by about 90 degrees. The main difference between these two molecules related by non crystallographic symmetry is that in molecule B one of the arms of “U” (internal arm) is disordered. The dimeric interaction has a buried surface area of 2168 Å2 thereby indicating a strong interaction between the dimers. In addition to this dimeric arrangement in the NTD1 crystals the dimers from the neighboring unit cell related by translation interact with each other using the N-terminal loop (residues 22 to 29) which interacts with an acidic groove in the neighboring NTD molecule to form a linear array such that neighboring NTD dimers bury a surface area of 1082 Å2 between consecutive dimers. In contrast to these dimers the previous structure of NTD by Hui et. al. also consisted of a dimer in the asymmetric unit wherein the “U” shaped monomers interacted with each other using the bases of the globular core such that the arms of the U shaped monomers faced away from each other with a buried surface area of 596 1 1 12 2 Å (Figure 1 b). The dramatic difference in packing by the NTD dimers in these two crystal forms possibly results from differences in ionic strength and pH between the two crystallization conditions. CTD forms a domain-swapped dimer The CTD in all the three crystal forms exists as an intimate domain swapped dimer (Figure 2) formed by monomers related by non-crystallographic symmetry in the asymmetric unit of these crystals. The domain swapping in the CTD dimers is brought about by exchange of beta strands from one monomer to the other. The overall topology of the CTD dimer can be described as a concave floor of ~400Å2 area consisting of an anti-parallel beta sheet (β1B-β2B-β2A-β1A ) surrounded by helices and loops (Figure 2 B). The helices 3 and 4 connected by a loop region arch inward over this floor and constitute the other phase of the dimer. The dimer is thus bounded on two phases by a curved -sheet floor and an -helical curved grove.A 12 residue long α-helix, α5 located at the extreme C-terminus forms an angled wall that flanks either side of the dimer and is held in place by a tight turn made by residues 307 to 310 (Figure 2). The integrity of the dimer observed in the crystal with a buried surface area of ~5000 Å2 is consistent with the observation that CTD is a dimer in solution and several biochemical studies which map the dimerization domain of the full length protein to the C-terminal domain. Packing interactions in CTD crystals. Although the structure of the dimer remains invariant, their molecular packing is considerably different in the three crystal form we have studied. The presence of one dimer in the ASU in two crystal forms (CTD1 and 1 2 13 CTD3) and 4 dimers in the ASU in one other crystal form (CTD2) allowed the analysis of dimer-dimer interactions not only at different pHs and crystallization conditions, but also in the presence and absence of any constraints imposed by crystal packing. Such an analysis of these inter-dimeric interactions is of relevance, considering the primary role of N-protein in nucleocapsid formation. In two of the crystal forms CTD1 (at pH 4.5) and CTD2 (at pH 8.5) the dimers form a linear array similar to that seen in the NTD. This “fibre” forming end on end dimerdimer interaction (designated as Type I) takes place between dimers from neighboring unit cells in CTD1 and between three non-crystallographic symmetry related dimers in CTD2 (figures 3a and 3b). Despite the differences in crystal forms and pH the interdimeric interaction is well preserved between the two crystal forms with an overall rmsd of 1.0 Å between the “fibres” and a similar buried surface area of ~1100 Å2. This “fibre” forming interaction is mediated by the extreme C-terminal residues between 308 and 328 which constitute a type II turn (TT in Figure 5) and the terminal α-helix 5 (Figure 5 lilac boxes). The two dimers are held together by a network of water-bridged polar interactions and a salt bridge between residues Arg 308 and Asp 314 (Figure 3b, right). CTD-CTD interactions mediated by residues from the N-terminus of CTD:. The mode of interaction of the fourth dimer in the asymmetric unit in the CTD2 structure (molecule4, Figure 3a) with other dimers essentially constitutes a second type of dimerdimer interaction whereby two dimers interact via the N-terminal residues of this domain (residues 221 to 230). This interaction therefore represents a side to side dimer interaction 1 3 14 whereby the -helix lined grooves mesh with each other with the N-terminal loop serving as the glue. Consequently the N-termini of molecule 4 and its interacting partner from the fibre are more ordered in the electron density map than the termini of other dimers.This Type II interaction occurs via predominantly vanderwaal contacts burying a surface area of 1385 Å2. Interestingly this Type II interaction seen at pH 8.5 closely resembles the side-side crystal packing interaction between perpendicular fibres in the pH 4.5 structure. Here the long fibers formed by a 21-screw operation (there are three perpendicular fibres in this P212121 spacegroup) interacts with one another via a similar interaction mediated by the N-terminal residues. The molecule4 thus is related to its counterpart in the pH 4.5 structure by ~90 degrees. This slight flexibility between the teo type II interactions see thus indetifies the Type II interactions as being key to flexible and branched dimer-dimer interactions in contrast to the rigid linear array formation mediated by Type I interactions Alternate modes of dimer-dimer interactions in CTD A third crystal form CTD3 in the P43 spacegroup for the IBV-Beaudette strain was also obtained with one dimer in the ASU. The CTD3 crystal form also at pH 8.5, revealed two modes of dimer-dimer interactions mediated by salt bridges on opposite ends of each dimer with dimers from the neighboring unit cell such that the combined buried surface area between three dimers is 1892 Å2. (Figure 5a). The interdimeric interactions in CTD3 are slightly different from those seen above and the Type I and II interaction represent sight variations to the mode of dimer-dimer packing(Figure 5b). The type IV interaction which involves residues around the N-terminal region resembles the Type II 1 4 15 interactions to some extent while the Type III interaction is closely related to another fibre-fibre interaction seen in the Ph 4.5 CTD structure. Together these side-side dimerdimer interactions result in the formation of columnar arrays and are therefore similar to the greater helicity formed when the constraints of a linear 21 fibre are removed. The side-side dimeric interactions seen these various spacegroups thus hold the key to nucleocapsid formation mediated by multiple CTD-CTD interactions. Discussion: One of the primary of functions of the N protein in coronaviruses is in the formation of the nucleocapsid through its interactions with the genomic RNA. From the biochemical characterization of the IBV N protein presented here and from similar characterization of the N protein from other coronoviruses it is apparent that this protein has two major protease-resistant domains. Our X-ray crystallographic structural characterization of these two domains provides some insights into how the two domain organization of the N protein may coordinate nucleocapsid assembly. NTD and CTD interact with the genome: Several biochemical studies have shown that determinants for RNA binding reside in the N-terminal region with the minimal region being mapped to residues 177 136 to 231 190 in MHV (corresponding to 136 to 190 in IBV). RNA binding ability for IBV NTD and CTD has been shown by Hui et. al. using gel-shift assays. Based on their structure of the NTD , Hui et. al. have proposed that the arms of the “U” shaped monomer which are quite basic in nature are likely the regions of the N-protein 1 5 16 binding to RNA. This is also consistent with NMR –NOE analysis of NTD-RNA interactions in the SARS-coronavirus N-protein. A novel finding in our crystal structure analysis of the NTD is that NTD dimers can associate to form linear arrays with these basic tethers exposed along the surface of such a fibre. This fibre with its exposed basic tethers could provide for closely packed interactions of NTD with the genomic RNA. The electrostatic potential surface of the CTD dimer and the CTD dimer fibres are significantly basic along the phase formed by the -helices opposite to the -sheet floor.. These basic residues in the CTD arrays thus provide a suitable surface for concerted binding of genomic RNA by both the NTD and CTD domains. CTD may provide a structural scaffold for helical nucleocapsid formation Our crystal structure analysis, clearly indicate a tight dimer mediated by domain swapped interaction. The CTD inter-dimeric fibre forming interactions with their significant buried surface area are preserved at a wide range of pH ( between CTD1 at 4.5 and CTD2 at pH8.5). This stability of the dimer-dimer CTD interactions makes the CTD ideally suited to serving as a structural scaffold around which the helical nucleocapsid is organized. The electrostatic potential distribution of the CTD is also highly polarized with the -sheet floor being predominantly acidic and the -helical roof being predominantly basic. The acidic -sheet floor may serve as a suitable region for interaction with the predominantly basic M protein. This possible surface complementarity is in agreement with studies that mapped the interaction regions between N and M proteins to their C-terminii. The multiple dimer-dimer interaction modes in CTD also allow for the formation of 1 6 17 tightly helical arrays like that seen in the P43 crystal form.The Type II, III and IV interactions may be brought into play during spherical shell formation after the linear fibre which provides the helicity to NTD interaction with RNA is already put in place. The bridging nature of these non-fibre dimer-dimer interactions may be important during compaction of the N-protein RNA complex into the nucleocapsid. Plausible model for structural organization of the coronavirus genome. The NTD with its demonstrated RNA binding activity (Hui et al) and the clearly dimeric CTD are two highly adaptable modules on an otherwise largely flexible and possible disordered protein (Wang, Wu et al. 2004).The two basic tethers in the NTD possible are held alongside a C-term mediated fibre with the tethers grabbing onto and sequestering RNA. This NTD_CTD_RNA superstructure possible then packs via secondary interactions made possible by both RNA interacting with CTD and the CTD-CTD class II dimerdimer interactions and also possibly the NTD-NTD dimeric interactions to form a highly compacted ribonucleoprotein complex. Together This suggests a model for genome organization wherein the CTD domains form a helical template with extending NTD RNA-grabbers that organize the genomic RNA that is brought along for the ride by interactions of consensus packaging signal with M protein which nucleates along the CTD fibre by interacting with it. The CTD-fiber thus serves as a structural template for the NTD-RNA complex to wind around with intermittent interactions between M and CTD. Once assembled this complex is not prone to disruption by treatment with RNAse A as observed by Narayannan et al(Narayanan, 1 7 18 Kim et al. 2003) Similarity to other coronaviral nucleocapsid proteins and evolutionary implications for viral architechture. A DALI search of the PDB revealed a very striking similarity to the 73 amino acid capsid forming domain of PRRSV a corona like virus which is a member of the nidovirales family. This match had a high similarity Z-score with a corresponding RMS deviation of 2.8 Å .PRRSV a corona like virus is also a + single stranded RNA-virus with a similarly large genome. PRRSV also forms a helical nucleocapsid and the full length N-protein was shown to form fibers in solution for the full length protein(Doan and Dokland 2003). Similar helical nucleocapsids have been observed in orthomyxovirus, paramyxovirus, flivovirus, rhabdovirus , bunyavirus and arenavirus families all of which contain genomic RNA associated with their respective nucleocapsid proteins(Narayanan, Kim et al. 2003). The capsid forming domain in PRRSV also packed into helical arrays using crystal contacts in the crystal studied. The arrangements of CTD, PRRSV and MS2 coat protein all show a similar feature of an anti-parallel beta strand floor with flanking helixes and loops. The major difference between the two structures lie in the fact that the CTD floor is more concave while the PRRSV floor is perfectly flat. Besides this the number of surrounding loops and helical regions are greater for CTD considering that it is almost 120 residues long compared to the 90 residue length of PRRSV-capsid forming domain. This fact taken together with the interaction seen in the PRRSV crystal packing interaction similarly mediated by helix helix Vanderwaal stacking and a similar salt1 8 19 bridge between Arg 65 and Asp43 in PRRSV suggests a common theme in helical fibre formation across the viruses in the Nidovirales family to which PRRSV and IBV both belong. This strengthens the suggestion that this fold is commonly employed in viruses with helical nucleocapsids. Similarity with SARS N protein. Despite the very low sequence homology between IBV-N and SARS-N and (25%) the predicted secondary structure of SARS-N for the CTD domain matches the observed secondary structure of IBV-N very closely (Figure 5 black topology diagram top). The NMR structure for the N-terminal domain for SARSN clearly shows that The N–terminal domain is largely composed of coiled structure and interacts with RNA in solution (Huang, Yu et al. 2004). A similar solution structure by NMR of a part of the dimerization domain of SARS coronavirus reveals a similarity to the PRRSV capsid protein as reported in this publication but differ from this study in the arrangement of the C-terminal helix which is the key mediator of dimer-dimer – interactions which we hypothesize are the determinants of helical nuclocapsid formation(Chang, Sue et al. 2005). The helix corresponding to helix α5 in the SARS structure packs against the helix from the same dimer forming an asymmetric homodimer(Chang, Sue et al. 2005). The IBV and PRRSV dimer are very symmetric homodimers and have the same helix mediating dimer-dimer interactions which are thee main determinant of strand formation. It is quite likely that the absence of the residues Nterminal to the β-stranded floor might have easily allowed the C-terminal helix in SARS to move dramatically and interact with itself within the dimer. This interaction might not be possible in the context of the whole protein and the nucleopcapsid as seen in the 1 9 20 structure of IBV-N protein and PRRSV N-protein. Concluding pargraph Here we have shown that N protein of coronavirus is organized into two modular domains with NTD serving as predominant RNA binding module and the CTD proiding a possible structural scaffold. Overall struc similarity between coronavirus CTD amd PRRSV Capsid forming domian indicates that the general principal seen here and in other viruses like HIV-nucleocapsodi protein might have evolved to have a single protein function efficiently in forming a structural scaffold and bind the genome togther with interacting with multiple components of the virus capsid. The flexible linking of these multiple domains by predominantly disordered loops facilitates the multi-moded interactions which togther form the compact ribo-nucleoprotein complex that is the nucleocapsid in these viruses. Baric, R. S., G. W. Nelson, et al. (1988). "Interactions between coronavirus nucleocapsid protein and viral RNAs: implications for viral transcription." J Virol 62(11): 4280-7. Bricogne, E. d. L. F. G. (1997). Maximum-Likelihood Heavy-Atom Parameter Refinement for the Multiple Isomorphous Replacement and Multiwavelength Anomalous Diffraction Methods. Methods in Enzymology, New York: Academic Press. 276: 472-494. Brunger, A. T., P. D. Adams, et al. (1998). "Crystallography & NMR system: A new software suite for macromolecular structure determination." Acta Crystallogr D Biol Crystallogr 54 (Pt 5): 905-21. Cavanagh, D. (2003). "Severe acute respiratory syndrome vaccine development: experiences of vaccination against avian infectious bronchitis coronavirus." Avian Pathol 32(6): 567-82. Chang, C. K., S. C. Sue, et al. (2005). "The dimer interface of the SARS coronavirus nucleocapsid protein adapts a porcine respiratory and reproductive syndrome virus-like structure." FEBS Lett 579(25): 5663-8. 2 0 21 Cologna, R. and B. G. Hogue (1998). "Coronavirus nucleocapsid protein. RNA interactions." Adv Exp Med Biol 440: 355-9. DeLano, W. L. (2002). The PyMOL Molecular Graphics System (2002) on World Wide Web http://www.pymol.org. Doan, D. N. and T. Dokland (2003). "Structure of the nucleocapsid protein of porcine reproductive and respiratory syndrome virus." Structure (Camb) 11(11): 1445-51. Emsley, P. and K. Cowtan (2004). "Coot: model-building tools for molecular graphics." Acta Crystallogr D Biol Crystallogr 60(Pt 12 Pt 1): 2126-32. Gouet, P., E. Courcelle, et al. (1999). "ESPript: analysis of multiple sequence alignments in PostScript." Bioinformatics 15(4): 305-8. He, R., A. Leeson, et al. (2003). "Activation of AP-1 signal transduction pathway by SARS coronavirus nucleocapsid protein." Biochem Biophys Res Commun 311(4): 870-6. Huang, Q., L. Yu, et al. (2004). "Structure of the N-terminal RNA-binding domain of the SARS CoV nucleocapsid protein." Biochemistry 43(20): 6059-63. Hui Fan, A. O., Yong Wah Tan, Sifang Wang. Shouguo Fang, Ding Xiang Liu & Julien Lescar (2005). "The Nucleocapsid Protein of Coronavirus Infectious Bronchitis Virus: Crystal structure of its N-terminal domain and multimerization properties." Structure (Camb). Kuo, L. and P. S. Masters (2002). "Genetic evidence for a structural interaction between the carboxy termini of the membrane and nucleocapsid proteins of mouse hepatitis virus." J Virol 76(10): 4987-99. Lai, M. M. and D. Cavanagh (1997). "The molecular biology of coronaviruses." Adv Virus Res 48: 1-100. Lamzin, V. S., Perrakis, A. & Wilson, K.S. (2001). The ARP/WARP suite for automated construction and refinement of protein models. International Tables for Crystallography. Vol. F: Crystallography of biological macromolecules. M. G. A. Rossmann, E., Dordrecht, Kluwer Academic Publishers, The Netherlands: 720722. Leung, D. T., F. C. Tam, et al. (2004). "Antibody response of patients with severe acute respiratory syndrome (SARS) targets the viral nucleocapsid." J Infect Dis 190(2): 379-86. Luo, C., H. Luo, et al. (2004). "Nucleocapsid protein of SARS coronavirus tightly binds to human cyclophilin A." Biochem Biophys Res Commun 321(3): 557-65. Narayanan, K., K. H. Kim, et al. (2003). "Characterization of N protein self-association in coronavirus ribonucleoprotein complexes." Virus Res 98(2): 131-40. Nelson, G. W., S. A. Stohlman, et al. (2000). "High affinity interaction between nucleocapsid protein and leader/intergenic sequence of mouse hepatitis virus RNA." J Gen Virol 81(Pt 1): 181-8. Pannu, N. S., G. N. Murshudov, et al. (1998). "Incorporation of prior phase information strengthens maximum-likelihood structure refinement." Acta Crystallogr D Biol Crystallogr 54(Pt 6 Pt 2): 1285-94. Risco, C., I. M. Anton, et al. (1996). "The transmissible gastroenteritis coronavirus contains a spherical core shell consisting of M and N proteins." J Virol 70(7): 2 1 22 4773-7. Risco, C., M. Muntion, et al. (1998). "Two types of virus-related particles are found during transmissible gastroenteritis virus morphogenesis." J Virol 72(5): 4022-31. Roman A Laskowski, M. W. M., David S Moss and Janet M Thornton (1993). "PROCHECK: a program to check the stereochemical quality of protein structures." Journal of Applied Crystallography 26: 283-291. Schelle, B., N. Karl, et al. (2005). "Selective replication of coronavirus genomes that express nucleocapsid protein." J Virol 79(11): 6620-30. Siddell, S. G. (1995). The Coronaviridae:an introduction, Plenum Press, New York, N.Y. Stohlman, S. A. and M. M. Lai (1979). "Phosphoproteins of murine hepatitis viruses." J Virol 32(2): 672-5. Storoni, L. C., A. J. McCoy, et al. (2004). "Likelihood-enhanced fast rotation functions." Acta Crystallogr D Biol Crystallogr 60(Pt 3): 432-8. Surjit, M., B. Liu, et al. (2004). "The SARS coronavirus nucleocapsid protein induces actin reorganization and apoptosis in COS-1 cells in the absence of growth factors." Biochem J 383(Pt 1): 13-8. Tahara, S. M., T. A. Dietlin, et al. (1998). "Mouse hepatitis virus nucleocapsid protein as a translational effector of viral mRNAs." Adv Exp Med Biol 440: 313-8. Terwilliger, T. C. and J. Berendzen (1999). "Automated MAD and MIR structure solution." Acta Crystallogr D Biol Crystallogr 55 (Pt 4): 849-61. Wang, Y., X. Wu, et al. (2004). "Low stability of nucleocapsid protein in SARS virus." Biochemistry 43(34): 11103-8. Weeks, C. M. and R. Miller (1999). "Optimizing Shake-and-Bake for proteins." Acta Crystallogr D Biol Crystallogr 55 (Pt 2): 492-500. Yu, I. M., C. L. Gustafson, et al. (2005). "Recombinant severe acute respiratory syndrome (SARS) coronavirus nucleocapsid protein forms a dimer through its Cterminal domain." J Biol Chem 280(24): 23280-6. Zhao, P., J. Cao, et al. (2005). "Immune responses against SARS-coronavirus nucleocapsid protein induced by DNA vaccine." Virology 331(1): 128-35. Zhou, M., A. K. Williams, et al. (1996). "The infectious bronchitis virus nucleocapsid protein binds RNA sequences in the 3' terminus of the genome." Virology 217(1): 191-9. 2 2