structureseed12

advertisement
1
<pnas> Titles are limited to three lines or 135
characters including spaces.</pnas>
BIOLOGICAL SCIENCES
X-ray structure of the N and C-terminal domain of a coronavirus nucleocapsid
protein; structural basis of helical nucleocapsid formation
Hariharan Jayaram, Hui Fan&, Brian R. Bowman, Amy Ooi& ,Jyothi Jayaram, Ellen W.
Collison, Lescar Julian, B.V.Venkataram Prasad
Verna and Marrs McLean Department of Biochemistry and Molecular Biology; Baylor
College of Medicine; Houston, Texas, 77030; U.S.A , Department of Veterinary
Pathobiology; Texas A&M University; College Station, Texas ,77843;U.S.A; School of
Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore
637551
Get buried surface area for NTD dimer-done
Make packing diagram for NTD-did not do -required
Figure out the details of lescar c2 –did send email
1
2
Abstract (250 words allowed ..page 2-Current 202):
Coronaviridae cause a variety of respiratory and enteric diseases in animals and man
including SARS, a disease with emerging global impact. Enveloped capsids of the
virus enclose the single stranded genome associated with the nucleocapsid protein ( N
protein). Using limited proteolysis we identified two stable domains of the
nucleocapsid protein from infectious bronchitis virus. We present here the crystal
structure of the N and C-terminal domains (NTD & CTD) of IBV- N protein. The NTD
protein with basic residues concentrated on two long tethers in the protein constitutes
an RNA inteacting module. The CTD exist as intimate domain swapped dimers that
tend to organize into helical arrays. Inferring from crystal packing interactions
observed at different pHs for the NTD and CTD we hypothesize that the CTD is the
key determinant of helical nucleocapsid formation in the virus. Similarity between
CTD and the capsid forming domain of a related virus family reveals that this fold
constitutes a new class of viral capsid folds that are employed in viruses with helical
nucleocapsids.The coronavirus nucleocapsid is thus made up of an N-terminal RNA
binding core connected to a C-terminal capsid forming domain that together organize
the helical nucleocapsid in the virus.
Introduction
Oh These are dangerous, global importance:
Coronaviridae, a member of the order Nidovirales, is a family of viruses with ssRNA
genomes which are a significant causative agent of human upper respiratory infections
such as common colds and other severe illnesses such as SARS (severe acute respiratory
2
3
syndrome). The brief SARS outbreak has established itself as an important model to
evaluate scientific and social preparedness and responses to future global
pandemics(Hufnagel, Brockmann et al. 2004). Following the SARS outbreak there has
been an explosion of research activity into coronavirus pathogenesis and biology.
Structural studies have yielded structures of 4 of the SARS proteins including some
bound to potential drug leads. We present here results on the structure of the
nucleocapsid protein from infectious bronchitis virus a group III coronavirus which
reveals important insights into the molecular architecture of the genome containing core
of the virus and interesting insights into the evolutionary relatedness between
coronaviruses and other closely related positive sense RNA viruses.
Coronavirus Background:
The coronaviruses are a family of enveloped positive strand RNA viruses. Their capsids
range in diameter from 80 to 160 nm and enclose a single 30kb long segment of positive
sense ssRNA(Siddell 1995). Upon infection and cell entry the genomic RNA encodes a
3’ co-terminal set of four or more subgenomic mRNAs with a common leader sequence
of 60-100 nucleotides attached post-transcriptionally at their 5’-ends. These subgenomic
RNA with their consensus 5’ and 3 ‘ termini encode the various viral structural and non
structural proteins required to replicate the virus and produce progeny virion capsids.
Coronavirus General capsid architecture:
The enveloped capsid of the virus is predominantly made up of the membrane
glycoprotein (M) and another small transmembrane protein (E) and an array of spikes
composed of the spike protein glycoprotein (S) which gives the spherical particles a
3
4
corona. A significant protein component of the capsid is the nucleocapsid protein (N),
which interacts with the genomic ssRNA forming the central core of the virion.
Capsid architecture and N protein:
Electron microscopic studies of detergent permeabilized transmissible gastroenteritis
virus capsids (TGEV a prototype coronavirus) revealed that the internal nucleocapsid is
helical and is composed of the ssRNA genome tightly associated with N-nucelocapsid
protein(Risco, Anton et al. 1996; Risco, Muntion et al. 1998).
The N protein is typically a multifunctional basic phosphoprotein of molecular weight
50kDa to 60kDa.and its coding RNA is the major subgenomic produced during an
infection. Consequently N-protein is synthesized in large amounts and plays multiple
functions during the virus life cycle(Stohlman and Lai 1979; Lai and Cavanagh 1997). N
proteins from Coronaviruses are divided into foure groups (Group I through IV) and
share share homology ranging from 70% to 25% to other N proteins within the same
group and N proteins from other groups respectively.
Being a basic protein ,one of the major functions of the N protein is its ssRNA binding
ability. The N protein was shown to have a general RNA binding ability with an
increased affinity for corresponding viral RNA(Cologna and Hogue 1998). During the
virus life-cycle the N protein interacts extensively with the genomic as well as the
subgenomic RNA that are synthesized. Both of these RNA species have multiple copies
of the N-protein tightly associated with it (Baric, Nelson et al. 1988; Narayanan, Kim et
al. 2003).This specific binding is mediated by consensus sequences common to all viral
RNA at their 5’ and 3’ ends. Specifically it was shown that the IBV and MHV N proteins
4
5
have an affinity for sequences at the 5’ and 3’ end of viral RNA(Nelson, Stohlman et al.
2000; Zhou and Collisson 2000) By virtue of these interactions with these consensus
RNA sequences the N-protein plays a role in controlling mRNA transcription, and
translation and replication(Lai and Cavanagh 1997; Tahara, Dietlin et al. 1998; Schelle,
Karl et al. 2005).
The genomic RNA which is replicated during the virus life cycle is selectively packaged
by recognition of a packaging signal by the M protein which has been identified for
Mouse hepatitis virus and Bovine Cornavirus(Fosmire, Hwang et al. 1992; Cologna and
Hogue 2000).Although genome incorporation for packaging is driven by M protein
driven recognition of a packaging signal the function of the N protein may be more to
serve as a structural template guiding genome packaging(Narayanan, Maeda et al. 2000;
Narayanan, Kim et al. 2003; He, Leeson et al. 2004). Accordingly several mutations in M
protein that are defective in virus formation could be rescued by compensatory mutations
in N protein(Kuo and Masters 2002). The M and N protein thus interact closely via their
C termini , an interaction which is very important for proper genome encapsidation and
nucleocapsid formation.
Host Interactions and Nucleocapsid protein:
The abundance of N produced during an infection results in N playing an important role
in host modulation during coronavirus infection. The N protein is primarily cytoplasmic
but has been reported to enter the nucleus in some coronaviruses(Wurm, Chen et al.
2001). The N from SARS has been shown to interact with cycophilin an immunomodulator ,activate the AP1 pathway an important pathway involved in cell cycle control
5
6
as well as induce apoptosis in certain cell types(He, Leeson et al. 2003; Luo, Luo et al.
2004; Surjit, Liu et al. 2004). The N protein is also a major immunogen and an important
diagnostic marker for coronavirus disease(Leung, Tam et al. 2004) and can help improve
the efficacy of coronavirus vaccines(Cavanagh 2003; Zhao, Cao et al. 2005).
Biochemistry background:
Considerable biochemical information has become available on the in vitro behavior of N
protein especially with regard to its oligomerization behavior and interaction with RNA.
The full length N-protein is prone to disorder and aggregation in solution and its
instability is suggested to be important for its role in virus capsid formation(Wang, Wu et
al. 2004). Early sequence analysis of multiple N protein sequences suggested that the
MHV-N protein is made up of three domains , two basic domains followed by an acidic
domain at the extreme C-terminus(Parker and Masters 1990). The RNA binding domain
of nucleocapsid protein has been generally localized to the first 300 residues of the Nprotein for various homologs. The core minimal region required to bind RNA by itself
was identified to a 55 aa stretch in the N-terminus of MHV-N protein(sequence 177 to
321 in MHV corresponds to 144 to 198 in IBV-N)(Ziebuhr 2004). (He, Dobie et al.
2004; Surjit, Liu et al. 2004; Yu, Gustafson et al. 2005). An NMR structure for the Nterminal domain for SARS-N clearly shows that The N –terminal domain is largely
composed of coiled structure and interacts with RNA in solution(Huang, Yu et al. 2004).
The dimerization domain of N-protein has been localized by several approaches in
nucleocapsid proteins from multiple groups to the C-terminal 200 residues. These studies
6
7
identified N-protein dimers both in the context of the domain by itself using appropriate
recombinant constructs and the full length protein using cross-linking and yeast twohybrid based approaches.
These in-vitro and in-vivo studies therefore identified two bread-functional domains for
the nucleocapsid protein an N-terminal RNA binding domain and a C-terminal
dimerization domain. We chose to characterize the structure of the nucleocapsid from
infectious bronchitis virus a prototype member of group IV coronavirus that causes
respiratory illness in chickens.
Results and Discussion: The full length N protein from infectious bronchitis virus had
been purified and characterized previously(Zhou and Collisson 2000). The N protein has
strong interactions with 5’and 3’ conserved sequences of IBV RNA and also undergoes
phosphorylation in infected cells to generate multiple isoforms . Our structural
characterization of full length N protein was impeded by its aggregation and degradation
on storage under a variety of conditions (lane zero Figure 0b). Purified full length N
protein was also extremely polydisperse in solution as characterized by dynamic light
scattering analysis and not amenable to detailed structural characterization using protein
X-ray crystallography.
We employed the divide and conquer approach to study the protein structurally. Using
limited proteolysis we sought to identify regions of the protein that represented stable
domains that were resistant to proteolysis under limiting amounts of proteases trypsin
(that cleaves after basic residues Arg and Lys) and V8 protease (cleaves after acidic
residues Glu and Asp). The digestion pattern with v8 protease was not very distinct and
7
8
yielded several diffuse bands( data not shown). Trypsin proteolysed the full length
protein to a single ~17 kD band on a 17% denaturing SDS-PAGE gel within 15 minutes
of trypsinization(Figure 0b). The “single” band thus observed was resistant to further
degradation even upon typsinization for several hours and represented a stable region of
the protein. Using N-terminal sequencing of the cleavage fragment we identified four
tryptic fragments: two major cleavage sites that corresponded to cleavage at residues19
and 219 and two secondary cleavage sites at residues 27 and 226-migrated The optimized
domain constructs termed NTD (N terminal domain) and CTD (C-terminal domain) were
then cloned, expressed and purified to homogeneity. The N terminal domain thus
identified was monomeric at moderate concentrations concentrations while the Cterminal domain protein was a dimer even at very low concentrations as assayed by gelfiltraion chromatography(Figure 0c). The C-terminal protein tended to aggregate during
purification and thus was purified at very low concentrations and concentrated only prior
to crystallization screening. The NTD and CTD proteins also failed to interact at a variety
of salt and protein concentrations as assayed by gel-filtration co-fractionation and pull
down experiments (Figure 0c and data not shown). NTD and CTD therefore represent
independent domains of the full length protein and were suitable for structure
determination separately.
Crystals of both the N-terminal and C-terminal domain were obtained in a variety of
conditions. Although diffraction data were obtained for both domains, we were
successful in phasing only the CTD data using MAD techniques. The wild-type NTD
from IBV contained no cysteines or methionines and therefore we engineered several
8
9
mutations at Leucine and phenylalanine residues to allow for phasing using MAD
methods and multiple isomorphous replacement. Theses mutant proteins failed to yield a
structure owing to pseudo-body centering in the crystals coupled with poor diffraction
quality. The crystal structure of a similar construct of IBV-N was then solved by some of
the co-authors and was therefore used as a model to phase the high resolution 1.3 Å data
we obtained for wild-type N-protein. Interestingly the N-terminal construct optimized in
that study was obtained by accidental proteolysis of full-length IBV-N-proitein during
the crystallization experiment and yielded the structure in a dimeric conformation .Our
structure in C2 spacegroup represent a novel packing dimeric interaction and important
insights into the adaptability of the N-domain to protein-protein and protein-RNA
interactions which would be important during virion nucleocapsid assembly.
We present here the novel crystal structure of the C-terminal domain of IBV-N protein
and the NTD phased using molecular replacement techniques.
High resolution structure of the N-terminal domain:The structure of the N-terminal
domain solved at 1.3 A is almost identical to the structure of the N-protein reported
before with the exception of five additional residues discernible at the N-terminus.
Briefly the structure is composed of a relatively acidic globular core made up a twisted
antiparallel b-sheet center surrounded by a number of loop regions. Prominent among the
loop regions are two long loops corresponding to the N-terminal 12 amino acids of this
domain (residues 22 to 34) and a looped loop region from residues 74 to 86 that extended
outward like long tethers from this globular base.
The structure obtained by Hui et al at pH 6.5 and in the absence of Mg2+ ( at pH 6.5
9
10
Sodium phosphate using PEG 4000 as a precipitant were side to side dimers in the
asymmetric unit with a rather limited buried-surface area between them (Figure -1 panel
b). The N-terminal dimer seen in this study which used a similar precipitant
concentration but absutely needed magnesium to obtain crystals and corresponded to a
lower pH of 5.5 constituted an interlocked set of monomers yielding an asymmetric
dimer with a buried surface area of ~1580 Å2 .
These interlocking of the two “U” shaped monomers with each other is prominently
aided by electrostatic interactions such that their acidic bases interacted with the basic
floors at the base of the “U” shaped monomer (Figure -1 panel c). Most of the contacts
between two monomers in this study consist of water mediated salt bridges.
The two tethers which have a predominance of basic residues were firmly wrapped
around the other monomer in the asymmetric unit for one of the monomers in the ASU
and largely disordered and consequently probably floppy for the other dimer (Figure -1
panel a). Thus the loop between residues 74 and 86 is quite ordered in the A molecule
but is almost entirely disordered in the B molecule indicating their inherent floppy nature
in the absence of interacting partner under or crystallization conditions. Interestingly in
the structure solved by Hui et al in two conditions and two spacegroups (P1 and C2) had
the same loop ordered despite the absence of any ligand or interacting partner. The lower
pH and Mg2+ containing crystal conditions may therefore help stabilize these loops in our
structure. Despite the necessity for Mg2+ no bound magnesium ion could be discerned in
our structure. These basic loops with their dynamic and pH dependent protein-protein
interactions may play an important role in nucleocapsid assembly and dis-assembly.
1
0
11
As hypothesized before these basic tethers with a very high percentage of conserved
residues basic residues possibly represent the RNA-interaction regions of the N-terminal
domain and possibly alternate between interacting with other monomers as seen in our
dimer or with RNA.
Interstingly the extremely N-terminus (residues 22-29) in this structure is the only region
that cannot be well superimposed with the conformation seen in the structure by Hui et al.
The packing interaction seen in our structure in a C2 spacegroup is also mediated by the
same N-terminal extension interacting in an identical manner as seen in the ASU with a
groove in the neighboring dimer in the next ASU (Figure -1 panel d).This flexible Nterm loop therefore possibly represents a highly basic loop structure adapted to mediating
either protein-protein or protein-RNA interactions required in its role during nucleocapsid
formation. The 63 residues that link the N-terminus to the CTD identified possibly help
to ensure the independent functioning of the NTD from the CTD .
Structure of the CTD:
For the CTD, of the three different space groups in which we were successful in
obtaining diffraction data we successfully solved the structure of CTD in two different
conditions (Table 1). One of these crystals is at an extremely low pH of 4.5 where the
crystals have a distinct rod like appearance in rare cases but form large needles or flat
sheets in most cases. The other condition ( Table I, Crystal II) yielded crystals which
were flat sheets after several weeks. We were successful in obtaining two wavelength
anomalous data with selenomethionine substituted protein for crystal I and native data
for crystal II. Crystal I and Crystal II represented two different pHs and two different
1
1
12
ionic strengths and had widely differing unit cell sizes(Table 1). The crystal
morphology of both crystals i.e rods or needles at acidic pHs or flat sheet crystals at basic
pHs indicated a tendency of the protein to pack very well in two dimensions. Besides
these a third three-dimensional hexagonal-bipyramidal crystal form grown under similar
conditions as Crystal I but at slightly elvated pH ( pH 5.2 ) and the absence of citrate or
acetate was obtained. This crystal despite its seemingly three dimensional appearance
diffracted in an extremely anistropic manner with almost no diffraction perpendicular to
the principal long axis of the pyramid. This factor also characteristic of organization
along only two dimensions prevented obtainance of any meaningful data for this crystal
form.
We report the pH 4.5 structure of CTD with a dimer in the asymmetric unit and a pH 8.5
structure with 4 dimers or 8 molecules in the asymmetric unit. The observation of dimers
as the building block of both crystals at these widely different pHs coupled with the
dimer observed on gel filtration under extremely dilute conditions reveal that dimers of
CTD were the obvious physiologically relevant form for this domain.
Structure of The CTD dimer: The CTD exists in both crystal forms as an intimate
domain swapped dimer (Figure 2). The domain swapping is brought about by interaction
between β-strands of one monomer with surrounding helices and loops from the other
monomer to form a reciprocated, closed domain swapped dimer akin to that seen in
crystal structures of cystatin A and RNAseA(Janowski, Kozak et al. 2001; Newcomer
2001). Accordingly a 12 residue long β-strand β2 (295 and 307) constitutes the interface
1
2
13
between the two monomers (Figure 2 bottom). The overall topology of the dimer of
IBV-N can be said to be a concave β-stranded floor of ~400Å2 area with the topology
β1B-β2B-β2A-β1A surrounded by helices and loops. The helices 3 and 4 connected by
loop region arch over this floor and constitute the roof of the dimer. A 12 residue long αhelix α5 located at the extreme C-terminus of CTD forms an angled wall that flanks
either side of the dimer and is held in place by a tight turn made up residues 307 to
310(purple boxed residues Figure 1 and Figure 2).
The dimerization interactions are very tight and bury a surface area of 5780Å2. Neither
the serine rich domain (161 to 191, Figure 0) nor disulfide bonding are important in
protein oligomerization as was expected based on previous biochemical data. The two
cysteine residues C228 and C281 lie in close proximity in the interior of the dimer and
are not disulfide bonded to each other in this structure. The crystals and protein prep was
performed in the absence of reducing agent so the non disulfide bond mediated
interaction seen here is probably identical to that seen in the virus nucleocapsid . The
integrity of the dimer observed in solution is apparent when one considers the ~5000 Å2
buried surface area involved in the dimerization.
The dimeric structure observed at pH 4.5 was almost identical to all four dimers observed
at pH 8.5 with the rmsd. for Cα-atoms in the core region (233 to 328) being ~0.3 Å.
The N and C termini in the five dimers observed differed from dimer to dimer based on
its stabilizing interactions with neighboring asymmetric unit dimers in the crystal (pH 4.4
case Figure 4 panel b) or within the asymmetric unit (pH 8.5 case, Figure 4 panel a)..
The presence of a dimer in the ASU in one crystal form and 4 dimers in the asu in the
1
3
14
other crystal form allowed the analysis of dimer-dimer interactions not only at different
pHs but in the presence and absence of any constraints imposed by crystal packing.
Crystal packing interactions in CTD insights into stability of helical packing
interactions: The two dimeric structures presented here result in five kinds of inter-dimer
interactions. Crystal packing in crystal I is brought about by dimer-dimer interactions
with the nth dimer interacting with n-1 dimer and n+1 dimer from neighboring ASU
(Figure 4b) burying a surface area of 1182 Å2. In crystal II with 4 dimers in the ASU,
inter-dimer interactions are responsible for keeping the four dimers in the ASU together
as well as mediating crystal packing (Figure 4a). Accordingly this gives rise to four
classes of dimer-dimer interactions. Two of them (termed class I dimer-dimer
interactions)i.e AB-CD, CD-EF and the crystal packing interaction wherein GH dimer
from one ASU interacts with the AB dimer from the neighboring ASU (i.e GH:ABn+1)
belong to the same class as seen in the pH 4.5 crystal form and bury an almost similar
surface area of 1122 Å2 .
This high pH crystal form also displays a new class of “dimer-dimer” interactions and it
involves the interaction between the GH dimer with an interface formed by the CD-EF
dimer (Figure 4a). This tri-dimeric interaction buries a surface area of 1385 Å2
The uniformity of all but the last kind of dimer-dimer interactions observed in two
crystals is apparent from a superposition of all four types of dimer-dimer interactions
observed between the two crystals whereby the dimers all superpose with a minimum of
0.3 Å rmsd and a maximum of 0.8 Å rmsd (yellow dimer inFigure 4c and Figure 4d).
1
4
15
When the three dimers (Dn+1-D-Dn-1) from three neighboring ASUs from crystal I are
superposed from the three dimers from within the ASU of crystal II the rmsd between
them is ~1.0 Å (blue tri-dimer vs yellow tri-dimer Figure 4c).
These interactions primarily involve residues between 308 and 328 which constitute a
type II turn (TT in Figure 1)and 5 and the terminal loop in CTD(Figure 1 lilac boxes).
Apart from the class I dimer interactions the crystal packing interactions in crystal II
(dimer GH interacting with dimer ABn+1) bury only a surface area of 600 Å2 and are
brought about by a swiveling away of the GH dimer prompted possibly by its strong
interaction with CD-EF dimers from within the ASU. This clearly indicates that the
dimers tend to swivel only slightly w.r.t each other and constitute a subtle module that is
well suited to interacting with itself (arrow figure 4b).
Although there is not significant surface complementarily between the two molecules the
predominant interaction between dimers is a salt bridge between Arg-308 from one dimer
and Asp-314 from a neighboring dimer (Figure 4 d ). The salt bridge and the orientation
of the dimers remain almost identical between the structures at pH 4.5 and pH 8.5. The
inter dimer interactions other than for the salt bridge are strictly Vanderwaal interactions.
The multimerization interaction in addition to the dimerization interactions seen in CTD
very well maintained over this wide range of pHs. The ionic strength of the two crystal
conditions is also different thereby providing further evidence as to the stability of dimerdimer packing interactions.
The additional dimer (GH) is clearly auxiliary (and not part of the primary fibre see
below) and reveals a higher mode of interaction with CTD dimers. The interacting
1
5
16
surface comprises residues from all over the dimers (underlined residues Figure 1).
Since this interaction involves three different molecules and yet the buried surface area is
similar (~1200 Å2)as the primary crystal-packing (or fiber forming interaction), we
hypothesize that it is less likely and therefore secondary to the primary interaction seen
for other dimers. Considering this dimer mediates crystal packing in this spacegroup by
the same region on its other face, the tight salt bridge observed between R308 and D315
is preserved in only one of the cases and disrupted in the two fold related case. Despite
this skewing the overall rmsd is only 0.8 Å indicating the extreme adaptability of the
dimer with α5 and preceding loop mediating these interactions.
This additional interaction also leads to the possibility that the fibre-hexamer made up
ABCDEF with GH appendage could circularize or form planar triangles under certain
conditions with the GH dimer serving as a bridge to bring the otherwise rigid ABCDEF
fibres together. Such bridging interactions may indeed be necessary for spherical particle
formation driven by triangularization of three hexamers with the fourth dimer serving as
the linker.
In addition the greater flexibility of various regions of the protein at alkaline pH (Figure
6a) coupled with the swiveling seen by GH-AB interaction could represent a snapshot
into the dis-assembly of dimer-dimer interactions considering how this may be important
for nucleocapsid disaasembly and genome release( NEED TO ELABORATE).
Electrostatic surface, conservation of surface residues and interaction with other
other capsid components: Analysis of the GRASP surface of the octamer further reveals
1
6
17
that the surface is primarily acidic with a swath of basic residues running in an
expectedly helical fashion throught the fibre (Figure 6b). Although the pimary
interactions with RNA are conferred by the N terminus secondary interactions may be
facilitated by this basic stretch which is clearly solvent exposed.
Fibre formation : The clear tendency of the dimer-dimer interaction to promote fibre
formation is evident from superposition of three dimers from both spacegroups (Figure
4c). The relevance of this interaction is greater when one considers that it occurs as
discussed above at both pHs and also occurs free of crystal packing induced forces at the
alkaline pHs. The dimer induced fibre formation is even more striking when one puts it in
context of the relatedness of the protein to another capsid forming domain N protein from
a related virus.
Similarity to other nucleocapsid proteins and evolutionary implications for viral
architechture: A DALI search of the PDB revealed a very striking similarity to the 12X
amino acid capsid forming domain of PRRSV a corona like virus which is a member of
the nidovirales family. This match had a high similarity Z-score with a corresponding
RMS deviation of 2.8 Å .PRRSV a corona like virus is also a + single stranded RNAvirus with a similarly large genome. PRRSV also forms a helical nucleocapsid and the
full length N-protein was shown to form fibers in solution for the full length
protein(Doan and Dokland 2003). Similar helical nucleocapsids have been observed in
orthomyxovirus, paramyxovirus, flivovirus, rhabdovirus , bunyavirus and arenavirus
1
7
18
families all of which contain genmomic RNA associated with their respective
nucleocapsid proteins(Narayanan, Kim et al. 2003).
The capsid forming domain in PRRSV also packed into helical arrays using crystal
contacts in the crystal studied. The arrangements of CTD, PRRSV and MS2 coat protein
all show a similar feature of an anti-parallel beta strand floor with flanking helixes and
loops. The major difference between the two structures lie in the fact that the CTD floor
is more concave while the PRRSV floor is perfectly flat. Besides this the number of
surrounding loops and helical regions are greater for CTD considering that it is almost
120 residues long compared to the 90 residue length of PRRSV-capsid forming domain.
This fact taken together with the interaction seen in the PRRSV crystal packing
interaction similarly mediated by helix helix Vanderwaal stacking and a similar saltbridge between ArgX aqnd ASpX in PRRSV suggests a common theme in helical fibre
formation across the viruses in the Nidovirales family to which PRRSV and IBV both
belong. This strengthens the suggestion that this fold is commonly employed in viruses
with helical nucleocapsids.
Also despite the very low sequence homology between SARS-N and IBV-N (25%)the
predicted secondary structure of SARS-N for the CTD domain matches the observed
secondary structure of IBV-N very closely (Figure 1 black topology diagram top) and
thus SARS-N probably has an identical fold as IBV-N. The NMR structure of SRAS-N is
also identical in topology to the IBV-N thus these virus folds possibly share a high level
of homology to each other.
1
8
19
The overall structural similarity between PRRSV and CTD here clearly indicates that
these viruses within the Nidovirales order are more similar than previously thought and
hints at this architecture being a characteristic fold adopted by helical nucleocapsid
viruses.
Genome organization in coronaviruses as suugested from the structure of NTD and
CTD. The NTD with its demonstrated RNA binding activity (Hui et al) and the clearly
dimeric CTD are two highly adaptable modules on an otherwise largely flexible and
possible disordered protein(Wang, Wu et al. 2004).The two basic tethers in the NTD
possible are held alongside a C-term mediated fibre with the tethers grabbing onto and
seuqnestering RNA. This NTD_CTD_RNA superstructure possible then packs via
secondary interactions made possible by both RNA interacting with CTD and the CTDCTD class II dimer-dimer interactions and also possibly the NTD-NTD dimeric
interactions to form a highly compacted ribonucleoprotein complex. The C-terminus of
N-protein is also known to interact with the CTD of M-protein which is predominantly
“BASIC?ACIDIC?”. This may be possible by interactions of M with the
ACIDIC?BASIC? patches on the CTD mediated fibre. These interactions possible
explain the rescue of unstable M-protein mutants by compensatory mutations in the CTD
region of MHV-N ( residues XXX XXXXin MHV-N Protein(Kuo and Masters 2002))
indicating a strong interaction mediated by this region. Together This suggests a model
for genome organization wherein the CTD domains form a helical template with
extending NTD RNA-grabbers that organize the genomic RNA that is brought along for
the ride by interactions of consensus packaging signal with M protein which nucleates
1
9
20
along the CTD fibre by interacting with it. The CTD-fiber this serves as a structural
template for the NTD-RNA complex to wind around with intermittent interactions
between M and CTD. Once assembled this complex is not prone to disruption by
treatment with RNAse A as observed by Narayannan et al(Narayanan, Kim et al. 2003)
Together these constitute a highly organized super-complex of protein and RNA.
Comprising the inner core of the virion
Materials and methods:
Purification of full length nucleocapsid protein and identification of tryptically
stable fragments: Full length nucleocapsid protein was expressed as before. The
expressed protein was purified by Ni-NTA agarose affinity followed by Heparin affinity
to almost 95% purity ( as assessed by denaturing SDS-PAGE followed by coomassie
staining). The protein was checked for monodispersity by dynamic light scattering (
Dynapro ) and negative stain electron microscopy. Cleavage of full length N protein was
carried out at 1-2 mg/ml concentration with 2% (wt trypsin /wt protein) sequencing grade
trypsin (Roche) to identify tryptically stable fragments . Following trypsinization the
protein was run on a denaturing SDS-PAFGE gel and the protein band that resulted was
blotted onto a PVDF (polyvinyldine fluoride) membrane and subjected to N-terminal
amino acid sequencing. For construct optimization the carboxy termini were estimated
based on predicted secondary structure in terminal region and mass spectrometric
characterization of purified protein.
Cloning , expression purification and crystallization of the tryptic fragments of
nucleocapsid protein: The two major and minor bands identified were expressed as GST
2
0
21
fusion proteins using the pet41 EkLIC vector (Novagen) into the LIC site . The
expressed protein was purified using affinity on glutathione S sepharose ( pharmacia)
followed by on-bead cleavage with enterokinase (EK-Max Invitrogen). The cleavage
reaction was performed by suspending 1 ml of beads in 40 mls of cutting buffer ( 250
mM NaCl, 50 mM Tris-HCl ph 8.0) with 10 units of protease for 1ml of beads. Following
proteolysis the dilute supernatant was purified further by gel filtration chromatography on
a superdex 75 16/60 column ( Pharmacia). The N-terminal domain protein migrated as a
monomer while the C-terminal domain protein migrated as a dimer and was concentrated
to 5-8 mg/ml and used for crystallization trials. Initial crystallization trials were carried
our using Crystal Screen I ( Hampton Research). Following several leads in conditions
with Peg 4000. The Index screens 2 and 3 ( Jena Biosciences) were used to design
optimization strategy. Crystals of the N-terminal grew as rhomboid plates in 25%,PEG
4000, 0.1M MES pH 5.5, 0.2 M MgCl2 .Crystals of the C-terminal dimer grew in three to
ten days and were mostly needle shaped ,thin plates or hexagonal three dimensional
bipyramidal crystals that grew around two base conditions: one with citrate i.e 100 mM
pH 4.5-5.2 trisodium citrate, 0.1M MgCl2, 25-30% PEG 4000 and the other had32%
PEG 4000, 0.8 M LiSO4, 0.1 M Tris-HCl pH 8.5.
Data Collection and phasing: Data was collected at the beamlines as indicated in Table
I. For each crystal 180 or 360 oscillation images with 1 oscillation angle were
collected using the inverse beam approach with a wedge size of 30 in the case of MAD
data sets and a continuous wedge of 180 for the native data sets. For the NTD the data
2
1
22
were phased using molecular replacement in phaser with the NTD coordinates as
deposited in the PDB. Following molecular replacement further model building and
refinement were performed in a smilar manner to the C-term below.
For the Cterm domain the entire dataset was integrated and scaled using the HKL200
suite and scalepack. Four methionine positions were located using shake and bake. The
solutions were then refined, phases calculated and density modified using SHARP. The
final FOM after structure solution and phasing inn SHARP was 0.65 which yielded maps
of an excellent quality to 2.2 Å. Although almost 80% of the model could be traced using
automated tracing in ARP-wARP, manual building of the dimer in the asymmetric unit
was performed using the program COOT. Refinement was carried out in CNS or refmac5
. Refined coordinates for the dimer were used to phase data obtained in the P21212
spacegroup by molecular replacement in the program phaser. Phaser was able to correctly
identify positions of all 4 dimers. Model bias was reduced during refinement following
molecular replacement by using the prime and switch methodology implemented in
SOLVE/RESOLVE. All figures were generated using Espript in combination with Adobe
Illustrator or pymol.
2
2
23
Figure -1
2
3
24
2
4
25
2
5
26
2
6
27
2
7
28
2
8
29
Baric, R. S., G. W. Nelson, et al. (1988). "Interactions between coronavirus nucleocapsid
protein and viral RNAs: implications for viral transcription." J Virol 62(11):
4280-7.
Cavanagh, D. (2003). "Severe acute respiratory syndrome vaccine development:
experiences of vaccination against avian infectious bronchitis coronavirus." Avian
Pathol 32(6): 567-82.
Cologna, R. and B. G. Hogue (1998). "Coronavirus nucleocapsid protein. RNA
interactions." Adv Exp Med Biol 440: 355-9.
Cologna, R. and B. G. Hogue (2000). "Identification of a bovine coronavirus packaging
signal." J Virol 74(1): 580-3.
Doan, D. N. and T. Dokland (2003). "Structure of the nucleocapsid protein of porcine
reproductive and respiratory syndrome virus." Structure (Camb) 11(11): 1445-51.
Fosmire, J. A., K. Hwang, et al. (1992). "Identification and characterization of a
coronavirus packaging signal." J Virol 66(6): 3522-30.
He, R., F. Dobie, et al. (2004). "Analysis of multimerization of the SARS coronavirus
nucleocapsid protein." Biochem Biophys Res Commun 316(2): 476-83.
He, R., A. Leeson, et al. (2003). "Activation of AP-1 signal transduction pathway by
SARS coronavirus nucleocapsid protein." Biochem Biophys Res Commun
2
9
30
311(4): 870-6.
He, R., A. Leeson, et al. (2004). "Characterization of protein-protein interactions between
the nucleocapsid protein and membrane protein of the SARS coronavirus." Virus
Res 105(2): 121-5.
Huang, Q., L. Yu, et al. (2004). "Structure of the N-terminal RNA-binding domain of the
SARS CoV nucleocapsid protein." Biochemistry 43(20): 6059-63.
Hufnagel, L., D. Brockmann, et al. (2004). "Forecast and control of epidemics in a
globalized world." Proc Natl Acad Sci U S A 101(42): 15124-9.
Janowski, R., M. Kozak, et al. (2001). "Human cystatin C, an amyloidogenic protein,
dimerizes through three-dimensional domain swapping." Nat Struct Biol 8(4):
316-20.
Kuo, L. and P. S. Masters (2002). "Genetic evidence for a structural interaction between
the carboxy termini of the membrane and nucleocapsid proteins of mouse
hepatitis virus." J Virol 76(10): 4987-99.
Lai, M. M. and D. Cavanagh (1997). "The molecular biology of coronaviruses." Adv
Virus Res 48: 1-100.
Leung, D. T., F. C. Tam, et al. (2004). "Antibody response of patients with severe acute
respiratory syndrome (SARS) targets the viral nucleocapsid." J Infect Dis 190(2):
379-86.
Luo, C., H. Luo, et al. (2004). "Nucleocapsid protein of SARS coronavirus tightly binds
to human cyclophilin A." Biochem Biophys Res Commun 321(3): 557-65.
Narayanan, K., K. H. Kim, et al. (2003). "Characterization of N protein self-association
in coronavirus ribonucleoprotein complexes." Virus Res 98(2): 131-40.
Narayanan, K., A. Maeda, et al. (2000). "Characterization of the coronavirus M protein
and nucleocapsid interaction in infected cells." J Virol 74(17): 8127-34.
Nelson, G. W., S. A. Stohlman, et al. (2000). "High affinity interaction between
nucleocapsid protein and leader/intergenic sequence of mouse hepatitis virus
RNA." J Gen Virol 81(Pt 1): 181-8.
Newcomer, M. E. (2001). "Trading places." Nat Struct Biol 8(4): 282-4.
Parker, M. M. and P. S. Masters (1990). "Sequence comparison of the N genes of five
strains of the coronavirus mouse hepatitis virus suggests a three domain structure
for the nucleocapsid protein." Virology 179(1): 463-8.
Risco, C., I. M. Anton, et al. (1996). "The transmissible gastroenteritis coronavirus
contains a spherical core shell consisting of M and N proteins." J Virol 70(7):
4773-7.
Risco, C., M. Muntion, et al. (1998). "Two types of virus-related particles are found
during transmissible gastroenteritis virus morphogenesis." J Virol 72(5): 4022-31.
Schelle, B., N. Karl, et al. (2005). "Selective replication of coronavirus genomes that
express nucleocapsid protein." J Virol 79(11): 6620-30.
Siddell, S. G. (1995). The Coronaviridae:an introduction, Plenum Press, New York, N.Y.
Stohlman, S. A. and M. M. Lai (1979). "Phosphoproteins of murine hepatitis viruses." J
Virol 32(2): 672-5.
Surjit, M., B. Liu, et al. (2004). "The SARS coronavirus nucleocapsid protein induces
actin reorganization and apoptosis in COS-1 cells in the absence of growth
3
0
31
factors." Biochem J 383(Pt 1): 13-8.
Surjit, M., B. Liu, et al. (2004). "The nucleocapsid protein of the SARS coronavirus is
capable of self-association through a C-terminal 209 amino acid interaction
domain." Biochem Biophys Res Commun 317(4): 1030-6.
Tahara, S. M., T. A. Dietlin, et al. (1998). "Mouse hepatitis virus nucleocapsid protein as
a translational effector of viral mRNAs." Adv Exp Med Biol 440: 313-8.
Wang, Y., X. Wu, et al. (2004). "Low stability of nucleocapsid protein in SARS virus."
Biochemistry 43(34): 11103-8.
Wurm, T., H. Chen, et al. (2001). "Localization to the nucleolus is a common feature of
coronavirus nucleoproteins, and the protein may disrupt host cell division." J
Virol 75(19): 9345-56.
Yu, I. M., C. L. Gustafson, et al. (2005). "Recombinant severe acute respiratory
syndrome (SARS) coronavirus nucleocapsid protein forms a dimer through its Cterminal domain." J Biol Chem 280(24): 23280-6.
Zhao, P., J. Cao, et al. (2005). "Immune responses against SARS-coronavirus
nucleocapsid protein induced by DNA vaccine." Virology 331(1): 128-35.
Zhou, M. and E. W. Collisson (2000). "The amino and carboxyl domains of the infectious
bronchitis virus nucleocapsid protein interact with 3' genomic RNA." Virus Res
67(1): 31-9.
Ziebuhr, J. (2004). "Molecular biology of severe acute respiratory syndrome
coronavirus." Curr Opin Microbiol 7(4): 412-9.
3
1
Download