X-ray structure of the C-terminal domain of a coronavirus

advertisement
X-ray structure of the C-terminal domain of a coronavirus nucleocapsid protein;
structural basis of helical nucleocapsid formation
Hariharan Jayaram, Jyothi Jayaram, Brian R. Bowman, Ellen W. Collison,
B.V.Venkataram Prasad
Verna and Marrs McLean Department of Biochemistry and Molecular Biology; Baylor
College of Medicine; Houston, Texas, 77030; U.S.A , Department of Veterinary
Pathobiology; Texas A&M University; College Station, Texas ,77843;U.S.A
Coronaviridae cause a variety of respiratory and enteric diseases in animals and man
including SARS a disease with emerging global impact. Enveloped capsids of the virus
enclose the single stranded genome associated with the nucleocapsid protein ( N protein).
Using limited proteolysis we identified two stable globular domains of the nucleocapsid
protein from infectious bronchitis virus. We present here the crystal structure of the Cterminal domain (CTD) of IBV- N protein. The CTD exist as intimate domain swapped
dimers that tend to organize into helical arrays. Inferring from interactions observed in
crystals at different pHs we hypothesize that the CTD is the key determinant of helical
nucleocapsid formation in the virus. Similarity between CTD and the capsid forming
domain of a related virus family reveals that this fold constitutes a new class of viral
capsid folds that are employed in viruses with helical nucleocapsids.
1
Coronaviridae, a member of the order Nidovirales, is a family of viruses with ssRNA
genomes which are a significant causative agent of common colds and other severe
respiratory illnesses such as SARS. The coronaviruses have enveloped, non-icosahedral,
pleiomorphic capsids with diameters ranging from 80 to 160 nm. The capsid encloses the
viral genome consisting of a single 30kb long segment of positive sense ssRNA. Upon
infection the genomic RNA encodes a 3’ co-terminal set of four or more subgenomic
mRNAs that code for both structural and non-structural proteins. The enveloped capsid of
the virus is predominantly made up of the membrane glycoprotein (M) and another small
transmembrane protein (E) and an array of spikes composed of the spike protein
glycoprotein (S). A significant protein component of the capsid is the nucleocapsid
protein (N), which interacts with the genomic ssRNA forming a helical nucleocapsid that
comprises the central core of the virion.
Electron microscopic studies of TGEV a prototype coronavirus revealed that the internal
nucleocapsid is possibly helical and is composed of the ssRNA genome tightly associated
with N-nucelocapsid protein(Risco, Anton et al. 1996). The coronavirus N (nucelocapsid)
protein is typically a protein of molecular weight 50kDa to 60kDa.tag and is synthesized
in large amounts in an infected cell. The protein binds the genomic RNA as well as
subgenomic RNAs that are synthesized during a virus infection. Interactions with
conserved sequences in genomic RNA are hypothesized to mediate incorporation of RNA
into nucleocapsid cores. Proper assembly of capsids in reverse genetic systems also
2
requires complementary interactions between N protein and the major membrane protein
M.
The N from SARS has been shown to interact with cycophilin an immuno-modulator,
RNAseH and also found to activate the AP1 pathway whci plays a role in cell cycle
control. . In MHV N protein was shown to enter the nucleus while similar localization
was observed for a fragment of the protein in SARS. These suggest a possible role for N
protein in host modulation and control of host processes during a coronavirus infection.
The N protein is also a major antigen and is one of the diagnostic markers used for
coronavirus infection. The N protein also enhances protection caused by the vain vaccine
in birds.
The N protein displays a non-specific affinity for ssRNA in coronavirus including the
ability to recognize with increased affinity the consensus packaging signal of MHV and
also interactions between SARS-N protein N-terminal and consensus leader sequence in
RNA. The N protein also has a role in modulating viral sub-genomic RNA transcription
and mRNA translation along with control of packaging of genomic RNA. These activities
have led to the suggestion that N protein function to coordinate the involvement of subgenomic and genomic RNA in various stages of the virus life cycle and ensure its
packaging into a nucleocapsid.
Consierable biochemical information has become available on the in vitro behavior of N
protein especially with regard to its oligomerization behavior and interaction with
3
RNA.The full length N-protein is prone to disorder and aggregation in solution and its
instability is suggested to be important for its role in virus capsid formation(Wang, Wu et
al. 2004). The dimerization domain of N-protein has been localized to the c-terminal 200
residues by several studies which identified N-protein dimers both in the context of the
domain by itself and the full length protein(Surjit, Liu et al. 2004; Tang, Wu et al. 2005;
Yu, Gustafson et al. 2005). The N-terminal domain has been shown to be predominantly
monomeric with an affinity for ssRNA. An NMR structrre for the N-terminal domain for
SARS-N clearly shows that The N –terminal domain is largely composed of coiled
structure and interacts with RNA in solution. The N-protein therefore constitutes two
functional domains, an RNA binding N-terminal domain (Tang, Wu et al. 2005; Yu,
Gustafson et al. 2005) and a C-terminal dimerization domain.
Biochemical characterization of IBV-N protein domains: The full length N protein
from infectious bronchitis virus has been purified and characterized previously. The N
protein has strong interactions with 5’and 3’ conserved sequences of IBV RNA and also
undergoes phosphorylation during an infection to generate multiple isoforms . Our
structural characterization of full length N protein was impeded by its aggregation and
degradation on storage under a variety of conditions (lane zero Figure 0b). Purified full
length N protein was also extremely polydisperse in solution and not amenable to
detailed structural characterization.
We employed the divide and conquer approach to study the protein structurally. Using
limited proteolysis we chose to identify regions of the protein that represented stable
domains that were resistant to proteolysis under limiting amounts of proteases trypsin
4
(that cleaves after basic residues Arg and Lys) and V8 protease (cleaves after acidic
residues Glu and Asp). The digestion pattern with v8 protease was not very distinct and
yielded several diffuse bands( data not shown). Trypsin proteolysed the full length
protein to a single ~17 kD band on a 17% denaturing SDS-PAGE gel within 15 minutes
of trypsinization(Figure 0b). The “single” band thus observed was resistant to further
degradation even upon typsinization for several hours and represented a stable region of
the protein. Using N-terminal sequencing of the cleavage fragment we identified four
tryptic fragments: two major cleavage sites that corresponded to cleavage at residues19
and 219 and two secondary cleavage sites at residues 27 and 226-migrated The optimized
domain constructs termed NTD (N terminal domain) and CTD (C-terminal domain) were
then cloned, expressed and purified to homogeneity. The N terminal domain thus
identified was monomeric at moderate concentrations concentrations while the Cterminal domain protein was a dimer even at very low concentrations(Figure 0c). The Cterminal protein tended to aggregate during purification and thus was purified at very low
concentrations and concentrated only prior to crystallization screening. The NTD and
CTD proteins thus expressed failed to interact at a variety of salt and protein
concentrations as assayed by gel-filtration co-fractionation and pull down experiments
(Figure 0c and data not shown). NTD and CTD therefore represent independent domains
of the full length protein and were suitable for structure determination separately.
Crystals of both the N-terminal and C-terminal domain were obtained in a variety of
conditions. Although diffraction data were obtained for both domains, we were
successful in phasing only the CTD data .We present here the crystal structure of the C-
5
terminal domain of IBV-N protein. Of the three different space groups in which we were
successful in obtaining diffraction data we successfully solved the structure of CTD in
two different conditions (Table 1). One of these crystal I , is at an extremely low pH of
4.5 where the crystals have a distinct rod like appearance in rare cases but form large
needles or flat sheets in most cases. The other condition ( Table I, Crystal II) yielded
crystals which were flat sheets after several weeks. We were successful in obtaining two
wavelength anomalous data with selenomethionine substituted protein for crystal I and
native data for crystal II. Crystal I and Crystal II represented two different pHs and two
different ionic strengths and had widely differing unit cell sizes(Table 1). The crystal
morphology of both crystals i.e rods or needles at acidic pHs or flat sheet crystals at basic
pHs indicated a tendency of the protein to pack very well in two dimensions. Besides
these a third three-dimensional hexagonal-bipyramidal crystal form grown under similar
conditions as Crystal I but at slightly elvated pH ( pH 5.2 ) and the absence of citrate or
acetate was optimized. Despite the seeming three dimensional appearance of this crystal
form, the diffraction pattern was extremely anistropic with almost no diffraction
perpendicular to the principal long axis of the pyramid. This factor also characteristic of
organization along only two dimensions prevented the solving of CTD structure under
these conditions. We report the pH 4.5 structure of CTD with a dimer in the asymmetric
unit and a pH 8.5 structure with 4 dimers or 8 molecules in the asymmetric unit. The
observation of dimers as the building block of both crystals at these widely different pHs
coupled with the dimer observed on gel filtration under extremely dilute conditions reveal
that dimers of CTD were the obvious physiologically relevant form for this domain.
6
Structure of The CTD dimer: The CTD exists in both crystal forms as an intimate
domain swapped dimer. The domain swapping is brought about by interaction between βstrands of one monomer with surrounding helices and loops from the other monomer to
form a reciprocated, closed domain swapped dimer akin to that seen in crystal structures
of cystatin A and RNAseA(Janowski, Kozak et al. 2001; Newcomer 2001). Accordingly
a 12 residue long β-strand β2 (295 and 307) constitutes the interface between the two
monomers (Figure 2 bottom). The overall topology of the dimer of IBV-N can be said to
be a concave β-stranded floor of ~400Å2 area with the topology β1B-β2B-β2A-β1A
surrounded by helices and loops. The helices 3 and 4 connected by loop region arch
over this floor and constitute the roof of the dimer. The 12 residue long α-helix α5
located at the extreme C-terminus of CTD forms an angled wall that flanks either side of
the dimer and is held in place by a tight turn made up residues 307 to 310(Figure 1 and
Figure 2).
The dimerization interactions are very tight and bury a surface area of 5780Å2. Neither
the serine rich domain (161 to 191, Figure 0) nor disulfide bonding are important in
protein oligomerization as was expected based on previous biochemical data. The two
cysteine residues C228 and C281 lie in close proximity in the interior of the dimer and
are not disulfide bonded to each other in this structure. The crystals and protein prep was
performed in the absence of reducing agent so the non disulfide bond mediated
interaction seen here is probably identical to that seen in the virus nucleocapsid . The
integrity of the dimer observed in solution is apparent when one considers the ~5000 Å2
buried surface area involved in the dimerization.
7
The dimeric structure observed at pH 4.5 was almost identical to all four dimers observed
at pH 8.5 with the rmsd. for Cα-atoms in the core region (233 to 328) being ~0.3 Å.
The N and C termini in the five dimers observed differed from dimer to deimer based on
its stabilizing interactions with neighboring dimers in crystal (pH 4.4 case) or within the
asymmetric unit (pH 8.5 case). Further insight into the nature of the CTD in the capsid or
context of the virus can be got from looking at the crystal packing interactions in both
spacegroups. The presence of a dimer in the ASU in one crystal form and 4 dimers in the
asu in the other crystal form allowed the analysis of dimer-dimer interactions not only at
different pHs but in the presence and absence of any constraints imposed by crystal
packing.
Crystal packing interactions in CTD insights into stability of helical packing
interactions: The two structures presented here result in five kinds of inter-dimer
interactions. Crystal packing in crystal I is brought about by dimer-dimer interactions
with the nth dimer interacting with n-1 dimer and n+1 dimer from neighboring ASU
(Figure 4b). In crystal II with 4 dimers in the ASU inter-dimer interactions are
responsible for keeping the four dimers in the ASU together as well as mediating crystal
packing(Figure 4a) accordingly giving rise to four classes of dimer-dimer interactions.
Three of them i.e AB-CD, CD-EF and crystal packing interaction GH with ABn+1 belong
to one class and a new class of “dimer-dimer” interactions involves the interaction
between the GH dimer with a different interface formed by the CD-EF dimer (Figure
4a).
The uniformity of all but the last kind of dimer-dimer interactions observed in two
8
crystals is apparent from a superposition of all four types of dimer-dimer interactions
observed between the two crystals whereby the dimers all superpose with a minimum of
0.3 XXXÅ rmsd and a maximum of 0.8XXX Å rmsd (Figure 4c and Figure 4d). When
the three dimers from crystal I are superposed from the three dimers from crystal II the
rmsd between them is ~1.0 Å (Figure 4c). This clearly indicates that the dimers tend to
swivel only slightly w.r.t each other and constitute a subtle module that is very prone to
interacting with itself.
These interactions primarily involve residues between 308 and 328 which constitute the
XXX type turn (TT in Figure 1)and 5 and the terminal loop in CTD(Figure 1 lilac
boxes). The dimers interact such that they bury a surface area of ~1200 Å2 between them
in all cases except that seen in packing in crystal II (dimer GH interacting with dimer
ABn+1) where the buried surface area is only 600 Å2 due to a swiveling away of the GH
dimer prompted possibly by its strong interaction with CD-EF dimers from within the
ASU.
Although there is not significant surface complementarily between the two molecules the
predominant interaction between dimers is a salt bridge between Arg-308 from one dimer
and Asp-314 from a neighboring dimer (Figure 5). The salt bridge and the orientation of
the dimers remain almost identical between the structures at pH 4.5 and pH 8.5. The inter
dimer interactions other than for the salt bridge are strictly Vanderwaal interactions. The
multimerization interaction in addition to the dimerization interactions seen in CTD very
well maintained over this wide range of pHs. The ionic strength of the two crystal
conditions is also different thereby providing further evidence as to the stability of dimer-
9
dimer packing interactions.
The additional dimer (GH) is clearly auxiliary (and not part of the primary fibre see
below) and reveals a higher mode of interaction with CTD dimers. The GH dimer
interacts with residues from two neighboring dimers in crystal 2 such that the total buried
surface area is ~1200Å2. The interacting surface comprises residues from all over the
dimers (underlined residues Figure 1). Since this interaction involves three different
molecules and yet the buried surface area is similar (~1200 Å2)as the primary crystalpacking (or fiber forming interaction), we hypothesize that it is less likely and therefore
secondary to the primary interaction seen for other dimers. Considering this dimer
mediates crystal packing in this pacegroup by the same region on its other face, the tight
salt bridge observed between R308 and D315 is preserved in only one of the cases and
disrupted in the two fold related. Despite this skewing the overall rmsd is only 0.8XXX Å
indicating the extreme adaptability of the dimer with α5 and preceding loop mediating
these interactions.
This additional interaction also leads to the possibility that the fibre-hexamer made up
ABCDEF with GH appendage could circularize or form planar triangles under certain
conditions with the GH dimer serving as a bridge to bring the otherwise rigid ABCDEF
fibres together. Such bridging interactions may indeed be necessary for spherical particle
formation driven by triangularization of three hexamers with the fourth dimer serving as
the linker.
In addition the greater flexibility of various regions of the protein at alkaline pH (Figure
6a) coupled with the swiveling seen by GH-AB interaction could represent a snapshot
1
0
into the dis-assembly of dimer-dimer interactions considering how this may be important
for nucleocapsid disaasembly and genome release( NEED TO ELABORATE).
NEED TO ADD A BIG SECTION ON THE FIBRE AND HOW THIS SUGGESTS
HELICAL NUCLEOCAPSID FORMATION>>AND IMPLICATIONS FOR M
INTERACTION ETC
Electrostaic surface, conservation of surface residues and interaction with other
other capsid components: Analysis of the GRASP surface of the octamer further reveals
that the surface is primarily acidic with a swath of basic residues running in an
expectedly helical fashion throught the fibre (Figure 6b). Although the pimary
interactions with RNA are conferred by the N terminus secondary interactions may be
facilitated by this basic stretch which is clearly solvent exposed.
Fibre formation : The clear tendency of the dimer-dimer interaction to promote fibre
formation is evident from superposition of three dimers from both spacegroups (Figure
4c). The relevance of this interaction is greater when one considers that it occurs as
discussed above at both pHs and also occurs free of crystal packing induced forces at the
alkaline pHs. The dimer induced fibre formation is even more striking when one puts it in
context of the relatedness of the protein to another capsid forming domain N protein from
a related virus.
Similarity to other nucleocapsid proteins and evolutionary implications for viral
1
1
architechture: A DALI search of the PDB revealed a very striking similarity to the 12X
amino acid capsid forming domain of PRRSV a corona like virus which is a member of
the nidovirales family. This match had a similarity Z-score of XXX with a corresponding
RMS deviation of 2.8 Å .PRRSV a corona like virus is also a + single stranded RNAvirus with a similarly large genome. PRRSV also forms a helical nucleocapsid and the
full length N-protein was shown to form fibers in solution for the full length protein . The
capsid forming domain also packed into helical arrays using crystal contacts in the crystal
studied. The arrangements of CTD, PRRSV and MS2 coat protein all show a similar
feature of an anti-parallel beta strand floor with flanking helixes and loops. The similarity
between PRRSV and CTD here clearly indicates that these viruses are3 more similar than
previously thought and hints at this architecture being a characteristic fold adopted by
helical nucleocapsid viruses.
This fact taken together with the interaction seen in the PRRSV crystal packing
interaction similarly mediated by helix helix vanderwaal stacking and a simlar salt-bridge
between ArgX aqnd ASpX in PRRSV suggests a common theme in helical fibre
formation across the viruses in the Nidovirales family to which PRRSV and IBV both
belong. This strengthens the suggestion that this fold is commonly employed in viruses
with helical nucleocapsids.
Materials and methods:
1
2
Purification of full length nucleocapsid protein and identification of tryptically
stable fragments: Full length nucleocapsid protein was expressed as before. The
expressed protein was purified by Ni-NTA agarose affinity followed by Heparin affinity
to almost 95% purity ( as assessed by denaturing SDS-PAGE followed by coomassie
staining). The protein was checked for monodispersity by dynamic light scattering (
Dynapro ) and negative stain electron microscopy. Cleavage of full length N protein was
carried out at 1-2 mg/ml concentration with 2% (wt trypsin /wt protein) sequencing grade
trypsin (Roche) to identify tryptically stable fragments . Following trypsinization the
protein was run on a denaturing SDS-PAFGE gel and the protein band that resulted was
blotted onto a PVDF (polyvinyldine fluoride) membrane and subjected to N-terminal
amino acid sequencing. For construct optimization the carboxy termini were estimated
based on predicted secondary structure in terminal region and mass spectrometric
characterization of purified protein.
Cloning , expression purification and crystallization of the tryptic fragments of
nucleocapsid protein: The two major and minor bands identified were expressed as GST
fusion proteins using the pet41 EkLIC vector (Novagen) into the LIC site . The expressed
protein was purified using affinity on glutathione S sepharose ( pharmacia) followed by
on-bead cleavage with enterokinase (EK-Max Invitrogen). The cleavage reaction was
performed by suspending 1 ml of beads in 40 mls of cutting buffer ( 250 mM NaCl, 50
mM Tris-HCl ph 8.0) with 10 units of protease for 1ml of beads. Following proteolysis
the dilute supernatant was purified further by gel filtration chromatography on a superdex
75 16/60 column ( Pharmacia). The protein migrated as a dimer and was concentrated to
1
3
5-8 mg/ml and used for crystallization trials. Initial crystallization trials were carried our
using Crystal Screen I ( Hampton Research). Following several leads in conditions with
Peg 4000. The Index screens 2 and 3 ( Jena Biosciences) were used to design
optimization strategy. Crystals of the C-terminal dimer grew in three to ten days and were
mostly needle shaped ,thin plates or hexagonal three dimensional bipyramidal crystals
that grew around two base conditions: one with citrate i.e 100 mM pH 4.5-5.2 trisodium
citrate, 0.1M MgCl2, 25-30% PEG 4000 and the other had32% PEG 4000, 0.8 M LiSO4,
0.1 M Tris-HCl pH 8.5.
Data Collection and phasing: Data was collected at the beamlines as indicated in Table I.
180 or 360 oscillation images with 1 oscillation angle were collected using the inverse
beam approach with a wedge size of 30. The entire dataset was integrated and scaled
using the HKL200 suite and scalepack. Four methionine positions were located using
shake and bake. The solutions were then refined, phases calculated and density modified
using SHARP. The final FOM after structure solution and phasing inn SHARP was 0.65
which yielded maps of an excellent quality to 2.2 Å. Although almost 80% of the model
could be traced using automated tracing in ARP-wARP, manual building of the dimer in
the asymmetric unit was performed using the program COOT. Refinement was carried
out in CNS or refmac5 . Refined coordinates for the dimer were used to phase data
obtained in the P21212 spacegroup by molecular replacement in the program phaser.
Phaser was able to correctly identify positions of all 4 dimers. Model bias was avoided
during refinement by using the prime and switch methodology implemented in
1
4
SOLVE/RESOLVE. All figures were generated using Espript in combination with Adobe
Illustrator or pymol.
Janowski, R., M. Kozak, et al. (2001). "Human cystatin C, an amyloidogenic protein,
dimerizes through three-dimensional domain swapping." Nat Struct Biol 8(4):
316-20.
Newcomer, M. E. (2001). "Trading places." Nat Struct Biol 8(4): 282-4.
Risco, C., I. M. Anton, et al. (1996). "The transmissible gastroenteritis coronavirus
contains a spherical core shell consisting of M and N proteins." J Virol 70(7):
4773-7.
Surjit, M., B. Liu, et al. (2004). "The nucleocapsid protein of the SARS coronavirus is
capable of self-association through a C-terminal 209 amino acid interaction
domain." Biochem Biophys Res Commun 317(4): 1030-6.
Tang, T. K., M. P. Wu, et al. (2005). "Biochemical and immunological studies of
nucleocapsid proteins of severe acute respiratory syndrome and 229E human
coronaviruses." Proteomics 5(4): 925-37.
Wang, Y., X. Wu, et al. (2004). "Low stability of nucleocapsid protein in SARS virus."
Biochemistry 43(34): 11103-8.
Yu, I. M., C. L. Gustafson, et al. (2005). "Recombinant severe acute respiratory
syndrome (SARS) coronavirus nucleocapsid protein forms a dimer through its Cterminal domain." J Biol Chem 280(24): 23280-6.
1
5
1
6
1
7
1
8
1
9
2
0
2
1
2
2
Download