X-ray Crystallographic Determination of a Collagen-like Peptide with the Repeating Sequence (Pro-Pro-Gly)

advertisement
Article No. mb981881
J. Mol. Biol. (1998) 280, 623±638
X-ray Crystallographic Determination of a
Collagen-like Peptide with the Repeating
Sequence (Pro-Pro-Gly)
Rachel Z. Kramer1, Luigi Vitagliano4, Jordi Bella1, Rita Berisio4
Lelio Mazzarella4, Barbara Brodsky3, Adriana Zagari4
and Helen M. Berman1,2*
1
Department of Chemistry
Rutgers University, 610 Taylor
Rd, Piscataway, NJ 088548087, USA
2
Waksman Institute
Piscataway, NJ 08855, USA
3
Department of Biochemistry,
Robert Wood Johnson Medical
School, Piscataway, NJ 08855
USA
4
Centro di Studio di
Biocristallogra®a, CNR and
Dipartimento di Chimica,
Universita' di Napoli, via
Mezzocannone 4, 80134 Napoli
Italy
The crystal structure of the triple-helical peptide (Pro-Pro-Gly)10 has been
re-determined to obtain a more accurate description for this widely studied collagen model and to provide a comparison with the recent highresolution crystal structure of a collagen-like peptide containing Pro-HypGly regions. This structure demonstrated that hydroxyproline participates
extensively in a repetitive hydrogen-bonded assembly between the peptide and the solvent molecules. Two separate structural studies of the
peptide (Pro-Pro-Gly)10 were performed with different crystallization conditions, data collection temperatures, and X-ray sources. The polymerlike structure of one triple-helical repeat of Pro-Pro-Gly has been deterÊ resolution in one case and 1.7 A
Ê resolution in the other.
mined to 2.0 A
The solvent structures of the two peptides were independently determined speci®cally for validation purposes. The two structures display a
reverse chain trace compared with the original structure determination.
In comparison with the Hyp-containing peptide, the two Pro-Pro-Gly
structures demonstrate very similar molecular conformation and analogous hydration patterns involving carbonyl groups, but have different
crystal packing. This difference in crystal packing indicates that the involvement of hydroxyproline in an extended hydration network is critical
for the lateral assembly and supermolecular structure of collagen.
# 1998 Academic Press
*Corresponding author
Keywords: collagen; triple helix; hydration; supermolecular structure;
hydroxyproline
Introduction
The triple helix is the primary structural element
in collagen and is an important component of various proteins such as the serum complement protein C1q and the macrophage scavenger receptor.
In both cases a triple-helical domain has been
R.Z.K. and L.V. contributed equally to this work.
Present address: J. Bella, Purdue University,
Department of Biological Sciences, West Lafayette, IN
47907, USA
Abbreviations used: PPG 0, structure of (Pro-ProGly)10 determined by Okuyama et al. (1981); PPG 1,
Ê ; PPG 2,
structure of (Pro-Pro-Gly)10 determined to 2.0 A
Ê
structure of (Pro-Pro-Gly)10 determined to 1.7 A
resolution; PPG, the PPG 1 and PPG 2 structures
collectively; Gly!Ala, structure determined by Bella
et al. (1994); Hyp, hydroxyproline; rms deviation, rootmean-square deviation.
0022±2836/98/290623±16 $30.00/0
found to be the site responsible for binding interactions (Acton et al., 1993; Doi et al., 1993; Hoppe &
Reid, 1994). Much of what is currently known
about the structure of the triple helix is the result
of ®ber diffraction studies on actual collagen and
collagen-like peptides (Fraser et al., 1979; Rich &
Crick, 1961; Yonath & Traub, 1969). The similarity
between ®ber diffraction patterns from synthetic
polypeptides and those from native collagen con®rmed their utility as good models for collagen.
Recent crystal structures of collagen-like peptides
(Bella et al., 1994; Okuyama et al., 1981) have corroborated and expanded what was known from ®ber
models.
The collagen molecule is known to be a triplehelical coiled-coil, in which each of the three
strands has a left-handed, extended polyproline II
helical conformation. The three strands then wrap
around a common helical axis in a right-handed
# 1998 Academic Press
624
fashion. The three strands are held together with
interchain hydrogen bonds in the Rich & Crick
(1961) collagen II pattern, with a one residue stagger between adjacent chains. The extended, closepacked nature of the triple helix requires a glycine
residue in every third position. This creates the
repetitive sequence (Gly-X-Y), in which the X and
Y positions are frequently occupied by the imino
acids proline and hydroxyproline, respectively.
Hydroxyproline is formed by the post-translational
modi®cation of proline by prolyl hydroxylase,
which places a hydroxyl group at the 4-position.
Hydroxyproline is common in collagens and in
proteins containing collagen-like regions, though it
is a rare amino acid in proteins overall. In general,
only proline residues on the amino side of a glycine residue are hydroxylated.
Hydroxyproline appears to play an important
role in the stability of the triple helix. Melting studies of the two triple-helical synthetic peptides
(Pro-Pro-Gly)10 and (Pro-Hyp-Gly)10 demonstrated
a distinct disparity in their melting temperatures{
(tm). The tm for (Pro-Hyp-Gly)10 is about 58 C in
aqueous solution with 10% (v/v) acetic acid, while
(Pro-Pro-Gly)10 has a tm value of about 24 C
(Sakakibara et al., 1973). Analogous experiments
using collagen yielded similar results. Rosenbloom
et al. (1973) demonstrated that collagen lacking
hydroxyproline is unstable at biological temperatures. The disease scurvy is the result of the improper functioning of prolyl hydroxylase in the
absence of its cofactor ascorbate.
The repetitive nature of the collagen sequence,
(Gly-X-Y)n, has offered unique opportunities for
®ber diffraction, model peptide, and theoretical
studies. Over the years, (Pro-Pro-Gly)-based collagen models have been widely investigated as the
simplest models to describe collagen triple helices,
neglecting the subtle effects introduced by Hyp
residues. Yonath & Traub (1969) reported a
detailed conformational analysis of the sequential
polypeptide poly(Pro-Pro-Gly). Its ®ber diffraction
pattern was quite similar to that observed for collagen, and showed signi®cantly higher de®nition
that allowed for an improved structure. This
model exhibited the basic characteristics of the
model previously proposed for collagen by Rich &
Crick (1961), and for several years was considered
to be the best available for the triple-helical conformation of collagen. Theoretical studies of poly(ProPro-Gly) by Miller & Scheraga (1976) and NeÂmethy
et al. (1992) indicated good agreement between the
lowest-energy model and experimental data.
{ Speci®cally, this is the midpoint of the thermal
denaturation from the triple-helical state to the nontriple-helical state.
{ The use of the 75 and 107 notation is intended to be
consistent with crystallographic screw symmetry
nomenclature, which indicates the handedness of the
helix. The 75 and 107 helices are equivalent to 7/2 and
10/3 helices, respectively.
X-ray Structure of Repeating (Pro-Pro-Gly) Peptide
Sakakibara et al. (1968) utilized solid-phase synthesis methods to obtain collagen-like oligopeptides of de®ned molecular mass. Single crystals of
(Pro-Pro-Gly)10 were obtained (Sakakibara et al.,
1972), and the crystal structure was determined
(Okuyama et al., 1981). Although the end-to-end
stacking of the molecules in the crystal structure of
(Pro-Pro-Gly)10 was not well determined, the structure exhibited many of the main features of the
commonly accepted model for collagen (Fraser
et al., 1979). The structure did, however, display
subtle differences in helical parameters yielding a
triple helix with 75 screw symmetry in contrast to
the 107 screw symmetry observed for native collagen{. Helical twist discrepancies notwithstanding, the crystal structure of (Pro-Pro-Gly)10 has
been considered by many to be a good high-resolution picture of the molecular conformation of the
collagen triple helix.
A recent crystal structure determination of a triple-helical designed peptide of sequence (Pro-HypGly)4Pro-Hyp-Ala-(Pro-Hyp-Gly)5, termed Gly!
Ala, (Bella et al., 1994, 1995), has provided a highresolution picture of a triple helix. This structure
displays regular triple-helical conformation at the
Pro-Hyp-Gly repeats at both ends and a bulging in
the center of the molecule where one alanine residue in each chain is substituted for a glycine residue. This results in the untwisting of one end of
the helix with respect to the other. The Gly!Ala
structure demonstrated that there is a delicate and
repetitive hydrogen-bonded assembly between the
triple-helical peptide molecules and the solvent
molecules surrounding them, and that Hyp residues participate extensively in the building of the
water network providing extra anchoring points
for hydrogen bonding to or from the peptide surface. Analysis of the hydration patterns in (ProPro-Gly)10 in the absence of Hyp residues can provide clues to understand why and how these residues contribute to triple-helical stability and what
role they may play in the proper assembly of collagen in vivo. The packing of collagen-like peptides
in crystals is directly analogous to the native biological situation in which one molecule must interact with others. This is demonstrated by the
similarity of the pseudo-hexagonal lateral packing
of the Gly!Ala structure (Bella et al., 1994) to that
observed for collagen (Fraser et al., 1983).
With this consideration, a re-determination of
the crystal structure of the peptide (Pro-Pro-Gly)10
was undertaken. The increased resolution of the
Ê , allows for the comparison of the
data, 1.7 A
hydration patterns of this peptide with those
observed in the Gly!Ala structure that does contain Hyp residues.
Results and Discussion
Two independent crystallization experiments
(PPG 1 and PPG 2) were performed with the peptide (Pro-Pro-Gly)10. The structure of the 21 residue
X-ray Structure of Repeating (Pro-Pro-Gly) Peptide
asymmetric unit (Figure 1(a)) was determined
using molecular replacement and an idealized
7-fold triple helix. The reduced size of the asymmetric unit compared with the entire molecule is a
consequence of translational disorder along the triple-helical axis, which leads to a molecule that
behaves as a quasi-in®nite chain. The crystallographic asymmetric unit consists of 21 residues
arranged in three chains of different length; one
with sequence Pro-Pro-Gly-Pro-Pro-Gly-Pro-Pro-
625
Gly and the other two with the shorter sequence
Pro-Pro-Gly-Pro-Pro-Gly. Because of this particular
arrangement, a given continuous peptide chain
runs throughout symmetry-generated mates of all
three chains in the asymmetric unit (Figure 1(b)).
The ®nal PPG 1 model, re®ned to a resolution of
Ê , contains 21 peptide residues, 37 water mol2.0 A
ecules, and two acetic acid molecules. The ®nal
Ê , conPPG 2 model, re®ned to a resolution of 1.7 A
tains 21 peptide residues and 40 water molecules.
Figure 1. (a) The 21 residue asymmetric unit of (Pro-Pro-Gly)10. One chain has
a length of nine residues with sequence Pro-Pro-Gly-Pro-Pro-Gly-Pro-Pro-Gly
(dark gray). The other two chains are each six residues long with the
sequence Pro-Pro-Gly-Pro-Pro-Gly (medium and light gray). This results in a
model with the length of one triple-helical repeat. Interchain hydrogen bonds
are shown with broken lines. It should be noted that this representation of
the asymmetric unit is arbitrary. The repeating unit could be similarly represented by three chains of seven residues each or by one chain of 21 residues. Whichever representation is chosen, the c-axis unit translation generates
the entire triple helix. The Figure was generated with MOLSCRIPT (Kraulis,
1991). (b) Line diagram showing the numbering scheme of the molecular
replacement model. The ®rst chain is numbered from one to nine, the second
from 31 to 36, and the third from 61 to 66. A cylindrical projection is shown
with the ®rst chain repeated on the right-hand-side of the diagram for clarity.
Due to the quasi-in®nite nature of the triple helix, covalent bonds are necessary to join the molecule with its symmetry mates both above it and below it
along the helical axis. These connections are displayed as well (symmetryrelated residues are indicated with #). For example, the N terminus of the
®rst chain is contiguous with the C terminus of the second chain of a symmetry-related molecule. This connects residue 1 with residue #36. Interchain
hydrogen bonds are shown with thick diagonal lines. (c) A 2Fo ÿ Fc electron
density map with one Pro-Pro-Gly tripeptide displayed from the PPG_2 structure. The map is contoured at 1s and was generated with SETOR (Evans,
1993). Hydrogen atoms are not shown.
626
X-ray Structure of Repeating (Pro-Pro-Gly) Peptide
Table 1. Data collection parameters and re®nement statistics
A. Data collection
Data detection device
Data collection temp. ( C)
Ê)
High resolution limit (A
No. of unique reflections
Overall completeness (%)
Completeness (top shell) (%)
Rmerge (based on I) (%)c
Space group
Ê) a
Unit cell dimensions (A
b
c
B. Refinement
Ê)
Resolution (A
No. of reflections (F>2sF)
Rcryst (%)
Peptide non-hydrogen atoms
Water sites
Acetic acid molecules
rms deviations from standard geometries
Ê)
Bonds (A
Angles (deg.)
Impropers (deg.)
Ê 2)
Average temperature factors (A
All atoms
Peptide atoms
Solvent
PPG 0a
PPG 1
PPG 2
Diffractometer
12
2.2
787
94
±
±
P212121
26.93
26.42
20.08
CAD4 diffractometer
ÿ14
1.97
1136
100
Ê)
98 (2.2± 1.97 A
±
P212121
26.82
26.29
20.18
CCD detector
20
1.6
1836
86
Ê)
60 (1.8± 1.6 A
4.9d
P212121
27.01
26.42
20.42
Up to 2.2
401b
30
126
21
0
8±1.97
861
18.1
126
37
2
±
±
±
±
±
±
0.011
2.07
2.11
15.90
13.20
23.44
8±1.6
1736
21.3
126
40
0
0.009
1.81
1.99
21.10
15.62
38.34
a
Okuyama et al. (1981)
A different criterion for re¯ection selection (F 5 90) was used.
c
Rmerge ˆ jIobs ÿ hIij/ I.
d
Ê . Rmerge ˆ jI1 ÿ I2j/ I2.
Rmerge between PPG 1 and PPG 2 data sets is 15.9% for 967 re¯ections between 8 and 1.97 A
b
Table 1 gives overall statistics for both models. The
two models, obtained following signi®cantly different crystallization and data collection conditions,
possess very similar molecular conformation and
will be referred to collectively as PPG. The ®nal
models show good agreement with data, have Rfactors of 18.1% and 21.3% for the PPG 1 and
PPG 2 models, respectively, and ®t electron density maps well (Figure 1(c)). In addition, the two
hydration networks are very similar, though they
were determined completely independently. The
structure of (Pro-Pro-Gly)10 determined by
Okuyama et al. (1981) will be referred to as PPG 0.
Structural description
Because of the greater number of re¯ections
used in the structure determination and re®nement
of PPG 1 and PPG 2, it was possible to remove 7fold symmetric non-crystallographic restraints and
thus obtain a more accurate structure than was
obtained for PPG 0. The PPG 1 and PPG 2 structures differ from that of PPG 0 in the direction of
the chain trace (Figure 2). The intrinsic symmetry
of the Pro-Pro-Gly sequence and the quasi-in®nite
nature of the helix yield two very similar possible
models, differing primarily by the directionality
along the helical axis. By using higher-resolution
data and a less-constrained model, discrimination
Figure 2. PPG 1 and PPG 2 (a) differ from PPG 0 (b) in
the directionality of the chain trace. Considering the
molecule in the lower left-hand corner of the unit cell as
a reference, in the PPG 1 and PPG 2 structures, this
molecule is oriented C!N towards the positive crystallographic c axis. In the PPG 0 structure this molecule is
oriented in a reverse way with N!C along the positive
c axis. The Figure was generated with MOLSCRIPT
(Kraulis, 1991).
627
X-ray Structure of Repeating (Pro-Pro-Gly) Peptide
Ê ) between the PPG 1, PPG 2, PPG 0, Gly!Ala and PPG COMP triple-helical
Table 2. The rms deviations (A
structures
PPG 1
PPG 0a,b
Gly!Alac
PPG COMPd
A. Triple helix (21 residues)
PPG 2
PPG 1
PPG 0
Gly!Ala
0.23 (0.21)
±
±
±
0.38 (0.28)
0.43 (0.35)
±
±
0.29 (0.26)
0.33 (0.29)
0.45 (0.35)
±
0.51
0.56
0.64
0.52
(0.41)
(0.43)
(0.49)
(0.43)
B. One chain (three Pro-Pro-Gly repeat units)
PPG 2
PPG 1
PPG 0
Gly!Ala
0.23 (0.21)
±
±
±
0.34 (0.27)
0.41 (0.37)
±
±
0.29 (0.26)
0.36 (0.31)
0.35 (0.30)
±
0.29
0.39
0.37
0.33
(0.20)
(0.29)
(0.25)
(0.20)
C. One Pro-Pro-Gly repeat unit
PPG 2
PPG 1
PPG 0
Gly!Ala
0.16 (0.17)
±
±
±
0.23 (0.18)
0.30 (0.28)
±
±
0.15 (0.12)
0.22 (0.22)
0.14 (0.12)
±
0.24
0.33
0.31
0.30
(0.08)
(0.19)
(0.21)
(0.14)
The rms deviations have been computed using all non-hydrogen atoms of the structures. The rms deviations computed on backbone
atoms (N, Ca, C and O) alone are given in parentheses.
a
Okuyama et al. (1981).
b
The PPG 0 model was transformed to the correct chain directionality prior to computation of rms deviations.
c
Bella et al. (1994), a segment from the regular, Pro-Hyp-Gly portion of the molecule was used in the calculations.
d
Computationally derived model of NeÂmethy et al. (1992).
between the two was possible. The reverse model
(similar to PPG 0) was investigated but was found
to be incorrect (see Materials and Methods). Rigidbody re®nement trials using the PPG 0 model (not
including water molecules) and the PPG 0 model
with a reverse chain trace against PPG 0 data gave
further evidence that the reversed model is correct.
After several cycles of rigid body re®nement using
X-PLOR (BruÈnger, 1992) and the data selection criteria of the original determination, i.e. re¯ections
Ê with F 5 90 (399 re¯ections), the R-facup to 2.2 A
tors for the PPG 0 model and the reversed PPG 0
model are 42.0% and 39.1%, respectively. This pattern is observed also if weak data (697 re¯ections)
are included; rigid body re®nement yields R-factors of 46.6% for the original model and 42.6% for
the reversed model. Given the similarities between
the two models, a larger disparity in R-factors
would not be expected.
The main conformational characteristics of the
polymer-crystal model for (Pro-Pro-Gly)10 are very
similar to those reported by Okuyama et al (1981)
and to those exhibited by the (Pro-Hyp-Gly)n
regions of the crystal structure of the Gly!Ala
peptide (Bella et al., 1994). The rms deviations
among various structures are given in Table 2. In
the polymer-crystal model, three identical chains in
polyproline II conformation are aligned in parallel
and wrap around the triple-helical axis with a stagger of one residue between adjacent chains. The
three chains are held together through hydrogen
bonds following the Rich & Crick II pattern (Rich
& Crick, 1961) between glycyl NÐ H groups and
CˆO groups of the proline residues in the X position of the neighboring chain, as was observed in
the PPG 0 structure (Figure 1(b)).
The f and c conformational angles of the ®nal
structures are typical of a polyproline II conformation, and are very close to those reported in previous studies of model polypeptides with collagenlike sequences (Table 3). The helical symmetry of
the models is almost exactly 75. Helical twist par-
Table 3. Averaged values of PPG main chain dihedral angles. The values are compared with those of Gly!Ala,
PPG 0 and native collagen structures. Standard deviations are given in parentheses.
Torsion Angle
o
f
c
o
f
c
o
f
c
ProX
ProX
ProX
ProY
ProY
ProY
Gly
Gly
Gly
a
b
c
PPG 0a
178.2
ÿ75.5
152.0
ÿ176.8
ÿ62.6
147.2
178.2
ÿ70.2
175.4
PPG 1
(this work)
178.6 (0.6)
ÿ73.1 (8.8)
159.7 (3.7)
179.1 (1.2)
ÿ58.7 (8.1)
161.0 (12.4)
179.6 (0.5)
ÿ83.7 (11.2)
179.8 (5.8)
Okuyama et al. (1981)
Bella et al. (1994), Ala residues are classi®ed with Gly residues.
Fraser et al. (1979).
PPG 2
(this work)
177.8 (0.7)
ÿ75.0 (2.7)
161.4 (3.1)
176.7 (2.1)
ÿ61.2 (1.1)
153.3 (2.2)
ÿ179.9 (0.2)
ÿ75.8 (2.0)
179.5 (3.5)
Gly!Alab
179.9
ÿ72.6
163.8
178.5
ÿ59.6
149.8
177.3
ÿ71.9
174.1
(1.8)
(7.6)
(8.8)
(1.5)
(7.3)
(8.8)
(3.1)
(9.6)
(11.9)
Collagen fiberc
180.0
ÿ72.1
164.3
180.0
ÿ75.0
155.8
180.0
ÿ67.6
151.4
628
X-ray Structure of Repeating (Pro-Pro-Gly) Peptide
Table 4. Helical parameters of PPG 1, PPG 2 and PPG 0, Gly!Ala and native collagen (standard deviations in parentheses)
Ê)
Helix twist height D (A
Helix twist angle y (deg.)
PPG 0a
PPG 1b
PPG 2b
Gly!Alac
Collagen fiberd
8.6
51.4
(this work)
8.65 (0.08)
51.4 (2.6)
(this work)
8.75 (0.03)
51.4 (1.9)
8.4
60
8.6
36
a
Okuyama et al. (1981).
Calculated by placing the triple-helical axis parallel with the crystallographic c axis and measuring the rotation (y) and translation (D) needed to superimpose each triplet with the next along the chain.
c
Bella et al. (1994); values include the alanine substitution zone as well as the Pro-Hyp-Gly regions of the structure.
d
Fraser et al. (1979).
b
ameters of the PPG 1 and PPG 2 structures are
given in Table 4. Although non-crystallographic
symmetry restraints were applied during the ®rst
part of the re®nement, their removal in later stages
did not produce signi®cant changes in the agreement with the X-ray data, indicating that the ®nal
models do not deviate signi®cantly from the symmetrical one. Accordingly, the rms deviations
between the ®nal unrestrained models and an
Ê (0.24 A
Ê ) and
idealized 7-fold model are 0.32 A
Ê (0.18 A
Ê ) for PPG 1 and PPG 2, respectively;
0.23 A
rms deviations computed on backbone atoms
alone are given in parentheses.
The average geometrical parameters for the
interchain hydrogen bonds are similar to those
observed for the Gly!Ala peptide. The average
Ê and 2.96 A
Ê
Gly N to Pro CˆO distances are 3.01 A
for the PPG 1 and PPG 2 structures, respectively.
Ê for the Gly!Ala
These compare well with 2.94 A
structure. The average N OˆC angles are 165
and 166 for the PPG 1 and PPG 2 structures,
respectively. Again, these compare well with the
Gly!Ala value of 163 .
In addition to the Rich & Crick II interchain
hydrogen bonds, the Gly!Ala structure showed
evidence of Ca-H OˆC hydrogen bonds (Bella &
Berman, 1996). With the exception of the central
disruption zone of the molecule, both Ha1 and Ha2
of Gly residues interact with the Gly CˆO group
of a neighboring chain, creating a bifurcated
hydrogen bond. In addition, Ha1 makes a hydrogen bonded interaction with the Pro CˆO, thus
forming a three-centered hydrogen bond. An
additional pattern was observed involving the Ha
from the Hyp residue, which interacts with the Pro
CˆO. Hydrogen atoms from the Pro residues are
directed into the solvent and were not considered.
The PPG 1 and PPG 2 structures show analogous
patterns. Hydrogen bonding distances and angles
are given in Table 5.
The puckering of the imino acid rings is dependent on the position of the Pro residue. In general,
those in the X position show downward puckering,
whereas those in the Y position display upward
puckering. The geometries of the upward and
downward conformations were described by
Momany et al. (1975). This puckering behavior was
previously reported for (Pro-Pro-Gly)10 (Okuyama
et al., 1981), although the resolution of the data did
not allow for discrimination between these two
conformations other than by R-factor. This pattern
of downward puckering in the X position and
upward puckering in the Y position has been
described for collagen or collagen-like peptides in
Table 5. Average selected hydrogen bonding parameters for PPG 1 and PPG 2 compared with Gly!Ala (Bella &
Berman, 1996)
Ê)
Interatomic distances (A
PPG 1
PPG 2
Gly!Alaa
Interatomic angles (deg.)
PPG 1
PPG 2
Gly!Alaa
A. NÐH OˆC hydrogen bonds
HN Gly O Pro X
2.14 (0.11)
N Gly O Pro X
3.01 (0.05)
2.05 (0.07)
2.96 (0.07)
2.06 (0.07)
2.94 (0.08)
B. Ca ÐH OˆC hydrogen bonds
2.65 (0.14)
Ha1 Gly O Gly
Ha2 Gly O Gly
2.91 (0.14)
3.21 (0.12)
Ca Gly O Gly
N ÐH Gly O Pro X
H Gly OˆC Pro X
N Gly O ˆ C Pro X
2.56 (0.05)
2.85 (0.08)
3.13 (0.05)
2.63 (0.20)
2.79 (0.16)
3.15 (0.15)
Ha1 Gly O Pro X
Ca Gly O Pro X
2.51 (0.17)
3.55 (0.16)
2.45 (0.07)
3.49 (0.07)
2.41 (0.18)
3.46 (0.18)
Ha Pro Y O Pro X
Ca Pro Y O Pro X
2.48 (0.10)
3.36 (0.12)
2.39 (0.04)
3.29 (0.04)
2.52 (0.19)
3.41 (0.16)
Ca ÐHa1 Gly O Gly
Ca ÐHa2 Gly O Gly
Ha1 Gly OˆC Gly
Ha2 Gly OˆC Gly
Ca Gly OˆC Gly
Ca ÐHa1 Gly O Pro X
Ha1 Gly OˆC Pro X
Ca Gly OˆC Pro X
Ca ÐHa Pro Y O Pro X
Ha Pro Y OˆC Pro X
Ca Pro Y OˆC Pro X
147 (9)
156 (7)
165 (5)
153 (4)
157 (3)
166 (2)
150 (4)
154 (5)
163 (5)
112
96
95
116
103
164
111
115
138
129
139
112
94
97
118
104
161
109
114
140
130
140
109
100
91
110
99
165
113
117
140
126
136
(6)
(5)
(5)
(6)
(5)
(6)
(6)
(5)
(6)
(4)
(5)
(3)
(2)
(1)
(1)
(1)
(3)
(3)
(3)
(1)
(2)
(2)
(7)
(8)
(5)
(6)
(5)
(6)
(8)
(8)
(5)
(5)
(5)
Hydrogen atoms have been placed based on the crystal coordnates of the heavier atoms, using X-PLOR default parameters (BruÈnger,
1991). Standard deviations are shown in parentheses.
a
The proline residue in the Y position is hydroxyproline.
629
X-ray Structure of Repeating (Pro-Pro-Gly) Peptide
various other experiments: the ®ber X-ray diffraction patterns of native collagen (Fraser et al., 1979);
the 2D-NMR measurements of the collagen-like
peptide (Pro-Pro-Gly)10 (Li et al., 1993); and the
crystal structure of the Gly!Ala peptide (Bella
et al., 1994). In contrast to Hyp-containing triple
helices, where direct water interactions may also
play a role in the conformation of the Y position
imino acid, the pattern observed for (Pro-ProGly)10 must necessarily arise from either conformational effects derived from the different
favored values of the backbone and/or side-chain
torsion angles for Pro in the X or Y position, from
local steric effects, or from indirect hydration
effects.
In the PPG 1 structure, there appears to be one
exception to the general puckering preference. This
occurs at residue 65. This Y position proline ring
puckers in the down conformation rather than in
the expected up conformation. Efforts to model
this ring with the reverse pucker only raised the
R-factor and the model re®ned back to the original
down conformation. Electron density maps also
seem to con®rm that this residue is in the down
orientation. The proline ring at the same position
in the PPG 2 structure is puckered in the up conformation. The observation of two different puckering conformations in two similar structures
indicates the potential ¯exibility of the proline
ring. Recent theoretical studies corroborate this
observation (NeÂmethy et al., 1992). Their results
suggest that ring puckering is not immutable and
that such interchanges may be more readily accomplished in the Y position than in the X position.
Crystal packing
The distribution of the triple helices follows a
pattern that can be envisioned by the position of
the intersections of their helical axes with the 001
plane. The intersection points display a tiling made
of regular squares and triangles as was ®rst
suggested by Okuyama et al. (1981). In this fashion,
every helix is ®ve-coordinated and two different
kinds of clusters appear: a square cluster in which
helices placed diagonally run in parallel and are
antiparallel with those of the other diagonal, and a
triangle cluster in which two helices run in parallel
with and opposite to the third one, no matter
which triangle or square is considered (Figure 2(a)).
Because of this mixed-parallel nature of the molecules, there is only quasi-tetragonal symmetry
and the structure falls instead into the P212121
space group with 2-fold rather than 4-fold symmetry. Aperiodic lattices made of squares and triangles have been invoked to account for the
pseudo-hexagonal pattern of lateral packing
between collagen triple helices (Sasisekharan &
Bansal, 1990). This tiling provides a useful classi®cation tool for the analysis of the water distribution.
Hydration analysis
From the early stages of the re®nement, electron
density maps displayed a considerable number of
maxima that by their shape, distance to main-chain
atoms and orientation, were good candidates for
water molecules. Non-crystallographic symmetry
restraints have not been applied to the water molecules, in contrast to the procedure utilized for the
PPG 0 structure. As solvent molecules are more
dependent on local environments and the true
crystallographic packing symmetry is incompatible
with 7-fold symmetry, water molecules must be
distributed in a non-symmetric way that is dependent on the packing arrangement of the triple-helical molecules.
Prior to the addition of any water molecules, signi®cant density appeared in the Fourier maps in
the ``triangle'' regions, but very little density was
apparent in the ``square'' regions. After 28 water
molecules had been included in the PPG 1 model,
density for reasonably well-de®ned water molecules in the square regions became evident
(Figure 3(a)). In this way it is clear that the
addition of the initial water molecules enhances
the phasing of the entire structure. The situation
was similar for the PPG 2 structure, in which an
automated water-picking procedure was used and
the ®rst water molecules chosen were in the triangle regions. In both models, the density in the
square regions remained much more diffuse than
in the regions of closer intermolecular contact;
water molecules in the square regions correspond
to lower peaks in the electron density maps and do
not participate as readily in discernible interwater
links. This suggests that the water in the square
regions is somewhat less ordered than that in other
areas, leading possibly to a bulk solvent channel.
This may lead to increased disorder of the solvent
structure in this area as well as of the portion of
the molecule contacting this region. For example,
the reverse-puckered prolyl ring at position 65 in
the PPG 1 structure, occurs in the square region of
the packing.
The ®nal PPG 1 polymer-crystal model for (ProPro-Gly)10 contains 37 water molecules and two
acetic acid molecules. The PPG 2 model contains
40 water molecules. In both structures these represent average solvation positions along the
extended unit cell. Because of the differences in the
crystallization conditions, data collection techniques and resolution, small differences in the solvent distribution would be expected between the
two models. The majority of the water molecules
participate in extensive hydrogen bonding with
peptide carbonyl groups and/or other water molecules, in a way that is clearly reminiscent of what
has been observed for the Gly!Ala peptide (Bella
et al., 1995) and comprises a coherent water network around the triple helix that can be divided
into multiple hydration shells.
The ®rst hydration shell contains 20 water molecules that are directly bound to the peptide chain.
630
X-ray Structure of Repeating (Pro-Pro-Gly) Peptide
Figure 3. (a) Molecular packing looking down the helical axis of PPG. The packing can be viewed as having two general regions, one triangular and the other square (shown with broken purple lines). The water molecules in the square
regions appear to be less ordered and more diffuse. The hydration pattern in the center of the square regions maintains a pseudo-tetragonal distribution. Helices are surrounded by ®ve nearest-neighbors that vary in distance from
Ê to 13.9 A
Ê . Two other neighbors, across the square regions are 19.4 A
Ê and 19.6 A
Ê away. The unit cell is shown
13.5 A
with thin, broken lines. (b) The packing of Gly!Ala is hexagonally closest packed with the six generally similar
Ê to 14.9 A
Ê . The molecules of Gly!Ala appear thicker than those of PPG
interhelical distances ranging from 14.0 A
because of the additional length of the molecule (90 amino acid residues compared with 21) and the unwinding of
Ê . The Figure was generated with
one end of the Gly!Ala helix with respect to the other. All distances are in A
CHAIN (Sack, 1988).
On average, the water molecules are positioned
Ê and 2.81 A
Ê from carbonyl groups for PPG 1
2.97 A
and PPG 2, respectively. This shell is characterized
by a repetitive pattern in which one water molecule is bound to the glycyl carbonyl group
(Figure 4(a)) and two are bound to the prolyl carbonyl group in the Y position (Figure 4(b)). As was
observed in the Gly!Ala structure, the Gly carbonyl group points slightly more towards the molecule than the Pro carbonyl group in the Y position
and the second water position is occupied by the
Ca of a glycine residue from a neighboring chain.
As a result, Ha1 and Ha2 from the neighboring Ca
make hydrogen bonding contacts to the Gly carbonyl group (Table 5). The two positions on the Y
prolyl carbonyl group can be termed WN and WA
according to their proximity to the nitrogen and
the a-carbon, respectively{. These water bridges
satisfy all backbone polar groups, since the carbo{ A numbering scheme has been developed for these
water molecules in which those attached to the ®rst
chain are numbered 101 to 109, the second chain 111 to
119 and third chain 121 to 129. They then form groups
of three in which the ®rst is in the WA position on the
prolyl carbonyl group, the second is that in the WN
position, and the third is attached to the glycyl carbonyl
group. All second and third shell water molecules are
given numbers in the 200s.
nyl group of the proline residues in the X position
participate in interchain hydrogen bonds with glycine NÐ H groups. In this way, the ®rst shell
hydration pattern involving carbonyl groups is
identical with that reported in the Gly!Ala structure. This demonstrates that this portion of the pattern is sequence/hydroxyproline-independent and
can be a general feature of the triple-helical motif.
The sole exception in this pattern occurs in both
the PPG 1 and the PPG 2 models at residue 35.
Here, at this Y position proline residue, one of the
two expected water molecules is missing; the
water molecule in the WN position is present and
that in the WA position is absent. This can be
explained by the proximity of the carbonyl group
to the less-dense square region of the molecular
packing (Figure 3(a)). A water molecule in this position would fall into the square region.
The repetitive regularity of the ®rst hydration
shell is further demonstrated by the similarity
between the PPG 1 and PPG 2 structures. The rms
deviation between the water molecules of the ®rst
Ê.
hydration shell of these two structures is 0.85 A
The long distance between one of these water molecules in the PPG 1 structure (water 121) and the
carbonyl group can be explained by its proximity
to the square region, where the water seems to be
generally much less ordered. As a result, the position of this water molecule may not be well
X-ray Structure of Repeating (Pro-Pro-Gly) Peptide
de®ned. The uniformity between the two structures is particularly interesting in that these water
molecules represent average positions along the triple helix (because of the reduced size of the asymmetric unit), indicating that these positions are
extremely well conserved.
Proline residues in the Y position do not contain
a hydrophilic hydroxyl group, but still are surrounded by ordered water that would fall within a
reasonable hydrogen bonding distance from a
simulated hydroxyl group (Figure 4(c)). When the
averaged water positions surrounding such a
hypothetical hydroxyl group are superimposed
with those found for the Gly!Ala peptide, the
averaged positions from PPG are close but not
exactly coincident with those of Gly!Ala. However, diffuse portions of the PPG averaged water
density fall on both the WB and WD2 positions
(Bella et al., 1995) of the Gly!Ala structure. It can
be proposed that the water surrounding the Y position proline, once in the presence of a hydroxyproline residue, becomes better localized and is
thereby shifted into the correct bridge-building
geometry. This would be accomplished without
much loss in entropy, since the water molecules
already occupy ordered positions, but with a gain
in enthalpy through the formation of hydrogen
bonded contacts. These observations indicate that
the existence of localized water surrounding the Y
position imino acid is itself not dependent on
hydroxyproline, but the presence of hydroxyproline induces the formation of additional water
bridges, greater localization, and a more extensive
hydration network. X position proline rings are not
similarly surrounded by ordered water positions.
Additional water molecules form a second
hydration shell, that is water molecules that are
bound to those bound to the peptide chain. These
water molecules form repetitive bridges that are
similar to the a (intrachain), b (interchain) and o
(intermolecular) bridges connecting carbonyl
groups as described for the Gly!Ala structure
(Bella et al., 1995; and see Figure 5(a)). In the
PPG 1 structure, acetic acid molecules take the
positions of water molecules and participate in
bridges. The second hydration shell, in general,
forms three-water molecule intra- and interchain
bridges (a3 and b3 bridges){ connecting water
molecules of the ®rst hydration shell. In some
cases, a water molecule from the ®rst hydration
shell of one peptide molecule is from the second
hydration shells of another symmetry-related peptide molecule. These bridges are quite repetitive
and fundamentally pentagonal in shape, with carbonyl groups occupying the additional two apices
{ When considering bridges, the usual requirements
for hydrogen bonding distances have been taken rather
loosely because of the averaged nature of the structure.
Long and short distances can be considered to be an
effect of the averaged nature of the structure. In general,
the overall appearance of the bridge was considered.
631
Figure 4. Water distribution diagrams around the carbonyl groups of (a) glycine and (b) the Y position proline
residues of PPG. Water molecules were chosen using a
Ê cutoff from the carbonyl group. The method of
3.25 A
Schneider et al. (1993) was used to calculate threedimensional contours. These positions are very similar
to those shown for the Gly!Ala peptide by Bella et al.
(1995), wherein the glycine carbonyl group has one
water bonding position and the proline carbonyl group
has two. The water positions are labeled WN or WA
according to their proximity to the N or Ca atoms,
respectively. (c) Water positions surrounding the ring of
proline in the Y position. A hydroxyl group (shown in
red) was modeled at the Cg position of the proline ring.
Ê of the simulated Od were
Water molecules within 3.5 A
selected (red) and contours were calculated with these
water molecules (also shown in red). The resulting contours and averaged water positions were superimposed
on those from hydroxyproline from the Gly!Ala structure. The contours and water molecules from the
Gly!Ala structure are shown in blue, and are labeled
WD1, WD2 and WB according to their respective proximity to Cd and Cb. The superimposition demonstrates
that while the general positions between the two structures are different, it is conceivable that a hydroxyproline residue could direct the water molecules around the
Y proline residue into the positions from the Gly!Ala
structure. The Figure was generated with CHAIN (Sack,
1988).
of the pentagon (Figure 5(b)). Five of the six potential hydration positions occupied by acetic acid in
the PPG 1 structure are occupied by water molecules in the PPG 2 structure. The second
hydration shells of PPG 1 and PPG 2 have 18
hydration positions in common, including the ®ve
positions from acetic acid molecules. The rms devi-
632
X-ray Structure of Repeating (Pro-Pro-Gly) Peptide
Figure 5. Examples of hydration structure in the (a) Gly!Ala structure (b) PPG 1 or PPG 2 and (c) PPG 0 structures. In (a) and (b) two water molecules are bound to the carbonyl group of the Y position proline residue (WA and
WN) and one water molecule is bound to the glycine carbonyl group (WN). In (a) two additional water molecules are
bound to Od of the hydroxyproline residue. Inter- and intrachain water bridges are then formed by interconnecting
water molecules. In general, the water structure of PPG shows repetitive pentagonal-like inter and intrachain bridges
between carbonyl groups. In (c) two water molecules, W1 and W2, are bound to the carbonyl group of the Y position
proline residue. W1 is also bound to the glycine carbonyl group, forming a one water molecule intrachain bridge. W1
and W2 are connected by W3, forming a three water molecule interchain bridge similar to that seen in (a) and (b).
The Figure was generated with MOLSCRIPT (Kraulis, 1991).
ation of these 18 water positions between the
Ê . Included in
PPG 1 and PPG 2 structures is 0.86 A
these 18 water molecules are the pseudo 4-fold
water positions that can be seen in the center of the
square region in Figure 3(a). Thus, even within this
region of decreased order there is still a large
degree of structural similarity.
The a bridges (Figure 6(a)) connect the Y position prolyl carbonyl group with the immediately
following glycyl carbonyl group, utilizing the
water molecule in the WA position on the proline
carbonyl group. The b bridges (Figure 6(b)) connect
the glycyl carbonyl group in one chain with the Y
position prolyl carbonyl group in the adjacent
chain, utilizing the WN position of the proline carbonyl group. Thus, the water molecule that is
attached to the glycyl carbonyl group participates
in two bridges (one inter- and one intrachain. Interchain b3 bridges are made between the same
chains, as are the interchain hydrogen bonds, thus
reinforcing the triple-helical structure. In a few
cases, the a-bridge pentagons are distorted by proline rings from neighboring helices that occupy the
position in which the water molecule would seem
most naturally to ®t and two water bridges can be
envisioned. In other cases, interchain distances are
bridged by four water b bridges or have one particularly long leg. These perturbations can be seen
as a consequence of the proximity of the bridge to
the square region or an interfering proline ring.
A variety of o bridges are also observed, i.e.
bridges between different neighboring helices.
Figure 7 shows an example of an o bridge formed
by the intersection of two interchain bridges. The
length of these bridges is dependent on their
location with regard to interhelical packing, and
may include two, three or four water molecules.
As in the Gly!Ala structure, there are no direct
contacts between the peptide molecules them-
selves. Any intermolecular interactions occur
through water molecules and o bridges. In contrast
to Gly!Ala, few water molecules in a third
hydration shell are seen. In the regions of close
packing (triangular and interhelical areas), the peptide molecules are too close to allow a third layer.
In this region, water molecules in the second
hydration shell from one helix may become the
®rst shell water molecules of an adjacent symmetry-related helix or there may be interaction
between second shell water molecules making the
third shell water molecules from one helix the
second shell water molecules from another. In the
regions of less dense packing (square areas) the
distances between helices are greater. While
ordered water molecules do appear in these
regions, there are overall fewer and the bridging
patterns are less distinct. Hence most of the
ordered water molecules lie in the triangular or
interhelical regions.
The PPG 1 and PPG 2 determinations differ
essentially by four water positions in the PPG 1
structure and two in the PPG 2 structure. Three of
the four water molecules from the PPG 1 structure
lie in the square region.
Comparison of packing with that of the
Gly!Ala structure
While the Gly!Ala and PPG structures have
similar molecular conformation (in terms of f/c
angles, puckering, and interchain hydrogen bonding) and ®rst hydration shell patterns where carbonyl groups are involved, the supermolecular
arrangements of the molecules are different. In the
Gly!Ala structure, the triple helices pack in a
way that is reminiscent of the putative quasi-hexagonal closest packing of collagen ®brils
(Figure 3(b)). In PPG, the distribution of the mol-
633
X-ray Structure of Repeating (Pro-Pro-Gly) Peptide
Figure 6. Water bridging patterns connecting carbonyl groups along the chain. (a) Intrachain a bridges may utilize
either two or three water molecules. In one case an a bridge is missing due to the absence of a water molecule in the
WN position on residue 35. (b) Interchain b bridges may incorporate three or four water molecules. In the PPG 1
structure an acetate molecule participates in the b bridge connecting residue 33 with residue 62, occupying two of the
four hydration positions. Long bridging distances are marked with an asterisk (*). These bridges were included as
they give the general appearance of a b bridge. Glycine Ca atoms have been omitted for clarity.
ecules is different (Figure 3(a)). As mentioned
above, the packing can be described by a series of
intersecting triangles and squares in which each triple helix is surrounded by ®ve close neighbors and
two neighbors that are further away across the
square region. In this way, the molecules are not as
equivalently or symmetrically placed as they are in
the Gly!Ala structure. Comparing interhelical distances, the molecules of Gly!Ala are all separated
Ê , producing six essentially
by about 14 to 15 A
equivalent interactions for any one helix, whereas
in PPG, the interhelical distances are more varied.
In the triangle regions, the ®ve helices are about 13
Ê apart, slightly smaller, but similar to
to 14 A
Gly!Ala. However across the square region the
Ê apart.
helices are about 19 A
These two distinctly different forms of packing
indicate that the extensive water network that
hydroxyproline induces is related to the determination of lateral molecular packing and therefore
supermolecular structure. As collagen is required
to form ®brils and other higher-order structures,
the interaction with other molecules is critical.
Comparison with the PPG 0 structure
Although the interchain hydrogen bonds in
PPG 0 maintain normal distance conformation,
Ê between glycyl NÐ H groups and CˆO
2.86 A
groups of the proline residues in the X position;
the average N OˆC angles for the PPG 1 and
PPG 2 structures (165 and 166 , respectively) are
different when compared with those of PPG 0
(152 ). Overall, the rms deviations demonstrate
that PPG 1 and PPG 2 are closer in form to
Gly!Ala than to PPG 0 (Table 2).
While the ®rst hydration shell that is observed in
PPG 1 and PPG 2 is very similar to that of
Gly!Ala, it is signi®cantly different from that
reported for PPG 0 (Okuyama et al., 1981), in
which one water molecule (W1) links the carbonyl
group of the Y proline residue and the following
634
X-ray Structure of Repeating (Pro-Pro-Gly) Peptide
Conclusions
Figure 7. An example of interhelical o bridges. The symmetry-related helix is shown in dark gray; symmetryrelated water molecules are marked with an (*). This
bridge pattern can be viewed as having o2 (through
water molecules 106 and *102) and several o4 (through
water molecules 115, 201, *204 and *109, for example)
connections. The pattern is formed by the intersection of
two b3 bridges (the ®rst through water molecules 115,
201 and 106, and the second through water molecules
*109, *204 and *102). This bridge occurs in an interhelical triangular region. This Figure demonstrates how
interchain bridges span about one-seventh of the way
around the helix. Water molecules are shown in light
gray and carbonyl oxygen atoms involved in the bridges
are shown in black. The Figure was generated with
MOLSCRIPT (Kraulis, 1991).
glycine residue (Figure 5(c)). It was proposed that
this water molecule could stabilize the glycine conformation and therefore the triple-helical structure.
The angle that the hydrogen bonds make in this
case is 68 . This angle seems possibly unstable and
it is likely that there is a lower-energy way to
make an intrachain water bridge. It is unlikely that
a water molecule with such a small hydrogen
bonding angle would do so. The PPG 1 and
PPG 2 structures demonstrate that this direct link
does not exist, and the stabilization of the triple
helix must arise from more extensive water
bridges. A second water molecule (W2) in PPG 0
attached to the Y prolyl carbonyl group is analogous to the water molecule that has been observed
on the Y position proline residue in the WN position. An interchain water bridge was observed
involving three water molecules wherein W1 and
W2 are connected by a third water molecule (W3).
This bridge is nearly identical with the interchain
b3 bridges observed in the PPG 1 and PPG 2
structures.
We present two high-resolution structures of a
long-studied
collagen-like
polypeptide
that
improve a previous structural determination
(Okuyama et al., 1981) In comparison, the structures presented here display a reversal in chain
direction and a different hydration pattern.
Although the molecules in the crystals retain a
polymer-like organization, a high-resolution averaged model for a triple-helical structure with a
Pro-Pro-Gly sequence can be described.
Two separate structural determinations were
made using different crystallization conditions,
data collection temperatures, and X-ray sources;
yet the results are essentially the same. This serves
to demonstrate that the ®ndings are not conditiondependent, but rather are representative of the
sequence. The failure to get completely ordered
crystals, despite varied conditions, shows that the
non-speci®c packing observed can be a consequence of the regularity of this sequence. This can
be indirect evidence of the importance of some
sequence variety and speci®city for correct lateral
assembly in native collagen.
In the two determinations, the peptide structures are quite similar to each other and show
close agreement with the ®rst atomic resolution
structure of a triple helix (Bella et al., 1994). The
present model shows a clear pattern for the puckering of the imino acids (Pro ˆ X ˆ down,
Pro ˆ Y ˆ up) consistent with previous ®ndings
(Fraser et al., 1979) and is similar to that observed
in the Pro-Hyp-Gly regions of Gly!Ala but
demonstrates that the Y position has the potential
to be ¯exible, and may adopt the alternative
pucker. The close similarity of molecular structure
between these two high-resolution structures con®rms that the presence of hydroxyproline does
not directly affect the molecular structure in an
imino acid-rich region of collagen and therefore
the structural stability of the triple helix related
to hydroxyproline arises solely from proteinwater interactions.
The ®rst hydration shells of PPG 1 and PPG 2
also display a high level of agreement with each
other and with the Gly!Ala structure. Differences
among the structures occur primarily in the
extended water structure. Further, the ordered
hydration found around the proline ring in the Y
position demonstrates that even in strongly hydrophobic regions, the triple helix maintains extensive
hydration. This indicates that while hydroxyproline is not necessary for hydration, its presence
adds stability and interconnectivity to the water
network that may be necessary for the functioning
of native collagen. This involvement of hydroxyproline was suggested by the Gly!Ala structure
(Bella et al., 1994, 1995) but the dissimilarity of its
packing with that of the (Pro-Pro-Gly)10 structures
demonstrates that this role for the extended
hydration network induced by hydroxyproline is
X-ray Structure of Repeating (Pro-Pro-Gly) Peptide
635
more extensive and directly related to lateral
assembly and supermolecular structure.
Materials and Methods
Crystallization experiments
Two separate, independent sets of crystallization
experiments were performed using the peptide (Pro-ProGly)10, yielding crystals grown under different conditions. In both sets of trials (PPG 1 and PPG 2) the
hanging-drop vapor diffusion technique was employed.
In the ®rst set of experiments (PPG 1), (Pro-Pro-Gly)10
was purchased from Peptides International. X-ray diffraction quality crystals were obtained at 4 C from 10 ml
drops containing initial concentrations of 4.0 mg/ml of
peptide dissolved in 10% (v/v) acetic acid, 0.1% (w/v)
sodium azide, and 3.0% (w/v) PEG 400, equilibrated
against a reservoir containing 1 ml of 6.0% PEG 400. The
crystals were orthorhombic in shape with typical dimensions of approximately 0.2 mm 0.2 mm 0.1 mm. In
this particular setting, acetic acid migration from the
drop to the reservoir produced a gradual increase in the
pH value of the drop, which resulted in nucleation processes and eventually in the appearance of single crystals.
The second set of crystallization experiments (PPG 2)
utilized peptide purchased from Peninsula Laboratories
Europe LTD. Small square plates of lengths ranging
from 0.01 to 0.20 mm were grown within one to two
weeks. Drops (10 ml) containing 7.5 mg/ml peptide (dissolved in 5% (v/v) aqueous acetic acid) and 0.05 M
sodium acetate were equilibrated at room temperature
against 1.0 ml reservoirs of 0.1 M acetate buffer at
pH 5.5. A mass spectroscopic analysis of dissolved crystals indicated that they were composed entirely of chains
that were ten triplets long.
Diffraction experiments on a PPG 1 crystal with a
maximum dimension of 0.2 mm were carried out at
ÿ14 C on an Enraf Nonius CAD4 diffractometer using
CuKa radiation. Data up to a maximum resolution of
Ê were collected. The majority of the observed dif1.97 A
fraction data could be indexed in an orthorhombic unit
Ê , b ˆ 26.29 A
Ê and
cell with dimensions a ˆ 26.82 A
Ê . The space group was determined to be
c ˆ 20.18 A
P212121, with one triple-helical molecule in the asymmetric unit. Intensity measurements were corrected for
Lorentz-polarization and absorption with MOLEN (Fair,
1990; and see Table 1).
Intensity data from a PPG 2 crystal were collected at
room temperature at the A1 beamline of the Cornell
High Energy Synchrotron Source (CHESS) using the
Ê . In all,
oscillation method and a wavelength of 0.91 A
118 images were recorded on a CCD detector at a distance of 47 mm, with an oscillation angle of 1 from a
single crystal with a maximum length of 0.05 mm. The
Ê , although pareffective resolution of the data was 1.7 A
Ê were also collected and
tially complete data up to 1.6 A
used in re®nement. The images were indexed and integrated using DENZO (Otwinowski, 1993) and merged
with SCALEPACK (Minor, 1993) The overall Rmerge (on
I) is 0.049 with a completeness of 86% and a mosaicity of
0.8 . Again, the space group was determined to be
Ê , b ˆ 26.42 A
Ê
P212121 with cell dimensions a ˆ 27.01 A
Ê (Table 1). A complete data set was also
and c ˆ 20.42 A
collected on a ¯ash-frozen crystal. However, the freezing
increased the crystal disorder. These data were therefore
not used.
Ê
Figure 8. The subcell of (Pro-Pro-Gly)10. The entire 86 A
model is shown in dark gray and the seven residue
model used in molecular replacement is show in light
gray. The subcell is clearly not large enough to hold the
entire molecule and is exactly the height of one triplehelical repeat. The Figure was generated with MOLSCRIPT (Kraulis, 1991).
The predicted length of a 30 residue collagen triple
Ê . Thus, the observed unit cell rephelix is about 86 A
resents a subcell (Figure 8). Both diffraction sets also
show evidence of a longer unit cell with identical a and b
Ê
axes, and a c axis ®ve times as long: c0 ˆ 5c ˆ 100.9 A
Ê (PPG 2 set). Identical ®ndings
(PPG 1 set) or 102.1 A
have been reported previously for this peptide
(Okuyama et al., 1981). The data of the dominant subcell
corresponds to those re¯ections with l0 ˆ 5n. Re¯ections
with l0 ˆ 5n ‡ m (m ˆ 1, 2, 3, 4), were observed as well,
especially with the synchrotron data, but their average
intensities were much lower than those from the subcell.
Of the 1193 5n ‡ 4 re¯ections collected, only 58% had
intensities greater than 1s(I). The rise per tripeptide in
Ê , and strong
collagen triple helices is known to be 2.9 A
re¯ections (0 1 7) corresponding to that spacing appear
in the PPG 1 data near the c axis. These re¯ections
were not measured in the PPG 2 data collection. The
Ê dimension of the short c axis corresponds to a
20 A
complete turn of a 7-fold triple helix aligned along the
c axis. The dominance of this reduced cell can be
interpreted in terms of a structure partially disordered
along the helical axis, in which the peptide molecules
stack on top of each other to form a columnar structure. Because of crystalline disorder, this columnar
structure resembles an in®nite chain in which an individual triple helix cannot be discriminated from that
above or below it. During the course of the work
reported here, several attempts were made to model
this disorder using the PPG 2 data from the
l0 ˆ 5n ‡ m re¯ections, but the data proved to be insuf®cient to discriminate among individual molecules
along the helical axis in the extended cell. Conse-
636
X-ray Structure of Repeating (Pro-Pro-Gly) Peptide
Figure 9. Comparing model A with model B from initial
molecular replacement. (a) Model A. Presumably, the carbonyl group of the proline residue in the X position should
make a hydrogen bonding contact with the N atom of the
glycine residue in the neighboring chain. Model A displays
``normal'', N(Gly) to O(Pro)X hydrogen bonds in terms of
length and orientation (broken single line). The distance to
the Ca of the third chain is longer and the orientation is not
as appropriate for a hydrogen bond (broken double line). (b)
Model B. Interchain hydrogen bonding geometry is perturbed. In model B, the oxygen atom appears to be somewhat pointed toward the Ca (broken double line), rather
than toward the N(Gly), and this length is reasonable for a
hydrogen bond while the distance to the N(Gly) becomes
longer (broken single line). (c) The high-symmetry sequence
and quasi-in®nite helical nature of Pro-Pro-Gly implies that
an end-to-end rotation of the peptide chain would give analogous models with only the N and Ca positions transposed.
Parts (a) and (b) of the Figure were generated with MOLSCRIPT (Kraulis, 1991).
quently, the average crystal structure corresponding to
an in®nite polymer crystal, in which the c axis is the
helical repeat of a 7-fold collagen triple helix, was
solved and re®ned using an asymmetric unit of 21
residues.
Structure determination and refinement
A simpli®ed molecular replacement search was performed using the LALS program (Campbell-Smith &
Arnott, 1978) and an idealized 21 residue fragment of a
7-fold triple helix. The polymer-crystal nature of this
structure reduces the number of search variables from
six to four. Since the helix axis can be aligned with the
crystallographic c axis, the orientation search is reduced
to the azimuth angle m. Three translational variables, u, v
{ Given the small number of re¯ections, the
discriminatory power of the free R-factor was
signi®cantly reduced, so other validation criteria were
taken into account subsequently.
and w, are still required to place the model correctly
with respect to the unit cell origin. To simplify the search
further, a ®rst solution for the (u, v, m) variables was
obtained based on the 31 equatorial-like hk0 re¯ections
Ê . Then, the search was extended (using all the
up to 4 A
Ê , 127 re¯ections) into the third dimension
data up to 4 A
by varying the vertical displacement w and the azimuth
angle m.
Two independent solutions (models A and B) were
found differing in the u translation and the m rotation,
which showed similar agreement with the X-ray data;
R ˆ 40.75 % and 41.15 % after rigid-body re®nement.
Torsion-angle re®nement of both models in parallel
using LALS (Campbell-Smith & Arnott, 1978) produced
non-discriminative results.
Positional and overall B-factor re®nements along with
simulated annealing were performed on both models
Ê
with X-PLOR (BruÈnger, 1991). Data between 8 and 2.0 A
were used and 7-fold non-crystallographic symmetry
restraints were maintained. As the re®nements proceeded, model B appeared to give somewhat higher R
and R-free values{, higher rms deviations against stan-
637
X-ray Structure of Repeating (Pro-Pro-Gly) Peptide
dard geometries, as well as somewhat higher constraint
energies (bonds and angles). However, neither set of
Fourier maps (2FoÿFc) was signi®cantly better or worse
in terms of chain continuity or coverage.
An investigation of the hydrogen bonding geometries
in both models provided the ®nal answer. While model
A has interchain N Ð H OˆC hydrogen bonds with
reasonable geometry, model B did not. In model B, interchain N O distances are longer than Ca O distances,
as if the primary hydrogen bond donors were the alpha
carbon instead of amide nitrogen atoms (Figure 9(a) and
(b)). This is consistent with the effect of an end-to-end
rotation of the triple helix. Because there is so much
inherent symmetry in the Pro-Pro-Gly sequence and the
quasi-in®nite helix eliminates end-effects, an end-to-end
rotation of the model leads to differences in just a few
places. While the carbonyl oxygen atoms and proline
rings remain in the same location, only Ca is substituted
for N and vice versa (Figure 9(c)). Inverted modeling of
the peptide helix imposes incorrect stereochemical
restraints and yields inverted non-bonded geometry
between the N Ð H OˆC hydrogen bonds and the
Ca OˆC non-bonded interactions. A review of the
2Fo ÿ Fc maps at this point further con®rmed this notion,
and consequently only model A was kept for subsequent
rounds of re®nement.
Several constraints were imposed in X-PLOR
(BruÈnger, 1991) to ensure the matching of the end of one
helix with the beginning of the next. The nature of the
in®nite triple helix requires covalent bonds among symmetry molecules related along the helical axis. Non-crystallographic symmetry restrictions were removed and
the resulting model underwent many rounds of simulated annealing, positional re®nement, and manual
water molecule ®tting. At a point near the end of manual
water ®tting using the PPG 1 data set, the coordinates
were used to re®ne against the PPG 2 set. The initial Rfactor upon placing the model (not including water molecules) against the PPG 2 set was about 27% for data
Ê . This and the preservation of elecbetween 8 and 1.8 A
tron density connectivity along the c axis indicated consistency between the two data sets. Re®nement of the
hydration structure continued against the two sets in
parallel, including positional re®nement, simulated
annealing, as well as group (PPG 1) and individual
(PPG 2) B-factor re®nement.
Independent re®nement of the PPG 2 structure was
performed using both X-PLOR (BruÈnger, 1991) and
PROLSQ (Hendrickson & Konnert, 1981). At the start of
the re®nement of the PPG 2 structure, only the peptide
portion of the PPG 1 structure was used; the Fourier
was investigated independently for hydration peaks.
The ®nal PPG 1 model contains 37 water and two
acetic acid molecules and the ®nal PPG 2 model contains
40 water molecules. The ®nal R-factor of the PPG 1
model against the PPG 1 set of data (for re¯ections in
Ê range using a 2s on F cutoff) is 18.1 %
the 8.0 to 1.97 A
and for the PPG 2 model against the PPG 2 set is 21.3%
Ê range using a 2s on F
(for re¯ections in the 8 to 1.6 A
cutoff; Table 1). The ®nal models ®t the 2Fo ÿ Fc maps
well and show no signi®cant chain discontinuities. The
rms deviations for both models against standard geometries are given in Table 1. Coordinates for both the
PPG 1 model and PPG 2 model have been deposited in
the Brookhaven Protein Data Bank as 1a3i and 1a3j
respectively. Structure factors have been deposited as
well with the codes r1a3isf and r1a3jsf for PPG 1 and
PPG 2, respectively.
Acknowledgments
Overall support for this project was received from
grants GM 21589 to H.M.B. and AR19626 to B.B. from
the National Institutes of Health as well as a grant from
the Pittsburgh Supercomputing Center. The research of
L.V. has been partially supported by an International
Exchange Program award from the University of Naples
``Federico II''. Financial support was provided by the Italian CNR (National Research Council, Progetto Strategico ``Biologia Strutturale'') and ASI (Italian Space
Agency). Computers and graphic facilities were made
available by Ceinge (Naples). The research of R.Z.K. has
been supported by the National Institutes of Health Molecular Biophysics Training Grant and the Department of
Education's Graduate Assistance in Areas of National
Need Grant. A.Z. is grateful to Professor H. A. Scheraga
for his interest and encouragement. We are indebted to
the CHESS staff and particularly to R. Walter for his constant support and assistance during data collection, and
to G. Sorrentino and P. Occorsio for their technical assistance. R.Z.K. and L.V. contributed equally to this work.
References
Acton, S., Resnick, D., Freeman, M., Ekkel, Y.,
Ashkenas, J. & Krieger, M. (1993). The collagenous
domains of macrophage scavenger receptors and
complement component C1q mediate their similar,
but not identical, binding speci®cities for polyanionic ligands. J. Biol. Chem. 268, 3530± 3537.
Bella, J. & Berman, H. M. (1996). Crystallographic evidence for Ca-H OˆC hydrogen bonds in a collagen triple helix. J. Mol. Biol. 264, 734± 742.
Bella, J., Eaton, M., Brodsky, B. & Berman, H. M. (1994).
Crystal and molecular structure of a collagen-like
Ê resolution. Science, 266, 75 ± 81.
peptide at 1.9 A
Bella, J., Brodsky, B. & Berman, H. M. (1995). Hydration
structure of a collagen peptide. Structure, 3, 893±
906.
BruÈnger, A. T. (1992). X-PLOR, Version 3.1, A system for
X-ray Crystallography and NMR. Yale University
Press, New Haven, Cl.
Campbell-Smith, P. J. & Arnott, S. (1978). LALS: a
linked-atom least-squares reciprocal-space re®nement system incorporating stereochemical restraints
to supplement sparse diffraction data. Acta Crystallog. sect. A, 34, 3 ±11.
Doi, T., Higashino, K.-i., Kurihara, Y., Wada, Y.,
Miyazaki, T., Nakamura, H., Uesugi, S., Imanishi,
T., Kawabe, Y. & Itakura, H. (1993). Charged collagen structure mediates the recognition of negatively charged macromolecules by macrophage
scavenger receptors. J. Biol. Chem. 268, 2126 ±2133.
Evans, S. V. (1993). SETOR: hardware lighted threedimensional solid model representations of macromolecules. J. Mol. Graph. 11, 134± 138.
Fair, C. K. (1992). MOLEN: An Interactive Structure
Solution Procedure. Enraf-Nonius, Delft, Netherlands.
Fraser, R. D. B., MacRae, T. P. & Suzuki, E. (1979).
Chain conformation in the collagen molecule. J. Mol.
Biol. 129, 463± 481.
Fraser, R. D. B., MacRae, T. P., Miller, A. & Suzuki, E.
(1983). Molecular conformation and packing in collagen ®brils. J. Mol. Biol. 167, 497±521.
Hendrickson, W. A. & Konnert, J. H. (1981). PROLSQ.
In Biomolecular Structure, Conformation, Function and
Evolution. (Srinivasan, R., Subramanian, E. &
638
X-ray Structure of Repeating (Pro-Pro-Gly) Peptide
Yathindra, N., eds.), pp. 43 ±57, Pergamon Press,
Oxford.
Hoppe, H.-J. & Reid, K. B. M. (1994). Collectins: soluble
proteins containing collagenous regions and lectin
domains and their roles in innate immunity. Protein
Sci. 3, 1143± 1158.
Kraulis, P. (1991). MOLSCRIPT: a program to produce
both detailed and schematic plots of protein structures. J. Appl. Crystallog. 24, 946± 950.
Li, M.-H., Fan, P., Brodsky, B. & Baum, J. (1993). Twodimensional NMR assignments and conformation of
(Pro-Hyp-Gly)10 and a designed triple helical peptide. Biochemistry, 32, 7377± 7387.
Miller, M. H. & Scheraga, H. A. (1976). Calculation of
the structures of collagen models. Role of interchain
interactions in determining the triple-helical coiledcoil conformation. I. Poly(glycyl-prolyl-prolyl).
J. Polym. Sci. Symp. 54, 171± 200.
Minor, W. (1993). XDISPLAYF program. Purdue University.
Momany, F. A., McGuire, R. F., Burgess, A. W. &
Scheraga, H. A. (1975). Energy parameters in polypeptides. VII. Geometric parameters, partial atomic
charges, nonbonded interactions, hydrogen bond
interactions, and intrinsic torsional potentials for
the naturally occuring amino acids. J. Phys. Chem.
79, 2361± 2381.
NeÂmethy, G., Gibson, K. D., Palmer, K. A., Yoon, C. N.,
Paterlini, G., Zagari, A., Rumsey, S. & Scheraga,
H. A. (1992). Energy parameters in polypeptides.
10. Improved geometrical parameters and nonbonded interactions for use in the ECEPP/3 algorithm, with application to proline-containing
peptides. J. Phys. Chem. 96, 6472.
Okuyama, K., Okuyama, K., Arnott, S., Takayanagi, M.
& Kakudo, M. (1981). Crystal and molecular structure of a collagen-like polypeptide (Pro-Pro-Gly)10.
J. Mol. Biol. 152, 427±443.
Otwinowski, Z. (1993). Oscillation data reduction program. In Proceedings of the CCP4 Study Weekend:
Data collection and Processing. (Sawyer, L., Isaacs, N.
& Bailey, S., eds.), pp. 56 ± 62, Warrington, UK,
SERC Daresbury Laboratory.
Rich, A. & Crick, F. H. C. (1961). The molecular structure of collagen. J. Mol. Biol. 3, 483±506.
Rosenbloom, J., Harsch, M. & Jimenez, S. (1973). Hydroxyproline content determines the denaturation temperature of chick tendon collagen. Arch. Biochem.
Biophys. 158, 478± 484.
Sack, J. S. (1988). CHAIN: a crystallographic modeling
program. J. Mol. Graphics. 6, 224± 225.
Sakakibara, S., Kishida, Y., Kikuchi, Y., Sakai, R. &
Kakiuchi, K. (1968). Synthesis of poly-(L-prolyl-Lprolylglycyl) of de®ned molecular weights. Bull.
Chem. Soc. Jpn. 41, 1273.
Sakakibara, S., Kishida, Y., Okuyama, K., Tanaka, N.,
Ashida, T. & Kakudo, M. (1972). Single crystals of
(Pro-Pro-Gly)10 a synthetic polypeptide model of
collagen. J. Mol. Biol. 65, 371±373.
Sakakibara, S., Inouye, K., Shudo, K., Kishida, Y.,
Kobayashi, Y. & Prockop, D. J. (1973). Synthesis of
(Pro-Hyp-Gly)n of de®ned molecular weights. Evidence for the stabilization of collagen triple helix by
hydroxypyroline. Biochim. Biophys. Acta, 303, 198±
202.
Sasisekharan, V. & Bansal, M. (1990). Self-similarity and
the assembly of collagen molecules. Curr. Sci. 1990,
863± 866.
Schneider, B., Cohen, D. M., Schleifer, L., Srinivasan,
A. R., Olson, W. K. & Berman, H. M. (1993). A systematic method for studying the spatial distribution
of water molecules around nucleic acid bases. Biophys. J. 65, 2291± 2303.
Yonath, A. & Traub, W. (1969). Polymers of tripeptides
as collagen models. J. Mol. Biol. 43, 461± 477.
Edited by D. Rees
(Received 23 January 1998; received in revised form 8 April 1998; accepted 9 April 1998)
Download