New Strategies for Protein Folding Materials and Process Simulation Center

advertisement
New Strategies for Protein Folding
Joseph F. Danzer, Derek A. Debe, Matt J. Carlson, William A. Goddard III
Materials and Process Simulation Center
California Institute of Technology
Protein Tertiary Structure Prediction
Given a Protein’s Primary Structure -- Amino Acid Sequence
…-HIS-CYS-ALA-ALA-GLY-GLU-ASP-...
Can We Determine It’s 3D Structure
How Do Those Structural
Units Pack Together?
What Local Structural Units
Does It Form?
•-Helix (Cylinder)
•-Sheets (Ribbon)
Structure Prediction is a Two Fold Problem
With a 6 (f,y) state representation,
650 or 1038 states for a 50 residue protein
Assuming protein may sample 1state/ps,
1019 years to fold
•Conformational Search Problem
–Given the exponentially large number of possible states,
how do we generate a correct state?
•Recognition Problem
–How do we differentiate correct from incorrect folds?
Restrained Generic Protein (RGP)
Direct Monte Carlo
Highly efficient, off-lattice residue buildup procedure
for generating ensembles of protein conformations that
comply with a set of user defined distance restraints.
q
f
l
l = 3.8Å; q = 120;
Typically f = 0, 60, 120, 180, 240, 300. (6 states per residue)
Generic Protein Model
•Each residue is a 5.5 Å sphere
•Fixed geometry connects residues
Restraint Implementation
At residue addition step i, the maximal position of
residue i+n in the (z,r) plane is known.
r
i+4
i+4
i+4
i+4
i+4
i-1
i+4
i
i+4
z
Satisfies pairwise
restraints with
>90% efficiency
with negligible
computational cost.
i+4
i+4
i+4
i+4
Leads to a simple set of trigonometric
conditions for restraint satisfaction.
Generate-and-Select Hierarchy
Inter-residue
restraints
RGP Ensemble
Generation
Amino Acid
Sequence
4
<10 topologies
Static Residue Burial
Selection
<500 topologies
Intact Peptide
Backbone
Secondary
structure
prediction
Dynamic
Residue Burial
Selection
<20 topologies
Local Structure
Refinement
Additional
Restraints
<10 topologies
Additional
Refinement
<5 topologies
LexA Repressor
RGP Ensemble
Selected Set
Sec. Prediction
N/36
Sa
30,0000
CRMSb
6.85Å
sc
395
Rankd
24t
CRMSe
7.46Å
Rankf
14t
CRMSg
6.67Å
N/24
5,000
6.57Å
209
6t
6.76Å
2t
6.11Å
N/12
500
6.28Å
271
1
6.43Å
7t
4.45Å
N/6
-
-
44
2
6.13Å
1t
5.76Å
Secondary Structure Prediction-PHD
Burkhard Rost & Chris Sander, J. Mol. Biol. 232, 584 (1993).
Myoglobin
RGP Ensemble
Selected Set
Sec. Prediction
N/12
S
50,000
CRMS
8.95Å
s
117
Rank
11
CRMS
8.77Å
Rank
5
CRMS
7.01Å
N/6
-
-
23
1
9.28Å
1
6.30Å
Inter-Residue Restraints
If tertiary structure is unknown, How can we generate distance restraints?
•Experimentally determined disulfide bond connectivity
•Use PHD prediction algorithm to generate loose restraints1
PHD predicts whether each residue will be buried or exposed to solvent
•Assume the residues with greatest burial form a hydrophobic core
•Generate a few loose restraints (4-10 Å) between these residues
Tests on two proteins (3icb,1lea) using loose restraints were done
Protein
# Restraints
3icb
3
1
3
8*
7**
1lea
Energy
Cut-Off
-26
-23
-27
-18
-30
-27
# Selected
Structures
463
460
172
2242
110
330
# Near
Native
4
2
1
1
3
8
*All restraints were picked so that they were incorrect
**All restraints were picked so that they were correct
1. Burkhard Rost & Chris Sander, J. Mol. Biol. 232, 584 (1993).
Best
CRMS
7.787
7.827
8.300
8.484
7.001
7.001
Local Structure Refinement
•Dynamic Monte Carlo
–Make small local deformations to the backbone structure
–Overall topology must be kept intact
–Use simple energy function to determine if deformation is
accepted or rejected
•Fragment Sewing
–Isites1 library is a database of structural fragments widely
observed in the Protein Data Bank.
–Based on sequence homology, Isites will generate a list of
fragments whose structures are likely to be found in the protein
–Local structure can be refined by sewing these fragments into the
overall structure
1. C. Bystroff & D. Baker, J. Mol. Bol. 281, 565 (1998).
Dynamic Monte Carlo
Local deformations are made by modifying the position of a single residue.
Axis of rotation
Circle defines allowed movement
based on fixed geometry of model
Energy function properly orients side chains. Hydrophilic groups point outward
and hydrophobic groups point inward.
C- Atoms
Hydrophilic Side Chain
Hydrophobic Side Chain
Fragment Sewing
Segment’s original structure
New structure after sewing
Rest of protein
Overall topology is still intact, but now local structure has -helical
structure rather than a random coil.
Download