Computational Method for Predicting Amyloidogenic Sequences

advertisement
Computational Method for
Predicting Amyloidogenic Sequences
Bill Welsh
UMDNJ- Robert Wood Johnson Medical School
welshwj@umdnj.edu
Amyloid Fibril Formation
A Common Mechanism for Protein Misfolding Diseases
• Numerous amyloid & misfolding diseases
• All of them are incurable at present
• Short list of more familiar examples
–
–
–
–
–
–
Alzheimer’s disease
Parkinson’s disease
Huntington’s disease
Crutzfeld-Jakob disease (“Mad Cow”)
Familial Amyloidosis
Type II Diabetes
• Triggered by short sequences that convert
from native a-helix or coil to b-strand
• We call this trait ‘hidden b-strand propensity’
Problems
1. No sequence specificities
2. Absence of detailed structural information on
misfolded proteins (amyloid fibrils)
Our Solution
1. Misfolding process is triggered by short (5-7
residue) sequences
2. Redefine sequence-structure relationships in
terms of tertiary context
3. Identify short sequences that exhibit non-native
(hidden) b-strand propensity [HbP].
Intriguing Relationship Between
Tertiary Contacts and Secondary Structure
Relative Occurrence of Secondary Structure
Elements in Different Tertiary Contact States
Tertiary Contact (TC)
Two non-H atoms 4Å apart
separated by more than 4
residues in sequence
Secondary
structure
Tertiary
contacts
Coil
a
b
Total
sequences
Low
38 %
59 %
3%
191,300
Medium
47 %
37 %
16 %
112,199
High
39 %
11 %
50 %
150,288
All
41 %
38 %
21 %
453,787
Based on SCOP20v1.57
Striking Conclusion
a-helix dominates in low-TC regions
b-sheet dominates in high-TC regions
TC Influence on Secondary Structure Propensity
a-helix propensity of b-strands
increases sharply at low TCs
0.9
0.9
Fragments predicted as helix
helix propensity
fragments predicted as beta-strand
b-strand propensity
β-strand propensity of helices
increases sharply at high TCs
0.8
0.7
0.6
0.5
0.4
0.3
Helix
0.2
Coil
0.1
0
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
0.8
0.7
0.6
0.5
0.4
beta-strand
Coil
0.3
0.2
0.1
0
0
0.2
0.4
0.6
TC
0.8
1
1.2
1.4
1.6
1.8
2
TC
Average Tertiary Contacts (TCs) in SCOP20
L
S
9.9
8.3
A
Q
E
6.4 10.2 8.6
K
8.0
D
N
R
I
F
T
M
P
9.1 10.6 15.9 11.0 18.5 9.3 11.9 7.0
G
W
C
V
Y
H
6.5 25.9 12.7 10.1 21.7 16.2
The CSSP Algorithm: Locating Sequences Exhibiting HbP
Amyloid fibrils from myoglobin
SCOP20
Sequences
3D Structure
from PDB
DSSP
Sec Str
Tertiary
Contact (TC)
Database of >450,000
7-residue sequences
with secondary
structure & TCs
Fandrich et al., 2001 Nature
Sequence of hidden b-propensity
Low TC
P(a|low)
High TC
Query Sequence
-Q–E–V–L–I–R–Lsliding 7-residue window
Similar
Sequences
…
P(b|high)
A G HGQ E V L I R L F T G H P E T L…
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
PHD prediction of secondary structure
Amino acid|…
PHD
|…
P(a)
|…
P(b)
|…
…AGHGQEVLIRLFTGHPETL…
…HHHHHHHHHHHHHH HHH…
…7756899999999623469…
…0000000000000000000…
…|
…|
…|
…|
Sensitivity of the CSSP Method
Cameleon Sequences
1AMP Aminopeptidase
1GKY Guanylate kinase
ASVKQVS
in a-helix
ASVKQVS
in b-sheet
Query local
sequence
Resident protein
HbP prediction (0-10 scale)
Native secondary
structure
Tertiary
Contacts
(TC)
P(a)
P(b)
P(Coil)
PDB ID
Name
1AMP
Aminopeptidase
b strand
1.3 
2
7
1
1GKY
Guanylate kinase
a helix
0.4 
8
1
1
ASVKQVS
Hidden β-propensity in
Alzheimer’s Disease
KLVFF are key residues in amyloid fibril polymerization
(Tjernberg et al., JBC 1996)
 Amyloidogenic wild type Aβ fragment
Helix Beta Coil
 Non-amyloidogenic mutant Aβ fragment
Propensity
Strong
Moderate
Weak
Very weak
Yoon and Welsh, Protein Science (2004); ibid., Proteins (2005)
hIAPP sequence (Type 2 Diabetes)
-NFLVH-FLVHSMazor et al., JMB (2002)
-NFGAILZanuy, Nussinov, et al.
Biophysical Journal (2003)
 hIAPP sequence (4-34) associated with type II diabetes
NAC sequence (Parkinson’s disease)
VTNVGGAVVTGVTAVA
VTGVTAVAQKTV
GAVVTGVTAVA
Bodles et al., J Neurochem (2001)
 NAC sequence of α-synuclein associated with Parkinson’s disease
Beta propensity of acetylcholinesterase (AChE)
and its homolog butyrylcholinesterase (BuChE)
Cottingham et al., Biochemistry (2002); ibid., (2003): AChE586-599 and BuChE573-586
 Amyloidogenic AChE586-599 fragment
 Nonamyloidogenic BuChE573-596 fragment
Amyloid Formation by G334V
Mutant p53 Associated with Lung Cancer
Higashimoto et al, Biochemistry 45, 1608-1619 (2006)
Amyloidogenic Sequence Knowledge Base (ASKB)
 CSSP Algorithm that predicts
“Hidden” b-Strand Propensity
in Proteins & Polypeptides
 Searchable peptide database
http://askb.umdnj.edu/askb/welcome.html
Estimating Free Energies
Unfolded
Partially Folded
Ga
Gb
Gcoil
a-helix
Ga  b
b-strand
Gamyloid
b-rich amyloid
Gcoilb
Random coil
Ghidden b  Gab  Gcoilb
Gamyloid  Ghidden b
Ghidden b   RT log K b
  RT (log Ka b  log K coilb )
Ghidden b
Pb
Pb 

  RT log
 log

P
P
a
coil 


Pb2 
  RT log

P
P

a coil 

Predicted vs. Expt’l b-Sheet
Structure of Prion Protein Peptide
•
Decatur and coworkers employed FTIR spectroscopy
to determine % b-sheet structure for peptides based
on residues 109-122 of the Syrian hamster prion
protein (H1) substituted at position 117.
•
We plotted our calculated HbP metrics for the
sequences H1, A117G, A117V, A117L, and A117I vs.
Decatur’s expt’l values.
•
Strong correlation (R2=0.96) suggests that
calculated HbP profiles are excellent predictors
of b-sheet nature.
SA Petty, T Thorsteinn, & SM Decatur, Biochemistry 44:4720-4726 (2005)
General Observations and Implications
 The CSSP algorithm successfully pinpoints amyloidogenic sequences in
numerous examples where expt’l data are available
 These sequences possess hidden b-strand propensity
 generally short sequences (4-7 residues) that serve as ‘core nucleation motifs’ to
trigger amyloid fibril formation
 adopt a-helix in low contact regions (low TC) and b-strand in high contact
regions (high TC)
 These sequences are conformationally ambivalent




interconvertible between a-helix and b-strand
highly sensitive to tertiary environment
generally contain hydrophobic, aromatic residues (Phe, Trp, Tyr)
consistent with recent findings: Rojas Quijano et al Biochemistry (2006)
 Ability to form amyloid is a generic trait of all proteins
Thank You!
welshwj@umdnj.edu
Download