DSSP and STRIDE Chapter 19, Du and Bourne “Structural Bioinformatics” Topic 9 Why Secondary Structure Assignments? A key step in protein classification --class, fold….. It has functional implications Useful in protein structure comparison and protein structure prediction -- Some protein structure alignment programs use SSE (secondary structure element) -- In protein threading, secondary structures are used to define “cores”, more later… It is an intuitive means of visualizing and understanding protein structures lysozyme CAF Andersen, and B Rost (2002) “Secondary structure assignment” How do we extract 2o structure info? Secondary Structure Annotations from PDB File Types of helices and sheets: Helix: Right-handed alpha (default) Right-handed omega Right-handed pi Right-handed gamma Right-handed 310 1 2 3 4 5 Sheets: Sense of strand with respect to previous strand in the sheet. first strand 0 Parallel 1 anti-parallel -1 ……………… They are assigned by crystallographers, but how? Will come back to this later…… Secondary Structure Assignments What are the two main structural properties when we talk about secondary structures? 1. Hydrogen bond patterns “Knowledge-based protein secondary structure assignment”. Frishman D, Argos P. (1995). Proteins 23(4):566-79 2. Backbone geometry (main-chain dihedral angles) Hydrogen Bond Identification 1. A simple way: angle-distance hydrogen bond assignment: A hydrogen bond is assigned when: 1. q > 120O AND 2. rHO < 2.5 Å Baker, E. N. & Hubbard, R. E. (1984). A better H-bond potential Repulsive Attractive Dahiyat BI, Gordon DB, and Mayo SL (1997). Automated design of the surface positions of protein helices. Protein Science. 6:1333-1337. But when used in practice, it isn’t without problems Allosteric response is both conserved and variable across three CheY orthologs. Mottonen JM, Jacobs DJ, Livesay DR (2010). Biophysical Journal, 99:2245-2254. Hydrogen Bond Identification-DSSP 2. Hydrogen bond identification is based on Coulomb energy DSSP: Definition (Dictionary) of Secondary Structure of Proteins E fq 1 q 2 ( 1 rNO 1 rHC 1 rHO 1 rNC Where: f = 332 kcal/mol q1= 0.42, q2=0.2 **Hydrogen issue A hydrogen bond is identified if this energy E is less than -0.5 kcal/mol Kabsch W, Sander C (1983). Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 1983 Dec;22(12):2577-637 ) Secondary Structure Assignments-DSSP H: (-helix), G: (310 helix), I: (-helix), E: (-strand), B: (bridge), T: (-turn), S: (bend), C(space): (coil) H: two consecutive amino acids have i and i+4 hydrogen bonds, and ends likewise with two consecutive i-4 and i hydrogen bonds. ** Similarly for G and I assignments. ** The helix definition does not assign the edge residue having the initial and final hydrogen bonds in the helix. T: single helix hydrogen bonds. Kabsch W, Sander C (1983). Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 1983 Dec;22(12):2577-637 Secondary Structure Assignments-DSSP H G I Kabsch W, Sander C (1983). Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 1983 Dec;22(12):2577-637 Secondary Structure Assignments-DSSP T Kabsch W, Sander C (1983). Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 1983 Dec;22(12):2577-637 Secondary Structure Assignments-DSSP Beta Structure Definitions: • Kabsch and Sander define all beta structure in terms of `bridges' which are either parallel or antiparallel. • Where two or more bridges of the same type are consecutive, the structure is termed a ladder. • Finally, overlapping ladders are amalgamated into sheets. Additional complications arise because ladders may have discontinuities in them, and ladders may consists of just a single bridge. • These aspects of protein structure make the coding of beta-structure less straightforward than for helix. Kabsch W, Sander C (1983). Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 1983 Dec;22(12):2577-637 Secondary Structure Assignments-DSSP H: (-helix), G: (310 helix), I: (-helix), E: (-strand), B: (bridge), T: (-turn), S: (bend), C(space): (coil) E: sheet, which is composed of overlapping ladders B: ladders of length 1 S: indicate a bend in the chain Kabsch W, Sander C (1983). Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 1983 Dec;22(12):2577-637 Secondary Structure Assignments-DSSP Kabsch W, Sander C (1983). Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 1983 Dec;22(12):2577-637 DSSP Sample Output DSSP Sample Output Note: lower case for SS-bridge CYS DSSP Sample Output N-H-->O etc. hydrogen bonds; e.g. -3,-1.4 means: if this residue is residue i then N-H of i is h-bonded to C=O of i-3 with an electrostatic Hbond energy of -1.4 kcal/mol. There are two columns for each type of H-bond, to allow for bifurcated Hbonds. Bifurcated HBs DSSP Sample Output **TCO: cosine of angle between C=O of residue I and C=O of residue I-1. α-helices: near +1; β-sheets: near -1. (Not used for structure definition) **KAPPA: virtual bond angle (bend angle) defined by the three Cα atoms of residues I- 2, I, I+2. Used to define bend (structure code 'S'). **ALPHA: virtual torsion angle (dihedral angle) defined by the four Cα atoms of residues I-1, I, I+1, I+2. Used to define chirality (structure code '+' or '-'). **PHI PSI **X, Y, Z coordinates of C Kappa and Alpha-DSSP k Kabsch W, Sander C (1983). Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 1983 Dec;22(12):2577-637 SS Assignment of PDB Files using DSSP α 310 β Turn 1ATP An aside: ACC from DSSP ACC = water exposed surface (Å2). But what is the problem with doing this??? Relative solvent accessibility (Range = 0-1) Reference = G G X X = Lys protein res ref res A Racc = A G G http://www.cmbi.ru.nl/hsspsoap/ STRIDE: secondary STRuctural IDEntification STRIDE uses two criteria: 1. Hydrogen Bond Energy 2. dihedral angle probabilities Knowledge-based protein secondary structure assignment. Frishman D, Argos P. (1995). Proteins 23(4):566-79 Hydrogen Bond Identification-STRIDE Empirical hydrogen bond calculation: Er C r 8 Ehb = E distance × E directional = Er × Et × E p D r 6 distance dependent term E p = cos Et 2 r is N-O distance () [0.9 + 0.1 sin(2t i )] cos (t = K 1 [K 2 - cos 2 (t i ) ] cos(t 0 ) 0 < t i 90 o) 90 < t i 110 o 110 t i where K 1 0 . 9 / cos 110 , K 2 cos 110 6 o 2 o two angular dependent terms Knowledge-based protein secondary structure assignment. Frishman D, Argos P. (1995). Proteins 23(4):566-79 Dihedral Angle Probabilities-STRIDE Torsion angles propensities for alpha-helix and beta-sheet Knowledge-based protein secondary structure assignment. Frishman D, Argos P. (1995). Proteins 23(4):566-79 Secondary Structure Assignments-STRIDE Recognition of -helices: similar to DSSP--have two consecutive hydrogen bonds between k and k+4 E k ,k 4 hb (1 W 1 W 2 For edge residues: Pk Pk 4 2 ) T1 Five parameters Pk T 2 and Pk 5 T3 Recognition of -sheets: similar to DSSP--have two consecutive hydrogen bonds ) T Antiparall el Antiparall el ) T Antiparall el Antiparall el E hb 1 (1 W 1 W 2 CONF E hb 2 (1 W 1 W 2 CONF E hb 1 (1 W 1 W 2 CONF Four parameters Parallel ) T Parallel E hb 2 (1 W 1 W 2 CONF Optimized based on a dataset with author’s assignments Pparallel ) T Parallel CONF ( PInt 1 PInt 2 ) 2 OR PInt Knowledge-based protein secondary structure assignment. Frishman D, Argos P. (1995). Proteins 23(4):566-79 STRIDE Sample Output STRIDE Sample Output STRIDE Sample Output http://webclu.bio.wzw.tum.de/cgi-bin/stride/stridecgi.py DSSP vs STRIDE STRIDE better, 58% DSSP better, 31% + Same Assignment, 11% ** <14% difference for individual proteins 226 chains based on authors’ three state Assignments, helix, extended, coil Knowledge-based protein secondary structure assignment. Frishman D, Argos P. (1995). Proteins 23(4):566-79 DSSP vs STRIDE Although DSSP is the older method and continues to be the most commonly used, the original STRIDE definition reported it to give a more satisfactory structural assignment in at least 70% of cases. In particular, STRIDE was observed to correct for the propensity of DSSP to assign shorter secondary structures than would be assigned by an expert crystallographer, usually due to the minor local variations in structure that are most common near the termini of secondary structure elements. Knowledge-based protein secondary structure assignment. Frishman D, Argos P. (1995). Proteins 23(4):566-579. Using a sliding-window method to smooth variations in assignment of single terminal residues, current implementations of STRIDE and DSSP are reported to agree in up to 95.4% of cases. Protein secondary structure assignment revisited: a detailed analysis of different assignment methods. Martin J, et al (2005). BMC Structural Biology 5:17. Both STRIDE and DSSP, among other common secondary structure assignment methods, are believed to under predict pi helices. Occurrence, conformational features and amino acid propensities for the pi-helix. Fodje MN, Al-Karadaghi S (2002). Protein Engineering 15(5):353-358. Comparison of Methods for Secondary Structure Assignment Other Programs: DEFINE – uses a distance criteria between C atoms which varies slightly for each secondary structure type; allows modifications for curvature DSSP is widely used and a generally accepted method Fourrier et al. BMC Bioinformatics 2004 5:58 Secondary Structure Annotations from PDB File Types of helices and sheets: Helix: Right-handed alpha (default) Right-handed omega Right-handed pi Right-handed gamma Right-handed 310 1 2 3 4 5 ……………… Crystallographers’ assignments -- angle-distance simple hydrogen bonding pattern -- more complex distance and geometric -- hydrogen pattern + mainchain dihedral angles -- mainchain dihedral angles only -- DSSP algorithm -- a combination of several methods -- visual inspection Sheets: Sense of strand with respect to previous strand in the sheet. first strand 0 Parallel 1 anti-parallel -1