Intrinsically Disordered Proteins: from lack of structure to pleiotropy of functions Lilia Iakoucheva University of California, San Diego Ordered Proteins Disordered Proteins Uversky and Dunker, 2012, Anal Chem Outline Characterization and properties of IDPs Functional repertoire of IDPs Post-translational modifications and disorder Importance for molecular recognition Disorder and diseases Structure is required for function 1894 “Lock-and-key” Emil Fischer 1950 “Configurational adaptability” Fred Karush 1958 “Induced fit” Daniel Koshland 1965 “Conformational selection” Monod-Wyman-Changeux Protein structure-function paradigm Amino Acid Sequence 3D Structure Function Examples of disordered proteins Some proteins/regions could function without being folded… Tail of histone H5 (Aviles et al, Eur. J. Biochem. 1978) … and later tails of other histones 95-residue long disordered segment of calcineurin (Kissinger et al, Nature, 1995) Cyclin-dependent kinase inhibitor p21Waf1/Cip1/Sdi1 (Kriwacki et al, PNAS, 1996) Re-assessing structure-function paradigm Amino Acid Sequence Amino Acid Sequence 3D Structure Order Disorder Function Function What is disorder? Protein regions (or entire proteins) lacking stable II and III structure and existing in the ensemble of conformations with dynamically changing Ramachandran angles Disorder is experimentally detected by • X-ray crystallography • NMR spectroscopy • Circular Dichroism (CD) • Limited proteolysis (LP) • Hydrodynamic methods Bracken et al, Curr Opin Struct Biol. 2004, 570; Receveur-Bréchot et al, Proteins, 2006, 24 Compositional bias DisProt – database of disordered proteins DisProt-Order/Order 0.6 0.4 DisProt 4.9 (2009) DisProt 3.4 (2006) 0.2 0.0 -0.2 -0.4 -0.6 -0.8 CWY I F V L H TNA G DMK R SQ P E ↓Aromatic, hydrophobic Order-promoting Residues ↑Charged, hydrophilic Disorder-promoting Dunker et al, 2001, JMGM; Radivojac et al, 2007, Biophys J Charge-hydrophobicity bias ↑ Net charge ↓ Hydrophobicity ↓ Net charge ↑ Hydrophobicity Uversky et al, 2000, Proteins 41:415-427 Disorder prediction Amino acid sequence codes for protein structure Does amino acid sequence code for the lack of structure? Keith Dunker group – first Predictor Of Natural Disordered Regions PONDR Nature, 2011 Protein Disorder Predictors The PONDR-FIT meta-predictor combines several methods. Use it and other predictors here. Xue, B., R. L. DunBrack, R.W. Williams, A.K. Dunker, and V. N. Uversky (2010) "PONDR-Fit: A meta-predictor of intrinsically disordered amino acids," Biochim. Biophys. Acta (in press) doi:10.1016/j.bbapap.2010.01.011 PONDRFITTM Linding R, Jensen LJ, Diella F, Bork P, Gibson TJ, Russell RB. "Protein disorder prediction: implications for structural proteomics." Structure. 2003;11(11):1453-9, PMID: DisEMBLTM 14604535 Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF, Jones DT. "Prediction and functional analysis of native disorder in proteins from the three kingdoms of life." J Mol Biol. 2004;337(3):635-45, PMID: 15019783 DISOPRED2 MacCallum B. "Order/Disorder Prediction With Self Organising Maps." CASP 6 meeting, DRIPPRED Online paper Cheng J, Sweredoski M, Baldi P. "Accurate Prediction of Protein Disordered Regions by Mining Protein Structure Data" Data Mining and Knowledge Discovery. 2005; 11(3):213- DISpro 222, Online Paper Prilusky J, Felder CE, Zeev-Ben-Mordehai T, Rydberg EH, Man O, Beckmann JS, Silman I, Sussman JL. "FoldIndex: a simple tool to predict whether a given protein sequence is FoldIndex© intrinsically unfolded." Bioinformatics. 2005;21(16):3435-8, PMID: 15955783 Linding R, Russell RB, Neduva V, Gibson TJ. "GlobPlot: Exploring protein sequences for GlobPlot 2 globularity and disorder." Nucleic Acids Res. 2003;31(13):3701-8, PMID: 12824398 Dosztanyi Z, Csizmok V, Tompa P, Simon I. "IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content." Bioinformatics. 2005;21(16):3433-4, PMID: 15955779 IUPred Romero P, Obradovic Z, Li X, Garner EC, Brown CJ, Dunker AK. "Sequence complexity of disordered protein." Proteins. 2001;42(1):38-48, PMID: 11093259 PONDR® Coeytaux K, Poupon A. "Prediction of unfolded segments in a protein sequence based on amino acid composition." Bioinformatics. 2005;21(9):1891-900, PMID: 15657106 PreLink Yang ZR, Thomson R, McNeil P, Esnouf RM. "RONN: the bio-basis function neural network technique applied to the detection of natively disordered regions in proteins." RONN http://www.disprot.org/ predictors.php Eukaryotic proteomes are more disordered Dunker et al, Gen Inf, 2000 IUpred Pancsa et al, PLoS One, 2012 “This large jump in putatively disordered proteins in multicelled, rather than singlecelled, organisms is both remarkable and unexpected” 194 eukaryotes 69 bacteria 18 archaea Disorder and Functions Examples Function Description Protein modification Phosphorylation, acetylation, glycosylation, methylation, ubiquitination, fatty acylation histones, 4-E BP, CFTR, Bcl-2, neuromodulin, HMG-I(Y), p53 Molecular recognition Protein-DNA, protein-RNA, protein-protein, proteinligand interactions p53, max, fos, jun, myc, α-synuclein, CDK inhibitors p21, p57, p27, TF Phages, viruses, bacterial Macromolecular flagellum, ribosome, assembly spliceosome, nuclear pore Flexible linkers, entropic Entropic chains springs, bristles flagellin, SR proteins, ribosomal prot, Nups fd g3p, RPA, titin, neurofilament H Dunker et al, 2002, Biochemistry Protein modifications - phosphorylation • Reversible PTM of phosphate transfer from ATP to S, T or Y • ~ ⅓ of eukaryotic proteins are phosphorylated • Disordered regions often carry phosphorylation sites charged exposed hydrophylic flexible Phos-sites are enriched in IDRs DisPhos http://www.dabi.temple.edu/disphos/ KINASES & TARGETS More kinases that target IDPs More kinase targets are IDPs S – structured M – moderately structured U - unstructured Iakoucheva et al, NAR, 2004 Gsponer et al, Science, 2008 Ub substrates are disordered β-catenin peptide: 15 out of 26 aa are disordered Wu et al, Mol Cell, 2003 p27 peptide: 14 out of 24 aa are disordered Hao et al, Mol Cell, 2005 pSic1 protein: Sic1 is disordered even in the complex with Cdc4 Mittag et al, Structure, 2010 Molecular recognition Disordered regions are commonly used for binding to multiple partners C-terminus of p53 Oldfield et al, BMC Genomics, 2008 NCBD domain of CBP/p300 Wright and Dyson, Curr Opin Struct Biol, 2009 Mechanisms of binding for IDPs How do disordered proteins bind to their targets? Induced folding First binding then folding Conformational selection First folding then binding Coupled/synergistic Simultaneous folding and binding, or even binding without folding (Sic1) Molecular Recognition Features (MoRFs) p53 Antigen CD2 Dunker, Structure, 2007 MoRFs - short disorder-to-order transition binding regions MoRFpred http://biomine.ece.ualberta.ca/MoRFpred/ Summary Proteins can carry intrinsically disordered regions These regions can be predicted from sequence Eukaryotic proteins are more disordered IDRs perform important functional roles: posttranslational modifications, molecular recognition etc Disordered regions can undergo disorder-to-order transition and form MoRFs Disorder and disease Cancer Signaling Swiss-Prot PDB Disorder and disease Individual examples of IDPs/IDRs involved in human diseases: p53 (cancer), BRCA1 (cancer), a-synuclein (PD, AD, dementia, Down syndrome), amyloid b (AD), tau (AD), prion (TSEs), amylin (Type II diabetes), hirudin and thrombin (CVD), HPV (cancer) etc Disease-associated mutations Disease mutations impact protein Structure: Function: - Folding - Post-translational modifications - Oligomerization - Binding to partners - Stability - Intracellular localization … - Activity … Disease-associated mutations Many predictors of the functional impact of SNPs are available (SIFT, POLYPHEN, SNP3D etc) Majority rely on known protein 3D structure and evolutionary conservation Do disease mutations occur in the regions of disorder? Disease mutations and disorder Disease mutations are enriched in ordered regions 100 *** IDR OR mutations, % 80 60 40 20 0 DM Poly NES Disease mutations cause disorder-to-order transition OR 25 *** DM Poly NES 20 mutations, % IDR 15 10 5 0 D->O O->D Vacic et al, PLoS CB, 2012 Disease mutations, sec structure and MoRFs Transitions from helix/strand to loop and vice versa are enriched in disease DMs cause loss of MoRFs D→O and O→D D→O Substitution R→W R→C R→H E→K R→Q O→D D→O disease mutations, % 13.1 10.3 7.6 6.7 6.3 44% Substitution L→P C→R G→R W→R F→S O→D disease mutations, % 11.9 6.6 6.1 4.1 3.6 32.2% Hypothetical mechanism? Codons for Arginine: CGG CGT CGC CGA AGA AGG CpG methylation TGG TGT TGC TGA AGA AGG R-> W R-> C R-> C R-> Stop N/A N/A R-> W and R-> C are among the most frequent mutations in the disease dataset AMD simulations of D->O mutation Disorder predictions for p63 DBD Red – more heavily sampled by mutant Black – by WT AMD simulations for p63 DBD Disease Models Disorder-centric view at disease mutations complements structure-centric view Acknowledgements Rockefeller University Indiana University Columbia University UCSD Jurg Ott Chad Haynes Fei Ji Vladimir Vacic Keith Dunker Predrag Radivojac Vladimir Uversky Phineus Markwick Andy McCammon Funding: Disordered Proteins Database DisProt http://www.DisProt.org List of Disorder Predictors http://www.disprot.org/predictors.php Phos Sites Predictor DisPhos http://www.dabi.temple.edu/disphos/ Ub Sites Predictor UbPred http://www.ubpred.org/ MoRF predictor http://biomine.ece.ualberta.ca/MoRFpred/ lilyak@ucsd.edu – Lilia Iakoucheva, http://psychiatry.ucsd.edu/faculty/lIakoucheva.html Advantages of being disordered Low-affinity/high-specificity binding Broad binding diversity Ability to form large interaction surfaces Greater capture radius (“fly-casting” mechanism) Facilitate alternative splicing Facilitate post-translational modifications Current predictors and IDR mutations PolyPhen (and SIFT) under-predicts disease relevance of IDR mutations Disorder in interaction networks Are network hubs disordered? Yeast interactome hubs ends order proteins, % 80 60 40 20 0 >=30 >=40 >=50 >=60 >=70 >=80 >=90 >=100 length of predicted disordered region, aa Haynes et al, 2006, PLoS CB Ordered hubs – disordered partners 14-3-3 proteins – >200 binding targets 14-3-3 TARGETS predicted as disordered (Bustos et al, Proteins, 2006) All targets bind to the same region of 14-3-3 Differences in 14-3-3 side chains conformations in different structures Peptides are highly hydrated in the bound state (i.e. likely disordered in the unbound state) Oldfield et al, BMC Genomics, 2008