Hydrophobic Residue Patterning in β-Strands and Implications for β-Sheet Nucleation Brent Wathen Dept. of Biochemistry Queen’s University Outline • Part I: Introduction • Proteins • Protein Folding • Part II: Protein Structure Prediction • Goals, Challenges • Techniques • State of the Art • Part III: Residue Patterning on β-Strands • β-Sheet Nucleation • Hydrophobic/Hydrophilic Patterning 2 Outline • Part I: Introduction • Proteins • Protein Folding • Part II: Protein Structure Prediction • Goals, Challenges • Techniques • State of the Art • Part III: Residue Patterning on β-Strands • β-Sheet Nucleation • Hydrophobic/Hydrophilic Patterning 3 Part I: Introduction Proteins – Some Basics • What Is a Protein? 4 Part I: Introduction Proteins – Some Basics • What Is a Protein? • Linear Sequence of Amino Acids... 5 Part I: Introduction Proteins – Some Basics • What Is a Protein? • Linear Sequence of Amino Acids... • What is an Amino Acid? 6 Part I: Introduction Proteins – Some Basics • What Is a Protein? • Linear Sequence of Amino Acids... • What is an Amino Acid? 7 Part I: Introduction Proteins – Some Basics • How many types of Amino Acids? 8 Part I: Introduction Proteins – Some Basics • How many types of Amino Acids? • 20 Naturally Occurring Amino Acids • Differ only in SIDE CHAINS Isoleucine Arginine Tyrosine 9 Part I: Introduction Proteins – Some Basics • Amino Acids connect via PEPTIDE BOND 10 Part I: Introduction Proteins – Some Basics • Backbone can swivel: DIHEDRAL ANGLES • 2 per Amino Acid • Proteins can be 100’s of Amino Acids in length! • Lots of freedom of movement 11 Part I: Introduction Protein Functions • What do proteins do? 12 Part I: Introduction Protein Functions • What do proteins do? • Enzymes • Cellular Signaling • Antibodies 13 Part I: Introduction Protein Functions • What do proteins do? • Enzymes • Cellular Signaling • Antibodies • WHAT DON’T THEY DO! 14 Part I: Introduction Protein Functions • What do proteins do? • Enzymes • Cellular Signaling • Antibodies • WHAT DON’T THEY DO! • Comes from Greek Work Proteios – PRIMARY • Fundamental to virtually all cellular processes 15 Part I: Introduction Protein Functions • How do proteins do so much? 16 Part I: Introduction Protein Functions • How do proteins do so much? • Proteins FOLD spontaneously • Assume a characteristic 3D SHAPE • Shape depends on particular Amino Acid Sequence • Shape gives SPECIFIC function 17 Part I: Introduction Protein Structure • STRUCTURE FUNCTION relationship • Determining structure is often critical in understanding what a protein does • 2 main techniques • X-ray crystallography • NMR • 0.5Å RMSD accuracy • Both are very challenging • Months to years of work • Many proteins don’t yield to these methods 18 Part I: Introduction Protein Structure • Levels of organization • Primary Sequence • Secondary Structure (Modular building blocks) • α-helices • β-sheets • Tertiary Structure • Quartenary Structure • Hydrophobic/Hydrophilic Organization • Hydrophobics ON INSIDE • Hydrophobic Cores 19 Part I: Introduction Protein Structure 20 Part I: Introduction Protein Structure 21 Part I: Introduction Protein Folding • What we DO know... • Protein folding is FAST!! • Typically a couple of seconds • Folding is CONSISTENT!! • Involves weak forces – Non-Covalent • Hydrogen Bonding, van der Waals, Salt Bridges • Mostly, 2-STATE systems • VERY FEW INTERMEDIATES • Makes it hard to study – BLACK BOX 22 Part I: Introduction Protein Folding • What we DON’T know... • Mechanism...? • Forces...? • Relative contributions? • Hydrophobic Force thought to be critical 23 Part I: Introduction Intro Summary • Proteins are central to all living things • Critical to all biological studies • Folding process is largely unknown • Sequence Structure Mapping • Structure Function relationship • Determining Protein Structure Experimentally is HARD WORK 24 Outline • Part I: Introduction • Proteins • Protein Folding • Part II: Protein Structure Prediction • Goals, Challenges • Techniques • State of the Art • Part III: Residue Patterning on β-Strands • β-Sheet Nucleation • Hydrophobic/Hydrophilic Patterning 25 Part II: Structure Prediction The Prediction Problem Can we predict the final 3D protein structure knowing only its amino acid sequence? 26 Part II: Structure Prediction The Prediction Problem Can we predict the final 3D protein structure knowing only its amino acid sequence? • • • • Studied for 4 Decades “Holy Grail” in Biological Sciences Primary Motivation for Bioinformatics Based on this 1-to-1 Mapping of Sequence to Structure • Still very much an OPEN PROBLEM 27 Part II: Structure Prediction PSP: Goals • Accurate 3D structures. But not there yet. • Good “guesses” • Working models for researchers • Understand the FOLDING PROCESS • Get into the Black Box • Only hope for some proteins • 25% won’t crystallize, too big for NMR • Best hope for novel protein engineering • Drug design, etc. 28 Part II: Structure Prediction PSP: Major Hurdles • Energetics • We don’t know all the forces involved in detail • Too computationally expensive BY FAR! • Conformational search impossibly large • 100 a.a. protein, 2 moving dihedrals, 2 possible positions for each diheral: 2200 conformations! • Levinthal’s Paradox • Longer than time of universe to search • Proteins fold in a couple of seconds?? • Multiple-minima problem 29 Part II: Structure Prediction Tertiary Structure Prediction • Major Techniques • Template Modeling • Homology Modeling • Threading • Template-Free Modeling • ab initio Methods • Physics-Based • Knowledge-Based 30 Part II: Structure Prediction Template Modeling • Homology Modeling • Works with HOMOLOGS • ~ 50% of new sequences have HOMOLOGS • BLAST or PSI-BLAST search to find good models • Refine: • Molecular Dynamics • Energy Minimization 31 Part II: Structure Prediction Template-Free Modeling • Modeling based primarily from sequence • May also use: Secondary Structure Prediction, analysis of residue contacts in PDB, etc. • Advantages: • • • • • Can give insights into FOLDING MECHANISMS Adaptable: Prions, Membrane, Natively Unfolded Doesn’t require homologs Only way to model NEW FOLDS Useful for de novo protein design • Disadvantages: HARD! 32 Part II: Structure Prediction Template-Free Modeling • Physics-Based • Use ONLY the PRIMARY SEQUENCE • Try to model ALL FORCES • EXTREMELY EXPENSIVE computationally • Knowledge-Based • Include other knowledge: SSP, PDB Analysis • Statistical Energy Potentials • Not so interested in folding process • “Hot” area of research 33 Part II: Structure Prediction Template-Free Modeling • All methods SIMPLIFY problem • Reduced Atomic Representations • C-α’s only; C-α + C-β; etc. • Simplify Force Fields • Only van der Waals; only 2-body interactions • Reduced Conformational Searches • Lattice Models • Dihedral Angle Restrictions 34 Part II: Structure Prediction Template-Free Modeling • Basic Approach: 1. Begin with an unfolded conformation 2. Make small conformational change 3. Measure energy of new conformation Accept based on heuristic: SA, MC, etc. 4. Repeat until ending criteria reached • Underlying Assumption: Correct Conformation has LOWEST ENERGY 35 Part II: Structure Prediction Diverse Efforts • Data Mining • Pattern Classification • Neural Networks, HMMs, Nearest Neighbour, etc. • Packing Algorithms • Search Optimization • Traveling Salesman Problem • Contact Maps, Contact Order • Constraint Logic, etc. • Combinations of the above! 36 Part II: Structure Prediction ROSETTA • Pioneered by Baker Group (U. of Washington) • Fragment Based Method • Guiding Assumption: • Fragment Conformations in PDB approximate their structural preferences • Pre-build fragment library • Alleviates need to do local energy calculations • Lowest energy conformations should already be in library 37 Part II: Structure Prediction ROSETTA • Pre-build fragment library • 3-mers and 9-mers • 200 structural possibilities for each • Build conformations from the library • Randomly assign 3-mers, 9-mers along chain • During conformational search, reassign a 3-mer or a 9-mer to a new conformation at random • Score using energy function • Adaptive: Coarse grain at first, detailed at end • Accept changes based on Monte Carlo method 38 Part II: Structure Prediction Diverse Efforts • Data Mining • Pattern Classification • Neural Networks, HMMs, Nearest Neighbour, etc. • Packing Algorithms • Search Optimization • Traveling Salesman Problem • Contact Maps, Contact Order • Constraint Logic, etc. • Combinations of the above! 39 Part II: Structure Prediction State of the Art • CASP Competition • Critical Assessment of Structure Prediction • Blind Competition Every 2 years • CASP6 in 2004 - CASP7 just completed • ~75 proteins whose structures have not been published as yet • Easy homologs examples • Distant homologs available • De novo structures: no homologs known 40 Part II: Structure Prediction State of the Art • Template Modeling CASP6 Target 266 (green), and best model (blue) Moult, J. (2005) Cur. Opin. Struct. Bio. 15:285-289 41 Part II: Structure Prediction State of the Art • Template Modeling • Alignment still not easy, and often requires multiple templates • Accurate core models (within 2-3Å RMSD) • Still not good at modeling regions missing from template • Side-chain modeling not too good • Molecular dynamics not able to improve models as hoped 42 Part II: Structure Prediction State of the Art • Template-Free Modeling CASP6 target 201, and best model. Vincent, J.J. et. al (2005) Proteins 7:67-83. 43 Part II: Structure Prediction State of the Art • Template-Free Modeling CASP6 target 241, and 3 best models. Vincent, J.J. et. al (2005) Proteins 7:67-83. 44 Part II: Structure Prediction State of the Art • How Good are Current Techniques? • CASP6 Summary: “The disappointing results for [hard new fold] targets suggest that the prediction community as a whole has learned to copy well but has not really learned how proteins fold.” Vincent, J.J. et. al (2005) Proteins 7:67-83. 45 Part II: Structure Prediction PSP Summary • Many diverse, creative efforts • Progress IS being made in finding final 3D structures • Less so with regards to understanding folding mechanisms • NEEDED: • Marriage of Creative Ideas and Increased Resources 46 Outline • Part I: Introduction • Proteins • Protein Folding • Part II: Protein Structure Prediction • Goals, Challenges • Techniques • State of the Art • Part III: Residue Patterning on β-Strands • β-Sheet Nucleation • Hydrophobic/Hydrophilic Patterning 47 Part III: β-Strand Patterning β-Sheet Basics • Made up of β-Strands • Diverse: • Parallel/Antiparallel • Edge/Interior Strands • Typically Twisted • Many Forms • β-sandwiches, β-barrels, β-helices, β-propellers, etc. • 2D? 3D? • Less studied than helices 48 Part III: β-Strand Patterning Beta Sheet Basics Internalin A Narbonin Polygalacturonase Galactose Oxidase 49 Part III: β-Strand Patterning Beta Sheet Basics • What do we know? • Residues: • V, I, F, Y, W, T, C L • Found largely in Protein Cores • Amphipathic Nature 50 Part III: β-Strand Patterning Amphipathic 51 Part III: β-Strand Patterning Theory of β-Sheet Nucleation • Hydrophobic Zipper (HZ) • Dill et. al. (1993) • Hydrophobic residues from different parts of chain make initial contact • Correct alignment of backbones • Hydrogen bonding • Subsequent growth via “Zipping Up” 52 Part III: β-Strand Patterning Theory of β-Sheet Nucleation • Hydrophobic Zipper (HZ) Dill, K.A. et al., (1993) Proc. Natl. Acad. Sci. USA 90: 1942-1946. 53 Part III: β-Strand Patterning Theory of Nucleation • Hydrophobic Zipper (HZ) • Once Hydrophobic “Seed” established, can grow out 2 directions 54 Part III: β-Strand Patterning Thought Experiment... • What would a Beta Seed look like? 55 Part III: β-Strand Patterning Thought Experiment... • What would a Beta Seed look like? • Contain hydrophobics • On both strands 56 Part III: β-Strand Patterning Thought Experiment... • What would a Beta Seed look like? • Contain hydrophobics • On both strands • How many? • Will single hydrophobic on each strand be sufficient? 57 Part III: β-Strand Patterning Thought Experiment... • What would a Beta Seed look like? • Contain hydrophobics • On both strands • How many? • Will single hydrophobic on each strand be sufficient? • Single Unlikely: • 1 Hydrophobic Residue NOT SPECIFIC ENOUGH • Too many possible combinations 58 Part III: β-Strand Patterning Thought Experiment... • What would a Beta Seed look like? • Contain hydrophobics • On both strands • How many? • Will single hydrophobic on each strand be sufficient? • Single Unlikely: • 1 Hydrophobic Residue NOT SPECIFIC ENOUGH • Too many possible combinations At least 1 strand must have >1 Hydrophobic 59 Part III: β-Strand Patterning Thought Experiment... • What hydrophobic arrangement would lead to Beta Sheet Nucleation? • i,i+1? • i,i+2? • i,i+3? • i,i+4? 60 Part III: β-Strand Patterning Thought Experiment... • What hydrophobic arrangement would lead to Beta Sheet Nucleation? • i,i+1? No, not likely: Amphipathic. • i,i+2? • i,i+3? • i,i+4? 61 Part III: β-Strand Patterning Thought Experiment... • What hydrophobic arrangement would lead to Beta Sheet Nucleation? • i,i+1? No, not likely: Amphipathic. • i,i+2? • i,i+3? No... Amphipathic. • i,i+4? 62 Part III: β-Strand Patterning Thought Experiment... • What hydrophobic arrangement would lead to Beta Sheet Nucleation? • i,i+1? No, not likely: Amphipathic. • i,i+2? • i,i+3? No... Amphipathic. • i,i+4? Seems too far apart... 63 Part III: β-Strand Patterning Thought Experiment... • What hydrophobic arrangement would lead to Beta Sheet Nucleation? • i,i+1? No, not likely: Amphipathic. • i,i+2? Most likely. • i,i+3? No... Amphipathic. • i,i+4? Seems too far apart... Chain loop? 64 Part III: β-Strand Patterning Hypothesis Assuming: • Beta Sheets Nucleate by Hydrophobics (HZ) • i,i+2 hydrophobic pairings on beta strands are necessary for nucleation 65 Part III: β-Strand Patterning Hypothesis Assuming: • Sec. structures contain their nucleating residues • Beta Sheets Nucleate by Hydrophobics (HZ) • i,i+2 hydrophobic pairings on beta strands are necessary for nucleation Beta Strands contain an increased frequency of i,i+2 hydrophobic residue pairings. 66 Part III: β-Strand Patterning Hypothesis 67 Part III: β-Strand Patterning Hypothesis 68 Part III: β-Strand Patterning Hypothesis 69 Part III: β-Strand Patterning Hypothesis 70 Part III: β-Strand Patterning Technique • Looking for statistically significant patterns • For any particular pattern: 1. Count how often it occurs in database 2. Randomly shuffle residues in sheets 3. Re-count how often pattern occurs 4. Repeat random shuffle and counting x1000 5. Compare initial count, avg random count Calculate the Std Dev σ If σ > 3.0, statistically significant 71 Part III: β-Strand Patterning Technique 72 Part III: β-Strand Patterning Technique 73 Part III: β-Strand Patterning Technique 74 Part III: β-Strand Patterning Technique 75 Part III: β-Strand Patterning Technique 76 Part III: β-Strand Patterning Technique 77 Part III: β-Strand Patterning Technique • Patterns of Interest: • Hydrophobic patterning (V L I F M) • Hydrophilic patterning (K R E D S T N Q) • Positions: • • • • i,i+1 i,i+2 i,i+3 i,i+4 • Consider only strands of length >= 5 residues 78 Part III: β-Strand Patterning Results • Hydrophilics • i,i+1 79 Part III: β-Strand Patterning Results • Hydrophilics • i,i+1 • Strongly Disfavoured: -20.5σ 80 Part III: β-Strand Patterning Results • Hydrophilics • i,i+2 81 Part III: β-Strand Patterning Results • Hydrophilics • i,i+2 • Strongly Favoured: 13.0σ 82 Part III: β-Strand Patterning Results • Hydrophilics • i,i+3 83 Part III: β-Strand Patterning Results • Hydrophilics • i,i+3 • Strongly Disfavoured: -6.1σ 84 Part III: β-Strand Patterning Results • Hydrophilics • i,i+4 85 Part III: β-Strand Patterning Results • Hydrophilics • i,i+4 • Strongly Favoured: 5.7σ 86 Part III: β-Strand Patterning Results • Hydrophilics: Summary 15 10 5 zScore 0 -5 -10 (i,i+1) (i,i+2) (i,i+3) (i,i+4) -15 -20 -25 Pattern • Demonstrate Amphipathic Separation • Suggests residues help guide tertiary formation • Moral Support: Technique seems sound 87 Part III: β-Strand Patterning Results • Hydrophobics • i,i+1 88 Part III: β-Strand Patterning Results • Hydrophobics • i,i+1 • Strongly Disfavoured: -16.8σ 89 Part III: β-Strand Patterning Results • Hydrophobics • i,i+3 90 Part III: β-Strand Patterning Results • Hydrophobics • i,i+3 • Strongly Disfavoured: -16.6σ 91 Part III: β-Strand Patterning Results • Hydrophobics • i,i+2 92 Part III: β-Strand Patterning Results • Hydrophobics • i,i+2 • Barely Favoured!: 3.5σ 93 Part III: β-Strand Patterning Results • Hydrophobics • i,i+4 94 Part III: β-Strand Patterning Results • Hydrophobics • i,i+4 • Strongly Disfavoured: -19.6σ 95 Part III: β-Strand Patterning Results • Hydrophobics: Summary 5 0 -5 (i,i+1) (i,i+2) (i,i+3) (i,i+4) z-10 Score -15 -20 -25 Pattern • Clearly amphipathic: i,i+1 i,i+3 Disfavoured • NOT particularly favoured at i,i+2 • Unexpectedly: i,i+4 strongly Disfavoured 96 Part III: β-Strand Patterning Results • Hydrophobics: Summary • Where are the hydrophobic pairings?? • Not at i,i+1 or i,i+3 or i,i+4 • Barely at i,i+2 • Note: • Moderate i,i+2 pairing: No strong aggregation • Low low i,i+4 pairing: Not Dispersed! Isolated 97 Part III: β-Strand Patterning Results 98 Part III: β-Strand Patterning Results 99 Part III: β-Strand Patterning Results • Examine localized hydrophobic pairings... 100 Part III: β-Strand Patterning Results • Examine localized hydrophobic pairings... • i,i+2 @ NT 101 Part III: β-Strand Patterning Results • Examine localized hydrophobic pairings... • i,i+2 @ NT • Only slightly favoured: 2.5σ 102 Part III: β-Strand Patterning Results • Examine localized hydrophobic pairings... • i,i+2 @ NT+1 103 Part III: β-Strand Patterning Results • Examine localized hydrophobic pairings... • i,i+2 @ NT+1 • Strongly favoured!!: 9.3σ 104 Part III: β-Strand Patterning Results • Examine localized hydrophobic pairings... • i,i+2 @ NT+2 105 Part III: β-Strand Patterning Results • Examine localized hydrophobic pairings... • i,i+2 @ NT+2 • Indifferent: 0.8σ 106 Part III: β-Strand Patterning Results • Examine localized hydrophobic pairings... • i,i+2 @ CT 107 Part III: β-Strand Patterning Results • Examine localized hydrophobic pairings... • i,i+2 @ CT • Favoured!: 5.7σ 108 Part III: β-Strand Patterning Results • Examine localized hydrophobic pairings... • i,i+2 @ CT-1 109 Part III: β-Strand Patterning Results • Examine localized hydrophobic pairings... • i,i+2 @ CT-1 • Only slightly favoured: 3.4σ 110 Part III: β-Strand Patterning Results • Examine localized hydrophobic pairings... • i,i+2 @ CT-2 111 Part III: β-Strand Patterning Results • Examine localized hydrophobic pairings... • i,i+2 @ CT-2 • Only slightly favoured: 3.9σ 112 Part III: β-Strand Patterning Results • Examine localized hydrophobic pairings... • i,i+2 @ Interior Positions 113 Part III: β-Strand Patterning Results • Examine localized hydrophobic pairings... • i,i+2 @ Interior Positions • Actually Disfavoured!!: -3.0σ 114 Part III: β-Strand Patterning Results • Examine localized hydrophobic pairings... • Summary: 10 8 6 4 zScore 2 0 -2 NT NT+1 NT+2 Central CT-2 CT-1 CT Avg -4 Pattern Location • Localized i,i+2 hydrophobic pairing at NT and CT • Disfavoured at interior positions 115 Part III: β-Strand Patterning Results • Examine localized hydrophobic pairings... • Are these patterns sense-specific? • @ NT+1: 10 8 6 4 zScore 2 0 -2 Parallel Antiparallel Mixed Edge -4 Strand Type • Favoured for Parallel, Antiparallel 116 Part III: β-Strand Patterning Results • Examine localized hydrophobic pairings... • Are these patterns sense-specific? • @ CT: 5 4 3 z2 Score 1 0 Parallel Antiparallel Mixed Edge -1 Strand Type • Favoured for Antiparallel, Mixed • NOT PARALLEL! 117 Part III: β-Strand Patterning Conclusions • Hydrophobic patterning suggests: • Hydrophobics are located on one side of beta sheets AMPHIPATHIC • Hydrophobics are CLUSTERED • Hydrophobics aggregate at NT, CT • Parallel Strands: @ NT only • Antiparallel Strands: @ NT & CT • Supports HYDROPHOBIC ZIPPER theory for sheet nucleation 118 Part III: β-Strand Patterning Implications • How do beta sheets nucleate? • Parallel 119 Part III: β-Strand Patterning Implications • How do beta sheets nucleate? • Parallel • Nucleate at NT • Growth is unidirectional: NTCT 120 Part III: β-Strand Patterning Implications • How do beta sheets nucleate? • Antiparallel 121 Part III: β-Strand Patterning Implications • How do beta sheets nucleate? • Antiparallel • Nucleate at edge • Growth is unidirectional 122 Part III: β-Strand Patterning Future Work 1. Extend this work to 2D Both intra- and inter-strand patterning 2. Consider more complex patterning 3 residues on one strand? NT Position? Specific residue combinations? 3. Consider patterning by beta-sheet type Beta Helices, Barrels, Sandwiches, etc. 123 Acknowledgements • Dr. Jia • Lab Members • • • • • • Dr. Qilu Ye Dr. Vinay Singh Dr. Susan Yates Daniel Lee Jimmy Zheng Neilin Jaffer • • • • • Andrew Wong Michael Suits Laura van Staalduinen Mark Currie Kateryna Podzelinska • NSERC 124