Structural Biology: What does 3D tell us? Stephen J Everse University of Vermont Outline • Determining a 3D structure – X-ray crystallography • Structural elements • Modeling a 3D structure Protein Structures Primary Secondary Tertiary Quaternary Amino acid sequence. Alpha helices & Beta sheets, Loops. Arrangement of secondary elements in 3D space. Packing of several polypeptide chains. Given an amino acid sequence, we are interested in its secondary structures, and how they are arranged in higher structures. Secondary Structural Elements Alpha-helix Beta-strand Beta-turns Viewing Structures Ca or CA Ball-and-stick CPK • It’s often as important to decide what to omit as it is to decide what to include • What you omit depends on what you want to emphasize Ribbon and Topology Diagrams Representations of Secondary Structures C a-helix b-strand N GRASP Graphical Representation and Analysis of Structural Properties Red = negative surface charge Blue = positive surface charge Consurf • The ConSurf server enables the identification of functionally important regions on the surface of a protein or domain, of known three-dimensional (3D) structure, based on the phylogenetic relations between its close sequence homologues; • A multiple sequence alignment (MSA) is used to build a phylogenetic tree consistent with the MSA and calculates conservation scores with either an empirical Bayesian or the Maximum Likelihood method. http://consurf.tau.ac.il/ Tools for Viewing Structures • Jmol – http://jmol.sourceforge.net • PyMOL – http://pymol.sourceforge.net • Swiss PDB viewer – http://www.expasy.ch/spdbv • Mage/KiNG – http://kinemage.biochem.duke.edu/software/mage.php – http://kinemage.biochem.duke.edu/software/king.php • Rasmol – http://www.umass.edu/microbio/rasmol/ Where can you learn about protein structures? • RCSB (PDB) – Lots of hyperlinks out – Educational info (proteins of the month) • Proteopedia RCSB http://www.rcsb.org/ PDB – View of Biology Proteopedia How do we show 3-D? • Stereo pairs – Rely on the way the brain processes leftand right-eye images – If we allow our eyes to go slightly walleyed or crossed, the image appears three-dimensional • Dynamics: rotation of flat image • Movies Stereo pair: Release factor 2/3 Klaholz et al, Nature (2004) 427:862 Protein structures in the PDB The last ~15 years have witnessed an explosion in the number of known protein structures. How do we make sense of all this information? N=78,477 blue bars: yearly total red bars: cumulative total Non-redundant ~ 44,706 Classification of Protein Structures The explosion of protein structures has led to the development of hierarchical systems for comparing and classifying them. Effective protein classification systems allow us to address several fundamental and important questions: If two proteins have similar structures, are they related by common ancestry, or did they converge on a common theme from two different starting points? How likely is that two proteins with similar structures have the same function? Put another way, if I have experimental knowledge of, or can somehow predict, a protein’s structure, I can fit into known classification systems. How much do I then know about that protein? Do I know what other proteins it is homologous to? Do I know what its function is? Definition of Domain • “A polypeptide or part of a polypeptide chain that can independently fold into a stable tertiary structure...” from Introduction to Protein Structure, by Branden & Tooze • “Compact units within the folding pattern of a single chain that look as if they should have independent stability.” from Introduction to Protein Architecture, by Lesk • Thus, domains: • can be built from structural motifs; • independently folding elements; • functional units; • separable by proteases. Two domains of a bifunctional enzyme Proteins Can Be Made From One or More Domains • • • • Proteins often have a modular organization Single polypeptide chain may be divisible into smaller independent units of tertiary structure called domains Domains are the fundamental units of structure classification Different domains in a protein are also often associated with different functions carried out by the protein, though some functions occur at the interface between domains domain organization of P53 tumor suppressor 1 60 activation domain 100 300 324 355 363 393 sequence-specific tetramer- non-specific DNA binding domain ization DNA-binding domain domain Rates of Change • Not all proteins change at the same rate; • Why? • Functional pressures – Surface residues are observed to change most frequently; – Interior less frequently; SequenceStructureFunction Many sequences can give same structure Side chain pattern more important than sequence When homology is high (>50%), likely to have same structure and function (Structural Genomics) Cores conserved Surfaces and loops more variable *3-D shape more conserved than sequence* *There are a limited number of structural frameworks* W. Chazin © 2003 Degree of Evolutionary Conservation Less conserved Information poor DNA seq Protein seq ACAGTTACAC CGGCTATGTA CTATACTTTG HDSFKLPVMS KFDWEMFKPC GKFLDSGKLG S. Lovell © 2002 More conserved Information rich Structure Function How is a 3D structure determined ? 1. Experimental methods (Best approach): • X-rays crystallography - stable fold, good quality crystals. • NMR - stable fold, not suitable for large molecule. 2. In-silico methods (partial solutions based on similarity): • Sequence or profile alignment - uses similar sequences, limited use of 3D information. • Threading - needs 3D structure, combinatorial complexity. • Ab-initio structure prediction - not always successful. What information does structure give you? 3-D view of macromolecules at near atomic resolution. The result of a successful structural project is a “structure” or model of the macromolecule in the crystal. You can assign: - secondary structure elements - position and conformation of side chains - position of ligands, inhibitors, metals etc. A model allows you: - to understand biochemical and genetic data (i.e., structural basis of functional changes in mutant or modified macromolecule). - generate hypotheses regarding the roles of particular residues or domains Sylvie Doublié © 2000 What did I just say????!!! • A structure is a “MODEL”!! • What does that mean? – It is someone’s interpretation of the primary data!!! So what happens when we can’t get an NMR or X-ray structure? 2˚ & 3˚ Structure Prediction Structure Prediction • Threading – A protein fold recognition technique that involves incrementally replacing the sequence of a known protein structure with a query sequence of unknown structure. • Why threading? – Secondary structure is more conserved than primary structure – Tertiary structure is more conserved than secondary structure TH R E A D 3D Threading Servers Generate 3D models or coordinates of possible models based on input sequence • PredictProtein-PHDacc – http://www.predictprotein.org • PredAcc – http://mobyle.rpbs.univ-paris-diderot.fr/cgibin/portal.py?form=PredAcc • Loopp (version 2) – http://cbsuapps.tc.cornell.edu/loopp.aspx • Phyre – http://www.sbg.bio.ic.ac.uk/~phyre/ • SwissModel – http://swissmodel.expasy.org/ • All require email addresses since the process may take hours to complete Ab Initio Folding • Two Central Problems – Sampling conformational space (10100) – The energy minimum problem • The Sampling Problem (Solutions) – Lattice models, off-lattice models, simplified chain methods, parallelism • The Energy Problem (Solutions) – Threading energies, packing assessment, topology assessment Lattice Folding http://predictioncenter.org/ Critical Assessment of protein Structure Prediction (CASP) http://folding.stanford.edu/ For the gamers out there… http://fold.it/portal/ Print & Online Resources Crystallography Made Crystal Clear, by Gale Rhodes http://www.usm.maine.edu/~rhodes/CMCC/index.html http://ruppweb.dyndns.org/Xray/101index.html Online tutorial with interactive applets and quizzes. http://www.ysbl.york.ac.uk/~cowtan/fourier/fourier.html Nice pictures demonstrating Fourier transforms http://ucxray.berkeley.edu/~jamesh/movies/ Cool movies demonstrating key points about diffraction, resolution, data quality, and refinement. http://www-structmed.cimr.cam.ac.uk/course.html Notes from a macromolecular crystallography course taught in Cambridge ENGAGE…. • At PDB search for something you are interested in … transferrin, transporter, etc • Take the PDBID and use it in the command line, fetch PDBID • Now you can display it any way you want… see any of this session’s links for details.