Basic structures of proteins Structural Hierarchy of Protein Primary structure . Functional elements : α-helix, strands, β-sheet, loops. - Structure, affinity, activity, specificity, stability etc. Functional element A Secondary structure Functional element B Functional element C Motif Fold (Scaffold) Structural motifs: super secondary structure • Commonly recurring substructures • Connectivity between secondary structural elements. • An individual motif usually consists of only a few elements. • Motifs do not allow us to predict the biological functions: they are found in proteins and enzymes with dissimilar functions - Fold: Arrangement of secondary structure elements of the structure (a total of 1282 folds) - Loops: Irregularly folded segments of polypeptide chain that connect the helixes and sheets - Usually exposed to solvent and are short • β hairpin: - Extremely common. - Two antiparallel β strands connected by a tight turn of a few amino acids between them. • Greek key: a decorative border constructed from a continuous line, shaped into a repeated motif - 4 β strands folded over into a sandwich shape • Omega loop: - A loop in which the residues that make up the beginning and end of the loop are very close together • Helix-loop-helix: - Consists of α helices bound by a looping stretch of amino acids. This motif is seen in transcription factors. • Zinc finger: -Two β strands with an α helix end folded over to bind a zinc ion. Important in DNA binding proteins. • Helix-turn-helix: - Two α helices joined by a short strand of amino acids and is found in many proteins that regulate gene expression. • Nest: - Extremely common. Just three consecutive amino acid residues form an anion-binding concavity. • Niche: - Extremely common. Just three consecutive amino acid residues form a cation-binding feature. Named after a pattern common to Greek ornamental artwork Domains - A part of a protein that can exist independently of the rest of the protein chain - Functional aspect Assembly of proteins from building blocks Layered sandwich structures, with each layer consisting of either α helixes or ß sheets • Classification by the packing - α/ α (all α) - ß/ß (all ß ) - α/ ß: α and ß elements are in a mixed order in the sequence - α + ß: α and ß elements are segregated in the sequence Packing of secondary structure • Hydrophobic effect: Major driving force for the folding of proteins - Burying and clustering of hydrophobic side chains to minimize their contacts with water • Basic requirements for folding: 1. Compact structure and minimization of hydrophobic surface area 2. Buried hydrogen bonding groups are all paired - Formation of a helixes and b sheets maximizes the pairing of the hydrogen bonding groups of the backbone - Packing of a helixes and b sheets by stacking their amino acid side chains • Packing density : Protein : ~ 0.75 Crystals : 0.7 ~ 0.78 Close-packed spheres : 0.74 Infinite cylinders : 0.91 • Quaternary structure: Overall organization of subunits - Contact interface of the subunits are closely packed as the protein interiors - Charged and hydrogen-bonding groups on the surface are paired with complementary partners Definition of the terms • Homologue : Evolved from a common ancestral protein Their evolutionary relationship is evident from similarities in sequence, structure and/or function • Analogue : Proteins that are similar in some way, yet show no evidence of a common ancestry. Structural analogues share the same fold, and functional analogues perform the same function • Paralogue : evolved by gene duplication within a genome and have a distinct, but usually related function • Orthologue : Equivalent genes in different species that evolved from a common ancestor by speciation Protein evolution and diversity - ~ 50,000 proteins - Structural analysis : ~1,000 scaffolds - Use of a limited repertoire of scaffolds for large diversity of functions diversification fine-tuning E1 E2 Duplication & modification E3 progenitor E4 undifferentiated scaffold Drastic gene rearrangement Insertion, deletion, and substitution of gene segments, point mutations primitive Proteins/ enzymes E5 superfamily proteins/enzymes Fine-tuning Accumulation of point mutations Sequence space and evolution of proteins Same superfamily same family Different family ④ Different superfamily Fitness ③ ② ① ⑥ Sequence space ① ② ③ ④ ⑤ ⑥ Same fold Sequence space Incremental improvement of protein property: specificity, activity, stability, expression Divergent evolution within family: substrate/cofactor specificity, enantio-selectivity Divergent evolution within superfamily: αβ hydrolase, enolase, crotonase superfamily Divergent evolution within fold: alteration of sub-binding/catalytic machineries Convergent evolution between folds: grafting sub-binding/catalytic machineries into different fold Directed evolution: find optimum fitness Divergent and convergent evolution of proteins : serine proteases • • • • Mammalian serine proteases: Common tertiary structure and function Superimposable polypeptide backbones About 60 % of the amino acids in the interior, but 10 % of the surface residues , are conserved Catalytic triad of residues :Asp-His-Ser Different substrate specificity Chymotrypsin Trypsin Elastase Subtilisin Catalytic property • Nucleophile: hydroxyl oxygen of Ser • Formation of acyl-enzyme through esterification of the hydroxyl of the reactive serine by the carboxyl portion of the substrate • Major difference in substrate specificity from changes in three loops forming the lining of the binding pocket - Chymotrypsin suitable for large hydrophobic side chains of Phe, Tyr, and Trp Small residues at the binding pocket - Trypsin: Negatively charged aspartate at the bottom Forms a salt linkage with the positively charged ammonium or guanidinium such as Lys and Arg - Elastase: Bulky Val and Thr at the entry of the pocket prevent the entry of large side chains into the pocket suitable for small hydrophobics like Ala • Catalytic triad: Arg-His-Ser Catalytic mechanism of serine protease Structural and mechanistic information • • • • Binding site of the enzymes : approximately complementary to the structures of the substrates Interactions : Non-polar parts of the substrate match up with non-polar side chains of the amino acids Hydrogen-bonding sites on the substrates bind to the backbone NH and CO groups of the protein Reactive part of the substrate is firmly held by this binding next to acidic, basic, or nucleophilic groups on the enzyme Provide a strategy and insight to engineering and design of enzymes Convergent or divergent evolution ? • Criteria for evolution from a common ancestor : Descending order of strength - DNA sequences coding for enzymes are similar? - Amino acid sequences are similar - Three-dimensional structures are similar? - Enzyme-substrate interactions are similar? - Catalytic mechanisms are similar? - Segments of polypeptide chain essential for catalysis are in the same sequence? • Mammalian serine proteases: Divergent evolution • Catalytic mechanism with subtilisin : Convergent evolution Three-dimensional structure is more conserved than primary structure but function has changed α/ß barrel protein (or Tim Barrel): Convergent evolution - Eight parallel ß strands connected by eight helixes - Strands form the staves of the barrel while the helixes are on the outside and parallel Hydrophobic core composed of the side chains of strands, Val, Ile, Leu Catalyze a variety of reactions and have diverse subunit compositions No homology with the enzymes that catalyze the same reaction in different organisms Active site: eight loops connecting the carboxyl end of each strand Little sequence identity, and active sites use different regions of the loops Relationship between three-dimensional structure and similarity of sequence Dependent on the length of the protein - The longer the protein, the lower the percent identity that implies similar structure - The higher sequence identity, the lower the RMSD between proteins - For a protein of 85 residues, 25~ 30 % sequence identity implies identical fold. Multi-enzyme complexes • Encoded in a single polypeptide chain • Involved in sequential steps in a biosynthetic pathway or complex biochemical process - Tryptophan synthase : tetramer (α2ß2) - Polyketide synthase - Fatty acid synthase • Advantages of multi-enzyme complexes - Enhanced catalytic efficiency: Reduced diffusion time of an intermediate Substrate channeling : ex) Tryptophan synthase: Reaction intermediate (indol) is not released, but shuttled directly between the subunits through 20 ~ 30 A long channel, and directed to the next reaction - Sequestration of reactive intermediates: protection of chemically unstable intermediates from water - Easy coordination for regulating the reaction - Easy coordination of expression during biosynthesis Polyketide synthases(PKSs) • Polyketide: a large class of diverse compounds that are characterized by more than two carbonyl groups connected by single intervening carbon atoms • PKSs: A family of multi-domain enzymes or enzyme complexes that produce polyketides, a large class of secondary metabolites: ex) Antibiotics(tetracyclin and macrolides, erythromycin), Anticholesterol drug (lovastatin) Immunosuppressant(sirolimus), Anticancer drug: epothilone B • Share striking similarities with fatty acid biosynthesis • The PKS genes are usually organized in one operon in bacteria and in gene clusters in eukaryotes The order of modules and domains of a complete polyketide-synthase • Starting or loading module: AT-ACP - Starter group, usually acetyl-CoA or malonyl-CoA, is loaded onto the ACP domain of the starter module catalyzed by the starter module's AT domain • Elongation or extending modules: KS-AT-[DH-ER-KR]-ACP- Polyketide chain is handed over from the ACP domain of the previous module to the KS domain of the current module, catalyzed by the KS domain • Termination or releasing module: TE - TE (thio-esterase) domain hydrolyzes the completed polyketide chain from the ACP-domain of the previous module Flexibility and conformational mobility of proteins • Flexibility of proteins even though globular proteins are closely packed • Undergo conformational changes on binding ligands or substrates • Conformational changes play an important role in a certain class of enzymes (allosteric) for modulating activity - Allosteric effectors: alter the shape of the protein : Hemoglobin • Equilibrium among two or more conformations of the protein in solution Allostery: A Phenomenon in which binding of a substrate, product, or other effector to a subunit of a multi-subunit enzyme at a site (allosteric site) other than the functional site alters its conformation and functional properties. Modes of motion and flexibility of proteins • Molecular tumbling - Globular proteins rotate in solution at frequencies close to those calculated for rigid sphere - Rotational correlation time (ϕ) : Time taken to rotate through a defined angle Reciprocal of the rate constant for the randomization of the orientation of the molecule by Brownian motion - For a rigid sphere, ϕ = V ƞ /kT V: molecular volume, ƞ: viscosity of the medium, k: Boltzmann’s constant, T: absolute temperature Approximately, ϕ = Mr/2000 ns, Mr: molecular mass of a globular protein ex) Chymotrypsin(Mr= 25,000), ϕ= 12 ns Rotation of side chains • NMR : most powerful technique for studying the mobility of individual amino acids • Measurement of rotational freedom of the aromatic side chains of Tyr and Phe about the Cβ-Cγ bond: H1 NMR - Detect whether or not the aromatic ring is constrained in an anisotropic environment - Slow rotation: 1 ~ 10 /s - Fast rotation: 10 4 ~ 10 5/s • Surface amino acids are more mobile than interior ones, showing no unique conformation Domain movement: hinge motion and segmental flexibility • Larger scale movement in proteins with low energy barriers • Hinge motion: - Two elements of structure undergo open and closed conformation as if connected by a hinge - MBP, Abl protein kinase (N-lobe and C-lobe) - Detection: time-resolved fluorescence polarization spectroscopy, NMR Protein mobility in solution • Incorporation of 15N into the protein Analysis of relaxation of 15N-NMR signals • The term ‘relaxation’ describes how signals change with time. - Signals deteriorate with time, becoming weaker and broader - The deterioration reflects that NMR signal arises from the over-population of an excited state, fluctuation in backbone structure