23 Amino Acids, Peptides, and Proteins T he three kinds of polymers that are prevalent in nature are polysaccharides, proteins, and nucleic acids. You have already learned about polysaccharides, which are naturally occurring polymers of sugar subunits (Section 22.18), and nucleic acids are covered in Chapter 27. We will now look at proteins and the structurally similar, but shorter, peptides. Peptides and proteins are polymers of amino acids linked together by amide bonds. The repeating units are called amino acid residues. Amino acid polymers can be composed of any number of monomers. A dipeptide contains two amino acid residues, a tripeptide contains three, an oligopeptide contains three to 10, and a polypeptide contains many amino acid residues. Proteins are naturally occurring polypeptides that are made up of 40 to 4000 amino acid residues. From the structure of an amino acid, we can see that the name is not very precise. The compounds commonly called amino acids are more precisely called a-aminocarboxylic acids. oxidized glutathione amide bonds O CH R + C OH NH3 -aminocarboxylic acid an amino acid O O O NHCHC NHCHC NHCHC R R′ R′′ amino acids are linked together by amide bonds Proteins and peptides serve many functions in biological systems. Some protect organisms from their environment or impart strength to certain biological structures. Hair, horns, hooves, feathers, fur, and the tough outer layer of skin are all composed largely of a structural protein called keratin. Collagen, another structural protein, is a major component of bones, muscles, and tendons. Some proteins have other protective functions. Snake venoms and plant toxins, for example, protect their owners from other species, blood-clotting proteins protect the vascular system when it is injured, 959 960 CHAPTER 23 Amino Acids, Peptides, and Proteins and antibodies and protein antibiotics protect us from disease. A group of proteins called enzymes catalyzes the chemical reactions that occur in living systems, and some of the hormones that regulate these reactions are peptides. Proteins are also responsible for many physiological functions, such as the transport and storage of oxygen in the body and the contraction of muscles. 23.1 Classification and Nomenclature of Amino Acids The structures of the 20 most common naturally occurring amino acids and the frequency with which each occurs in proteins are shown in Table 23.1. Other amino acids occur in nature, but only infrequently. All amino acids except proline contain a primary amino group. Proline contains a secondary amino group incorporated into a five-membered ring. The amino acids differ only in the substituent (R) attached to the a-carbon. The wide variation in these substituents (called side chains) is what gives proteins their great structural diversity and, as a consequence, their great functional diversity. 3-D Molecules: Common naturally occurring amino acids Table 23.1 The Most Common Naturally Occurring Amino Acids The amino acids are shown in the form that predominates at physiological pH (7.3). Formula Name Abbreviations Average relative abundance in proteins O Aliphatic side chain amino acids CHCO− H + Glycine Gly G 7.5% Alanine Ala A 9.0% Valine* Val V 6.9% Leucine* Leu L 7.5% Isoleucine* Ile I 4.6% Serine Ser S 7.1% Threonine* Thr T 6.0% NH3 O CHCO− CH3 + NH3 O CHCO− CH3CH CH3 + NH3 O CHCO− CH3CHCH2 + CH3 NH3 O CH3CH2CH CHCO− CH3 + NH3 O Hydroxy-containing amino acids CHCO− HOCH2 + NH3 O CH3CH CHCO− OH + NH3 * Essential amino acids Section 23.1 Table 23.1 Classification and Nomenclature of Amino Acids 961 (continued) Formula Name Abbreviations Average relative abundance in proteins O Sulfur-containing amino acids CHCO− HSCH2 + Cysteine Cys C 2.8% Methionine* Met M 1.7% Aspartate (aspartic acid) Asp D 5.5% Glutamate (glutamic acid) Glu E 6.2% Asparagine Asn N 4.4% Glutamine Gln Q 3.9% Lysine* Lys K 7.0% Arginine* Arg R 4.7% Phenylalanine* Phe F 3.5% Tyrosine Tyr Y 3.5% Proline Pro P 4.6% NH3 O CHCO− CH3SCH2CH2 + O Acidic amino acids NH3 O − CHCO− OCCH2 + NH3 O O − CHCO− OCCH2CH2 + Amides of acidic amino acids NH3 O O H2NCCH2 CHCO− + NH3 O O CHCO− H2NCCH2CH2 + NH3 O Basic amino acids + CHCO− H3NCH2CH2CH2CH2 + NH3 + NH2 O CHCO− H2NCNHCH2CH2CH2 + NH3 O CHCO− CH2 Benzene-containing amino acids + NH3 O HO CHCO− CH2 + NH3 O CO− Heterocylic amino acids N Η * Essential amino acids + Η 962 CHAPTER 23 Table 23.1 Amino Acids, Peptides, and Proteins (continued) Formula Name Average relative abundance in proteins Abbreviations O Heterocyclic amino acids (continued) CHCO− CH2 + N NH Histidine* His H 2.1% Tryptophan* Trp W 1.1% NH3 O CHCO− CH2 + NH3 N H * Essential amino acids glycine leucine The amino acids are almost always called by their common names. Often, the name tells you something about the amino acid. For example, glycine got its name because of its sweet taste (glykos is Greek for “sweet”), and valine, like valeric acid, has five carbon atoms. Asparagine was first found in asparagus, and tyrosine was isolated from cheese (tyros is Greek for “cheese”). Dividing the amino acids into classes makes them easier to learn. The aliphatic side chain amino acids include glycine, the amino acid in which R = H, and four amino acids with alkyl side chains. Alanine is the amino acid with a methyl side chain, and valine has an isopropyl side chain. Can you guess which amino acid—leucine or isoleucine—has an isobutyl side chain? If you gave the obvious answer, you guessed incorrectly. Isoleucine does not have an “iso” group; it is leucine that has an isobutyl substituent—isoleucine has a sec-butyl substituent. Each of the amino acids has both a three-letter abbreviation (the first three letters of the name in most cases) and a single-letter abbreviation. Two amino acid side chains—serine and threonine—contain alcohol groups. Serine is an HO-substituted alanine and threonine has a branched ethanol substituent. There are also two sulfur-containing amino acids: Cysteine is an HS-substituted alanine and methionine has a 2-methylthioethyl substituent. There are two acidic amino acids (amino acids with two carboxylic acid groups): aspartate and glutamate. Aspartate is a carboxy-substituted alanine and glutamate has one more methylene group than aspartate. (If their carboxyl groups are protonated, they are called aspartic acid and glutamic acid, respectively.) Two amino acids— asparagine and glutamine—are amides of the acidic amino acids; asparagine is the amide of aspartate and glutamine is the amide of glutamate. Notice that the obvious one-letter abbreviations cannot be used for these four amino acids because A and G are used for alanine and glycine. Aspartic acid and glutamic acid are abbreviated D and E, and asparagine and glutamine are abbreviated N and Q. There are two basic amino acids (amino acids with two basic nitrogen-containing groups): lysine and arginine. Lysine has an P-amino group and arginine has a d-guanidino group. At physiological pH, these groups are protonated. The P and d can remind you how many methylene groups each amino acid has. + O + H3N CH2CH2CH2CH2CHCO− + an -amino group lysine NH3 H2N O NH2 C CH2CH2CH2CHCO− NH + a -guanidino group arginine NH3 Section 23.1 Classification and Nomenclature of Amino Acids 963 Two amino acids—phenylalanine and tyrosine—contain benzene rings. As its name indicates, phenylalanine is phenyl-substituted alanine. Tyrosine is phenylalanine with a para-hydroxy substituent. Proline, histidine, and tryptophan are heterocyclic amino acids. Proline has its nitrogen incorporated into a five-membered ring—it is the only amino acid that contains a secondary amino group. Histidine is an imidazole-substituted alanine. Imidazole is an aromatic compound because it is cyclic and planar and has three pairs of delocalized p electrons (Section 21.11). The pKa of a protonated imidazole ring is 6.0, so the ring will be protonated in acidic solutions and nonprotonated in basic solutions (Section 23.3). + HN NH N protonated imidazole NH + H+ aspartate imidazole Tryptophan is an indole-substituted alanine (Section 21.11). Like imidazole, indole is an aromatic compound. Because the lone pair on the nitrogen atom of indole is needed for the compound’s aromaticity, indole is a very weak base. (The pKa of protonated indole is -2.4.) Therefore, the ring nitrogen in tryptophan is never protonated under physiological conditions. Ten amino acids are essential amino acids. We humans must obtain these 10 essential amino acids from our diets because we either cannot synthesize them at all or cannot synthesize them in adequate amounts. For example, we must have a dietary source of phenylalanine because we cannot synthesize benzene rings. However, we do not need tyrosine in our diets, because we can synthesize the necessary amounts from phenylalanine. The essential amino acids are denoted by red asterisks (*) in Table 23.1. Although humans can synthesize arginine, it is needed for growth in greater amounts than can be synthesized. So arginine is an essential amino acid for children, but a nonessential amino acid for adults. Not all proteins contain the same amino acids. Bean protein is deficient in methionine, for example, and wheat protein is deficient in lysine. They are incomplete proteins: They contain too little of one or more essential amino acids to support growth. Therefore, a balanced diet must contain proteins from different sources. Dietary protein is hydrolyzed in the body to individual amino acids. Some of these amino acids are used to synthesize proteins needed by the body, some are broken down further to supply energy to the body, and some are used as starting materials for the synthesis of nonprotein compounds the body needs, such as adrenaline, thyroxine, and melanin (Section 25.6). lysine N H indole PROBLEM 1 a. Explain why, when the imidazole ring of histidine is protonated, the double-bonded nitrogen is the nitrogen that accepts the proton. CH2CHCOO− CH2CHCOO− N NH NH2 + 2 H+ + + HN NH NH3 b. Explain why, when the guanidino group of arginine is protonated, the double-bonded nitrogen is the nitrogen that accepts the proton. NH + O H2NCNHCH2CH2CH2CHCO NH2 O NH2 − + + 2H H2NCNHCH2CH2CH2CHCO− + NH3 Tutorial: Basic nitrogens in histidine and arginine 964 CHAPTER 23 Amino Acids, Peptides, and Proteins 23.2 Configuration of Amino Acids alanine an amino acid The a-carbon of all the naturally occurring amino acids except glycine is an asymmetric carbon. Therefore, 19 of the 20 amino acids listed in Table 23.1 can exist as enantiomers. The D and L notation used for monosaccharides (Section 22.2) is also used for amino acids. The D and L isomers of monosaccharides and amino acids are defined the same way. Thus, an amino acid drawn in a Fischer projection with the carboxyl group on the top and the R group on the bottom of the vertical axis is a D-amino acid if the amino group is on the right and an L-amino acid if the amino group is on the left. Unlike monosaccharides, where the D isomer is the one found in nature, most amino acids found in nature have the L configuration. To date, D-amino acid residues have been found only in a few peptide antibiotics and in some small peptides attached to the cell walls of bacteria. O O C C H H OH CH2OH D-glyceraldehyde H HO H CH2OH L-glyceraldehyde O C H R O − O NH3 + D-amino acid C H3N + O− H R L-amino acid Why D-sugars and L-amino acids? While it makes no difference which isomer nature “selected” to be synthesized, it is important that the same isomer be synthesized by all organisms. For example, if mammals ended up having L-amino acids, then L-amino acids would need to be the isomers synthesized by the organisms upon which mammals depend for food. AMINO ACIDS AND DISEASE The Chamorro people of Guam have a high incidence of a syndrome that resembles amyotrophic lateral sclerosis (ALS) with elements of Parkinson’s disease and dementia. This syndrome developed during World War II when, as a result of food shortages, the tribe ate large quantities of Cycas circinalis seeds. These seeds contain b -methylaminoL-alanine, an amino acid that binds to glutamate receptors. When monkeys are given b -methylamino-L-alanine, they develop some of the features of this syndrome. There is hope that, by studying the mechanism of action of b -methylaminoL-alanine, we may gain an understanding of how ALS and Parkinson’s disease arise. PROBLEM 2 ◆ a. Which isomer—(R)-alanine or (S)-alanine—is D-alanine? b. Which isomer—(R)-aspartate or (S)-aspartate—is D-aspartate? c. Can a general statement be made relating R and S to D and L? PROBLEM 3 ◆ Which amino acids in Table 23.1 have more than one asymmetric carbon? Section 23.3 Acid–Base Properties of Amino Acids 965 23.3 Acid–Base Properties of Amino Acids Every amino acid has a carboxyl group and an amino group, and each group can exist in an acidic form or a basic form, depending on the pH of the solution in which the amino acid is dissolved. The carboxyl groups of the amino acids have pKa values of approximately 2, and the protonated amino groups have pKa values near 9 (Table 23.2). Both groups, therefore, will be in their acidic forms in a very acidic solution (pH ' 0). At pH = 7, the pH of the solution is greater than the pKa of the carboxyl group, but less than the pKa of the protonated amino group. The carboxyl group, therefore, will be in its basic form and the amino group will be in its acidic form. In a strongly basic solution (pH ' 11), both groups will be in their basic forms. O R CH + C O OH CH R + NH3 pH = 0 C O − O R NH3 + H+ CH C O− NH2 + H+ pH = 11 a zwitterion pH = 7 Notice that an amino acid can never exist as an uncharged compound, regardless of the pH of the solution. To be uncharged, an amino acid would have to lose a proton from an +NH 3 group with a pKa of about 9 before it would lose a proton from a COOH group with a pKa of about 2. This clearly is impossible: A weak acid cannot be more acidic than a strong acid. Therefore, at physiological pH (7.3) an amino acid exists as a dipolar ion, called a zwitterion. A zwitterion is a compound that has a negative charge Table 23.2 The pKa Values of Amino Acids Amino acid Alanine Arginine Asparagine Aspartic acid Cysteine Glutamic acid Glutamine Glycine Histidine Isoleucine Leucine Lysine Methionine Phenylalanine Proline Serine Threonine Tryptophan Tyrosine Valine pK a A-COOH pK a A-NH 3ⴙ pK a side chain 2.34 2.17 2.02 2.09 1.92 2.19 2.17 2.34 1.82 2.36 2.36 2.18 2.28 2.16 1.99 2.21 2.63 2.38 2.20 2.32 9.69 9.04 8.84 9.82 10.46 9.67 9.13 9.60 9.17 9.68 9.60 8.95 9.21 9.18 10.60 9.15 9.10 9.39 9.11 9.62 — 12.48 — 3.86 8.35 4.25 — — 6.04 — — 10.79 — — — — — — 10.07 — Recall from the Henderson–Hasselbalch equation (Section 1.20) that the acidic form predominates if the pH of the solution is less than the pKa of the compound and the basic form predominates if the pH of the solution is greater than the pKa of the compound. 966 CHAPTER 23 Amino Acids, Peptides, and Proteins on one atom and a positive charge on a nonadjacent atom. (The name comes from zwitter, German for “hermaphrodite” or “hybrid.”) A few amino acids have side chains with ionizable hydrogens (Table 23.2). The protonated imidazole side chain of histidine, for example, has a pKa of 6.04. Histidine, therefore, can exist in four different forms, and the form that predominates depends on the pH of the solution. O O CH2CHCO− CH2CHCOH + HN + NH + NH3 O HN + pH = 0 NH O CH2CHCO− + NH3 N NH pH = 4 NH3 CH2CHCO− N pH = 8 NH NH2 pH = 12 histidine PROBLEM 4 ◆ Why are the carboxylic acid groups of the amino acids so much more acidic (pKa ' 2) than a carboxylic acid such as acetic acid (pKa = 4.76)? PROBLEM 5 SOLVED Draw the form in which each of the following amino acids predominantly exists at physiological pH (7.3): a. aspartic acid b. histidine c. glutamine d. lysine e. arginine f. tyrosine SOLUTION TO 5a Both carboxyl groups are in their basic forms because the pH is greater than their pKa’s. The protonated amino group is in its acidic form because the pH is less than its pKa . O O − OCCH2CHCO− + NH3 PROBLEM 6 ◆ Draw the form in which glutamic acid predominantly exists in a solution with the following pH: a. pH = 0 b. pH = 3 c. pH = 6 d. pH = 11 PROBLEM 7 a. Why is the pKa of the glutamic acid side chain greater than the pKa of the aspartic acid side chain? b. Why is the pKa of the arginine side chain greater than the pKa of the lysine side chain? 23.4 The Isoelectric Point The isoelectric point (pI) of an amino acid is the pH at which it has no net charge. In other words, it is the pH at which the amount of positive charge on an amino acid exactly balances the amount of negative charge: pI (isoelectric point) ⴝ pH at which there is no net charge Section 23.4 The Isoelectric Point 967 The pI of an amino acid that does not have an ionizable side chain—such as alanine— is midway between its two pKa values. This is because at pH = 2.34, half the molecules have a negatively charged carboxyl group and half have an uncharged carboxyl group, and at pH = 9.69, half the molecules have a positively charged amino group and half have an uncharged amino group. As the pH increases from 2.34, the carboxyl group of more molecules becomes negatively charged; as the pH decreases from 9.69, the amino group of more molecules becomes positively charged. Therefore, at the average of the two pKa values, the number of negatively charged groups equals the number of positively charged groups. O Recall from the Henderson–Hasselbalch equation that when pH ⴝ pKa , half the group is in its acidic form and half is in its basic form (Section 1.20). pKa = 2.34 CH3CHCOH + NH3 pI = An amino acid will be positively charged if the pH of the solution is less than the pI of the amino acid and will be negatively charged if the pH of the solution is greater than the pI of the amino acid. pKa = 9.69 alanine 2.34 + 9.69 12.03 = = 6.02 2 2 The pI of an amino acid that has an ionizable side chain is the average of the pKa values of the similarly ionizing groups (a positively charged group ionizing to an uncharged group or an uncharged group ionizing to a negatively charged group). For example, the pI of lysine is the average of the pKa values of the two groups that are positively charged in their acidic form and uncharged in their basic form. The pI of glutamate, on the other hand, is the average of the pKa values of the two groups that are uncharged in their acidic form and negatively charged in their basic form. pKa = 2.18 O + O pKa = 10.79 + pKa = 4.25 NH3 pKa = 8.95 lysine pI = 8.95 + 10.79 19.74 = = 9.87 2 2 NH3 glutamic acid pI = Explain why the pI of lysine is the average of the pKa values of its two protonated amino groups. PROBLEM 9 ◆ Calculate the pI of each of the following amino acids: b. arginine pKa = 9.67 2.19 + 4.25 6.44 = = 3.22 2 2 PROBLEM 8 a. asparagine pKa = 2.19 HOCCH2CH2CHCOH H3NCH2CH2CH2CH2CHCOH + O c. serine PROBLEM 10 ◆ a. Which amino acid has the lowest pI value? b. Which amino acid has the highest pI value? c. Which amino acid has the greatest amount of negative charge at pH = 6.20? d. Which amino acid—glycine or methionine—has a greater negative charge at pH = 6.20? PROBLEM 11 Explain why the pI values of tyrosine and cysteine cannot be determined by the method just described. 968 CHAPTER 23 Amino Acids, Peptides, and Proteins 23.5 Separation of Amino Acids Electrophoresis A mixture of amino acids can be separated by several different techniques. Electrophoresis separates amino acids on the basis of their pI values. A few drops of a solution of an amino acid mixture are applied to the middle of a piece of filter paper or to a gel. When the paper or the gel is placed in a buffered solution between two electrodes and an electric field is applied, an amino acid with a pI greater than the pH of the solution will have an overall positive charge and will migrate toward the cathode (the negative electrode). The farther the amino acid’s pI is from the pH of the buffer, the more positive the amino acid will be and the farther it will migrate toward the cathode in a given amount of time. An amino acid with a pI less than the pH of the buffer will have an overall negative charge and will migrate toward the anode (the positive electrode). If two molecules have the same charge, the larger one will move more slowly during electrophoresis because the same charge has to move a greater mass. Since amino acids are colorless, how can we detect that they have been separated? When amino acids are heated with ninhydrin, they form a colored product. After electrophoretic separation of the amino acids, the filter paper is sprayed with ninhydrin and dried in a warm oven. Most amino acids form a purple product. The number of different kinds of amino acids in the mixture is determined by the number of colored spots on the filter paper (Figure 23.1). The individual amino acids are identified by their location on the paper compared with a standard. Tutorial: Electrophoresis and pI cathode anode − + − + + NH2 O O O O H2NCNHCH2CH2CH2CHCO− + NH3 CH3CHCO− + NH3 −OCCH CHCO− 2 + NH3 arginine pl = 10.76 alanine pl = 6.02 aspartate pl = 2.98 ▲ Figure 23.1 Arginine, alanine, and aspartic acid separated by electrophoresis at pH = 5. The mechanism for formation of the colored product is as shown, omitting the mechanisms for the steps involving dehydration, imine formation, and imine hydrolysis. (These mechanisms are shown in Sections 18.6 and 18.7.) mechanism for the reaction of an amino acid with ninhydrin to form a colored product O O OH O O − O + H2NCHCO OH + H2O O O ninhydrin O N R O CH C O − R + H2O an amino acid O NH2 O O− O O + RCH H2O N O CHR + HO− O N CHR H O H + CO2 Section 23.5 Separation of Amino Acids 969 O O O − O− O HO O H N O O N O + H2O O O + H2O purple-colored product Paper Chromatography and Thin-Layer Chromatography Paper chromatography once played an important role in biochemical analysis because it provided a method for separating amino acids using very simple equipment. Although more modern techniques are now more commonly used, we will describe the principles behind paper chromatography because many of the same principles are employed in modern separation techniques. The technique of paper chromatography separates amino acids on the basis of polarity. A few drops of a solution of an amino acid mixture are applied to the bottom of a strip of filter paper. The edge of the paper is then placed in a solvent (typically a mixture of water, acetic acid, and butanol). The solvent moves up the paper by capillary action, carrying the amino acids with it. Depending on their polarities, the amino acids have different affinities for the mobile (solvent) and stationary (paper) phases and therefore travel up the paper at different rates. The more polar the amino acid, the more strongly it is adsorbed onto the relatively polar paper. The less polar amino acids travel up the paper more rapidly, since they have a greater affinity for the mobile phase. Therefore, when the paper is developed with ninhydrin, the colored spot closest to the origin is the most polar amino acid and the spot farthest away from the origin is the least polar amino acid (Figure 23.2). Chromatography least polar amino acid Leu > Figure 23.2 Separation of glutamate, alanine, and leucine by paper chromatography. Ala Glu most polar amino acid origin The most polar amino acids are those with charged side chains, the next most polar are those with side chains that can form hydrogen bonds, and the least polar are those with hydrocarbon side chains. For amino acids with hydrocarbon side chains, the larger the alkyl group, the less polar the amino acid. In other words, leucine is less polar than valine. Paper chromatography has largely been replaced by thin-layer chromatography (TLC). Similar to paper chromatography, TLC differs from it in that TLC uses a plate with a coating of solid material instead of filter paper. The physical property on which the separation is based depends on the solid material and the solvent chosen for the mobile phase. PROBLEM 12 ◆ A mixture of seven amino acids (glycine, glutamate, leucine, lysine, alanine, isoleucine, and aspartate) is separated by TLC. Explain why only six spots show up when the chromatographic plate is sprayed with ninhydrin and heated. Movie: Column chromatography 970 CHAPTER 23 Amino Acids, Peptides, and Proteins Ion-Exchange Chromatography Cations bind most strongly to cation-exchange resins. Anions bind most strongly to anion-exchange resins. Figure 23.3 N A section of a cation-exchange resin. This particular resin is called Dowex® 50. Electrophoresis and thin-layer chromatography are analytical separations—small amounts of amino acids are separated for analysis. Preparative separation, in which larger amounts of amino acids are separated for use in subsequent processes, can be achieved using ion-exchange chromatography. This technique employs a column packed with an insoluble resin. A solution of a mixture of amino acids is loaded onto the top of the column and eluted with a buffer. The amino acids separate because they flow through the column at different rates, as explained below. The resin is a chemically inert material with charged side chains. One commonly used resin is a copolymer of styrene and divinylbenzene with negatively charged sulfonic acid groups on some of the benzene rings (Figure 23.3). If a mixture of lysine and glutamate in a solution with a pH of 6 were loaded onto the column, glutamate would travel down the column rapidly because its negatively charged side chain would be repelled by the negatively charged sulfonic acid groups of the resin. The positively charged side chain of lysine, on the other hand, would cause that amino acid to be retained on the column. This kind of resin is called a cation-exchange resin because it exchanges the Na + counterions of the SO3 - groups for the positively charged species that are added to the column. In addition, the relatively nonpolar nature of the column causes it to retain nonpolar amino acids longer than polar amino acids. Resins with positively charged groups are called anion-exchange resins because they impede the flow of anions by exchanging their negatively charged counterions for negatively charged species that are added to the column. A common anionexchange resin (Dowex®1) has CH 2N +(CH 3)3Cl- groups in place of the SO3 - Na+ groups in Figure 23.3. SO3 Na+ SO3 Na+ SO3 Na+ CH2 CH CH2 CH CH CH2 CH CH2 CH CH2 CH CH2 CH CH2 CH CH2 CH CH2 CH CH2 CH CH2 CH CH2 CH SO3 Na+ SO3 Na+ An amino acid analyzer is an instrument that automates ion-exchange chromatography. When a solution of an amino acid mixture passes through the column of an amino acid analyzer containing a cation-exchange resin, the amino acids move through the column at different rates, depending on their overall charge. The solution leaving the column is collected in fractions, which are collected often enough that a different amino acid ends up in each fraction (Figure 23.4). If ninhydrin is added to each of the fractions, the concentration of the amino acid in each fraction can be WATER SOFTENERS: EXAMPLES OF CATIONEXCHANGE CHROMATOGRAPHY Water softeners contain a column with a cation-exchange resin that has been flushed with concentrated sodium chloride. In Section 17.13, we saw that the presence of calcium and magne- sium ions in water is what causes the water to be “hard.” When water passes through the column, the resin binds magnesium and calcium ions more tightly than it binds sodium ions. In this way, the water softener removes magnesium and calcium ions from water, replacing them with sodium ions. The resin must be recharged from time to time by flushing it with concentrated sodium chloride to replace the bound magnesium and calcium ions with sodium ions. Section 23.5 Separation of Amino Acids 971 > Figure 23.4 Separation of amino acids by ion-exchange chromatography. Fractions sequentially collected determined by the amount of absorption at 570 nm—because the colored compound formed by the reaction of an amino acid with ninhydrin has a lmax of 570 (Section 8.11). In this way, the identity and the relative amount of each amino acid can be determined (Figure 23.5). Absorbance Asp Thr pH 5.3 buffer pH 4.3 buffer pH 3.3 buffer Ser Glu Ala Met Ile Gly Leu Tyr Val His Phe Lys NH3 Arg Pro 40 80 120 160 200 240 280 320 330 370 Effluent (mL) 410 450 490 50 90 130 PROBLEM 13 Why are buffer solutions of increasingly higher pH used to elute the column that generates the chromatogram shown in Figure 23.5? PROBLEM 14 Explain the order of elution (with a buffer of pH 4) of each of the following pairs of amino acids on a column packed with Dowex® 50 (Figure 23.3): a. aspartate before serine b. glycine before alanine c. valine before leucine d. tyrosine before phenylalanine PROBLEM 15 ◆ In what order would the following amino acids be eluted with a buffer of pH 4 from a column containing an anion-exchange resin? histidine, serine, aspartate, valine > Figure 23.5 A typical chromatogram obtained from the separation of a mixture of amino acids using an automated amino acid analyzer. 972 CHAPTER 23 Amino Acids, Peptides, and Proteins 23.6 Resolution of Racemic Mixtures of Amino Acids Chemists do not have to rely on nature to produce amino acids; they can synthesize them in the laboratory, using a variety of methods. One of the oldest methods replaces an a-hydrogen of a carboxylic acid with a bromine in a Hell–Volhard–Zelinski reaction (Section 19.5). The resulting a-bromocarboxylic acid then undergoes an SN2 reaction with ammonia to form the amino acid (Section 10.4). O O 1. Br2, PBr3 2. H3O+ RCH2COH a carboxylic acid RCHCOH O excess NH3 + RCHCO− + NH4Br− + Br NH3 an amino acid PROBLEM 16 Why is excess ammonia used in the preceding reaction? When amino acids are synthesized in nature, only the L-enantiomer is formed (Section 5.20). However, when amino acids are synthesized in the laboratory, the product is usually a racemic mixture of D and L enantiomers. If only one isomer is desired, the enantiomers must be separated. They can be separated by means of an enzyme-catalyzed reaction. Because an enzyme is chiral, it will react at a different rate with each of the enantiomers (Section 5.20). For example, pig kidney aminoacylase is an enzyme that catalyzes the hydrolysis of N-acetyl-L-amino acids, but not N-acetylD-amino acids. Therefore, if the racemic amino acid is converted into a pair of N-acetylamino acids and the N-acetylated mixture is hydrolyzed with pig kidney aminoacylase, the products will be the L-amino acid and N-acetyl-D-amino acid, which are easily separated. Because the resolution (separation) of the enantiomers depends on the difference in the rates of reaction of the enzyme with the two N-acetylated compounds, this technique is known as a kinetic resolution. O O O O − H2NCHCO O CH3COCCH3 CH3C R O − NHCHCO O CO− H2N H + CH3CO− R pig kidney aminoacylase H2O R D-amino acid N-acetyl-D-amino acid + N-acetyl-L-amino acid + L-amino acid L-amino acid O + CH3C O NHCHCO− R N-acetyl-D-amino acid PROBLEM 17 Pig liver esterase is an enzyme that catalyzes the hydrolysis of esters. It hydrolyzes esters of L-amino acids more rapidly than esters of D-amino acids. How can this enzyme be used to separate a racemic mixture of amino acids? PROBLEM 18 ◆ Amino acids can be synthesized by reductive amination of a-keto acids (Section 21.8). O O RC O C OH excess ammonia H2/Raney Ni RCH + NH3 C O− Section 23.7 Peptide Bonds and Disulfide Bonds Biological organisms can also convert a-keto acids into amino acids, but because H 2 and metal catalysts are not available to the cell, they do so by a different mechanism (Section 25.6.) a. What amino acid is obtained from the reductive amination of each of the following metabolic intermediates in the cell? O O CH3C C OH O O O HOCCH2 C COH pyruvic acid O HOCCH2CH2 oxaloacetic acid O O C COH -ketoglutaric acid b. What amino acids are obtained from the same metabolic intermediates when they are synthesized in the laboratory? 23.7 Peptide Bonds and Disulfide Bonds Peptide bonds and disulfide bonds are the only covalent bonds that hold amino acid residues together in a peptide or a protein. Peptide Bonds The amide bonds that link amino acid residues are called peptide bonds. By convention, peptides and proteins are written with the free amino group (the N-terminal amino acid) on the left and the free carboxyl group (the C-terminal amino acid) on the right. O + − H3NCHCO − + H3NCHCO R O H3NCHC NHCHC R R′ O + + H3NCHCO− R′ O + O + R′′ O NHCHCO− + 2 H2O R′′ peptide bonds the C-terminal amino acid the N-terminal amino acid a tripeptide When the identities of the amino acids in a peptide are known but their sequence is not known, the amino acids are written separated by commas. When the sequence of amino acids is known, the amino acids are written separated by hyphens. In the following pentapeptide shown on the right, valine is the N-terminal amino acid and histidine is the C-terminal amino acid. The amino acids are numbered starting with the N-terminal end. The glutamate residue is referred to as Glu 4 because it is the fourth amino acid from the N-terminal end. In naming the peptide, adjective names (ending in “yl”) are used for all the amino acids except the C-terminal amino acid. Thus, this pentapeptide is named valylcysteylalanylglutamylhistidine. Glu, Cys, His, Val, Ala Val-Cys-Ala-Glu-His the pentapeptide contains the indicated amino acids, but their sequence is not known the amino acids in the pentapeptide have the indicated sequence A peptide bond has about 40% double-bond character because of electron delocalization. Steric hindrance causes the trans configuration to be more stable than the 973 974 CHAPTER 23 Amino Acids, Peptides, and Proteins cis configuration, so the a-carbons of adjacent amino acids are trans to each other (Section 4.11). O R C CH O− -carbon CH C N -carbon R R H CH + CH N R H trans configuration Free rotation about the peptide bond is not possible because of its partial double-bond character. The carbon and nitrogen atoms of the peptide bond and the two atoms to which each is attached are held rigidly in a plane (Figure 23.6). This regional planarity affects the way a chain of amino acids can fold, so it has important implications for the three-dimensional shapes of peptides and proteins (Section 23.13). Figure 23.6 N A segment of a polypeptide chain. The plane defined by each peptide bond is indicated. Notice that the R groups bonded to the a-carbons are on alternate sides of the peptide backbone. O H R C N O H CH R C N O H CH R C N O H CH C N CH N C CH N C CH N C CH R H O R H O R H O R PROBLEM 19 Draw a peptide bond in a cis configuration. Disulfide Bonds When thiols are oxidized under mild conditions, they form disulfides. A disulfide is a compound with an S ¬ S bond. 2R mild oxidation SH RS a thiol SR a disulfide An oxidizing agent commonly used for this reaction is Br2 (or I2) in a basic solution. mechanism for oxidation of a thiol to a disulfide R SH HO− R H2O S Br − Br R S R Br S − R S S R + Br− + Br− Because thiols can be oxidized to disulfides, disulfides can be reduced to thiols. RS SR reduction 2R SH a thiol a disulfide Cysteine is an amino acid that contains a thiol group. Two cysteine molecules therefore can be oxidized to a disulfide. This disulfide is called cystine. O 2 HSCH2CHCO + NH3 cysteine O O − mild oxidation − OCCHCH2S + SCH2CHCO− + NH3 NH3 cystine Two cysteine residues in a protein can be oxidized to a disulfide. This is known as a disulfide bridge. Disulfide bridges are the only covalent bonds that can form between nonadjacent amino acids. They contribute to the overall shape of a protein by holding the cysteine residues in close proximity, as shown in Figure 23.7. Section 23.7 Peptide Bonds and Disulfide Bonds 975 > Figure 23.7 Disulfide bridges cross-linking portions of a peptide. SH oxidation HS SH SH S SS S S reduction S SH SH polypeptide disulfide bridges cross-linking portions of a polypeptide Insulin, a hormone secreted by the pancreas, controls the level of glucose in the blood by regulating glucose metabolism. Insulin is a polypeptide with two peptide chains. The short chain (the A-chain) contains 21 amino acids and the long chain (the B-chain) contains 30 amino acids. The two chains are held together by two disulfide bridges. These are interchain disulfide bridges (between the A- and B-chains). Insulin also has an intrachain disulfide bridge (within the A-chain). an intrachain disulfide bridge S S A-chain Gly Ile Val Glu Gln Cys Cys Thr Ser Ile Cys Ser Leu Tyr Gln Leu Glu Asn Tyr Cys Asn S S interchain disulfide bridges S S B-chain Phe Val Asn Gln His Leu Cys Gly Ser His Leu Val Glu Ala Leu Tyr Leu Val Cys Gly Glu Arg Gly Phe Phe Tyr Thr Pro Lys Ala insulin HAIR: STRAIGHT OR CURLY? Hair is made up of a protein known as keratin. Keratin contains an unusually large number of cysteine residues (about 8% of the amino acids), which give it many disulfide bridges to maintain its three-dimensional structure. People can alter the structure of their hair (if they feel that it is either too straight or too curly) by changing the location of these disulfide bridges. This is accomplished by first applying a reducing agent to the hair to reduce all the disulfide bridges S SS S S S curly hair in the protein strands. Then the hair is given the desired shape (using curlers to curl it or combing it straight to uncurl it), and an oxidizing agent is applied that forms new disulfide bridges. The new disulfide bridges maintain the hair’s new shape. When this treatment is applied to straight hair, it is called a “permanent.” When it is applied to curly hair, it is called “hair straightening.” S S S S S S S S S S S S straight hair 976 CHAPTER 23 Amino Acids, Peptides, and Proteins PROBLEM 20 ◆ a. How many different octapeptides can be made from the 20 naturally occurring amino acids? b. How many different proteins containing 100 amino acids can be made from the 20 naturally occurring amino acids? PROBLEM 21 ◆ Which bonds in the backbone of a peptide can rotate freely? 23.8 Some Interesting Peptides Oxytocin was the first small peptide to be synthesized. Its synthesis was achieved in 1953 by Vincent du Vigneaud (1901–1978), who later synthesized vasopressin. Du Vigneaud was born in Chicago and was a professor at George Washington University Medical School and later at Cornell University Medical College. For synthesizing these nonapeptides, he received the Nobel Prize in chemistry in 1955. Enkephalins are pentapeptides that are synthesized by the body to control pain. They decrease the body’s sensitivity to pain by binding to receptors in certain brain cells. Part of the three-dimensional structures of enkephalins must be similar to those of morphine and painkillers such as Demerol® because they bind to the same receptors (Sections 30.3 and 30.6). Tyr-Gly-Gly-Phe-Leu Tyr-Gly-Gly-Phe-Met leucine enkephalin methionine enkephalin Bradykinin, vasopressin, and oxytocin are peptide hormones. They are all nonapeptides. Bradykinin inhibits the inflammation of tissues. Vasopressin controls blood pressure by regulating the contraction of smooth muscle. It is also an antidiuretic. Oxytocin induces labor in pregnant women and stimulates milk production in nursing mothers. Vasopressin and oxytocin both have an intrachain disulfide bond, and their C-terminal amino acids contain amide rather than carboxyl groups. Notice that the C-terminal amide group is indicated by writing “NH 2” after the name of the C-terminal amino acid. In spite of their very different physiological effects, vasopressin and oxytocin differ only by two amino acids. bradykinin Arg-Pro-Pro-Gly-Phe-Ser-Pro-Phe-Arg vasopressin Cys-Tyr-Phe-Gln-Asn-Cys-Pro-Arg-Gly-NH2 S oxytocin S Cys-Tyr-Ile-Gln-Asn-Cys-Pro-Leu-Gly-NH2 S L-Val L-Pro L-Orn L-Phe L-Leu L-Leu D-Phe D-Orn L-Pro L-Val gramicidin S O + H3NCH2CH2CH2CHCO− + ornithine NH3 S Gramicidin S is an antibiotic produced by a strain of bacteria. It is a cyclic decapeptide. Notice that it contains the amino acids L-ornithine (L-Orn), D-ornithine (D-Orn), and also D-phenylalanine. Ornithine is not listed in Table 23.1 because it occurs rarely in nature. Ornithine resembles lysine, but has one less methylene group in its side chain. The synthetic sweetener aspartame, or NutraSweet® (Section 22.21), is the methyl ester of a dipeptide of L-aspartate and L-phenylalanine. Aspartame is about 200 times sweeter than sucrose. The ethyl ester of the same dipeptide is not sweet. If a D-amino acid is substituted for either of the L-amino acids of aspartame, the resulting dipeptide is bitter rather than sweet. O + H3NCHC CH2 COO− aspartame NutraSweet O NHCHCOCH3 CH2 Section 23.9 Strategy of Peptide Bond Synthesis: N-Protection and C-Activation Glutathione is a tripeptide of glutamate, cysteine, and glycine. Its function is to destroy harmful oxidizing agents in the body. Oxidizing agents are thought to be responsible for some of the effects of aging and are believed to play a role in cancer (Section 9.8). Glutathione removes oxidizing agents by reducing them. Consequently, glutathione is oxidized, forming a disulfide bond between two glutathione molecules. An enzyme subsequently reduces the disulfide bond, allowing glutathione to react with more oxidizing agents. COO− O O 2 H3NCHCH2CH2C NHCHC + O NHCH2CO− CH2 glutathione SH reducing agent COO− oxidizing agent O O H3NCHCH2CH2C NHCHC + 3-D Molecules: Glutathione; Oxidized glutathione O NHCH2CO− CH2 S S CH2 + H3NCHCH2CH2C COO− NHCHC O NHCH2CO− O O oxidized glutathione PROBLEM 22 What is unusual about glutathione’s structure? (If you can’t answer this question, draw the structure you would expect for a tripeptide of glutamate, cysteine, and glycine, and compare your structure with the structure of glutathione.) 23.9 Strategy of Peptide Bond Synthesis: N-Protection and C-Activation Because amino acids have two functional groups, a problem arises when one attempts to make a particular peptide bond. For example, suppose you wanted to make the dipeptide Gly-Ala. That dipeptide is only one of four possible dipeptides that could be formed from alanine and glycine. O + H3NCH2C O O − NHCHCO CH3 Gly-Ala + H3NCHC O O − NHCHCO CH3 Ala-Ala + H3NCH2C O O − NHCH2CO + H3NCHC CH3 O NHCH2CO− CH3 Gly-Gly If the amino group of the amino acid that is to be on the N-terminal end (in this case, Gly) is protected, it will not be available to form a peptide bond. If the carboxyl group of this same amino acid is activated before the second amino acid is added, the amino group of the added amino acid (in this case, Ala) will react with the activated Ala-Gly 977 978 CHAPTER 23 Amino Acids, Peptides, and Proteins carboxyl group of glycine in preference to reacting with a nonactivated carboxyl group of another alanine molecule. glycine alanine O O protect H2NCH2CO− H2NCHCO− CH3 activate peptide bond is formed between these groups The reagent that is most often used to protect the amino group of an amino acid is di-tert-butyl dicarbonate. Its popularity is due to the ease with which the protecting group can be removed when the need for protection is over. The protecting group is known by the acronym t-BOC (pronounced tee-boc). CH3 CH3C O C CH3 O O O C O CH3 CH3 O − CCH3 + H2NCH2CO CH3C CH3 O O C CH3 O − NHCH2CO + CO2 + HO CH3 di-tert-butyl dicarbonate CCH3 CH3 N-protected glycine glycine Carboxylic acids are generally activated by being converted into acyl chlorides (Section 17.20). Acyl chlorides, however, are so reactive that they can readily react with the substituents of some of the amino acids during peptide synthesis, creating unwanted products. The preferred method for activating the carboxyl group of an N-protected amino acid is to convert it into an imidate using dicyclohexylcarbodiimide (DCC). (By now, you have probably noticed that biochemists are even more fond of acronyms than organic chemists are.) DCC activates a carboxyl group by putting a good leaving group on the carbonyl carbon. CH3 CH3C O O O N C NHCH2COH + C CH3 proton transfer N CH3 CH3C O O C O N NHCH2CO− + C + NH CH3 N-protected amino acid dicyclohexylcarbodiimide DCC CH3 CH3C O CH3 O C O NHCH2CO protected N C NH activated an imidate After the amino acid has its N-terminal group protected and its C-terminal group activated, the second amino acid is added to form the new peptide bond. The C ¬ O bond of the tetrahedral intermediate is easily broken (the activated group is a good leaving Section 23.9 Strategy of Peptide Bond Synthesis: N-Protection and C-Activation group) because the bonding electrons are delocalized, forming dicyclohexylurea, a stable diamide. [Recall that the weaker (more stable) the base, the better it is as a leaving group; see Section 17.5.] H B+ CH3 O CH3C O OCNHCH2C CH3 O N O CH3 C CH3C O H N OCNHCH2C O C + CH3 NH HB NH NH CH3CH O − OC H2NCHCO− O CH3 tetrahedral intermediate amino acid CH3 O O O OCNHCH2C CH3C NH − NHCHCO CH3 + O CH3 C NH new peptide bond dicyclohexylurea a diamide Amino acids can be added to the growing C-terminal end by repeating these two steps: activating the carboxyl group of the C-terminal amino acid of the peptide by treating it with DCC and then adding a new amino acid. CH3 O CH3C OC O CH3 O O 1. DCC − NHCH2C NHCHCO 2. H2NCHCO− CH3 CH3 CH3C O N-protected dipeptide OC O O NHCH2C NHCHC CH3 O NHCHCO− CH3 R N-protected tripeptide R When the desired number of amino acids has been added to the chain, the protecting group on the N-terminal amino acid is removed. t-BOC is an ideal protecting group because it can be removed by washing with trifluoroacetic acid and methylene chloride, reagents that will not break any other covalent bonds. The protecting group is removed by an elimination reaction, forming isobutylene and carbon dioxide. Because these products are gases, they escape, driving the reaction to completion. H B+ CH3 CH3C O O O C NHCH2C O NHCHC CH2 CH3 − O O CO2 + H3NCH2C NHCHC + NHCHCOH CH3 tripeptide + OH O NHCHCOH R CF3COOH CH2Cl2 R N-protected tripeptide H CF3COO CH3 O O C H B O NHCH2C CH3C CH2 + O NHCHC CH3 O NHCHCOH R 979 980 CHAPTER 23 Amino Acids, Peptides, and Proteins Theoretically, one should be able to make as long a peptide as desired with this technique. Reactions do not produce 100% yields, however, and the yields are further decreased during the purification process. After each step of the synthesis, the peptide must be purified to prevent subsequent unwanted reactions with leftover reagents. Assuming that each amino acid can be added to the growing end of the peptide chain in an 80% yield (a relatively high yield, as you can probably appreciate from your own experience in the laboratory), the overall yield of a nonapeptide such as bradykinin would be only 17%. It is clear that large polypeptides could never be synthesized in this way. Number of amino acids Overall yield 2 3 4 5 6 7 8 9 80% 64% 51% 41% 33% 26% 21% 17% PROBLEM 23 What dipeptides would be formed by heating a mixture of valine and N-protected leucine? PROBLEM 24 Suppose you are trying to synthesize the dipeptide Val-Ser. Compare the product that would be obtained if the carboxyl group of N-protected valine were activated with thionyl chloride with the product that would be obtained if the carboxyl group were activated with DCC. PROBLEM 25 Show the steps in the synthesis of the tetrapeptide Leu-Phe-Lys-Val. PROBLEM 26 ◆ a. Calculate the overall yield of bradykinin if the yield for the addition of each amino acid to the chain is 70%. b. What would be the overall yield of a peptide containing 15 amino acid residues if the yield for the incorporation of each is 80%? 23.10 Automated Peptide Synthesis R. Bruce Merrifield was born in 1921 and received a B.S. and a Ph.D. from the University of California, Los Angeles. He is a professor of chemistry at Rockefeller University. Merrifield received the 1984 Nobel Prize in chemistry for developing automated solid-phase peptide synthesis. In addition to producing low overall yields, the method of peptide synthesis described in Section 23.9 is extremely time-consuming because the product must be purified at each step of the synthesis. In 1969, Bruce Merrifield described a method that revolutionized the synthesis of peptides because it provided a much faster way to produce peptides in much higher yields. Furthermore, because it is automated, the synthesis requires fewer hours of direct attention. Using this technique, bradykinin was synthesized with an 85% yield in 27 hours. Subsequent refinements in the technique now allow a reasonable yield of a peptide containing 100 amino acids to be synthesized in four days. In the Merrifield method, the C-terminal amino acid is covalently attached to a solid support contained in a column. Each N-terminal blocked amino acid is added one at a time, along with other needed reagents, so the protein is synthesized from the C-terminal end to the N-terminal end. Notice that this is opposite to the way proteins are synthesized in nature (from the N-terminal end to the C-terminal end; Section 27.13). Because it uses a solid support and is automated, Merrifield’s method of protein synthesis is called automated solid-phase peptide synthesis. Merrifield automated solid-phase synthesis of a tripeptide CH3 CH3C O CH3 O O C NHCHCO− + ClCH2 R N-protected amino acid resin Section 23.10 CH3 CH3C O O O C NHCHCO CH3 CH3C O + CO2 + H2NCHCO O C CH3 NHCHCOH CH3 CH3C R O O CH3C O O C NHCHC NHCHCO R R CH3 + CO2 + H2NCHC DCC NHCHCOH CH3 CH3C R CH3 CH3C NHCHCO O O CH2 O C O NHCHCO CH3 N-protected amino acid CF3COOH CH2Cl2 R CH3 O CH2 O R CH2 C O O CH3C O DCC R O CH3 O NHCHCO N-protected and C-activated amino acid CH3 CH3C C O CH3 N-protected amino acid CH3 CH2 R O DCC CF3COOH CH2Cl2 O CH2 CH3 CH2 R CH3 CH3C DCC R N-protected and C-activated amino acid O O O C NHCHC NHCHC NHCHCO R R R CH3 O CH2 CF3COOH CH2Cl2 CH3 CH3C O O + CO2 + H2NCHC NHCHC NHCHCO R R CH2 R O CH2 HF O O H3NCHC NHCHC + R R O NHCHCOH + HOCH2 R Automated Peptide Synthesis Tutorial: Merrifield automated solid-phase synthesis 981 982 CHAPTER 23 Amino Acids, Peptides, and Proteins The solid support to which the C-terminal amino acid is attached is a polystyrene resin similar to the one used in ion-exchange chromatography (Section 23.5), except that the benzene rings have chloromethyl substituents instead of sulfonic acid substituents. Before the C-terminal amino acid is attached to the resin, its amino group is protected with t-BOC to prevent the amino group from reacting with the resin. The C-terminal amino acid is attached to the resin by means of an SN2 reaction—its carboxyl group attacks a benzyl carbon of the resin, displacing a chloride ion (Section 10.4). After the C-terminal amino acid is attached to the resin, the t-BOC protecting group is removed (Section 23.9). The next amino acid, with its amino group protected with t-BOC and its carboxyl group activated with DCC, is added to the column. A huge advantage of the Merrifield method of peptide synthesis is that the growing peptide can be purified by washing the column with an appropriate solvent after each step of the procedure. The impurities are washed out of the column because they are not attached to the solid support. Since the peptide is covalently attached to the resin, none of it is lost in the purification step, leading to high yields of purified product. After the required amino acids have been added one by one, the peptide can be removed from the resin by treatment with HF under mild conditions that do not break the peptide bonds. Merrifield’s technique is constantly being improved so that peptides can be made more rapidly and more efficiently. However, it still cannot begin to compare with nature: A bacterial cell is able to synthesize a protein thousands of amino acids long in seconds and can simultaneously synthesize thousands of different proteins with no mistakes. Since the early 1980s, it has been possible to synthesize proteins by genetic engineering techniques. Strands of DNA can be introduced into bacterial cells, causing the cells to produce large amounts of a desired protein (Section 27.13). For example, mass quantities of human insulin are produced from genetically modified E. coli. Genetic engineering techniques also have been useful in synthesizing proteins that differ in one or a few amino acids from the natural protein. Such synthetic proteins have been used, for example, to learn how a change in a single amino acid affects the properties of a protein (Section 24.9). PROBLEM 27 Show the steps in the synthesis of the peptide in Problem 25, using Merrifield’s method. 23.11 Protein Structure Protein molecules are described by several levels of structure. The primary structure of a protein is the sequence of amino acids in the chain and the location of all the disulfide bridges. The secondary structure describes the regular conformation assumed by segments of the protein’s backbone. In other words, the secondary structure describes how local regions of the backbone fold. The tertiary structure describes the three-dimensional structure of the entire polypeptide. If a protein has more than one polypeptide chain, it has quaternary structure. The quaternary structure of a protein is the way the individual protein chains are arranged with respect to each other. Proteins can be divided roughly into two classes. Fibrous proteins contain long chains of polypeptides that occur in bundles. These proteins are insoluble in water. All the structural proteins described at the beginning of this chapter, such as keratin and collagen, are fibrous proteins. Globular proteins are soluble in water and tend to have roughly spherical shapes. Essentially all enzymes are globular proteins. Section 23.12 Determining the Primary Structure of a Protein 983 electrons in biological oxidations, has about 100 amino acid residues. Yeast cytochrome c differs by 48 amino acids from horse cytochrome c, while duck cytochrome c differs by only two amino acids from chicken cytochrome c. Chickens and turkeys have cytochrome c’s with identical primary structures. Humans and chimpanzees also have identical cytochrome c’s, differing by one amino acid from the cytochrome c of the rhesus monkey. PRIMARY STRUCTURE AND EVOLUTION When we examine the primary structures of proteins that carry out the same function in different organisms, we can relate the number of amino acid differences between the proteins to the taxonomic differences between the species. For example, cytochrome c, a protein that transfers 23.12 Determining the Primary Structure of a Protein The first step in determining the sequence of amino acids in a peptide or a protein is to reduce any disulfide bridges in the peptide or protein. A commonly used reducing agent is 2-mercaptoethanol, which is oxidized to a disulfide. Reaction of the protein thiol groups with iodoacetic acid prevents the disulfide bridges from reforming as a result of oxidation by O2 . cleaving disulfide bridges O O NHCH NHCH C CH2 CH2 S + S 2 HSCH2CH2OH 2-mercaptoethanol SH + SH SCH2CH2OH SCH2CH2OH CH2 O CH2 O NHCH C NHCH C C O ICH2COH iodoacetic acid O NHCH C CH2 O SCH2COH + 2 HI O SCH2COH CH2 O NHCH C PROBLEM 28 Write the mechanism for the reaction of a cysteine residue with iodoacetic acid. The next step is to determine the number and kinds of amino acids in the peptide or protein. To do this, a sample of the peptide or protein is dissolved in 6 N HCl and heated at 100 °C for 24 hours. This treatment hydrolyzes all the amide bonds in the protein, including the amide bonds of asparagine and glutamine. protein 6 N HCI 100 °C 24 h amino acids Insulin was the first protein for which the primary sequence was determined. This was done in 1953 by Frederick Sanger, who received the 1958 Nobel Prize in chemistry for his work. Sanger was born in England in 1918 and received a Ph.D. from Cambridge University, where he has worked for his entire career. He also received a share of the 1980 Nobel Prize in chemistry (Section 27.15) for being the first to sequence a DNA molecule (with 5375 nucleotide pairs). 984 CHAPTER 23 Amino Acids, Peptides, and Proteins The mixture of amino acids is then passed through an amino acid analyzer to determine the number and kind of each amino acid in the peptide or protein (Section 23.5). Because all the asparagine and glutamine residues have been hydrolyzed to aspartate and glutamate residues, the number of aspartate or glutamate residues in the amino acid mixture tells us the number of aspartate plus asparagine—or glutamate plus glutamine—residues in the original protein. A separate technique must be used to distinguish between aspartate and asparagine or between glutamate and glutamine in the original protein. The strongly acidic conditions used for hydrolysis destroy all the tryptophan residues because the indole ring is unstable in acid (Section 21.9). The tryptophan content of the protein can be determined by hydroxide-ion-promoted hydrolysis of the protein. This is not a general method for peptide bond hydrolysis because the strongly basic conditions destroy several other amino acid residues. There are several ways to identify the N-terminal amino acid of a peptide or protein. One of the most widely used methods is to treat the protein with phenyl isothiocyanate (PITC), more commonly known as Edman’s reagent. This reagent reacts with the N-terminal amino group, and the resulting thiazolinone derivative is cleaved from the protein under mildly acidic conditions. The thiazolinone derivative is extracted into an organic solvent and in the presence of acid, rearranges to a more stable phenylthiohydantoin (PTH). N O O O S H2NCHC NHCHC NHCHC R R′ R′′ O O O C NHCHC NHCHC NHCHC S R R′ R′′ C phenyl isothiocyanate PITC Edman's reagent N H H N+ HF N H R C CH H F O O S C NHCHC NHCHC R′ R′′ O O H3NCHC NHCHC O R − O HC HN + S + HN + R′ thiazolinone derivative R′′ peptide without the original N-terminal amino acid HCl R O HC HN N S PTH–amino acid Because each amino acid has a different substituent (R), each amino acid forms a different PTH. The particular PTH can be identified by chromatography using known standards. Several successive Edman degradations can be carried out on a protein. The entire primary sequence cannot be determined in this way, however, because side Section 23.12 Determining the Primary Structure of a Protein products accumulate that interfere with the results. An automated instrument known as a sequenator allows about 50 successive Edman degradations to be carried out on a protein. The C-terminal amino acid of the peptide or protein can be identified by treating the protein with carboxypeptidase A. Carboxypeptidase A cleaves off the C-terminal amino acid as long as it is not arginine or lysine (Section 24.9). On the other hand, carboxypeptidase B cleaves off the C-terminal amino acid only if it is arginine or lysine. Carboxypeptidases are exopeptidases. An exopeptidase is an enzyme that catalyzes the hydrolysis of a peptide bond at the end of a peptide chain. site where carboxypeptidase cleaves O O NHCHC NHCHC NHCHCO− R R′ R′′ O Once the N-terminal and C-terminal amino acids have been identified, a sample of the protein is hydrolyzed with dilute acid. This treatment, called partial hydrolysis, hydrolyzes only some of the peptide bonds. The resulting fragments are separated, and the amino acid composition of each is determined. The N-terminal and C-terminal amino acids of each fragment can also be identified. The sequence of the original protein can then be determined by lining up the peptides and looking for points of overlap. PROBLEM-SOLVING STRATEGY A nonapeptide undergoes partial hydrolysis to give peptides whose amino acid compositions are shown. Reaction of the intact nonapeptide with Edman’s reagent releases PTH-Leu. What is the sequence of the nonapeptide? a. Pro, Ser c. Met, Ala, Leu e. Glu, Ser, Val, Pro g. Met, Leu b. Gly, Glu d. Gly, Ala f. Glu, Pro, Gly h. His, Val Let’s start with the N-terminal amino acid. We know that it is Leu. Now we need to look for a fragment that contains Leu. Fragment (g) tells us that Met is next to Leu and fragment (c) tells us that Ala is next to Met. Now we look for a fragment that contains Ala. Fragment (d) contains Ala and tells us that Gly is next to Ala. From fragment (b), we know that Glu comes next. Glu is in both fragments (e) and (f). Fragment (e) has two amino acids we have yet to place in the growing peptide, but fragment (f) has only one, so from fragment (f), we know that Pro is the next amino acid. Now we can use fragment (e). Fragment (e) tells us that the next amino acid is Val, and fragment (h) tells us that His is the last (C-terminal) amino acid. Thus, the amino acid sequence of the nonapeptide is Leu-Met-Ala-Gly-Glu-Pro-Ser-Val-His Now continue on to Problem 29. PROBLEM 29 ◆ A decapeptide undergoes partial hydrolysis to give peptides whose amino acid compositions are shown. Reaction of the intact decapeptide with Edman’s reagent releases PTH-Gly. What is the sequence of the decapeptide? a. Ala, Trp c. Pro, Val e. Trp, Ala, Arg g. Glu, Ala, Leu b. Val, Pro, Asp d. Ala, Glu f. Arg, Gly h. Met, Pro, Leu, Glu The peptide or protein can also be partially hydrolyzed using endopeptidases. An endopeptidase is an enzyme that catalyzes the hydrolysis of a peptide bond that is not at the end of a peptide chain. Trypsin, chymotrypsin, and elastase are endopeptidases 985 986 CHAPTER 23 Amino Acids, Peptides, and Proteins Table 23.3 Specificity of Peptide or Protein Cleavage Reagent Specificity Chemical reagents Edman’s reagent Cyanogen bromide Exopeptidases* Carboxypeptidase A Carboxypeptidase B Endopeptidases* Trypsin Chymotrypsin removes the N-terminal amino acid hydrolyzes on the C-side of Met removes the C-terminal amino acid (not Arg or Lys) removes the C-terminal amino acid (only Arg or Lys) hydrolyzes on the C-side of Arg and Lys hydrolyzes on the C-side of amino acids that contain aromatic six-membered rings (Phe, Tyr, Trp) hydrolyzes on the C-side of small amino acids (Gly and Ala) Elastase * Cleavage will not occur if Pro is on either side of the bond to be hydrolyzed. 3-D Molecules: Carboxypeptidase A; Chymotrypsin that catalyze the hydrolysis of only the specific peptide bonds listed in Table 23.3. Trypsin, for example, catalyzes the hydrolysis of the peptide bond on the C-side of only arginine or lysine residues. C-side of lysine C-side of arginine O O O O O O NHCHC NHCHC NHCHC NHCHC NHCHC NHCHC R′ R′′ R CH2 + CH2 CH2 CH2 CH2 CH2 CH2 NH NH3 C R′′′ + NH2 NH2 Thus, trypsin will catalyze the hydrolysis of three peptide bonds in the following peptide, creating a hexapeptide, a dipeptide, and two tripeptides. Ala-Lys-Phe-Gly-Asp-Trp-Ser-Arg-Met-Val-Arg-Tyr-Leu-His cleavage by trypsin Chymotrypsin catalyzes the hydrolysis of the peptide bond on the C-side of amino acids that contain aromatic six-membered rings (Phe, Tyr, Trp). Ala-Lys-Phe-Gly-Asp-Trp-Ser-Arg-Met-Val-Arg-Tyr-Leu-His cleavage by chymotrypsin Elastase catalyzes the hydrolysis of peptide bonds on the C-side of small amino acids (Gly, Ala). Chymotrypsin and elastase are much less specific than trypsin. (An explanation for the specificity of these enzymes is given in Section 24.9.) Ala-Lys-Phe-Gly-Asp-Trp-Ser-Arg-Met-Val-Arg-Tyr-Leu-His cleavage by elastase Section 23.12 Determining the Primary Structure of a Protein None of the exopeptidases or endopeptidases that we have mentioned will catalyze the hydrolysis of an amide bond if proline is at the hydrolysis site. These enzymes recognize the appropriate hydrolysis site by its shape and charge, and proline’s structure causes the hydrolysis site to have an unrecognizable three-dimensional shape. Ala-Lys-Pro Leu-Phe-Pro Pro-Phe-Val trypsin will not cleave chymotrypsin will not cleave chymotrypsin will cleave Cyanogen bromide (BrC ‚ N) causes the hydrolysis of the amide bond on the C-side of a methionine residue. Cyanogen bromide is more specific than the endopeptidases about what peptide bonds it cleaves, so it provides more reliable information about the primary structure (the sequence of amino acids). Because cyanogen bromide is not a protein and therefore does not recognize the substrate by its shape, cyanogen bromide will still cleave the peptide bond if proline is at the cleavage site. Ala-Lys-Phe-Gly-Lys-Trp-Ser-Arg-Met-Val-Arg-Tyr-Leu-His cleavage by cyanogen bromide The first step in the mechanism for cleavage of a peptide bond by cyanogen bromide is attack by the highly nucleophilic sulfur of methionine on cyanogen bromide. Formation of a five-membered ring with departure of the weakly basic leaving group is followed by acid-catalyzed hydrolysis, which cleaves the protein (Section 18.6). Further hydrolysis can cause the lactone (a cyclic ester) to open to a carboxyl group and an alcohol group (Section 17.11). mechanism for the cleavage of a peptide bond by cyanogen bromide CH3 Br C N CH3 + S CH2 O + Br− N CH2 CH2 O O C NHCHC NHCHCNHCH C S R O CH2 O O C NHCHC NHCHCNHCH R′ R R′ + CH3SC CH2 O CH2 O NHCHCNHCH C O NHCHC + R R′ HCl H2O OH CH2 O CH2 O NHCHCNHCH R CH2 COH O HCl H2O CH2 O NHCHCNHCH R C O + O + H3NCHC R′ N 987 988 CHAPTER 23 Amino Acids, Peptides, and Proteins The last step in determining the primary structure of a protein is to figure out the location of any disulfide bonds. This is done by hydrolyzing a sample of the protein that has intact disulfide bonds. From a determination of the amino acids in the cysteinecontaining fragments, the locations of the disulfide bonds in the protein can be established (Problem 47). PROBLEM 30 Why won’t cyanogen bromide cleave at cysteine residues? PROBLEM 31 ◆ In determining the primary structure of insulin, what would lead you to conclude that it had more than one polypeptide chain? PROBLEM 32 SOLVED Determine the amino acid sequence of a polypeptide from the following results: Acid hydrolysis gives Ala, Arg, His, 2 Lys, Leu, 2 Met, Pro, 2 Ser, Thr, Val. Carboxypeptidase A releases Val. Edman’s reagent releases PTH-Leu. Cleavage with cyanogen bromide gives three peptides with the following amino acid compositions: 1. His, Lys, Met, Pro, Ser 3. Ala, Arg, Leu, Lys, Met, Ser 2. Thr, Val Trypsin-catalyzed hydrolysis gives three peptides and a single amino acid: 1. Arg, Leu, Ser 3. Lys 2. Met, Pro, Ser, Thr, Val 4. Ala, His, Lys, Met SOLUTION Acid hydrolysis shows that the polypeptide has 13 amino acids. The N-terminal amino acid is Leu (Edman’s reagent), and the C-terminal amino acid is Val (carboxypeptidase A). Leu Val Because cyanogen bromide cleaves on the C-side of Met, any peptide containing Met must have Met as its C-terminal amino acid. The peptide that does not contain Met must be the C-terminal peptide. We know that peptide 3 is the N-terminal peptide because it contains Leu. Since it is a hexapeptide, we know that the 6th amino acid in the 13-amino acid peptide is Met. We also know that the eleventh amino acid is Met because cyanogen bromide cleavage gave the dipeptide Thr, Val. Ala, Arg, Lys, Ser Leu His, Lys, Pro, Ser Met Met Thr Val Because trypsin cleaves on the C-side of Arg and Lys, any peptide containing Arg or Lys must have that amino acid as its C-terminal amino acid. Therefore, Arg is the C-terminal amino acid of peptide 1, so we now know that the first three amino acids are Leu-Ser-Arg. We also know that the next two are Lys-Ala because if they were Ala-Lys, trypsin cleavage would give an Ala, Lys dipeptide. The trypsin data also identify the positions of His and Lys. Pro, Ser Leu Ser Arg Lys Ala Met His Lys Met Thr Val Finally, because trypsin successfully cleaves on the C-side of Lys, Pro cannot be adjacent to Lys. Thus, the amino acid sequence of the given polypeptide is Leu Ser Arg Lys Ala Met His Lys Ser Pro Met Thr Val Section 23.13 989 Secondary Structure of Proteins PROBLEM 33 ◆ Determine the primary structure of an octapeptide from the following data: Acid hydrolysis gives 2 Arg, Leu, Lys, Met, Phe, Ser, Tyr. Carboxypeptidase A releases Ser. Edman’s reagent releases Leu. Cyanogen bromide forms two peptides with the following amino acid compositions: 1. Arg, Phe, Ser 2. Arg, Leu, Lys, Met, Tyr Trypsin forms the following two peptides and two amino acids: 1. Arg 3. Arg, Met, Phe 2. Ser 4. Leu, Lys, Tyr 23.13 Secondary Structure of Proteins Secondary structure describes the conformation of segments of the backbone chain of a peptide or protein. To minimize energy, a polypeptide chain tends to fold in a repeating geometric structure such as an a-helix or a b-pleated sheet. Three factors determine the choice of secondary structure: • the regional planarity about each peptide bond (as a result of the partial doublebond character of the amide bond), which limits the possible conformations of the peptide chain (Section 23.7) • maximization of the number of peptide groups that engage in hydrogen bonding (i.e., hydrogen bonding between the carbonyl oxygen of one amino acid residue and the amide hydrogen of another) • adequate separation between nearby R groups to avoid steric hindrance and repulsion of like charges R O HN C O R H R NH R N O hydrogen bonding between peptide groups A-Helix One type of secondary structure is the A-helix. In an a-helix, the backbone of the polypeptide coils around the long axis of the protein molecule (Figure 23.8). The helix is stabilized by hydrogen bonds: Each hydrogen attached to an amide nitrogen is a. b. > Figure 23.8 (a) A segment of a protein in an a-helix. (b) Looking up the longitudinal axis of an a-helix. 990 CHAPTER 23 Amino Acids, Peptides, and Proteins 3-D Molecule: An a-helix hydrogen bonded to a carbonyl oxygen of an amino acid four residues away. The substituents on the a-carbons of the amino acids protrude outward from the helix, thereby minimizing steric hindrance. Because the amino acids have the L-configuration, the a-helix is a right-handed helix; that is, it rotates in a clockwise direction as it spirals down. Each turn of the helix contains 3.6 amino acid residues, and the repeat distance of the helix is 5.4 Å. Not all amino acids are able to fit into an a-helix. A proline residue, for example, forces a bend in a helix because the bond between the proline nitrogen and the a-carbon cannot rotate to enable it to fit readily into a helix. Similarly, two adjacent amino acids that have more than one substituent on a b-carbon (valine, isoleucine, or threonine) cannot fit into a helix because of steric crowding between the R groups. Finally, two adjacent amino acids with like-charged substituents cannot fit into a helix because of electrostatic repulsion between the R groups. The percentage of amino acid residues coiled into an a-helix varies from protein to protein, but, on average, about 25% of the residues in globular proteins are in a-helices. B -Pleated Sheet The second type of secondary structure is the B-pleated sheet. In a b-pleated sheet, the polypeptide backbone is extended in a zigzag structure resembling a series of pleats. A b-pleated sheet is almost fully extended—the average two-residue repeat distance is 7.0 Å. The hydrogen bonding in a b-pleated sheet occurs between neighboring peptide chains. The adjacent hydrogen-bonded peptide chains can run in the same direction or in opposite directions. In a parallel B-pleated sheet, the adjacent chains run in the same direction. In an antiparallel B-pleated sheet, the adjacent chains run in opposite directions (Figure 23.9). Figure 23.9 N Segment of a b-pleated sheet drawn to illustrate its pleated character. N-terminal R H N HC O C R CH C N H R CH C C H N HC O C R CH C R H N O H R R HC C H N HC O C R CH C R H N O H R C-terminal HC CH O N C-terminal C-terminal Parallel 3-D Molecule: Antiparallel b-pleated sheet N-terminal CH O N HC N-terminal N O R C CH N H CH R C O R H O C O H N R CH N HC R O H R C-terminal N-terminal Antiparallel Because the substituents (R) on the a-carbons of the amino acids on adjacent chains are close to each other, the chains can nestle closely together to maximize hydrogenbonding interactions only if the substituents are small. Silk, for example, a protein with a large number of relatively small amino acids (glycine and alanine), has large segments of b-pleated sheets. The number of side-by-side strands in a b-pleated sheet ranges from 2 to 15 in a globular protein. The average strand in a b-pleated sheet section of a globular protein contains six amino acid residues. Wool and the fibrous protein of muscle are examples of proteins with secondary structures that are almost all a-helices. Consequently, these proteins can be stretched. In contrast, the secondary structures of silk and spider webs are predominantly b-pleated sheets. Because the b-pleated sheet is a fully extended structure, these proteins cannot be stretched. Section 23.14 Tertiary Structure of Proteins 991 Coil Conformation Generally, less than half of a globular protein is in an a-helix or b-pleated sheet (Figure 23.10). Most of the rest of the protein is still highly ordered but is difficult to describe. These polypeptide fragments are said to be in a coil conformation or a loop conformation. > Figure 23.10 The backbone structure of carboxypeptidase A: a-helical segments are purple; b-pleated sheets are indicated by flat green arrows pointing in the N ¡ C direction. PROBLEM 34 ◆ How long is an a-helix that contains 74 amino acids? Compare the length of this a-helix with the length of a fully extended peptide chain containing the same number of amino acids. (The distance between consecutive amino acids in a fully extended chain is 3.5 Å.) B-PEPTIDES: AN ATTEMPT TO IMPROVE ON NATURE Chemists are currently studying b-peptides, which are polymers of b-amino acids. These peptides have backbones one carbon longer than the peptides nature synthesizes using a-amino acids. Therefore, each b-amino acid residue has two carbons to which side chains can be attached. Like a-polypeptides, b-polypeptides fold into relatively stable helical and pleated sheet conformations, causing scientists to wonder whether biological activity might be possible with such peptides. Recently, a b-peptide with biological activity has been synthesized—one that mimics the activity of the hor- mone somatostatin. There is hope that b-polypeptides will provide a source of new drugs and catalysts. Surprisingly, the peptide bonds in b-polypeptides are resistant to the enzymes that catalyze the hydrolysis of peptide bonds in a-polypeptides. This resistance to hydrolysis means that a b-polypeptide drug will have a longer duration of action in the bloodstream. O + H3N CH C O − O R -amino acid 23.14 Tertiary Structure of Proteins The tertiary structure of a protein is the three-dimensional arrangement of all the atoms in the protein. Proteins fold spontaneously in solution in order to maximize their stability. Every time there is a stabilizing interaction between two atoms, free energy is released. The more free energy released (the more negative the ¢G°), the more stable the protein. So a protein tends to fold in a way that maximizes the number of stabilizing interactions (Figure 23.11). + H3N CH CH R R′ -amino acid C O− 992 CHAPTER 23 Amino Acids, Peptides, and Proteins CH2 OH O C O H CH2C NH H3N+ H OCH2 Helical Structure C CH3 HN CH3 CH3 CHCH2CH3 CH O + (CH2)4 NH3 −OCCH2 CH S S CH2 Pleated Sheet Structure O COO− ▲ Figure 23.11 Stabilizing interactions responsible for the tertiary structure of a protein. Max Ferdinand Perutz and John Cowdery Kendrew were the first to determine the tertiary structure of a protein. Using X-ray diffraction, they determined the tertiary structure of myoglobin (1957) and hemoglobin (1959). For this work, they shared the 1962 Nobel Prize in chemistry. Max Perutz was born in Austria in 1914. In 1936, because of the rise of Nazism, he moved to England. He received a Ph.D. from, and became a professor at, Cambridge University. He worked on the three-dimensional structure of hemoglobin and assigned the work on myoglobin (a smaller protein) to John Kendrew (1917–1997). Kendrew was born in England and was educated at Cambridge University. The stabilizing interactions include covalent bonds, hydrogen bonds, electrostatic attractions (attractions between opposite charges), and hydrophobic (van der Waals) interactions. Stabilizing interactions can occur between peptide groups (atoms in the backbone of the protein), between side-chain groups (a-substituents), and between peptide and side-chain groups. Because the side-chain groups help determine how a protein folds, the tertiary structure of a protein is determined by its primary structure. Disulfide bonds are the only covalent bonds that can form when a protein folds. The other bonding interactions that occur in folding are much weaker, but because there are so many of them (Figure 23.12), they are the important interactions in determining how a protein folds. Most proteins exist in aqueous environments. Therefore, they tend to fold in a way that exposes the maximum number of polar groups to the aqueous environment and that buries the nonpolar groups in the interior of the protein, away from water. The interactions between nonpolar groups are known as hydrophobic interactions. These interactions increase the stability of a protein by increasing the entropy of water molecules. Water molecules that surround nonpolar groups are highly structured. When two nonpolar groups come together, the surface area in contact with water decreases, decreasing the amount of structured water. Decreasing structure increases entropy, which in turn decreases the free energy, which increases the stability of the protein. (Recall that ¢G° = ¢H° - T¢S°.) Section 23.15 Quaternary Structure of Proteins 993 > Figure 23.12 The three-dimensional structure of carboxypeptidase A. PROBLEM 35 ◆ How would a protein that resides in the interior of a membrane fold, compared with the water-soluble protein just discussed? 23.15 Quaternary Structure of Proteins Proteins that have more than one peptide chain are called oligomers. The individual chains are called subunits. A protein with a single subunit is called a monomer, one with two subunits is called a dimer; one with three subunits is called a trimer, and one with four subunits is called a tetramer. Hemoglobin is an example of a tetramer. It has two different kinds of subunits and two of each kind. The quaternary structure of hemoglobin is shown in Figure 23.13. The subunits are held together by the same kinds of interactions that hold the individual protein chains in a particular three-dimensional conformation: hydrophobic interactions, hydrogen bonding, and electrostatic attractions. The quaternary structure of a protein describes the way the subunits are arranged in space. Some of the possible arrangements of the six subunits of a hexamer are shown here: possible quaternary structures for a hexamer PROBLEM 36 ◆ a. Which of the following water-soluble proteins would have the greatest percentage of polar amino acids—a spherical protein, a cigar-shaped protein, or a subunit of a hexamer? b. Which of these would have the smallest percentage of polar amino acids? ▲ Figure 23.13 Computer graphic representation of the quaternary structure of hemoglobin. The orange and pink subunits are identical, as are the green and purple subunits. The cylindrical tubes represent the polypeptide chains, while the beads represent the iron-containing porphyrin rings (Section 21.11). 994 CHAPTER 23 Amino Acids, Peptides, and Proteins 23.16 Protein Denaturation Destroying the highly organized tertiary structure of a protein is called denaturation. Anything that breaks the bonds responsible for maintaining the three-dimensional shape of the protein will cause the protein to denature (unfold). Because these bonds are weak, proteins are easily denatured. The totally random conformation of a denatured protein is called a random coil. The following are some of the ways that proteins can be denatured: • Changing the pH denatures proteins because it changes the charges on many of the side chains. This disrupts electrostatic attractions and hydrogen bonds. • Certain reagents such as urea and guanidine hydrochloride denature proteins by forming hydrogen bonds to the protein groups that are stronger than the hydrogen bonds formed between the groups. • Detergents such as sodium dodecyl sulfate denature proteins by associating with the nonpolar groups of the protein, thus interfering with the normal hydrophobic interactions. • Organic solvents denature proteins by disrupting hydrophobic interactions. • Proteins can also be denatured by heat or by agitation. Both increase molecular motion, which can disrupt the attractive forces. A well-known example is the change that occurs to the white of an egg when it is heated or whipped. Summary Peptides and proteins are polymers of amino acids linked together by peptide (amide) bonds. A dipeptide contains two amino acid residues, a tripeptide contains three, an oligopeptide contains three to 10, and a polypeptide contains many amino acid residues. Proteins have 40 to 4000 amino acid residues. The amino acids differ only in the substituent attached to the a-carbon. Most amino acids found in nature have the L configuration. The carboxyl groups of the amino acids have pKa values of ' 2 and the protonated amino groups have pKa values of ' 9. At physiological pH, an amino acid exists as a zwitterion. A few amino acids have side chains with ionizable hydrogens. The isoelectric point (pI) of an amino acid is the pH at which the amino acid has no net charge. A mixture of amino acids can be separated based on their pI’s by electrophoresis or based on their polarities by paper chromatography or thin-layer chromatography. Preparative separation can be achieved using ion-exchange chromatography employing a cation-exchange resin. An amino acid analyzer is an instrument that automates ion-exchange chromatography. A racemic mixture of amino acids can be separated by a kinetic resolution. The amide bonds that link amino acid residues are called peptide bonds. A peptide bond has about 40% doublebond character. Two cysteine residues can be oxidized to a disulfide bridge. Disulfide bridges are the only covalent bonds that can form between nonadjacent amino acids. By convention, peptides and proteins are written with the free amino group (the N-terminal amino acid) on the left and the free carboxyl group (the C-terminal amino acid) on the right. To synthesize a peptide bond, the amino group of the first amino acid must be protected (by t-BOC) and its carboxyl group activated (with DCC). The second amino acid is added to form a dipeptide. Amino acids can be added to the growing C-terminal end by repeating these two steps: activating the carboxyl group of the C-terminal amino acid with DCC and adding a new amino acid. Automated solidphase peptide synthesis allows peptides to be made more rapidly and in higher yields. The primary structure of a protein is the sequence of its amino acids and the location of all its disulfide bridges. The N-terminal amino acid of a peptide or protein can be determined with Edman’s reagent. The C-terminal amino acid can be identified with carboxypeptidase. Partial hydrolysis hydrolyzes only some of the peptide bonds. An exopeptidase catalyzes the hydrolysis of a peptide bond at the end of a peptide chain. An endopeptidase catalyzes the hydrolysis of a peptide bond that is not at the end of a peptide chain. The secondary structure of a protein describes how local segments of the protein’s backbone folds. A protein folds so as to maximize the number of stabilizing interactions: covalent bonds, hydrogen bonds, electrostatic attractions (attraction between opposite charges), and hydrophobic interactions (interactions between nonpolar groups). An A-helix, a B-pleated sheet, and a coil conformation are types of secondary structure. The tertiary structure of a protein is the three-dimensional arrangement of all the atoms in the protein. Proteins with more than one peptide chain are called oligomers. The individual chains are called subunits. The quaternary structure of a protein describes the way the subunits are arranged with respect to each other in space. Problems Key Terms enkephalins (p. 976) Author: Insert term that is boldface in text? Or make lightface in text? amino acid (p. 959) D-amino acid (p. 964) L-amino acid (p. 964) amino acid analyzer (p. 970) amino acid residue (p. 959) anion-exchange resin (p. 970) antiparallel b-pleated sheet (p. 990) automated solid-phase peptide synthesis (p. 980) cation-exchange resin (p. 970) coil conformation (p. 991) C-terminal amino acid (p. 973) denaturation (p. 994) dipeptide (p. 959) disulfide (p. 979) disulfide bridge (p. 974) Edman’s reagent (p. 984) electrophoresis (p. 968) 995 endopeptidase (p. 985) enzyme (p. 960) essential amino acid (p. 963) exopeptidase (p. 985) fibrous protein (p. 982) globular protein (p. 982) a-helix (p. 989) hydrophobic interactions (p. 992) interchain disulfide bridge (p. 975) intrachain disulfide bridge (p. 975) ion-exchange chromatography(p. 970) isoelectric point (p. 966) kinetic resolution (p. 972) loop conformation (p. 991) N-terminal amino acid (p. 973) oligomer (p. 993) oligopeptide (p. 959) paper chromatography (p. 969) parallel b-pleated sheet (p. 990) partial hydrolysis (p. 985) peptide (p. 959) peptide bond (p. 973) b-pleated sheet (p. 990) polypeptide (p. 959) primary structure (p. 982) protein (p. 959) quaternary structure (p. 982) random coil (p. 994) secondary structure (p. 982) structural protein (p. 959) subunit (p. 993) tertiary structure (p. 982) thin-layer chromatography (p. 969) tripeptide (p. 959) zwitterion (p. 965) Problems 37. Unlike most amines and carboxylic acids, amino acids are insoluble in diethyl ether. Explain. 38. Indicate the peptides that would result from cleavage by the indicated reagent: a. His-Lys-Leu-Val-Glu-Pro-Arg-Ala-Gly-Ala by trypsin b. Leu-Gly-Ser-Met-Phe-Pro-Tyr-Gly-Val by chymotrypsin c. Val-Arg-Gly-Met-Arg-Ala-Ser by carboxypeptidase A d. Ser-Phe-Lys-Met-Pro-Ser-Ala-Asp by cyanogen bromide e. Arg-Ser-Pro-Lys-Lys-Ser-Glu-Gly by trypsin 39. Aspartame has a pI of 5.9. Draw its most prevalent form at physiological pH. 40. Draw the form of aspartic acid that predominates at a. pH = 1.0 b. pH = 2.6 c. pH = 6.0 d. pH = 11.0 41. Dr. Kim S. Tree was preparing a manuscript for publication in which she reported that the pI of the tripeptide Lys-Lys-Lys was 10.6. One of her students pointed out that there must be an error in her calculations because the pKa of the e-amino group of lysine is 10.8 and the pI of the tripeptide has to be greater than any of its individual pKa values. Was the student correct? 42. A mixture of amino acids that do not separate sufficiently when a single technique is used can often be separated by two-dimensional chromatography. In this technique, the mixture of amino acids is applied to a piece of filter paper and separated by chromatographic techniques. The paper is then rotated 90°, and the amino acids are further separated by electrophoresis, producing a type of chromatogram called a fingerprint. Identify the spots in the fingerprint obtained from a mixture of Ser, Glu, Leu, His, Met, and Thr. Electrophoresis at pH = 5 − + Chromatography 996 CHAPTER 23 Amino Acids, Peptides, and Proteins 43. Explain the difference in the pKa values of the carboxyl groups of alanine, serine, and cysteine. 44. Which would be a more effective buffer at physiological pH, a solution of 0.1 M glycylglycylglycylglycine or a solution of 0.2 M glycine? 45. Identify the location and type of charge on the hexapeptide Lys-Ser-Asp-Cys-His-Tyr at a. pH = 7 b. pH = 5 c. pH = 9 46. The following polypeptide was treated with 2-mercaptoethanol and then with iodoacetic acid. After reacting with maleic anhydride, the peptide was hydrolyzed by trypsin. (Treatment with maleic anhydride causes trypsin to cleave a peptide only at arginine residues.) Gly-Ser-Asp-Ala-Leu-Pro-Gly-Ile-Thr-Ser-Arg-Asp-Val-Ser-Lys-Val-Glu-Tyr-Phe-Glu-Ala-Gly-Arg-Ser-Glu-Phe-Lys-Glu-ProArg-Leu-Tyr-Met-Lys-Val-Glu-Gly-Arg-Pro-Val-Ser-Ala-Gly-Leu-Trp a. Why, after a peptide is treated with maleic anhydride, does trypsin no longer cleave it at lysine residues? b. How many fragments are obtained from the peptide? c. In what order would the fragments be eluted from an anion-exchange column using a buffer of pH = 5? 47. Treatment of a polypeptide with 2-mercaptoethanol yields two polypeptides with the following primary sequences: Val-Met-Tyr-Ala-Cys-Ser-Phe-Ala-Glu-Ser Ser-Cys-Phe-Lys-Cys-Trp-Lys-Tyr-Cys-Phe-Arg-Cys-Ser Treatment of the original intact polypeptide with chymotrypsin yields the following peptides: a. Ala, Glu, Ser c. Tyr, Val, Met e. Ser, Phe, 2 Cys, Lys, Ala, Trp b. 2 Phe, 2 Cys, Ser d. Arg, Ser, Cys f. Tyr, Lys Determine the positions of the disulfide bridges in the original polypeptide. 48. Show how aspartame can be synthesized using DCC. 49. Reaction of a polypeptide with carboxypeptidase A releases Met. The polypeptide undergoes partial hydrolysis to give the following peptides. What is the sequence of the polypeptide? a. Ser, Lys, Trp e. Met, Ala, Gly i. Lys, Ser b. Gly, His, Ala f. Ser, Lys, Val j. Glu, His, Val c. Glu, Val, Ser g. Glu, His k. Trp, Leu, Glu d. Leu, Glu, Ser h. Leu, Lys, Trp l. Ala, Met 50. Glycine has pKa values of 2.3 and 9.6. Would you expect the pKa values of glycylglycine to be higher or lower than these values? 51. A mixture of 15 amino acids gave the fingerprint shown in the below (see also Problem 42). Identify the spots. (Hint 1: Pro reacts with ninhydrin to form a yellow color; Phe and Tyr form a yellow-green color. Hint 2: Count the number of spots before you start.) − Starting mixture: Electrophoresis at pH = 5 Ala Arg Asp Glu Gly Ile Leu Met Phe Pro Ser Thr Trp Tyr Val Origin + Chromatography 52. Dithiothreitol reacts with disulfide bridges in the same way that 2-mercaptoethanol does. With dithiothreitol, however, the equilibrium lies much more to the right. Explain. HO HO SH + RSSR SH dithiothreitol HO HO S + 2 RSH S Problems 997 53. a-Amino acids can be prepared by treating an aldehyde with ammonia and hydrogen cyanide, followed by acid-catalyzed hydrolysis. a. Give the structures of the two intermediates formed in this reaction. b. What amino acid is formed when the aldehyde that is used is 3-methylbutanal? c. What aldehyde would be needed to prepare valine? 54. The UV spectra of tryptophan, tyrosine, and phenylalanine are shown here. Each spectrum is that of a 1 * 10-3 M solution of the amino acid, buffered at pH = 6.0. Calculate the approximate molar absorptivity of each of the three amino acids at 280 nm. .6 tryptophan .5 Absorbance .4 .3 .2 tyrosine .1 phenylalanine 0 230 240 250 260 270 280 290 300 310 Wavelength (nm) 55. A normal polypeptide and a mutant of the polypeptide were hydrolyzed by an endopeptidase under the same conditions. The normal and mutant differ by one amino acid residue. The fingerprints of the peptides obtained from the normal and mutant polypeptides are as shown. What kind of amino acid substitution occurred as a result of the mutation (That is, the substituted amino acid more or less polar than the original amino acid? Is its pI lower or higher?) Electrophoresis at pH 6.5 − Electrophoresis at pH 6.5 − Normal + Paper chromatography Mutant + Paper chromatography 998 CHAPTER 23 Amino Acids, Peptides, and Proteins 56. Determine the amino acid sequence of a polypeptide from the following results: a. Complete hydrolysis of the peptide yields the following amino acids: Ala, Arg, Gly, 2 Lys, Met, Phe, Pro, 2 Ser, Tyr, Val. b. Treatment with Edman’s reagent gives PTH-Val. c. Carboxypeptidase A releases Ala. d. Treatment with cyanogen bromide yields the following two peptides: 1. Ala, 2 Lys, Phe, Pro, Ser, Tyr 2. Arg, Gly, Met, Ser, Val e. Treatment with chymotrypsin yields the following three peptides: 1. 2 Lys, Phe, Pro 2. Arg, Gly, Met, Ser, Tyr, Val 3. Ala, Ser f. Treatment with trypsin yields the following three peptides: 1. Gly, Lys, Met, Tyr 2. Ala, Lys, Phe, Pro, Ser 3. Arg, Ser, Val 57. The C-terminal end of a protein extends into the aqueous environment surrounding the protein. The C-terminal amino acids are Gln, Asp, 2 Ser, and three nonpolar amino acids. Assuming that the ¢G° for formation of a hydrogen bond is -3 kcal>mol and the ¢G° for removal of a hydrophobic group from water is -4 kcal>mol, calculate the ¢G° for folding the C-terminal end of the protein into the interior of the protein under the following conditions: a. All the polar groups form one intramolecular hydrogen bond. b. All but two of the polar groups form intramolecular hydrogen bonds. 58. Professor Mary Gold wanted to test her hypothesis that the disulfide bridges that form in many proteins do so after the minimum energy conformation of the protein has been achieved. She treated a sample of lysozyme, an enzyme containing four disulfide bridges, with 2-mercaptoethanol and then added urea to denature the enzyme. She slowly removed these reagents so that the enzyme could refold and reform the disulfide bridges. The lysozyme she recovered had 80% of its original activity. What would be the percent activity in the recovered enzyme if disulfide bridge formation were entirely random rather than determined by the tertiary structure? Does this experiment support Professor Gold’s hypothesis?