Exercise 2: Chymotrypsin—Active Site and Specificity 1. Obtain the amino acid sequences of bovine chymotrypsin and trypsin as follows. Go to the National Center for Biotechnology Information website: http://www.ncbi.nlm.nih.gov/. Select “Protein” from the dropdown menu and enter the identification numbers for the two proteins (“gi” numbers) in the search box: “576117, 60593450”. Click “Go.” The entries for the two proteins should appear. From the “Display” dropdown menu, select “FASTA.” You will then see the amino acid sequences of the proteins in FASTA format. (This is a format in which the “>” symbol is followed by identification information and a carriage return; the amino acid sequence, using the one-letter codes, begins after that.) Keep this window open so that you can copy the sequences during step 2. Chymotrypsin and trypsin are both serine proteases. Name the three active site residues of serine proteases that constitute the catalytic triad. ___________________________________________________________________ What chemical reaction do these enzymes catalyze? ___________________________________________________________________ Describe the differing specificities of chymotrypsin and trypsin for this reaction. ___________________________________________________________________ 2. Open a new browser window and go to the EMBL-EBI Toolbox pairwise alignment site: http://www.ebi.ac.uk/emboss/align/index.html. Copy and paste the sequences in FASTA format into the two textboxes (include the “>” and identification information). Leave all alignment parameters the same and click “Run.” When your output appears, click the “Needle output” link to obtain a printer-friendly version of the alignment. Print the alignment (you will turn in all alignments at the end). Examine the sequence alignment, paying close attention to the residues of the catalytic triad. You may wish to highlight these residues. Determine whether or not this is a good alignment. Click the “back” button on the browser window twice to go back to the pairwise alignment site. Change the “Gap Open” and “Gap Extend” parameters and generate another alignment. Try at least two different combinations and print the alignment each time. Carefully compare all the alignments and determine which one is best and label it “BEST” (you may decide that some alignments are roughly equal). 2 Use the space provided to answer the following discussion questions: A. What constitutes a “good” alignment? B. Why did you choose this particular alignment as the best relative to other alignments you generated? 3. Chymotrypsin residues 189, 190, and 228 are three of the residues lining the specificity pocket of the enzyme (where a side chain of the substrate binds). Use your best sequence alignment to determine the identity of these residues for chymotrysin, and then determine the identity of the corresponding residues for trypsin. (The residue numbers will be different for trypsin.) Chymotrypsin Trypsin 189 ____________ ____________ 190 ____________ ____________ 228 ____________ ____________ Obtain the handout from the instructor which shows the structure of chymotrypsin. 3 Use the space provided to answer the following discussion question: Although other features of these two enzymes are essential for their differing specificities, the residues lining the specificity pocket make an important contribution to their specificities. Residue 189 of chymotrypsin and the corresponding residue of trypsin are each at the “base” of the specificity pocket, as you can see on the handout. Reconcile the identity of this residue in chymotrypsin and trypsin with the differing specificities of the two enzymes. 4. You have been working with bovine serine proteases so far. Choose an organism that is distantly related to the cow (Bos Taurus) and search the NCBI protein database for a chymotrypsin or trypsin sequence from this organism. (Go to http://www.ncbi.nlm.nih.gov/ and choose “Protein” from the dropdown menu.) When you find a sequence that looks interesting, click on the link to view the full entry. Read the entry carefully to be sure that you have found a full-length sequence. (For instance, 50 amino acids is only a partial sequence.) Write down the gi number for the sequence so that you will have it for step 5. Use the space provided to propose some hypotheses before examining the sequence further: A. Predict whether or not the enzyme you chose will contain the residues forming the catalytic triad. Explain your reasoning. 4 B. Predict the identities of the three residues lining the specificity pocket which you examined in step 3 for bovine chymotrypsin and trypsin. Explain your reasoning. 5. You will now do a multiple sequence alignment for bovine chymotrypsin, trypsin, and the serine protease you chose in step 4. You will need to obtain these three sequences in FASTA format. If you already closed the browser windows where these sequences were displayed, just go back to the NCBI website and use the gi numbers to find them. Open a new browser window and go to the EMBL-EBI Toolbox ClustalW multiple sequence alignment site: http://www.ebi.ac.uk/clustalw/index.html. Copy and paste the three FASTA-format sequences into the textbox (paste the sequences one after the other with a carriage return between them; include the > and identification information for each sequence). Leave all alignment parameters the same and click “Run.” When your output appears, click the “Alignment file” link to obtain a printer-friendly version of the alignment. Print the alignment (you will turn in all alignments at the end). Examine the alignment, paying careful attention to the residues of the catalytic triad and the three residues lining the specificity pocket. If you are not satisfied with the alignment, try it again changing the “Gap open,” “End gaps,” “Gap extension,” or “Gap distances” parameters. 5 Use the space provided to answer the following discussion questions: A. Were your hypotheses from step 4 correct? On your alignment, highlight the residues of the catalytic triad and the three residues lining the specificity pocket. Name any residues that turned out to be different from your predictions and discuss the implications in terms of enzymatic activity and specificity. B. Based on amino acid sequence alone, do you expect the serine protease from the organism you chose to have chymotrypsin-like activity or trypsin-like activity? Justify your answer. 6 C. Compare your multiple sequence alignment to that of two other groups who have chosen a serine protease from a different organism; focus on the catalytic triad and the three residues lining the specificity pocket. Were the identities of the three residues lining the specificity pocket for the serine proteases chosen by the other groups different from those of the serine protease you chose? Discuss any differences. D. Looking at your multiple sequence alignment, are there any regions that are highly conserved (many identical amino acids for all three sequences)? Are there any regions that are not well conserved (few identical amino acids for all three sequences)? Do any of these regions surround the residues of the catalytic triad or the three residues lining the specificity pocket? How might the locations of any highly conserved regions relate to the structure and function of the three enzymes? When finished, turn in your sequence alignments and answers to the discussion questions.