Student Worksheet (Introductory)

advertisement
Introductory Worksheet
Proteomics: Protein Identification Using On-Line Databases
Introduction:
In this activity you will learn how to search proteomic databases to determine the identity, functions and 3-D
structure of an unknown yeast protein. Mass spectrometry data files, prepared by undergraduate students at
Franklin and Marshall College, will be uploaded into a proteomic search engine that will query protein databases
and return a report of possible identifications for the unknown protein. You will also be able to determine the
primary structure of your identified protein, learn about its function and view an interactive 3-D model of its
structure.
Background Information:
The undergraduate students exposed their yeast to various environmental stresses (heat, hypoxia, high ethanol
concentration, and hydrogen peroxide (oxidative stress). They compared the proteins produced by the control
and stressed yeast and located proteins that had changed using a laboratory procedure known as 2-D
polyacrylamide gel electrophoresis (PAGE). These proteins were isolated from the gels, digested into smaller
peptide fragments using the enzyme trypsin and analyzed using an instrument known as a liquid chromatography
mass spectrometer (LC/MS). The mass spectrometer determined the molecular masses of the peptide fragments
and saved that data in a file that can be uploaded into a proteomic search engine for analysis.
To understand the laboratory procedure used to analyze proteins and produce mass spectrometry data files go to
the website sponsored by the Children’s Hospital Boston and view the following tutorial:
Guide to Sequencing and Identifying Proteins
http://www.childrenshospital.org/cfapps/research/data_admin/Site602/mainpageS602P0.html
Directions:
To begin your search for the identification of your protein log on to the protein search engine, known as MASCOT,
at the following address:
http://www.matrixscience.com
Click on the MASCOT link at the top of the page and choose the MS/MS Ions Search option that will take you to
the search form.
Enter your name and email address in the appropriate fields along with a suitable title for the search. The email
address is used to send you your data file in the event you are disconnected from the internet during the search;
it is used for no other purpose.
Verify or select the following parameters from the menu choices on the Mascot MS/MS Ions Search data entry
page:
* Database: SwissProt
* Enzyme: Trypsin
* Missed Cleavages: 1
* Taxonomy: Yeast (Saccharomyces cerevisiae - Baker’s Yeast)
* Data file: Use the Browse button to locate the MS data file (mgf format) assigned to you by your instructor.
* All other fields should be left as the default settings.
1
Introductory Worksheet
Once all of the parameters are selected, you can attempt to identify the unknown protein by selecting the Start
Search button at the bottom of the page.
This should return a page entitled, “Mascot Search Results”. It includes a summary of the search information,
possible protein hits (identifications), a “Mascot Score Histogram” and a “Peptide Summary Report”.
Examining the Mascot Score Histogram will tell you the level of statistical confidence Mascot has assigned the
identification for the protein. Protein scores that fall to the right, outside of the green shaded box, indicate that
Mascot has found a significant match for your unknown protein.
The Peptide Summary Report lists the proteins Mascot has matched to the MS peptide data provided in the mgf
files. The report is arranged in hierarchical fashion; the protein identification with the greatest number of
matches is first, the protein with the fewest number of matches is last. Examining the first protein you will see its
accession tag highlighted in blue, the name of the protein, Mowse score (how confident the software is in the
identification) and a list of the peptides, highlighted in red, detected by the mass spectrometer that are found in
the amino acid sequence of this protein.
Click on the accession tag for your protein to access the Mascot “Protein View” page. This will show you the
primary sequence (all of the amino acids indicated by their one-letter abbreviations) of the entire protein with
peptides detected by the mass spectrometer highlighted in red.
To determine the function of your protein copy the accession tag and open the UniProt database in a second
browser window at the following address:
http://www.uniprot.com
Enter the accession tag into the “Query” window and verify that the “Protein Knowledgebase” (UniProtKB) has
been selected from the “Search in” menu before clicking on the Search button. The search result page gives a
description of the protein’s function, as well as links to more information about its role in metabolism and
literature citations.
Many proteins will also have 3D structural information available further down the UniProt information page,
under the heading “Cross-references: 3D structure databases.” If the protein has a link to the RSCB PDB (Protein
Data Bank), then clicking one of the entries will open a new window at the PDB site. If there are multiple PDB
entries on the UniProt page, select the one with the lowest resolution value (Å).
At the PDB site the name of the protein will be listed and an image will appear on the right side of the page.
Clicking the “View in Jmol” button below the image will allow you to view and a larger, rotatable image. Controls
below the image allow you to highlight different structures and display the protein using ribbons, a backbone
trace, or as ball-and-stick. By clicking on the “Export Image” button at the bottom of the page you can save the
image to your computer and print it out using “Preview” or “Word”.
Complete the accompanying worksheet by answering the questions and attach a picture of your protein model to
it in the space provided.
2
Introductory Worksheet
Proteomics: Protein Identification Using On-Line Databases
1. Draw a picture of a 2-D gel and explain how the pattern of spots, representing different proteins, was created.
2. Why are the protein samples treated with trypsin?
Where is trypsin found in the human body?
What is its function?
3. Briefly explain how a mass spectrometer works.
What type of data is generated by the mass spectrometer for the protein sample?
3
Introductory Worksheet
4. Record the yeast MS file name (in mgf format) assigned to you, its accession tag and the name of your
protein.
File name
Accession Tag
Protein Name
____________________
____________________
____________________
5. What is the P (probability) Score for your protein?
How many amino acids are in your protein?
How many peptide matches did your search return?
6. What is the function of your protein?
Picture:
4
Download