Assignment 4

advertisement
BIO 224 Laboratory
CSU, Sacramento
Dr. Tom Peavy
September 27 & 29, 2010
Assignment 4
Protein Structure & Signature Sequences
(due Wednesday, October 6 )
1. Using your assigned human protein sequence, answer the following questions:
A) Search for and list your UniProtKB/Swiss-Prot accession number entry for your
protein. __________________
( UniProtKB database site: http://www.uniprot.org/) . You can use the GenBank protein
accession number or the official abbreviation for your gene. If there are more than one
entries, list them and then examine what is essentially different between them. Then
choose the one that seems to be the most thorough or appropriate to work with.
Next, Examine your UniProt entry for the following info:
B) Describe the types of information provided in the “General annotation/Comments”
section of the UniProtKB/Swiss-Prot entry. (don’t relay the exact info for your particular
protein but rather categorize the topics). Why might you need to know these pieces of
information?
C) Describe the types of information provided in the “Sequence annotation/features”
section provide? Why might you need to know these pieces of information?
D) Find the Gene Ontology (GO) consortium entries for your protein within the UniProt
entry. List and describe what each of these GO entries mean and what kind of evidence
supports their designation (evidence codes such as “traceable author statement”).
E) Has the 3-D structure for your protein been determined? If so, provide their PDB
accession entry numbers (if more than 3 entries, then just provide the first 3).
F) Has the 3-D structure of your protein (or portion of the protein) been used to generate
a theoretical model for another homolog? (Click on the ModBase entry). If so, examine
the entry to determine what was modeled (which species homolog) and using what
template (which crystal structure or in essence, primary database link)?
G) Follow the links to your protein on the following sites and describe the information
provided at the various sites. (not exhaustively, but the emphasis of each site and provide
some of the key information you found out about your protein from this site). If you do
not have any information for some of these sites (meaning not listed), then please contact
the instructor for a suitable substitute.
i.
ii.
iii.
iv.
InterPro
Pfam
SMART
PRINTS
BIO 224 Laboratory
CSU, Sacramento
v.
Dr. Tom Peavy
September 27 & 29, 2010
PROSITE
H) Choose one of your Prosite links and list its entry number__________
Then copy and paste the consensus pattern for this protein and then explain what it
means. (in other words, interpret the pattern). Try to choose a prosite entry that does
have a consensus pattern so as to complete the question.
I) Go to the ScanProsite search engine and enter your human protein sequence directly
into the left hand text box to search the database (http://us.expasy.org/tools/scanprosite/ ).
What kind of information did you receive? Compare it to the Prosite links in the UniProt
entry given for your protein (e.g. did you get the hits for Prosite and did you receive any
additional info).
2. Using your assigned human mRNA/protein, predict the following physical and chemical
properties for the protein:
A) Using the mRNA sequence for your protein (download or copy directly from
accession entry), what frame is your mRNA translated from to generate the full length
protein sequence? (i.e frame 1, 2 or 3)
Use the following program: Translate http://br.expasy.org/tools/dna.html
B) Does your protein have any transmembrane regions? Does the output appear to be
consistent with your protein having a signal peptide? (double check with your UniProt
entry to see whether a signal peptide was documented and then discuss your profile
output)
TMpred http://www.ch.embnet.org/software/TMPRED_form.html
C) What is the pI and MW of the mature protein (remember to take into consideration
the signal peptide if it has one)?
Compute pI/Mw tool http://us.expasy.org/tools/pi_tool.html
D) How many peptides would be generated if trypsin was used to proteolytically cleave
your protein? (choose the “sophisticated model” for the trypsin analysis that is listed as
the third entry under the "Please Select" menu)
PeptideCutter http://us.expasy.org/tools/peptidecutter/
Peptides derived from their parent proteins (like above) are often used in proteomics
projects. Why is this so? And how are the peptides utilized?
E) Does your protein have any potential N-linked glycosylation sites?
NetNGlyc: http://www.cbs.dtu.dk/services/NetNGlyc/
F) What is the predicted subcellular location for your protein?
TargetP: http://www.cbs.dtu.dk/services/TargetP/
BIO 224 Laboratory
CSU, Sacramento
Dr. Tom Peavy
September 27 & 29, 2010
Download