Each question was worth 1 point. BCB 444/544

advertisement
BCB 444/544 Fall 07 Sep 6
Lab 3
p. 1
BCB 444/544
Lab 3 Key
Database Searching
Each question was worth 1 point.
1a. How do the results of the SSEARCH and BLAST searches compare?
The results were quite similar overall. Interestingly, when I did the searches, BLAST had the top hits as perfect
matches while the top hits from SSEARCH had some mismatches. But the main point is that both SSEARCH
and BLAST found a lot of identical and very similar sequences in the database.
b. Did they find the same hits in the database?
Yes, the hits were basically the same.
c. Are the alignments the same?
The alignments were also the same for the most part.
d. Which program would you use more often and why?
I would use BLAST because it runs so much faster. Only if BLAST failed to find any hits after repeated tries
with different parameters would I consider using SSEARCH.
2. What does an E-value of 2 mean?
The E-value is basically the number of hits with the given score you would expect to see just by chance, i.e.
with NO biological significance. An E-value of 2 means that you would expect to find 2 hits with the given
score just by chance.
3. What is this protein?
My top hit was annotated as: “Aspartate aminotransferase, mitochondrial precursor (Transaminase A)
(Glutamate oxaloacetate transaminase 2).”
4. How many hits did you get? How many of them are significant, with an E-value below 0.1?
5. How many hits did you get? How many of them are significant, with an E-value below 0.1?
6. How many hits did you get? How many of them are significant, with an E-value below 0.1?
Here’s what I got:
Results table for questions 4-6
BLAST flavor
megablast
discontinuous megablast
blastn
Number of hits
0
0
7
Number of hits with E-value < 0.1
0
0
1
BCB 444/544 Fall 07 Sep 6
Lab 3
p. 2
7. What explanation can you give for the different results from using megablast, discontinuous megablast, and
blastn?
Megablast uses settings designed to find highly similar (nearly identical) sequences. Discontinuous megablast
uses settings designed to find very similar sequences. Blastn uses settings designed to find less similar
sequences. The main point here is that the exact same BLAST algorithm is used with all three, but the settings
are different and the different settings obviously effect the results that are obtained.
8. How many hits did you get? How many of them are significant, with an E-value below 0.1?
I got 25 hits, 8 with E-value less that 0.1.
9. How many hits did you get? How many of them are significant, with an E-value below 0.1?
This time I got at least 100 hits, with about 90 of them having an E-value below 0.1.
10. What types of proteins do you get from the second PSI-BLAST iteration? Do you believe that our query
sequence is related to the results we found? Why or why not?
I got a lot of hits to a protein called diguanylate cyclase, various transporter proteins, and a handful of other
membrane proteins. Since PSI-BLAST works by creating a profile from the hits of the first round of BLAST
searching and then searching the database for hits to the profile. In this problem, we found a small number of
hits in our initial BLAST search with e-values around 1-2. These sequences may or may not be related to our
initial query sequence. The second round of PSI-BLAST created a profile based on the hits found, and then
looked for sequences similar to the profile. Since the second round is searching based on the profile, our results
may or may not be related. This is both the power and the pitfall of PSI-BLAST, we can detect similarities that
would be impossible to see from a single sequence, but we can also get off track if the set of sequences used to
build the profile are not very closely related.
So, the answer to the final question is either yes or no, and that is up to you. As long as you have a justified
reason for why you say yes or no you will get full credit for your answer.
Personally, I would believe that our query protein has some type of association with the membrane, but not
necessarily any specific similarity to any of the hits we found.
Download