Assign #2

advertisement
BIO 224 Fall 2010
Dr. Tom Peavy
Lab Assignment 2
(due date Wed, Sept 22)
1. Go to the Dotlet website (http://www.isrec.isb-sib.ch/java/dotlet/Dotlet.html )
for dotplot comparisons. First, go to the “about” and "new features" links to read
more about the program.
A. Then, examine the “learn by example” link and go to the “repeated domains”
example.
Explain the pattern you see. (note: be sure to describe the characteristic of a
repeated domain pattern within dotplot)?
B. Go to the “Conserved domains” example in which two proteins are compared.
Explain the dotplot pattern.
C. Create a folder for yourself on the computer and download the fasta file for the
protein sequence of your assigned sequence from last week. Then enter the
sequence into Dotlet program to perform a dotplot comparison against itself. (in
essence, looking for repetitive domains within the protein). Explain the pattern
you observe. (note: use the histogram slider bar to reduce noise; see the
"about" link for a tutorial on how to do so)
D. Go back to NCBI (http://www.ncbi.nlm.nih.gov/) and search the Homologene
database to find another protein sequence that is homologous to your human
protein that is NOT a rodent (e.g. rat) nor a primate sequence (e.g. chimpanzee).
The sequence can be as divergent as you want.
What species did you choose? ___________________
What is the accession number?___________________
E. Download/save the homologous non-rodent/non-primate protein sequence
and compare it to the human protein sequence above using Dotlet. Explain the
pattern. (note whether a diagonal line is present and whether it is continuous vs
broken using similar histogram criteria as above; is there any shift in the
alignment? Why?).
2. You are going to perform a series of local pairwise sequence alignments using
your human protein, the homologous mouse protein (found within Homologene
last assignment), and the other non-rodent/non-primate protein sequence you
used above in question 1D, E.
BIO 224 Fall 2010
Dr. Tom Peavy
A. For a local pairwise comparison, use the "bl2seq " local alignment program
found at the NCBI site (http://www.ncbi.nlm.nih.gov/blast/bl2seq/wblast2.cgi). Be
sure to change the program to "blastp" at the top of the page since the default
setting is "blastn" (a nucleotide alignment program).
Perform pairwise comparisons with your three proteins to provide the data in the
table. Replace the question marks embedded in the table with the percent
identity and percent similarity (place similarity into parentheses) for each pairwise
comparison using the specified scoring matrix. You will need to click on the
"Algorithm parameters" arrow at the bottom of the page to alter the scoring
matrix.
Scoring Matrix= Blosum80
Human protein
Mouse protein
Other protein
Human protein
100% (100%)
----------------------------
Mouse protein
????
100% (100%)
---------------
Other protein
????
????
100%(100%)
Scoring Matrix= Blosum62 (note this is the default value)
Human protein
Mouse protein
Other protein
Human protein
100% (100%)
----------------------------
Mouse protein
????
100% (100%)
---------------
Other protein
????
????
100%(100%)
Mouse protein
????
100% (100%)
---------------
Other protein
????
????
100%(100%)
Scoring Matrix= Blosum45
Human protein
Mouse protein
Other protein
Human protein
100% (100%)
----------------------------
B. Copy and paste one of your more interesting sequence alignments into this
document. Why did you choose this one? (note: copy the alignment into a word
document using a universal font like Courier).
C. What trends do you note when comparing the different matrices for the local
alignment comparisons? How do you explain this?
BIO 224 Fall 2010
Dr. Tom Peavy
3. Then perform the same comparative alignments using a global alignment
program found at the FASTA site:
(http://fasta.bioch.virginia.edu/fasta_www2/fasta_www.cgi). You will need to
select the box underneath the statement "Compare your own sequences" near
the "(A) Program" specification area (which should be listed as "GGSEARCH:
global protein:protein). You can change the Scoring Matrix at the bottom of the
page.
A. Once again, enter the percent identity and similarity scores as before.
Scoring Matrix= Blosum80
Human protein
Mouse protein
Other protein
Human protein
100% (100%)
----------------------------
Mouse protein
????
100% (100%)
---------------
Other protein
????
????
100%(100%)
Mouse protein
????
100% (100%)
---------------
Other protein
????
????
100%(100%)
Scoring Matrix= Blosum62
Human protein
Mouse protein
Other protein
Human protein
100% (100%)
----------------------------
Scoring Matrix= Blosum50 (note this is the default value)
Human protein
Mouse protein
Other protein
Human protein
100% (100%)
----------------------------
Mouse protein
????
100% (100%)
---------------
Other protein
????
????
100%(100%)
B. For a visual comparison of sequence alignment programs, copy and paste the
same global sequence alignments (the same two species and BLOSUM matrix)
that you pasted in question 2B above. (note: copy the alignment into a word
document using a universal font like Courier). We will discuss these two
alignments in question 5.
C. What trends do you note when comparing the different matrices for the global
alignment comparisons? How do you explain this?
BIO 224 Fall 2010
Dr. Tom Peavy
4. Using the tables you generated for the local and global alignments, compare
the values for percent identity/similarity for similar scoring matrices (e.g.
BLOSUM62 for local and global). Are the values identical or relatively close?
Explain why or why not? What is the trend?
5. Compare your sequence alignment comparisons (local vs global) for the same
species and scoring matrix. (e.g. Are there conserved regions?; Are there
variable or divergent regions?; Are there gaps and are they found in the same
region?; How do the local and global alignments differ? Why would they differ?)
6. Calculate the Log-odds for a PAM250 matrix for a position in the alignment in
which a Histidine (H) remains a Histidine (H) (note: show your calculation;
relevant information is in table 3-2 [pg 63 new edition; pg 53 old text] and figure
3.13 [pg 68 new edition; pg 57 old text].
Download