Lab 3: Basic Database Similarity Searching

Review of Dynamic Programming SEQUENCE 1 SEQUENCE 2 We want to calculate the score for the yellow box. The final score that we fill in the yellow box will be the SUM of two other scores, we’ll call them MATCH and MAX. Let’s try it… Dynamic Programming Score = Sum of MatchScore + MAX j-4 Match Score whether the sequence matches at that location 1 for match / 0 for non match j-3 j-2 j-1 j i-4 i-3 i-2 i-1 Fill in the Table from the top left hand corner! i MAX (the highest of the following three) 1. The score in the box at position i-1, j-1 2. The highest score in the row i-x, j-1 (where 2<=x<i) 3. The highest score in the column i-1, j-y (where 2<=y<j) Dynamic Programming – Filling in the Table! FILL in the Table from the top left hand corner! A A 1 B B C D The MATCH score is assigned based on whether the residues at position i, j (i.e. yellow box) matches. B C C C In this case, the residues at i, j are A and A which matches. Therefore, the MATCH score would be 1. Since there are no i-1 or j-1 (i.e no column/rows on top) we don’t have to worry about the MAX part of the score. Dynamic Programming – Filling in the Table! A B A 1 0 B C D Moving one square to the right. B C C C In this case, the residues at i, j are B and A and match. Therefore, the MATCH score would be 0. Again there are no i-1 or j-1 (i.e no column/rows on top) we don’t have to worry about the MAX part of the score. Dynamic Programming – Filling in the Table! A B B C D A B C C C 1 0 0 0 0 0 0 0 0 We can filled in the rest of the first column and first row Dynamic Programming – Filling in the Table! Let’s move to the 2nd row Score = Sum of MatchScore + MAX A B B C D A B C C C 1 0 0 0 0 0 2 0 0 0 MAX (the highest of the following three) 1. The score in the box at position i-1, j-1 2. The highest score in the row i-x, j-1 (where 2<=x<i) 3. The highest score in the column i-1, j-y (where 2<=y<j) In this case there is no 2 or 3 to consider MatchScore = 1 MAX = 1 Score = 1 + 1 = 2 Dynamic Programming – Filling in the Table! Moving across the row Score = Sum of MatchScore + MAX A B B C D A B C C C 1 0 0 0 0 0 2 2 0 0 0 MAX (the highest of the following three) 1. The score in the box at position i-1, j-1 2. The highest score in the row i-x, j-1 (where 2<=x<i) 3. The highest score in the column i-1, j-y (where 2<=y<j) MatchScore = 1 MAX = 1 Score = 1 + 1 = 2 Dynamic Programming – Filling in the Table! Moving across the row again! Score = Sum of MatchScore + MAX A B B C D A B C C C 1 0 0 0 0 0 2 2 1 1 0 0 0 MAX (the highest of the following three) 1. The score in the box at position i-1, j-1 2. The highest score in the row i-x, j-1 (where 2<=x<i) 3. The highest score in the column i-1, j-y (where 2<=y<j) MatchScore = 0 MAX = 1 Score = 0 + 1 = 1 We can fill in the last square using the same method = 1 Dynamic Programming – Filling in the Table! Moving to the next row A B B C D A B C C C 1 0 0 0 0 0 2 2 1 1 0 1 0 0 MAX (the highest of the following three) 1. The score in the box at position i-1, j-1 2. The highest score in the row i-x, j-1 (where 2<=x<i) 3. The highest score in the column i-1, j-y (where 2<=y<j) MatchScore = 0 MAX = 1 Score = 0 + 1 = 1 Dynamic Programming – Filling in the Table! Moving to the next row A B B C D A B C C C 1 0 0 0 0 0 2 2 1 1 0 1 2 0 0 MAX (the highest of the following three) 1. The score in the box at position i-1, j-1 2. The highest score in the row i-x, j-1 (where 2<=x<i) 3. The highest score in the column i-1, j-y (where 2<=y<j) MatchScore = 0 MAX = 2 Score = 0 + 2 = 2 Dynamic Programming – Filling in the Table! Moving to the next row A B B C D A B C C C 1 0 0 0 0 0 2 2 1 1 0 1 2 3 2 0 0 MAX (the highest of the following three) 1. The score in the box at position i-1, j-1 2. The highest score in the row i-x, j-1 (where 2<=x<i) 3. The highest score in the column i-1, j-y (where 2<=y<j) MatchScore = 1 MAX = 2 OR 2 Score = 1 + 2 = 3 We can fill in the last square in similar fashion Dynamic Programming – Filling in the Table! A B C C C A B 1 0 0 0 0 0 2 1 1 1 B C 0 2 2 2 2 0 1 3 3 3 D 0 1 2 3 We can fill in the remaining squares! Dynamic Programming – Filling in the Table! A B C C C A B 1 0 0 0 0 0 2 1 1 1 B C 0 2 2 2 2 0 1 3 3 3 D 0 1 2 3 3 The LAST Square! MATCH = 0 MAX = 3 Score = 0+3 = 3 QUESTIONS? A B A 1 0 0 0 0 B 0 2 2 1 1 C 0 1 2 3 2 C 0 1 2 3 3 C B C D 0 1 2 3 3 Traceback Protocol A A T V D A 1 1 0 0 0 V 0 1 1 2 1 Start in the lower right corner. V 0 1 1 2 2 D 0 1 1 1 3 You can only move to the largest number that is UP and TO THE LEFT. D D Used to get the alignment from the filled in table. Traceback Protocol A A T V D A 1 1 0 0 0 V 0 1 1 2 1 V 0 1 1 2 2 D 0 1 1 1 3 VD VD All 3 paths start like this. But, moving up and to the left from the square with score 2, we have two possible choices, both of which are up and to the left, and contain equal values. Traceback Protocol A A T V D A 1 1 0 0 0 V 0 1 1 2 1 V 0 1 1 2 2 D 0 TVD VVD 1 1 1 3 ATVD V-VD We now have two possible alignments – red and yellow. Yellow has only one more square it can access. The red alignment can branch off again, however. Traceback Protocol A A T V D A 1 1 0 0 0 V 0 1 1 2 1 V 0 1 1 2 2 D 0 1 1 1 3 AATVD -AVVD AATVD AV-VD AATVD A-VVD These are the 3 possible paths through the matrix, in other words, the 3 possible alignments. Traceback Protocol A A T V D A 1 1 0 0 0 V 0 1 1 2 1 V 0 1 1 2 2 D 0 1 1 1 3 Every time a diagonal line “skips” a box (i.e does not lead into the box immediately to the upper left (i-1, j1), we insert a gap into the alignment. Traceback Protocol A A A T V D 1 1 0 0 0 V 0 1 1 2 1 V 0 1 1 2 2 D 0 1 1 1 3 AATVD -AVVD AATVD AV-VD AATVD A-VVD Traceback Protocol A A T V D A 1 1 0 0 0 V 0 1 1 2 1 V 0 1 1 2 2 D 0 1 1 1 3 Is this possible?? AATV-D -A-VVD Optimal alignment?? QUESTIONS?? A A A T V D 1 1 0 0 0 V 0 1 1 2 1 V 0 1 1 2 2 D 0 1 1 1 3 AATVD -AVVD AATVD AV-VD AATVD A-VVD

Lab 3: Basic Database Similarity Searching

Related documents

Products

Support

Lab 3: Basic Database Similarity Searching

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib