Pairwise profile alignment

advertisement
Pairwise profile alignment
Usman Roshan
BNFO 601
Protein families
• PFAM: http://pfam.sanger.ac.uk/
• Family alignments can be used to
search for new members in a database
Profile-sequence alignment
• Given a family alignment, how can we align it
to a sequence?
• First, we compute a profile of the alignment.
• We then align the profile to the sequence
using standard dynamic programming.
• However, we need to describe how to align a
profile vector to a nucleotide or residue.
Profile
• A profile can be described by a set of
vectors of nucleotide/residue
frequencies.
• For each position i of the alignment, we
we compute the normalized frequency
of nucleotides A, C, G, and T
Aligning a profile vector to a
nucleotide
• ClustalW/MUSCLE
– Let f be the profile vector
– Score(f,j)= 
f i S (i, j )
i  { A ,C ,G ,T }
– where S(i,j) is substitution scoring matrix

Aligning a profile vector to a nucleotide
•
•
•
•
•
PSI-BLAST
Score(f,i)=log(Qi/Pi)
Pi is the background probability of nucleotide i
qij is a matrix of match/mismatch probabilities
Define gi as
f
gi 

i  { A ,C ,G ,T }
• and Qi as

Qi 
j
Pj
f i  gi

q ij
(  ,  are constants)
Download