Carmen Nigro September 14, 2009 Topic: Bioinformatics: Sequence Alignment Description: This research examines different algorithms for determining relationships between sequences of amino acids or nucleotides from DNA, RNA, or proteins. Motivation: Sequence alignment can help scientists hypothesize the function of a particular sequence of DNA or protein. Similarities in different sequences can imply similarities in function and structure. References: D.J. Lipman, S.F. Altschul, and J.D. Kececioglu, “A Tool for Multiple Sequence Alignment”, Proc. Nail. Acad. Sci. USA, Vol. 86, pp. 4412-4415, June 1989. [This article offers an alternative to dynamic programming for multiple sequence alignment. Dynamic programming has become impractical for multiple sequence alignment and this article proposes a more efficient algorithm] R. Chenna, H. Sugawara, T. Koike, R. Lopez, T.J. Gibson, D.G. Higgins, and J.D. Thompson, “Multiple sequence alignment with the Clustal series of programs”, Oxford Journals: Nucleic Acids Research, Vol. 31, pp. 3497-3500, 2003. [The Clustal series of programs are the most widely used programs for sequence alignment. This article describes the Clustal series and its implementation] J.D. Thompson, F. Plewniak, and O. Poch, “A comprehensive comparison of multiple sequence alignment programs”, Oxford Journals: Nucleic Acids Research, Vol. 27, pp. 2682-2690, 1999. [This article compares the most widely used programs for multiple sequence alignment. The results show that iterative algorithms offer better accuracy, but take much longer to compute.] J.D. Thompson, T.J. Gibson, F. Plewniak, F. Jeanmougin, and D. G. Higgins, “The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools”, Oxford Journals: Nucleic Acids Research, Vol. 25, pp. 4876-4882, 1997. [This article describes the Clustal_X user interface and useful new features. The article also describes the algorithms used to check alignment quality.] M.S. Waterman, “Efficient Sequence Alignment Algorithms”, J. theor. Biol., Vol. 108, pp. 333-337, 1984. [This article evaluates sequence alignment algorithms and compares them using big O notation. The article proposes the use of concave weighting functions in order to increase efficiency.] H. Rangwala and G. Karypis, “Incremental window-based protein sequence alignment algorithms”, Oxford Journals: Bioinformatics, Vol. 23, pp. e17-e23, 2007. [This article proposes a new algorithm for sequence alignment, which is based on short fixed-or-variable length high-scoring subsequences. The results show that this algorithm gives comparable results to algorithms already in use.] I. M. Wallace , O. Orla, and D. G. Higgins, “Evaluation of Iterative Alignment Algorithms for Multiple Alignment”, Oxford Journals: Bioinformatics, Vol. 21, pp. 14081414, 2005. [This article compares different iterative algorithms for multiple alignment. The paper analyzes the results of several tests that were run on iterative algorithms.] L. A. Newberg, “Memory efficient dynamic programming backtrace and pairwise local sequence alignment”, Oxford Journals: Bioinformatics, Vol. 24, pp. 1772-1778, 2008. [Because it is insufficient to store all intermediate sequences in a cache, this article proposes a memory efficient algorithm for calculating these intermediate values as they are needed. The article describes the results obtained from experiments with this checkpointing system on pairwise local sequences.] J. Hérisson, G. Payen, and R. Gherbi, “A 3D pattern matching algorithm for DNA sequences” , Oxford Journals: Bioinformatics, Vol. 23, pp. 680-686, 2007. [The article proposes a 3D model for DNA rather than the traditional textual models. A 3D model would allow scientists to study syntax and other properties of DNA. ] T. W. Lam, W. K. Sung, S. L. Tam, C. K. Wong, and S. M. Yiu, “Compressed indexing and local alignment of DNA”, Oxford Journals: Bioinformatics, Vol. 24, pp. 791-797, 2008. [The article focuses on finding local alignments of DNA sequences through indexing certain sequences of DNA. This is a faster alternative to dynamic programming; however, it is a heuristic-based approach and may not be as accurate.]