Classified References on Computational Biology R. C. T. Lee On Books On Classification of Protein Folds On Evolutionary Trees On Divide-and-Conquer On Genome Rearrangement On LCS On Miscellaneous On Nearest Neighbor Search On Pattern Discovery On Physical Mapping On Protein Structure On RNA Structures On Sequence Alignment On Sequence Assembly Problem On Sorting by Reversal On String Matching On Structure Alignment On Superstrings On Superstructures On Visual Display On NP-Complete Problems and Approximation Algorithms 1 On Sequence Assembly Problem [CFM80] The k best spanning arborescences of a network, Camperini, P., Fratta, L. and Maffioli, F., Networks, Vol. 10, 1980, pp. 91-110. [DS1991] A Sequence Assembly and Editing Program for Efficient Management of Large Projects, Dean, S. and Staden, R., Nucleic Acids Research, Vol. 19, 1991, pp. 3901-3917. [GMSR79] Computer Problems for the Assembly of DNA Sequences, Gingeras, T. R., Milao, J. P., Sciaky, P. and Roberts, R. J., Nucleic Acid Res, Vol. 7, 1979, pp. 529-545. [H92] A Contig Assembly Program Based on Sensitive Detection of Fragments Overlap, Huang, X., Genomics, Vol. 14, 1992, pp. 18-25. [HGST86] Efficient algorithms for finding minimum spanning trees in undirected and directed graphs, Harold, G., Galil, Z., Spencer, T. and Tarjan, R., Combinatorial, Vol. 6, 1986, pp. 109-122. [KE95] Combinational Algorithms for DNA Sequence Assembly, Kececioglu, D. J. and Myers, W. E., Algorithmica, Vol. 13, 1995, 7-51. [KLT2001] A Probabilistic Approach to Sequence Assembly Validation, Kim, S., Liao, L. and Tomb, J. F., Workshop on Data Mining in Bioinformatics, 2001 [KM89] A Procedural Interface for a Fragment Assembly Tool, Kececioglu, D. J. and Myers, W. E., Technical Reports 89-5, Department of Computer Science, The University of Arizona,1989 [M93] Rethinking the DNA Fragment Assembly Problem, Meidanis, J., 1993. [M95] Towards Simplifying and Accurately Formulating Fragment Assembly, E. W. Myers, J. Comput. Biology, Vol. 2, 1995, pp. 275-290 [PTW2001] A New Approach to Fragment Assembly in DNA Sequencing, Pevzner, P. A., Tang, H. and Waterman, M. S., RECOMB, Montreal, Canada, 2001, pp. 256-267. 2 3 On Sequence Alignment [AAS2000] On Approximation Algorithms for Local Multiple Alignment, Akutsu, T., Arimura, H. and Shimozono, S., RECOMB, TOKYO, 2000, pp. 1-7. [AKMSW87] Geometric applications of a matrix-searching algorithm, Aggarwal, A., Klawe, M., Moran. S., Shor, P. and Wilber, R., Algorithmica, Vol. 2, 1987, pp. 195-208. [AL89] Trees, Stars, and Multiple Biological Sequence Aligment, Altschul, S. F. and Lipman, D. J., SIAM J. Appl. Math.,Vol. 49, 1989, pp.197-209. [B95] A space efficient algorithm for finding the best nonoverlapping alignment score, Benson, G., Theoret. Comput. Sci., Vol. 145, 1995, pp. 357-369. [BLP97] Approximation Algorithms for Multiple Sequence Alignment, Bafna, V., Lawler, E. and Pevzner, P., Theoretical Computer Science, Vol. 182, 1997, pp. 233-244. [BM98] Discovering internet marketing intelligence through online analytical web usage mining, Buechner, A. and Mulvenna, M., SIGMOD Record, Vol. 27, 1998, pp. 54-61. [BV2001] The Complexity of Multiple Sequence Alignment with SP-Score that is a Metric, Bonizzoni, P. and Vedova, G. D. Theoretical Computer Science, Vol. 259, 2001, pp. 63-79. [CHM92] Recent Developments in Linear-Space Alignment Methods: A Survey, Chao, K. M, Hardison, R. C., and Miller, W., J. Comput. Biol., Vol. 1, 1992, pp. 271-291. [CL88] The Multiple Sequence Alignment Problem in Biology, Carrillo,H. and Lipman, D, SIAM J. Appl. Math, Vol. 48, 1988, pp.1073-1082. [CL92] Theoretical and Empirical Comparisons of Approximate String Matching Algorithms, Chang, W. I. and Lampe, J., In Proceedings of the 3rd Symposium on Combinatorial Pattern Matching, Lecture Notes in Computer Science, Vol. 644, 1992, pp. 172-181. 4 [CPM92] Aligning Two Sequences within a Specified Diagonal Band, Chao, K. M., Pearson, W. R. and Miller, W., Comput. Appl. BioSciences, Vol. 8, 1992, pp. 481-487. [CWC92] A survey of Multiple Sequence Comparison Methods, Chan, S. C., Wong, A. K. C. and Chiu, D. K.Y. Bull. Math. Biol. Vol. 54, 1992, pp. 563-598. [DH75] Sequence Comparison by Dynamic Programming ,Delcoigne, A. and Hansen, P., Biometrika, Vol. 62, 1975, pp. 661-664. [FD87] Progressive Sequence Aligment as a Prerequisite to Correct Phylogenetic Trees, Feng, D. and Doolittle, R., J. Molec. Evol. Vol. 25, 1987, pp. 351-360. [FFB2000] A task-based architecture for application-aware adjuncts, Farrell, R., Fairweather, P. and Breimer, E., Proceedings of the 2000 International Conference on Intelligent User Interfaces, 2000, pp. 82-85. [G91] Efficient Methods for Multiple Sequence Alignment with Guaranteed Error Bounds, Gusfield, D., Tech. Report, Computer Science Division, University of California, Davis, CSE-91-4, 1991. [G93] Efficient Methods for Multiple Sequence Alignment with Guaranteed Error Bounds, Gusfield, D., Bull. Mathematics Biology. Vol. 55, 1993, pp. 141-154. [GBN94] Parametric Optimization of Sequence Alignment, Gusfield, D., Balasubramanian, K. and Naor, D., Algorithmica, Vol.12, No. 4-5, Oct-Nov. 1994, pp.312-326. [GCS2000] Evaluation Measures of Multiple Sequence Alignments, Gonnet, G. H., Korostensky, C. and Benner,S., Journal of Computational Biology, Vol. 7, No. 1-2, 2000, pp. 261-276. [GG89] Speeding up dynamic programming with applications to molecular biology, Galil, Z. and Giancarlo, R., Theoret. Comput. Sci., Vol. 64, 1989, pp. 107-118. [GMP96] Gene recognition via spliced sequence alignment, Gelfand, M., 5 Mironov, A. and Pevzner, P., Proc. Natl. Acad. Sci. USA, Vol. 93, 1996, pp. 9061-9066. [J99] Reducing Gap-0 Multiple Alignment to Multiple Alignment, Just, W., Manuscript, 1999. [K93] The Maximun Weight Trace Alignment Problem in Multiple Sequence Aligment, Kececioglu, J., A. Apostolico, M Crochemore, Z. Galil, U. Manber (Eds.), Combinatorial Pattern Maching 93, Padova, Italy, June, 1993, Vol. 684, pp.106-119. [KM96] An algorithm for locating non-overlapping regions of maximum alignment score, Kannan, S. and Myers, E., SIAM J. Comput., Vol. 25, No. 3, 1996, pp. 648-662. [KRGS2001] Gene Structure Prediction and Alternative Splicing Analysis Using Genomically Aligned ESTs, Kan, Z., Rouchka, E. C., Gish, W. R. and States, D. J., Genome Research, Vol. 11, 2001, pp. 889 –900. [L88] Computational Molecular Biology, Sources and Methods for Sequence Analysis, A. Lesk, ED., Oxford University Press, 1988. [LAK89] A Tool for Multiple Sequence Aligment, Lipman, D. J., Altschul, S. F. and Kececioglu, J. D., Proc. Nat. Acad Sci., Vol. 86, 1989, pp. 4412-4415. [LMW99] Finding Similar Regions in Many Sequences, Li, M., Ma, B. and Wang, L., Proc. 31st ACM Symp. Theory of Computing (STOC 99) , 1999. [LP2000] RNA Pseudoknot Prediction in Energy Based Models, Lyngso, R. B. and Pedersen, C. N. S., Journal of Computational biology, Vol. 7, 2000, pp.409-427. [LPSH2001] Visualization and analysis of clickstream data of online stores for understanding web merchandising, Lee, J., Podlaseck, M., Schonberg, E. and Hoch, R., J. Data Mining Knowledge Discovery, Vol. 5, Nos. 1/2, 2001, pp. 59-84. [LR99] Local Multiple Sequence Alignment Using Dead-End Elimination, Lukashin, A. V. and Rosa, J. J., Biogen, Inc, Cambridge Center, USA, Vol. 15, No. 11, 1999, pp. 947-953. 6 [LU2001] On the Common Substring Alignment Problem, Landau, G. and Ukelson, M., Journal of Algorithms, Vol. 41, 2001, pp. 338-359. [M88] A Flexible Multiple Sequence Alignment Program, Martinez, M, Nucleic Acids Res, Vol. 16, 1988, pp. 1683-1691. [MFDW97] DIALIGN: Finding Local Similarities by Multiple Sequence Alignment, Morgenstern, B., Frech, K., Dress, A. and Werner, T., GSF-National Research Center for Environment and Health, 1997. [MRPG98] Performance-guarantee gene predictions via spliced alignment, Mironov, A., Roytberg, M., Pevzner, P. and Gelfand, M., Genomics 51 A.N. GE985251, 1998, pp. 332-339. [MW97] Near Optimal Multiple Alignment within a Band in Polynomial Time, Ma, B. and Wang, L., in the Proc. 32nd ACM, pp. 1-23. [NW70] A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins, Neddleman, S. B. and Wunsch, C. D., J. Mol. Biol., Vol. 48, 1970, pp. 443-453. [P92] Multiple Alignment, Communication Cost, and Graph Matching, Pevzner, P. A., SIAM Journal on Applied Mathematics, Vol. 52, No. 6, Dec. 1992, pp. 1763-1779. [S80] The Theory and Computations of Evolutionary Distances: Pattern Recognition, Sellers, P. H., J. Algorithms, Vol. 1, 1980, pp. 359-154. [S2001] Non-Approximability of Weighted Multiple Sequence Alignment, Siebert, B., COCOON, 2001, PP. 75-85. [SBDGGHHLKMPS91] A system for distributed intrusion detection, Snapp, S., Brentano, J., Dias, G., Goan, T., Grance, T., Heberlein, L., Ho, C., Levitt, K., Mukerjee, B., Mansur, D., Pon, K. and Smaha, S., COMPCON Spring 91, the 36th IEEE International Computer Conference, 1991, pp. 170-176. 7 [SM86] A Multiple Sequence Aligment Program, Sobel, E. And Martinez, M., Nucleic Acids Res., Vol. 14, 1986, pp. 363-374. [SP97] Las Vegas algorithms for gene recognition: Suboptimal and error-tolerant spliced alignment, Sze, S. and Pevzner, P., J. Comp. Biol., Vol. 4, No. 3, 1997, pp. 297-309. [SYYH02] Super Pairwise Alignment (SPA): An Efficient Approach to Global Alignment for Homologous Sequence, Shen, S. Y., Yang, J., Yao, A., and Hwang, P. I., Journal of Computational Biology, Vol.9, 2002, pp. 477-486. [SZ90] Fast Algorithm for the Unit Cost Editing Distance between Trees, Shasha, D. and Zhang, K., J. Algorithms, Vol. 11, 1990, pp. 581-621. [T90] Hierarchical Method to Align Large Numbers of Biological Sequences, Taylor, W. R., Mothods Enzymol. Vol. 183, 1990, pp. 456-474. [UHLU] Using repeats to speedup DNA sequence alignment, private communication, Ukelson, M., Horesh, Y., Landau, G. and Unger, R. [VLP94] Approximation Algorithms for Multiple Sequence Alignment, Bafna, V., Lawler, E. L. and Pevzner, P., Proc. of the 5th Annual Symp. on Combin. Pattern Matching(CPM'94). Lecture Notes in Computer Science, Vol. 807, 1974, pp. 43-53. [WJ94] On the Complexity of Multiple Sequence Alignment, Wang, L. and Jiang, T., Journal of Computation Biology, Vol. 1, 1994, pp. 337-348. [W95] A Simplified Proof of the NP- and MAX SNP-Hardness of Multiple Sequence Tree Alignments, Wareham, H. T., J. Comput. Biol., Vol. 2, No. 4., 1995, pp. 509-514. [WJ94] On the Complexity of Multiple Sequence Aligment , Wang, L. and Jiang, T., J. Comput. Biol., Vol. 1, 1994, pp. 337-348. [WSB76] Some Biological Sequence Metrics, Waterman, M. S., Smith, T. F. and Beyer, W. A., Adv. In Math. Vol. 20, 1976, pp. 367-378. 8 [Z96] A Constrained Edit Distance between Unordered Labeled Trees, Zhang, K., Algorithmica, Vol. 15, 1996, pp. 205-222. [ZSS92] On the Editing Distance between unordered Labeled Trees, Zhang, K., Statman, R. and Shasha, D., Information Processing Letters, Vol. 42, 1992, pp. 133-139. 9 On Evolutionary Trees [AG83] Human Mitochondrial DNA Variation and Evolution: Analysis of Nucleotide Sequences from Seven Individuals, Aquadro, C. F. and Greenberg, B. D., Genetics, Vol. 103, 1983, pp. 287-312. [AK97] Maximun Agreement Subtree in a Set of Evolutionary Trees: Metrics and Efficient Algorithms, Amir, A and Keselman, D., SIAM J. Comput., Vol. 26, 1997, pp. 1656-1669. [B71] The Recovery of Trees from Measures of Dissimilarity, Buneman, P., Mathematics in the Archaeological and Historical Sciences, 1971 , pp. 387-395. [BBJKLWZ2000] Practical Algorithm for Recovering the Best Supported Edges in an Evolutionary Tree, Berry, V., Bryant, D., Jiang, T., Kearney, P., Li, M., Wareham, T., and Zhang, H., Proc. 11th Annual ACM-SIAM Symp. on Discrete Algorithms, Jan. 2000. [BPWW82] Mitochondrial DNA Sequences of Primates: Tempo and Mode of Evolution, Brown, W. M., Prager, E. M., Wang, A. and Wilson, A. C., Journal of Molecular Evolution, Vol.18, 1982, pp.225-239. [BSLGDV98] The Discovery of Two New Divergent STLVs has Implications for the Evolution and Epidemiology of HTLVs, Brussel, M. V., Salemi, M., Liu, H. F., Goubau, P., Desmyter, J. and Vandamme, A. M., Rev. Med. Virol., Vol. 9, 1999, pp. 155-170. [CBW84] Polymorphic Sites and the Mechanism of Evolution in Human Mitochondrial DNA, Cann, R. L., Brown, W. M. and Wilson, A. C., Genetics, Vol. 106, , 1984, pp. 479-499. [CR89] A Fast Algorithm for Constructing Trees from Distance Matrices, Culbertson, J. C. and Rudnicki, P., Inform. Process. Lett., Vol. 30, No. 4., 1989, pp. 215-220. [CSW87] Mitochondrial DNA and Human Evolution, Cann, R. L., Stoneking, M. and Wilson, A. C., Nature, Vol. 325, 1987, pp. 31-36. 10 [F81] Evolutionary Trees from DNA sequences: A Maximum Likelihood Approach, Felsenstein, J., J. Molecular Evolution, Vol. 17, 1981. [F88] Phylogenies from Molecular Sequences: Inference and Reliability, Felsenstein, J., Annu. Rev. Genet, Vol. 22, 1988, pp. 521-565. [FKW95] A Robust Model for Finding Optimal Evolutionary Trees, Farach, M., Kannan, S. and Warnow, T., Algorithmica, Vol. 13, No. 1-2, Jan-Feb. 1995, pp. 155-179. [FM67] Construction of Phylogenetic Trees, Fitch, W. M. and Margoliash, E., Science, Vol.155, No. 20, Jan.1967, pp. 279-284. [FT97] Sparse Dynamic Programming for Evolutionary Tree Comparison, Farach, M. and Thorup, M., SIAM J. Comput., Vol. 26, 1997, pp. 210-230. [HH90] Intraspecific Nucleotide Sequence Differences in the Major Noncoding Region of Human Mitochondrial DNA, Horai, S. and Hayasaka, Am. J. Hum Genet., Vol. 46, No. 828, 1990. [HH91] Time of the Deepest Root for Polymorphism in Human Mitochondrial DNA, Hasegawa, M. and Horai, S., Journal of Molecular Evolution, Vol. 32, 1991, pp. 37-42. [HT84] Fast Algorithms for Finding Nearest Common Ancestors, D. Harel and R. E. Tarjan, SIAM J. Comp, Vol. 13, No.2, 1984, pp.338-355. [JKL2001] A Polynomial Time Approximation Scheme For Inferring Evolutionary Trees From Quartet Topologies and Its Application, Jiang, T., Kearney, P. and Li, M., SIAM Journal Comput. Vol. 30, No. 6, pp. 1942-1961. [JLW94] Aligning Sequences via an Evolutionary Tree, Jiang, T., Lawler, E. L. and Wang, L., Conference Proceedings of the Annual ACM Symposium on Theory of Computing, May 23-25, 1994, pp. 760-769. [KG98] Reconstructing a History of Recombination from a Set of Sequences, Kececioglu, J. and Gusfield, D., Discrete Applied Mathematics, Vol. 88, 1998, pp. 239-260. 11 [KHM97] Inferring Evolutionary Trees from Ordinal Data, Kearney, P., Hayward, R. B. and Meijer, H. Proc. 8th Annual ACM-SIAM Symposium on Discrete Algorithms, 1997, pp. 418-426. [KLW96] Determining the Evolutionary Tree Using Experiments, Kannan, S. K., Lawler, E. L. and Warnow, T. J., J. Algorithms, Vol. 21, 1996, pp. 26-50. [KW94] Inferring Evolutionary History from DNA Sequences, Kannan, S. K. and Warnow, T. J., SIAM Journal on Computing, Vol. 23, No. 4, Aug. 1994, pp. 713-737. [KW95] Tree Reconstruction from Partial Orders, Kannan, S. and Warnow; T., SIAM J. Computing, Vol. 24, 1995, pp. 511-519. [KWY98] Computing the Local Consensus of Trees, Kannan, S., Warnow, T. and Yooseph, S., SIAM Journal on Computing, Vol. 27, No. 6, Dec. 1998, pp.1695-1724. [LBC96] An Evolutionary Trace Method Defines Binding Surfaces Common to Protein Families, Lichtarge, O., Bourne, H. R. and Cohen, F. E., Journal Comput. Biol., Vol. 257, 1996, pp. 342-358. [LCJDLG98] Molecular Analysis of GB Virus C Isolates in Belgian Hemodialysis Patients, Liu, H. F., Cornu, C., Jadoul, M., Dahan, K., Loute, G. and Goubau, P., Journal of Medical Virology, Vol. 55, 1998, pp. 118-122. [LMTDDG2000] High Prevalence of GB Virus C/Hepatities G Virus in Kinshasa, Democratic Republic of Congo: A Phylogenetic Analysis, Liu, H. F., Muyembe-Tamfum, J. J., Dahan K., Desmyter, J. and Goubau, P., Journal of Medical Virology, Vol. 60, 2000, pp. 159-165. [S75] Minimum Mutation Tree of Sequences, Sankoff, D., SIAM J. Appl. Math., Vol. 28, 1975, pp. 35-42. [S89] Origin of Early Modern Humans, Stringer, C. B., ibid, 1989, pp. 232-244. 12 [S92] The Complexity of Reconstructing Trees from Qualitative Characters and Subtrees, Steel, M., Journal of Classification, Vol. 9, 1992 , pp. 91-116. [SA83] Phylogeny and Classification of Birds Based on the Data of DNA-DNA-Hybridization, Sibley, C. G. and Ahlquist, J. E., Curr. Ornithol., Vol. 1, 1983, pp. 245-292. [SA88] Genetic and Fossil Evidence for the Origin of Modern Humans, Stringer, C. B. and Andrews, P., Science, Vol. 239, 1988, pp. 1263-1268. [SH96] Quartet Puzzling : A Quartet Maximum-Likelihood Method for Reconstructing Tree Topologies, Strimmer, K. and Haeseler, A. V., Molecular Biology and Evolution, Vol. 13, 1996, pp. 964-969. [SJBW90] Geographic Variation in Human Mitochondrial DNA from Papua New Guinea, Stoneking, M., Jorde, L. B., Bhatia, K. and Wilson, A. C., Genetics, Vol. 124, 1990, pp.717-733. [SN87] The Neighbor-Joining Method : A New Method for Reconstructing Phylogenetic Trees, Staitou, N. and Nei, M., Molecular Biology and Evolution, Vol. 4, 1987, pp. 406-425. [SV88] On Finding Lowest Common Ancestors: Simplification and Parallelization, B. Schieber and U. Vishkin., SIAM J. Comput., Vol. 17, 1988, pp.1253-1262. [T91] Human Origins and Analysis of Mitochondrial DNA Sequences, Templeton, A., Science, Vol. 255, 1991, pp. 737. [VPHKW89] Mitochondrial DNA Sequences in Single Hairs from a Southern African Population, Vigilant, R., Pennington, Harpending, H., Kocher, T. D. and Wilson, A. C., Proc. Natl. Acad. U.S.A., Vol. 86, 1989, pp. 9350-9354. [VSHHW91] African Populations and the Evolution of Human Mitochondrial DNA, Vigilant, L., Stoneking, M., Harpending, H., Hawkes, K. and Wilson, A. C., Science, Sept. Vol. 253, No. 27, 1991, pp. 1503-1507. [WJ94] On the Complexity of Multiple Sequence Alignment, Wang, L. and Jiang, T., Journal of Computational Biology, Vol. 1, No. 4, 1994, pp. 337-348. 13 [WLBCR2000] A Polynomial-Time Approximation Scheme for Minimum Routing Cost Spanning Trees, Wu, B. Y., Lancia, G., Bafna, V., Chao, K. M., Ravi, R. and Tang, C. Y., SIAM J. on Computing, Vol. 29, No. 3, Jan. 12, 2000, pp. 761-778. [WSSB77] Additive Evolutionary Trees, Waterman, M. S., Smith, T. F., Singh, M. and Beyer, W. A., Journal Theoretical Biology, Vol. 64, 1977, pp. 199-213. [WZJS94] A System for Approximate Tree Matching, Wang, J. T. L., Zhang, K., Jeong, K. and Shasha, D., IEEE Transactions on Knowledge and Data Engineering, Vol. 6, No. 4, Aug. 1994, pp. 559-571 1041-4347. [VSHHW91] African Populations and the Evolution of Human Mitochondrial DNA, Vigilant L., Stoneking M., Harpending H., Hawkers K. and Wilson A. C., Science, New Series, Vol. 253, Issue 5027, 1991, pp.1503-1507. 14 On Superstrings [AS95] Improved Length Bounds for the Shortest Superstring Problem, Armen, C. and Stein, C., in Proceedings 5th International Workshop on Algorithms and Data Structures, Lecture Notes in Comput. Sci., Vol. 955, 1995, pp. 494-505 [AS96] A 2 2/3 Approximation Algorithm for the Shortest Superstring Problem, Armen, C. and Stein C., in Proceedings Combinational Pattern Matching, Lecture Notes in Comput. Sci., Vol. 1075, 1996, pp. 87-101. [AS98] 2 2/3 Superstring Approximation Algorithm, Armen, C. and Stein, C., Discrete Applied Mathematics, Vol. 88, No. 1-3, Nov. 9, 1998, pp. 29-57. [BJJ97] Rotations of Periodic Strings and Short Superstrings, Breslauer, D., Jiang, T. and Jiang, Z., J. Algorithms, Vol. 24, No. 2, August, 1997, pp. 340-353. [BJLTY91] Linear Approximation of Shrotest Superstrings, Blum, A., Jiang, T., Li, M.,Tromp, J. and Yannakakis, M., in Proceedings 23th Annual ACM Symposium on Theory of Computing, ACM, 1991, pp. 328-336. [E90] A linear time algorithm for finding approximate shortest common superstrings, Esko, U., Algorithmica, Vol. 5, 1990, pp. 313-323. [FS98] Greedy Algorithms for the Shortest Common Superstring that are Asymptotically Optimal, Frieze, A. and Szpankowski, W., Algorithmica, Vol. 21, No. 1, May, 1998, pp. 921-36. [GMS80] On finding minimal length superstring, Gallant, J., Maier, D., and Storer, J., Journal of Computer and System Sciences, Vol. 20, 1980, pp.50-58. [J89] Approximation algorithms for the shortest common superstring problem, Jonathan, T., Information and Computation, Vol. 83, 1989, pp. 1-20. [JL95] On the Approximation of Shortest Common Supersequences and Longest Common Subsequences, Jiang, T. and Li, M., SIAM Journal on Computing, Vol. 24, No. 5, 1995, pp.1122-1139. [JU88] A greedy approximation algorithm for constructing shortest common 15 superstrings, Jorma, T. and Ukkonen, E., Theoretical Computer Science, Vol. 57, 1988, pp. 131-145. [KPS94] Long Tours and Short Superstrings, Kosaraju, S. R., Park, J. K. and Stein, C., Proc. 35th Annual IEEE Symposium on Foundations of Computer Science, 1994, pp. 166-177. [S99] A 2 1/2 Approximation Algorithm for Shortest Superstring, Sweedyk, Z., SIAM J. on Computing, Vol. 29, No. 3, 1999, pp. 954-986. [TY93] Approximating Shortest Superstrings, Teng, S. and Yao, F., Proc. 34th Annual IEEE Symposium on Foundations of Computer Science, IEEE Computer Society Press, Los Alamitos, CA, 1993, pp.158-165. 16 On Protein Structure [A96] Protein Structure Alignment Using Dynamic Programming and Iterative Improvement, Akutsu, T., IEICE Trans. Inf. & Syst., Vol. E78-D, No. 0, 1996, pp.1-8. [AGMML90] Basic Local Alignment Search Tool, Altschul, S. F., Gish, W., Miller, W., Myers, E. W., Lipman, D., J. Mol. Biol., Vol. 215, 1990, pp.403-410. [AH94] On the approximation of largest common subtrees and largest common point sets, Akutsu, T. and Halldorsson, M. M., Lecture Notes in Computer Science, 1994, pp. 405-413. [AM97] On the Approximation of Protein Threading, Akutsu, T. and Miyano, S., RECOMB, 1997, pp. 3-8. [AS99] Protein Threading Based on Multiple Protein Structure Alignment, Akutsu, T. and Sim, K. L., Genome Informatics, Vol. 10, 1999, pp. 23-29. [AT98] Linear programming based approach to the derivation of a contact potential for protein threading, Akutsu, T. and Tashimo, H., Proc. Pacific Symposium on Biocomputing 1998, 1998, pp. 413-424. [AMSZZML97] Gapped BLAST and PSI BLAST: A new generation of protein database search, Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. J., Nucleic Acids Research, Vol. 25, No. 17, 1997, pp.3389-3402. [B76] The Protein Data Bank: A computer-based archival file for macromolecular structure, Bernstein, F. C. et. al., J. Molecular Biology, 1976, pp. 535-542. [BKWMBRKST76] The Protein Data Bank: A Computer-Based Archival File for Macromolecular Structures, Bernstein, F. C., Koetzle, T. F., Williams, G. J. B., Meyer jr., E. F., Brice, M. D., Rodgers, J. R., Kennard, O., Shimanouchi, T., and Tasumi, M., J. Molecular Biology, Vol.112, 1976, pp.535-542. 17 [BL98] Protein Folding in the Hydrophobic-Hydrophilic(HP) Model is NP-Complete, Berger, B. and Leighton, T., Journal of Computational Biology, Vol. 5, No. 1, 1998, pp. 27-40. [BLE91] A method to identify protein sequences that fold into a known three-dimensional structures, Bowie, J. U., Luthy, R., and Eisenberg, D., Science, 1991, pp. 164-170. [BT91] Introduction to Protein Structure, Branden, C. and Tooze, J., Garland Publishing, New Yourk, 1991. [BYZRS2000] Comprehensive statistical method for protein fold recognition, Bienkowska, J. R., Yu, L., Zarakhovich, S., Rogers Jr, R. G., Smith, T. F., RECOMB 2000 Tokyo Japan, 2000, pp. 76-85. [CPMLC91] Pattern Recognition and Protein Structure Prediction, Cohen, B. I., Presenell, S.R., Morris, M., Langridge, R. and Cohen, F. E., System Sciences, Vol.1, 1991, pp. 574-584. [CPM92] Aligning two sequences within a specified diagonal band, Chao, K. M., Pearson, W. R., and Miller, W., CABIOS, No8, 1992, pp.481-487. [D69] Computer Analysis of Protein Evolution, Dayhoff, M. O., Sci. Amer., July 1969, pp. 86-96. [D2002] A Genomic Regulatory Network for Development, Davidson, E. H., Science, VOL 295, 2002, pp. 1669-1678. [DPR97] Protein structure prediction and potential energy landscape analysis using continuous global minimization, Dill, K. A., Phillips, A. T., Rosen, J. B., RECOMB, 1997, pp. 109-117. [EGGI92] Sparse Dynamic Programming I: Linear Cost Functions, Eppstein, D., Galil, Z., Giancarlo, R. and Italiano, G. F., Journal of the Association for Computing Machinery, Vol. 39, No 3, 1992, pp. 519-545. [EGGI92] Sparse Dynamic Programming II: Convex and Concave Cost Functions, Eppstein, D., Galil, Z., Giancarlo, R. and Italiano, G., J. Assoc. Comput. 18 Mach., Vol. 39, 1992, pp. 546-567. [G93] Efficient methods for multiple sequence alignment with guaranteed error bounds, Gusfield, D., Bulletin of Mathematical Biology, Vol. 55, 1993, pp. 141-154. [GBDK89] An NTP-Binding Motif is the Most Conserved Sequence in a Highly Diverged Monophyletic Group of Proteins Involved in Positive Strand RNA Viral Replication, Gorbalenya, A. E., Blinov, V. M., Donchenko, A. P. and Koonin, E. V., J. Molec. Evol. Vol. 28, 1989, pp. 256-68. [GGPPY98] On the Complexity of Protein Folding, Crescenzi, P., Goldman, D., Papadimitriou, C., Piccolboni, A. and Yannakakis, M., Journal of Computational Biology, Vol. 5, No. 3, 1998, pp. 423-465. [GIP99] Algorithmic Aspects of Protein Structure Similarity, Goldman, D., Istrail, S. and Papadimitriou, C., IEEE Proc. 40th Ann. Conf. Foundations of Computer Science (FOCS’99), 1999, pp. 512-521. [GL96] Using Iterative Dynamic Pprogramming to Obtain Accurate Pairwise and Multiple Alignments of Protein Structures, Gerstein, M. and Levitt, M., In Proc. Fourth Int, Conf. on Intell. Sys. For Mol. Biol. Menlo Park, 1996, pp. 59-67. [H95] A context dependent method for comparing sequences, Huang, X., Proc. 5th Symposium on combinatorial pattern Matching, 1995, pp. 54-63 [HI96] Fast Protein Folding in the Hydrophobic-Hydrophilic Model within Three-Eights of Optimal, Hart, W. E. and Istrail, S., Journal of Computational Biology, Spring, 1996. [HOSTV92] A Database of Protein Structure Families with Common Folding Motifs, Holm, L., Onzounis, C., Sander, C., Tuparev, G., and Vriend, G., Protein Science, vol.1, 1992, pp.1691-1698. [HS91] Database algorithm for generating protein backone and side chain co-ordinates from a Ca trace. Application to model building and detection of co-ordinate errors, Holm, L. and Sander, C., J. Mol. Biol., Vol. 218, 1991, pp. 183-194. 19 [HS93] Protein structure comparison by alignment of distance matrices, Holm, L. and Sander, C., J. Mol. Biol., Vol. 233, 1993, pp. 123-138. [HS94] The FSSP database of structurally aligned protein fold families, Holm, L. and Sander, C., Nucleic Acids Research, Vol. 22, 1994, pp. 3600-3609. [HS95] 3-D Lookup: Fast Protein Structure Database Searches at 90% Reliability, Holm, L., and Sander, C., Proc. 3rd International Conference on Intelligent Systems for Molecular Biology (ISMB’95), 1995, pp.179-187. [HS96] Mapping the protein universe, Holm, L. and Sander, C., Science, Vol. 273, 1996, pp. 595-602. [HS96] Alignment of three-dimensional protein structure, Holm, L. and Sander, C., Meth. Enz., Vol. 266, 1996, pp.595-602. [HS98] Dictionary of recurrent domains in protein structure, Holm, L. and Sander, C., Proteins, Vol. 33, 1998, pp. 88-96. [L91] Protein Architecture: A Practical Approach, Lesk, A. M., IRL Press, New York, 1991. [L94] The protein threading problem with sequence amino acid interaction preferences is NP-complete, Lathrop, R. H., Protein Engineering, Vol. 7, 1994, pp.1059-1068. [LS94] A Branch-and-Bound Algorithm for Optimal Protein Threading with Pairwise (Contact Potential) Amino Acid Interactions, Lathrop, R. H. and Smith, T. F., Proc. 27th Annual Hawaii International Conference on System Sciences, Vol. 5, 1994, pp.365-374. [LS96] Global optimum protein threading with gapped alignment and empirical pair score function, Lathrop, R. H. and Smith, T. F., J. Molecular Biology, Vol. 255, 1996, pp. 641-665. [LEN2002] The Spectrum Kernel: A String Kernel for SVM Protein Classification, Leslie C., Eskin E., Nobble W. S., Proceeding of the Pacific symposium on Biocomputing, January 2002, pp564-575. 20 [MHBFP97] Critical assessment of methods of protein structure prediction (CASP): Round II, Moult, J., Hubbard, T., Bryant, S. H., Fidelis, K., and Pedersen, J. T., Protein: Structure, Function, and Genetics, Suppl. 1, 1997, pp. 2-6. [MPP99] Approximation Algorithms for Protein Folding Prediction, Mauri, G., Pavesi, G. and Piccolboni, A., Proceedings of the 10th Annual Symposium on Discrete Algorithms (SODA), 1999, pp. 945-946. [N97] Molecular Modeling of Proteins and Mathematical Prediction of Protein Structure, Neumaier, A., SIAM, Vol. 39, No. 3, 1997, pp. 407-460. [OJT94] Protein Superfamilies and Domain Superfolds, Orengo, C. A., Jones, D. T., and Thornton, J. M., Nature, Vol.372, 1994, pp.631-634. [OTIA94] Protein Structure Prediction Based on Multi-Level Description, Onizuka, K., Tsuda, H., Ishikawa, M. and Aiba, A., System Sciences, Vol.V, 1994, pp. 355-364. [PA92] A Data Bank Merging Related Protein Structures and Sequences, Pascarella, S., and Argos, P., Protein Engineering, vol.5, 1992, pp.121-137. [PV2000] Backbone Cluster Identification in Proteins by a Graph Theoretical Method, Patra, S. M. and Vishveshwara, S., Biophysical Chemistry, Vol. 84, 2000, pp. 13-25. [R2001] Review: Protein Secondary Structure Prediction Continues to Rise, Rost, B., Journal of Structural Biology, Vol. 134, 2001, pp. 204-218. [RCB95] Protein Fold Recognition from Secondary Structure Assignments, Russell, R. B., Copley, R. R. and Barton, G. J., Proceedings of the 28th Annual Hawaii International Conference on System Sciences, 1995, pp. 302-311. [RR73] Comparison of Super-Secondary Structure in Proteins, Rao, S. T. and Rossmann, M. G.., J. Molecular Biology, vol.76, 1973, pp.241-256. [RS93] Prediction of protein structure at better than 70% accuracy, Rost, B. and Sander, C., J. Molecular Biology, Vol. 232, 1993, pp. 584-599. 21 [SO94] Derivation of Rules for Comparative Potein Modeling from a Database of Protein Structure Alignments, Sali, A., and Overington, J. P., Protein Science, vol.3, 1994, pp.1582-1596. [SSK94] How Does a Protein Fold?, Sali, A., Shahknovich, E. and Karplus, M., Nature, Vol. 369, 1994, pp. 248-251. [TO89] Protein Structure Alignment, Taylor, W. R., and Orengo, C. A., J. Molecular Biology, vol.208, 1989, pp.1-22. [UM93] Genetic Algorithms for Protein Folding Simulations, Unger, R. and Moult, J., Journal of Molecular Biology, Vol. 231, 1993, pp. 75-81. [VS91] Detection of Common Three-Dimensional Substructures in Proteins, Vriend, G., and Sander, C., PROTEINS: Structure, Function, and Genetics, Vol.11, 1991, pp. 52-58. [YD94] Forces of Tertiary Structural Organization in Globular Proteins, Yue, K. and Dill, K. A., Proceedings of the National Academy of Science, USA, Vol. 92, 1994, pp. 146-150. [ZB96] The use of amino acid patterns of classified helices and strands in secondary structure prediction, Zhu, Z. Y., Blundell, T. L., J Mol Biol, 1996, pp. 261-276. [ZWM89] Protein Structure Prediction by A Data-Level Parallel Algorithm, Zhang, X., Waltz, D., Mesirov, J. P., Proceedings of the 1989 Conference on Supercomputing, 1989, pp. 215-223. 22 On String Matching [A87] Generalized string matching, Abrahamson, K., SIAM J. Comput., Vol. 16, 1987, pp. 1039-1051. [AF95] Efficient 2-dimensional approximate matching of half-rectangular figures, Amir, A. and Farach, M., Inform. And Comput., Vol. 118, 1995, pp. 1-11. [B97] Parameterized Duplication in Strings: Algorithms and An Application to Software Maintenance, Baker, B. S., SIAM J. Comput., Vol. 26, No. 5, 1997, pp. 1343-1362. [BG92] A New Approach to Text Searching., Baeza-Yates, R. A. and Gonnet, G. H., ACM, Vol. 35, 1992, pp. 74-82. [BN99] Faster Approximate String Matching, Baeza-Yates, R. and Navarro, G., Algorithmica, Vol. 23, No. 2, Feb. 1999, pp. 127-158. [C95] Fast Approximate Matching Using Suffix Trees, Cobbs, A., In Proceedings of the 6th Symposium on Combinatorial Pattern Matching, Lecture Notes in Computer Science, Vol. 937, 1995, pp. 41-54. [CL92] Theoretical and Empirical Comparisons of Approximate String Matching Algorithms, Chang, W. I. and Lampe, J., In Proceedings of the 3rd Symposium on Combinatorial Pattern Matching, Lecture Notes in Computer Science, Vol. 644, 1992, pp. 172-181. [CL94] Sublinear Approximate String Matching and Biological Applications, Chang, W. I.. and Lawler, E. L., Algorithmica, Vol. 12, No. 4-5, Oct-Nov. 1994, pp. 327-344. [G2001] A Guide Tour to Approximate String Matching, Gonzalo, N., ACM, Vol. 33, 2001, pp. 31-88. [GP90] An Improved Algorithm for Approximate String Matching, Galil, Z. and Park, K., SIAM J. Comput., Vol. 19, 1990, pp. 989-999. [GV2000] Compressed Suffix Arrays and Suffix Trees with Applications to Text 23 Indexing and String Matching, Grossi, R. and Vitter, J. S., STOC Portland Oregon USA, 2000, pp. 397-406. [KNU2000] Approximate string matching over Ziv-Lempel compressed text, Karkkainen, J., Navarro, G. and Ukkonen, E., Proceedings of the 11th Annual Symposium on Combinatorial Pattern Matching, 2000, pp. 195-209. [LMS98] Incremental string comparison, Landau, G., Myers, E. and Schmidt, J., SIAM J. Comput., Vol. 27, No. 2, 1998, pp. 557-582. [LV86] Efficient string matching with k mismatches, Landau, G. M. and Vishkin, U., Theoret. Comput. Sci., Vol. 43, 1986, pp. 239-249. [LV88] Fast String Matching with k Differences, Landau, G. M. and Vishkin, U., J. Comput. Syst. Sci., Vol. 37, 1988, pp. 63-78. [LV89] Fast parallel and serial approximate string matching, Landau, G. M. and Vishkin, U., J. Algorithms, Vol. 10, 1989, pp. 157-169. [M76] A Space-economical Suffix Tree Construction Algorithm, Mccreight E. M., ACM, Vol. 23, No. 2, April 1976, pp. 262-272. [M94] Sublinear Algorithm for Approximate Keyword Searching, Myers, E.W., Algorithmica, Vol. 12, No. 4-5, Oct-Nov. 1994, pp. 345-374. [MP80] A Fast Algorithm for Computing String Edit Distances, Masek, W. J. and Paterson, M. S., J. Comput. Syst. Sci., Vol. 20, 1980, pp. 18-31. [MR95] Muthukrishnan, S. and Ramesh, H., String matching under a general matching relation, Muthukrishnan, S. and Ramesh, H., Inform. And Comput., Vol. 122, 1995, pp. 140-148. [N99] A Guided Tour to Approximate String Matching, NAVARRO, G., ACM Computing Surveys, Vol. 33, No. 1, March 2001, pp. 31–88. [PW95] Multiple Filtration and Approximate Pattern Matching, Pevzner, P. A. and Waterman, M. S., Algorithmica, Vol. 13, No. 1-2, Jan-Feb. 1995, pp. 135-154. 24 [RS98] On Pattern Frequency Occurrences in a Markovian Sequence, Regnier, M. and Szpankowski, W., Algorithmica, Vol. 22, No. 4, Dec, 1998, pp. 631-649. [S93] Generalized Suffix Tree and Its (Un)expected Asympototic Behaviors, Szpankowski, W., SIAM Journal on Computing, Vol. 22, No. 6, Dec. 1993, pp. 1176-1198. [S98] All highest scoring paths in weighted grid graphs and their application to finding all approximate repeats in strings, Schmidt, J., SIAM J. Comput., Vol. 27, No. 4, 1998, pp. 972-992. [U85] Finding Approximate Patterns in Strings, Ukkonen, E., J. Algorithms, Vol. 6, 1985, pp. 132-137. [U90] A linear-time algorithm for finding approximate shortest common super-strings, Ukkonen, E., Algorithmica, Vol. 5, 1990, pp. 313-323. [U92] Approximate String-Matching with q-Grams and Maximal Matches, Ukkonen, E., Theoret. Comput. Sci., Vol. 92, 1992, pp. 191-211. [U93] On-Line Construction of Suffix-Trees, Ukkonen, E., Technical Report A-1993, Department of Computer Science, University of Helsinki, Finland, 1993. [WM92] Fast Text Searching Allowing Errors, Wu, S. and Manber, U., ACM, Vol. 35, No. 10, 1992, pp. 83-91. [WMM96] A Subquadratic Algorithm for Approximate Limited Expression Matching, Wu, S. and Myers, G., Algorithmica, Vol. 15, 1996, pp. 50-67. 25 On Superstructures [AS98] On Testing Consecutive-Ones Property in Parallel, Annexstein, F. and Swaminathan, R., Discrete Applied Mathematics, Vol. 88, No.1-3, Nov. 9, 1998, pp. 7-28. [JK98] Mapping Clones with a Given Ordering or Interleaving, Jiang, T. and Karp, R. Algorithmica, Vol. 21, 1998, pp. 262-284. [KS95] Exact and Approximation Algorithms for Sorting by Reversals, with Application to Genome Rearrangement, Kececioglu, J. and Sankoff, D., Algorithmica, Vol. 13, No. 1-2, Jan-Feb. 1995, pp. 180-210. [S85] Simultaneous Solution of the RNA Folding, Alignment and Protosequence Problem, Sankoff, D., SIAM J. Appl. Math. Vol. 45, 1985, pp. 810-825. 26 On RNA Structures [BMR95] Computing similarity between RNA strings, Bafna, V., Muthukrishnan, S. and Ravi, R., Proceedings of the 6th Annual Symposium on Combinatorial Pattern Mathcing, Vol. 937, 1995, pp. 1-16. [CM94] RNAling program : alignment of RNA sequences using both primary and secondary structures, Corpet, F. and Michot, B., Comput. Appl. Bio-sci., Vol. 10, 1994, pp. 389-399. [JLMZ2002] A General Edit Distance between RNA Structures, Jiang, T., Lin, G., Ma, B. and Zhang, K., Journal of Computational Biology, Vol. 9, No. 2, 2002, pp. 371-388. [LP2000] Pseudoknotes in RNA Secondary Structures, Lyngs , R. B. and Pedersen, C. N. S., RECOMB, Tokyo, 2000, pp. 201-209. [LRV98] A polyhedral approach to RNA sequence structure alignment, Lenhof, H., Reinert, K. and Vingron, M., Proceedings of the Second Annual International Conference on Computational Molecular Biology, 1998, pp. 153-159. [RE99] A Dynamic Programming Algorithm for RNA Structure Prediction Including Pseudoknots, Rivas, E. and Eddy, S. R., J. Mol. Biol., Vol. 285, 1999, pp. 2053-2068. [S85] Simultaneous solution of the RNA folding, alignment, and protosquence problems problems, Sankoff, D., SIAM J. Appl. Math., Vol. 45, 1985, pp. 810-825. [T2000] Dynamic Programming Algorithms for RNA Secondary Structure Prediction with Pseudoknots, Tatsuya, A., Discrete Applied Mathematics, Vol. 104, 2000, pp. 45-62. [TSF88] RNA Structure Prediction, Turner, D. H., Sugimoto, N. and Freier, S., Annual Review of Biophysics and Biophysical Chemistry, Vol. 17, 1988, pp. 167-192. [WS86] Rapid Dynamic Programming Algorithms for RNA Secondary Structure, Waterman, M. S. and Smith, T. F., Advances in Applied Mathematics, 27 Vol. 7, 1986, pp. 455-464. [Z89] The Use of Dynamic Programming Algorithms in RNA Secondary Structure Prediction, Zuker, M., Mathematical Methods for DNA Sequences, Waterman M. S., Ed. CRC Press, Inc., Boca Raton, Florida, chapter 7, 1989, pp. 159-184. [ZS84] RNA Secondary Structures and Their Prediction, Zuker, M. and Sankoff, D., Bulletin of Mathematical Biology, Vol. 46, 1984, pp. 591-621. [ZWM99] Computing similarity between RNA structures, Zang, K., Wang, L. and Ma, B., Proceedings of the 10th Annual Symposium on Combinatorial Pattern Matching, Vol. 1645, 1999, pp. 281-293. 28 On Miscellaneous [B75] On the Factorization of the Complete Uniform Hypergraph, Baranyai, Z., A. Hajnal, T. Rado, V.T. Sos (Eds.), Infinite and Finite Sets, 1975, pp. 91-108. [BJKLW99] Quartet Cleaning: Improved Algorithms and Simulations, Berry; V., Jiang, T., Kearney, P., Li, M. and Wareham, T., Proc 7th Annual European Symposium on Algorithms, July, 1999. [CW79] Universal Classes of Hash Functions, Carter, M. N. and Wegman, M. N., J. Comput. System Sci, Vol. 18., 1979, pp. 143-154. [FJKST] An Algorithmic Approach to Multiple Complete Digest Mapping, Fasulo, D., Jiang, T., Karp, R., Settergren, R. and Thayer, E., Journal of Computerational Biology, in press. [JKL98] Orchestrating Quartets: Approximation and Data Correction, Jiang, T. Kearney, P. and Li, M., Proc. 39th IEEE Symposium on Foundations of Computer Science, Palo Alto, CA, 1998. [JS94] Functional Equation Arising in the Analysis of Algorithms, Jacquet, P. and Szpankowski, W., Conference Proceedings of the Anuual ACM Symposium on Theory of Computing, May 23-25, 1994, pp. 780-789. [S88] Data Compression: Methods and Theory, Storer, J., Computer Science Press, Rockville, MD, 1988. 29 On Genome Rearrangement [APCVBLL2000] An SNP Map of the Human Genome Generated by Reduced Representation Shotgun Sequencing, Altshuler, D., Pollara, VJ., Cowles, CR.,Van Etten,WJ., Baldwin, J., Linton, L. and Lander, ES., Nature, Vol. 407, 2000, pp. 513-516. [ATPEMAFJOL98] A Genome-Based Approach for the Identification of Essential Bacterial Genes, Arigoni, F., Talabot, F., Peitsch, M., Edgerton, M. D., Meldrum, E., Allet, E., Fish, R., Jamotte, T., Ourchod, M. L. and Loferer, H., Nat. Biotechnol., Vol. 16, 1998, pp.851-857. [AW87] Sorting by insertion of leading element, Aigner, M. and West, D. B., Journal Combinational Theory, Vol. 45, pp. 306-309. [B99] An Automated Comparative Analysis of 17 Complete Microbial Genomes, Bansal, A. K., Bioinformatics, Vol. 15, No. 11, 1999, pp. 900-908. [B99] The Complexity of the Breakpoint Median Problem, Bryant, D., University de Montreal, 1999, pp.1-12. [BH96] Fast Sorting by Reversals, Berman, P. and Hannenhalli, S., in Combinatorial Pattern Matching, Lecture Notes in Comput. Sci., Vol. 1075, 1996, pp. 168-185. [BHK2001] 1.375-approximation algorithm for sorting by reversals, Berman, P., Hannenhalli, S. and Karpinki, M., Electronic Colloquium for Computational Complexity TR01-047, 2001. [BK99] On some tighter inapproximability results, Berman, P. and Karpinski, M., In Proceedings of the 26th ICALP. Springer, 1999. [BMY2001] A linear-time algorithm for computing inversion distances between signed permutations with an experimental study, Bader, D., Moret, B. and Yan, M., J. Comput. Biol., Vol. 8, No. 5, 2001, pp. 483-491. [BP94] Genome Rearrangements and Sorting by Reversals, Bafna, V. and Pevzner, P. A., the 34th IEEE Symposium of the Foundations of Computer 30 Science, 1994, pp. 148-157. [BP95] Sorting Permutations by Transpositions, Bafna, V. and Pevezner, P., in Proceedings of the 6th Annual Symposium on Discrete Algorithms, ACM, 1995, pp. 614-623. [BP96] Genome Rearrangements and Sorting by Reversals, Bafna, V. and Pevener, P. A., SIAM Journal on Computing, Vol. 25, 1996, pp. 272-289. [BP98] Sorting by transposition, Bafna, V. and Pevzner, P., SIAM Journal on Discrete Mathematics, Vol. 11, No. 2, 1998, pp. 224-240. [C97] Sorting by reversals is difficult, Caprara, A., In Proceedings of the 1st Conference on Computational Molecular Biology (RECOMB97), 1997, pp. 75–83. [C97] Sorting Permutations by Reversals and Eulerian Cycle Decompositions, Caprara A., to appear in SIAM Journal on Discrete Mathematics, April 1997, pp. 1-23. [C98] A 3/2-approximation algorithm for sorting by reversals, Christie, D. A., In Proceedings of the 9th Annual Symposium on Discrete Algorithms (SODA 98),. ACM Press, 1998, pp. 244-252. [C99] Formulations and Hardness of Multiple Sorting by Reversals, Caprara A., ACM, 1999, pp. 84-93. [C99] Sorting permutations by reversals and Eulerian cycle decompositions, Caprara, A., SIAM J. Discrete Math., Vol. 12, No. 1, 1999, pp. 91–110. [CFKRP93] The GDB Human Genome Data Base, Cuticchia, A. J., Fasman, K. H., Kingsbury, D. T., Robbins, R. J. and Pearson, P. L., Nucleic Acids Research, Vol. 21, 1993, pp. 3003. [CL2000] Experimental and Statistical Analysis of Sorting by Reversals, Caprara, A. and Lancia, G., Comparative Genomics: Empirical and Analytical Approaches to Gene Order Dynamics, 2000, pp. 171-183 31 [CLN] Fast Practical Solution of Sorting by Reversals, Caprara, A., Lancia, G. and Ng, S. K., Bioinformatics, pp. 12-21. [CLN99] A Column-Generation Based Branch-and-Bound Algorithm for Sorting by Reversals, Caprara, A., Lancia, G. and Ng, S. K., Mathematical Support for Molecular Biology; DIMACS Series in Discrete Mathematics and Theoretical Computer Science, Vol. 47, 1999, pp. 213-226 [CSGK2000] ProDom and Prodom-CG: Tools for Protein Domain Analysis and whole Genome Comparisons, Corpet, F., Servant, F., Gouzy, J. and Kahn, D., Nucleic Acids Research, Vol. 28, No. 1, 2000, pp. 267-269. [D2000] Graphical Tools for Comparative Genome Analysis, Dicks, J., Yeast, Vol. 17, 2000, pp. 6-15. [DKFPWS99] Aligment of whole Genomes, Delcher, A. L., Kasif, S., Fleischmann, R. D., Peterson, J., White, O. and Salzberg, S. L., Nucleic Acids Research, Vol. 27, No. 11, 1999, pp. 2369-2376. [E2001] (1+ε)-Approximation of Sorting by Reversals and Transpositions, Eriksen, N., Dept. of Mathematics, Royal Institute of Technology, 2001, pp. 227-237. [E2002] (1+ ε )-approximation of sorting by reversals and transpositions, Eriksen, N., Theoretical Computer Science, Vol. 289, 2002, pp. 517-529. [EDAE2001] Gene order rearrangements with derange:weights and reliability, Eriksen, N., Dalevi, D., Andersson, S. G. E., and Eriksson, K., Submitted to the Journal of Computational Biology, 2001. [EEKSW2001] Sorting a bridge hand, Eriksson, H., Eriksson, K., Karlander, J., Svensson, L. and Waslund, J., SIAM J. on Discrete Mathematics, Vol. 241, 2001, pp. 289-300. [F94] Restructuring the Genome Data Base: a Model for a Federation of Biological Database, Fasman, K. H., J. Computational Biology, Vol. 1, 1994, pp. 165-171. 32 [FCK94] The GDA Human Genome Data Base, Fasman, K. H., Cuticchia, A. J. and Kingsbury, D. T., Nucleic Acids Research, Vol. 22, 1994, pp. 3462-3469. [FLCK96] Improvements to the GDB Human Genome Data Base, Fasman, K. H., Letovsky, S. I., Cottingham, B. W. and Kingsbury, D. T., Nucleic Acids Research, Vol. 24, 1996, pp.57-63. [FRSZSMM2000] Web-Based Visualization Tools for Bacterial Genome Alignments, Florea, L., Riemer, C., Schwartz, S., Zhang, Z., Stojanovic, N., Miller, W. and McClelland, M., Necleic Acids Research, Vol. 28, 2000, pp. 3486-3496. [GK2000] Who’s Your Neighbor? New Computational Approaches for Functional Genomics, Galpertin, MY. And Koonin, EV., Nat. Biotechnol, Vol. 18, No. 6, 2000, pp. 609-631. [GL2000] Gestalt: A Workbench for Automatic Integration and Visualization of Large-Scale Genomic Sequence Analyses, Glusman, G. and Lancet, D., Bioinformatics, Vol. 16, 2000, pp. 482-483. [GMP96] Spliced Alignment: A New Approach to Gene Recognition Hirshberg, D. and Myers E.(eds.) Combinatiorial Pattern Matching, Gelfand, M., Mironov, A. and Pevzner, P., Lecture Notes in Computer Science, Vol. 1075, pp. 141-159. [GP79] Bounds for Sorting by Prefix Reversals, Gates, W. H. and Papadimitriou, C. H., Discrete Mathematics, Vol. 27, 1979, pp.47-57. [GPS99] A 2-approximation algorithm for genome rearrangements by reversals and transpositions, Gu, Q. P., Peng, S. and Sudborough, H., Theoretical Computer Science, Vol. 210, No. 2, 1999, pp. 327-339. [GR93] Prediction of the Exon-Intron Structure by a Dynamic-Programming Approach, Gelfand, M. S. and Roytberg, M. A., Biosystems, Vol. 30, 1993, pp. 173-182. [H81] The NP-Completeness of Some Edge-Partition Problems, Holyer, I., SIAM Journal on Computing, Vol. 10, 1981, pp. 713-717. 33 [H96] Polynomial Algorithm for Computing Translocation Distance between Genomes, Hannenhalli, S., Discrete Appl. Math., Vol. 71, 1996, pp. 137-151. [HAZK97] A Tool for Analyzing and Annotating Genomic Sequences, Huang, X., Adams, MD., Zhou, H. and Kerlavage, AR., Genomics, Vol. 46, No. 1, 1997, pp. 37-45. [HP95] Transforming Cabbage into Turnip: Polynomial Algorithm for Sorting Signed Permutations by Reversals, Hannenhalli, S. and Pevzner, P. A., Proceedings of the 27th Annual ACM Symposium on the Theory of Computing, 1995, pp. 178-187. [HP95] Transforming Men into Mice(Polynomial Algorithm for Genomic Distance Problems), Hannenhalli, S. and Pevzner, P. A., in Proceedings of the 27th Annual ACM Symposium on Theory of Computing, 1995, pp. 178-189. [J85] The complexity of finding minimum-length generator sequences, Jerrum, M., Theoretic Computer Science, Vol. 36, 1985, pp. 265-289. [K99] 131. Why Genome Analysis? , Koonin, E., TIG, Vol. 15, No. 4, April 1999, pp. [KM95] Combinational Algorithms for DNA Sequence Asembly, Kececioglu, J. D. and Myers, E.W., Algorithmica 13, 1995, pp. 7-51. [KS93] Exact and Approximation Algorithm for Sorting by Reversals, with Application to Genome Rearrangement, Kececioglu, J. and Sankoff, D., Algorithmica, Vol. 13, 1995, pp. 180-210. A Preliminary Version Appeared in Proceedings CPM 93, 1993, pp.87-105. [KS95] Exact and Approximation Algorithms for Sorting by Reversals, with Application to Genome Rearrangement, Kececioglu, J. and Sankoff, D., Algorithmica, Vol. 13, 1995,pp. 180-210. [KST97] Faster and Simpler Algorithm for Sorting Signed Permutations by Reversals, Kaplan, H., Shamir, R. and Tarjan, R. E., in Proceedings of the 8th ACM-SIAM Symposium on Discrete Algorithms(Also in Proceedings of the First International Conference on Computational Molecular Biology(RECOMB)), 34 1997, pp. 344-351(also pp. 163). [KST99] A Faster and Simpler Algorithm for Sorting Signed Permutations by Reversals, Kaplan, H., Shamir, R. and Tarjan, R. E., SIAM J. Comput., 1999, pp. 880-892. [LX99] Signed genome rearrangements by reversals and transpositions : Models and approximations, Lin, G. H. and Xue, G., In Proc. COCOON ’99, Lecture Notes in Computer Science, Vol. 1627, 1999, pp. 71-78. [MKRG99] Computer Analysis of Transcription Regulatory Patterns in Completely Sequenced Bacterial Genomes, Mironov, A. A., Koonin, E.V. Roytberg, M.A. and Gelfand, M. S., Necleic Acids Research, Vol. 27, No. 14, 1999, pp. 2981-2989. [MMS99] Benchmarking PSI-BLAST in Genome Anotation, Miller, A., Maccallum, R. M. and Stemberg, M. J. E., Journal of Molecular Biology, Vol. 293, 1999, pp. 1257-1271. [PH88] Plant Mitochondrial DNA Evolves Rapidly in Structure, but Slowly in Sequence, Palmer, J. D. and Herbon, L. A., J. Molecular Evolution ,Vol. 28, 1988, pp. 87-97. [PS] Approximation Algorithms for the Median Problem in the Breakpoint Model, Pe’er, I. and Shamir, R., Tel Aviv University, pp.1-16. [PS98] The Median Problems for Breakpoints are NP-complete, Pe’er, I. and Shamir, R., Supported by Eshkol Scholarship from the Ministry of Science and Technology, Israel, Nov. 1998, pp. 1-15. [RAG97] Combinatorial approaches to gene recognition, Roytberg, M., Astakhova, T. and Gelfand, M., Comput. Chem., Vol. 21, No. 4, 1997, pp.229-235. [S99] Genome Rearrangement with Gene Families, Sankoff, D., Bioinformatics, Vol.15, No. 11, 1999, pp. 909-917. [SB98] Multiple Genome Rearrangement and Breakpoint Phylogeny, Sankoff, D. and Blanchette, M., J. Comput. Bio., Vol. 5, No. 3, 1998, pp. 555-570. 35 [SB98] Multiple Genome Rearrangement, Sankoff, D. and Blanchette, M., RECOMB, New York, 1998, pp. 243-247. [SB99] Probability Models for Genome Rearrangement and Linear Invariants for Phylogenetic Inference, Sankoff, D. and Blanchette, M., ACM, 1999, pp. 302-309. [SBFHHIKKRSSSSTTWZH99] The Genome Sequence Database: Towards an Integrated Functional Genomics Resource, Skupski, M. P., Booker, M, Framer,A., Harpold, M., Huang, W., Inman, J., Kiphart, D., Kodira, C., Root, S., Schilkey, F., Schwertfeger, J., Siepel, A., Stamper, D., Thayer, N., Thompson, R., Wortman, J., Zhuang, J. J. and Harger, C., Necleic Acids Research, Vol. 27, 1999, pp. 35-38. [SCA90] Genomic Divergence through Gene Rearrangement, Sankoff, D., Cedergren, R. and Abel, Y., Methods in Enzymology, Vol. 183, 1990, pp. 428-438. [SD94] A Workbench for Large-Scale Sequence Homology Analysis, Sconnhammer, E. L. and Durbin, R., Comput. Appl. Biosci., Vol. 10, No. 3, 1994, pp. 301-307. [SP99] Lecture 11: Algorithm for Molecular Biology, Shamir, R. and Pe’er, I., Tel Aviv University, February 14, 1999, pp. 1-23. [SRMTE98] Optimization of Restriction Fragment DNA Mapping, Siegel, A. F., Roach, J. C., Magness, C., Thayer, E. and Engh, V. D., J. Comput. Biol., Vol. 5, No. 1, 1998, pp. 113-126. [SSK96] Steiner Points in the Space of Genome Rearrangements, Sankoff, D., Sundaram, G. and Kececioglu, J., in International Journal of Foundations of Computer Science, Vol. 7, No. 1, Jan. 1996, pp. 1-9. [TKO99] Complete Genomes in WWW Entrez: Data Representation and Analysis, Tatusova,T. A., Karsch-Mizrachi, I. and Ostell, J. A., Bioinformatics, Vol. 15, 1999, pp. 536-543. [WDM98] Reversal and transposition distance of linear chromosomes, Walter, 36 M. E., Dias, Z. and Meidanis, J., In String Proceeding and Information Retrieval:A South American Symposium (SPRIE 98), 1998. [WDM2000] A new approach for approximating the transposition distance, Walter, M. E., Dias, Z. and Meidanis, J., In String Processing and Information Retrieval: A south American Symposium (SPIRE 00), 2000. 37 On Pattern Discovery [B96] Parameterized Pattern Matching: Algorithms and Applications, Baker, B. S., J. Comput. Syst. Sci., Vol.. 52, No. 1, 1996, pp. 28-42. [BBEG99] MEME, MAST, and Meta-MEME: New Tools for Motif Discovery in Protein Sequences, Bailey, T. L., Baker, M. E., Elkan, C. P., and Grundy, W. N., in Pattern Discover in Biomolecular Data, Oxford University Press, 1999. [C2000] SPLASH: Structural Pattern Localization Analysis by Sequential Histograms, Califano, A., Bioinformatics, Vol. 16, No. 4, 2000, pp. 341-357. [CHG99] Discovering Concepts in Structural Data, Cook, D. J., Holder, L. B. and Galal G., in Pattern Discover in Biomolecular Data, Oxford University Press, 1999. [CPY96] Data mining for path traversal patterns in a web environment, Chen, M., Park, J. and Yu, P., The 16th International Conference on Distributed Computing Systems, 1996, pp. 385-392. [GSF99] Motif Discovery in Protein Structure Databases, Glasgow, J., Steeg, E. and Fortier, S., in Pattern Discover in Biomolecular Data, Oxford University Press, 1999. [GU96] A Fast Look-Up Algorithm for Detecting Repetitive DNA Sequences, Guan, X. and Uberbatcher, E. C., Proceedings of the Pacific Symposium on Biocomputing., 1996, pp. 718-719. [H99] Assembling Blocks, Henikoff, J. G., in Pattern Discover in Biomolecular Data, Oxford University Press, 1999. [JCH95] Finding Flexible Patterns in Unaligned Protein Sequences, Jonassen, I., Collins, J. F. and Higgins, D. G., Protein Sci., Vol. 4, 1995, pp. 1587-1595. [LSW99] A Framework for Biological Pattern Discovery on Network of Workstations, Li, B., Shasha, D. and Wang, J. T. L., in Pattern Discover in Biomolecular Data, Oxford University Press, 1999. 38 [M78] The Complexity of Some Problems on Subsequences and Supersequences , Maier, D., J. ACM, Vol. 25, 1978, pp. 322-336. [M83] An Efficient Method for Finding Repeats in Molecular Sequences, Martinez, M., Nucleic Acids Res, Vol. 11, 1983, pp. 4629-4634. [M99] Discovering Patterns in DNA Sequences by the Algorithmic Significance Method, Milosavljevic, A., in Pattern Discover in Biomolecular Data, Oxford University Press, 1999. [NG94] Detecting Patterns in Protein Sequences, Neuwald, A. F. and Green, P., Journal of Molecular Biology, Vol. 239, 1994, pp. 698-712. [NW70] A General Method Applcable to the Search for Similarities in the Amino Acid Sequence of Two Proteins, Needleman, S. E. and Wunsch, C. D., J. Mol. Biol, Vol. 48, 1970, pp. 443-453. [PFR99] An Approximation Algorithm for Alignment of Multiple Sequences Using Motif Discovery, Parida, L., Floratos, A., and Rigoutsos, I., in Journal of Combinatorial Optimization, 1999. [PRFPG2000] Pattern Discovery on Character Sets and Real-Valued Data: Linear Bound on Irredundant Motifs and an Efficient Polynominal Time Algorithm, Parida, L., Rigoutsos, I., Floratos, A., Platt, D. and Gao, Y., IBM Thomas J. Watson Research Center, to appear in SODA, 2000, pp. 297-308. [R92] A Search for Common Patterns in Many Sequences, Roytberg, M. A., Comput. Applic. Biosci., Vol. 8, 1992, pp. 57-64. [RF98] Combinatorial Pattern Discovery in Biological Sequences: the TEIRESIAS Algorithm, Rigoutsos, I. and Floratos, A., Bioinformatics, Vol. 14, No. 1, 1998,pp. 55-67. [RF98] Motif Discovery in Biological Sequences without Alignment or Enumeration, Rigoutsos, I. and Floratos, A., In Proceedings of the Annual Conference on Computational Molecular Biology (RECOMB ‘98), ACM Press March, 1998, pp. 221-227. 39 [RPCS99] Representation and Matching of Small Flexible Molecules in Large Database of 3D Molecular Information, Rigoutsos, I., Platt, D., Califano, A, and Silverman, D., in Pattern Discover in Biomolecular Data, Oxford University Press, 1999. [SAC90] Finding Sequence Motifs in Groups of Functionally Related Proteins, Smith, H. O., Annau, T. M. and Chandrasegaran, S., Proc. Natl Acad. Sci. USA, Vol. 87, 1990, pp. 826-830. [SKWC99] RNA Structure Analysis : A Multifaceted Approach, Shapiro, B. A., Kasprzak, W., Wu, J. C. and Currey, K., in Pattern Discover in Biomolecular Data, Oxford University Press, 1999. [SNJ95] Searching for Common Sequence Patterns Among Distantly Related Proteins, Suyama, M., Nishioka, T. and Jun’ichi, O., Protein Eng., Vol. 8, 1995, pp. 1075-1080. [SV96] A Double Combinatorial Approach to Discovering Patterns in Biological Sequences, Sagot, M. F. and Viari, A., In Proceedings of the 7th Symposium on Combinatorial Pattern Matching., 1996, pp.186-208. [SW81] Identification of Commom Molecular Subsequences, Smith, T. F. and Waterman, M. S., J. Mol. Biol., Vol. 147, 1981, pp. 195-197. [TK99] Systematic Detection of Protein Structural Motifs, Tomii, K. and Kanehisa, M., in Pattern Discover in Biomolecular Data, Oxford University Press, 1999. [U85] Finding approximate patterns in strings, Ukkonen, E., J. Algorithms, Vol. 6, 1985, pp. 132-137. [WB95] Identification of Protein Motifs Using Conserved Amino Acid Properties and Partitioning Techniques, Wu, T. D. and Brutlag, D. L., In Proceedings of the 3th International Conference on Intelligent Systems for Molecular Biology, AAAI Press, Menlo Park, CA, 1995, pp. 402-410. 40 [WGA84] Pattern Recognition in Several Sequences: Consensus and Alignment, Waterman, M. S., Galas, D. J. and Arratia, R., Bull. Math. Biol., Vol. 46, 1984, pp. 515-527. [WMRSSCWZ99] Pattern Discovery and Classification in Biosequences, Wang, J. T. L., Marr, T. G., Rozen, S., Shasha, D., Shapiro, B. A., Chirn, G. W., Wang, Z. and Zhang, K., in Pattern Discover in Biomolecular Data, Oxford University Press, 1999. [WMSSC94] Discovering Active Motifs in Sets of Related Protein Sequences and Using them for classification, Wang, J., Marr, T. G., Shasha, D., Shapiro, B. A. and Chirn, G., Nucleic Acids. Res., Vol. 22, 1994, pp. 2769-2775. [YCHKLZ99] Overview : A System for Tracking and Managing the Results from Sequence Comparison Programs, Yee, D. P., Cushing, J. B., Hunkapiller, T., Kutter, E., Laird, J. and Zucker, F., in Pattern Discover in Biomolecular Data, Oxford University Press, 1999. 41 On Divide-and-Conquer [W97] Tighter Bounds on the Solution of a Divide-and-Conquer Maximin Recurrence, Wang, B. F., Journal of Algorithms, Vol. 23, 1997, pp. 329-344. [W2000] Tight Bounds on the Solutions of Multidimensional Divide-and-Conquer Maximin Recurrences, Wang, B. F., Theoretical Computer Science, Vol. 242, 2000, pp. 377-401. 42 On Books [BS89] The Human Revolution: Behavioural and Biological Perspectives on the Origins of Modern Humans, Brauer, G., Edinburgh Univ. Press, Edinburgh, 1989, pp. 123-154. [CC2001] Multidimensional Scaling, Cox, T. F. and Cox, M. A. A., Chapman&Hall/CRC, ISBN: 1584880945, pp.1-308. [D2001] Genomic Regulatory System Development and Evolution, Davidson, E. H., Academic Press, 2001. [DEKM98] Biological Sequence Analysis : Probabilistic Models of Proteins and Nucleic Acids, Durbin, R., Eddy, S. R., Krogh, A. and Mitchison, G., Cambridge University Press, 1998. [E93] Cluster Analysis, Everitt, B. S., Edward Arnold, ISBN: 0 470 220430, 1993, pp. 1-170. [G97] Algorithm on Strings, Trees, and Sequences, Gusfield, D., Cambridge University Press, 1997. [GJ79] Computer and Intractability – A Guide to the Theory of NP-Completeness, Garey M. R. and Johnson D. S., Freeman, New York, 1979. [GLS88] A Guide to the Theory of NP-Completeness, Grotschel M., Lovasz L. and Schrijver A., Springer, Berlin, 1988. [KM95] Combinational Algorithms for DNA Sequence Assembly, Kececioglu J. D. and Myers E. W., Algorithmica, Vol. 13, pp. 7-51, 1995. [KW78] Multidimensional Scaling, Kruskal, J. B. and Wish, M., Sage University Paper Series on Quantitative Applications in the Social Sciences, 07-011. Beverly Hills and London: Sage Publications, 1978. [KW91] Evolution of Life: Fossils, Molecules and Culture, Kocher, T. D. and Wilson, A. C., Springer-Verlag, Tokyo, 1991, pp. 391-413. 43 [L96] Genomic Diversity and Molecular Phylogeny of Human and Simian T-Cell Lymphotropic Viruses, Liu, H. F., Katholieke Universiteit Leuven(Faculty of Medicine Department of Microbiology and Immunology Rega Institute for Medical Research),1996, pp.1-1099. [P2000] Computational Molecular Biology: An Algorithmic Approach, Pevzner, P. A., MIT Press, 2000. [SK83] Time Warps, String Edits, and Macromolecules: the Theory and Practice of Sequence Comparison, Sankoff, D. and Kruskal, J. B., Addison-Wesley, 1983. [SM97] Introduction to Computional Molecular Biology, Setubal, J. C. and Meidanis, J., PWS Pub., 1997. [SO90] Molecular Systematics, Swofford, D. L. and Olsen, G. J., Sinauer, Sunderland, MA, 1990, pp. 411-501. [SR81] Biometry, Sokal, R. R. and Rohlf, F. J., Freeman, New York, 1981. [W95] Introduction to Computational Biology, Waterman, M. S., Chapman&Hall, 1995. [WSS99] Patterrn Discovery in Biomolecular Data: Tools, Techniques, and Applications, Wang, J. T. L., Shapiro, B. A. and Shasha, D., Oxford University Press, 1999. 44 On Sorting by Reversal [BH96] Fast Sorting by Reversal, Berman, P. and Hannenhalli, S., Proceedings of 7th Annual Symposium on Combinatorial Pattern Matching, Lecture Notes in Computer Science, 1996. [BK] On Some Tighter Inapproximability Results, Berman, P. and Karpinski, M., DIMACS Technical Report, pp. 99-23. [BMY2001] A Linear-Time Algorithm for Computing Inversion Distance between Signed Permutations with an Experimental Study, Bader, D., Moret, B. and Yan, M., In Proceeding 7th Workshop on Algorithms and Data Structures WADS91, 2001. [BP93] Genome Rearrangements and Sorting by Reversals, Bafna, V. and Pevzner, P., Proc. 34th FOCS, IEEE, 1993, pp. 148-157. [C97] Sorting by Reversals is Difficult, Caprara A., Proceedings of the First Annual International Conference Biology(RECOMB’97), ACM Press, 1997. on Computational Molecular [C98] A 3/2-Approximation Algorithm for Sorting by Reversals, Christie, D. A., in Proceedings of the 9th Annual ACM-SIAM Symposium on Discrete Algorithms(SODA 98), ACM Press, 1998, pp. 244-252. [CLN99] Sorting Permutations by Reversals through Branch-and-Price, Caprara A., Lancia, G.. and Ng, S. K., Bioinformatics, 1999, pp. 1-32. [CLN2000] Fast Practical Solution of Sorting by Reversals, Caprara A., Lancia, G.. and Ng, S. K., Proceedings of the 11th ACM-SIAM Symposium on Discrete Algorithm, ACM Press, 2000, pp. 12-21. [H81] The NP-Completeness of Some Edge-Partition Problems, Holyer, I., SIAM Journal on Computing, Vol. 10, 1981, pp. 713-717. [KS95] Exact and Approximation Algorithms for Sorting by Reversals, Kececioglu, J. and Sankoff, D., Algorithmica, Vol. 13, 1995, pp.180-210. 45 [WDM98] Reversal and Transposition Distance of Linear Chromosomes, Walter, M. E. M. T., Dias, Z. and Meidanis, J., Sring Processing and Information Retrieval: A South American Symposium (SPIRE 98), 1998, pp.1-7. 46 On Visual Display [BW] Model Simplification Through Refinement, Brodsky, D. and Watson, B., Department of Computer Science University of British Columbia. [CA74] Nonlinear Intrinsic Dimensionality Computations, Chen, C. K. and Andrew, H. C., IEEE Trans. Comput., Vol. C-23, Feb. 1974, pp. 178-184. [CC] Discrete Multi-Dimensional Scaling, Clouse, D. S. and Cottrell, G. W., Computer Science&Engineering 0114 University of California, San Diego. [CL71] Multivariate Data Analysis, Cooley, W. W. and Lohnes, P., New York: Wiley, 1971. [CL73] A Heuristic Relaxation Method for Nonlinear Mapping in Cluster Analysis, Chang, C. L. and Lee, R. C. T., IEEE Trans. On System, Man and Cybernetics., Vol. SMC-3, Mar, 1973, pp. 197-200, [F36] The Use of Multiple Measurements in Taxanomic Problems, Fisher, R. A., Ann. Eugen., Vol. 7, 1936, pp.178-188. [F99] On the Use of Self-Organizing Maps for Clustering and Visualization, Flexer, A., The Austrian Research Institute for Artificial Intelligence, 1999, pp. 1-16. [GPRTWW2000] A Continuous Clustering Method for Vector Fields, Garcke, H., Preu β er, T., Rumpf, M., Telea, A., Weikard, U. and Wijk, J. V., IEEE Visualization, pp. 1-9. [GR69] Minimum Spanning Trees and Single Linkage Cluster Analysis, Gower, J. C. and Ross, G. J. S., Appl. Stat., Vol. 18, No. 1, 1969, pp. 54-64. [H72] Direct Clustering of a Data Matrix, Hartigan, J. A., Amer. Stat. Ass., Vol. 67, No. 337, Mar. 1972, pp. 123-129. [HK2001] Clustering Spatial Data Using Random Walks, Harel, D. and Koren, Y., ACM, 2001, pp. 1-6. 47 [HR2001] Stochastic Neighbor Embedding, Hinton, G. and Roweis, S., Department of Computer Science, University of Toronto, 2001, pp. 1-8. [HV2001] A Data Set Oriented Approach for Clustering Algorithm Selection, Halkidi, M. and Vazirgiannis, M., Department of Informatics, Athens University of Economics&Business, 2001, pp. 1-12. [JW2002] Applied Multivariate Statistical Analysis, Johnson, R. A. and Wichern, D. W., Prentice-Hall, Inc, 2002, pp. 1-767. [K64a] Multidimensional Scaling by Optimizing Goodness of Fit to a Nonmetric Hypothesis, Kruskal, J. B., Psychometrika, Vol. 29, No. 1, 1964, pp. 1-27. [K64b] Non-Metric Multidimensional Scaling: A Numerical Method, Kruskal, J. B., Psychometrika, Vol. 29, No. 1, 1964, pp. 115-129. [K68] A Course in Multivariate Analysis, Kendall, M. G.., New York: Hefner, 1968. [KNTCW2001] Analysis and Visualization of Gene Expression Data Using Self-Organizing Maps, Kaski, S., Nikkila, J., Toronen, P., Castren, E. and Wong, G., Helsinki University of Technology, Neural Networks Research Centre; University of Kuopio, A. I. Virtanen-institute, 2001, pp. 1-5. [LSB76] A Triangulation Method for the Sequential Mapping of Points from N-Space to Two-Space, Lee, R. C. T., Slagle, J. R. and Blum, H., IEEE Trans. Comput., March 1977, pp. 288-292. [LSH77] A Triangulation Method for the Sequential Mapping of Points from N-Space to Two-Space, R.C.T. Lee, J.R. Slagle and H. Blum,IEEE Transactions on Computers,1977,pp. 288-292. [MT] Optimal Dilations for Metric Multidimensional Scaling, Malone, S. W. and Trosset, M. W., Department of Mathematics, Duke University. [N71] Problem Solving Methods in Artificial Intelligence, Nilsson, N. J., McGraw-Hill, 1971. 48 [ND2000] Interactive Data Exploration Using MDS Mapping, Naud, A. and Duch, W., Department of Computer Methods Nicolaus Copernicus University, 2000, pp. 1-6. [P57] Shortest Connection Network and Some Generalizations, Prim, R. C., Bell Syst. Tech. J., Nov. 1957, pp. 1389-1401. [S64] The Analysis of Proximities: Multidimensional Scaling with an Unknown Distance Function, Shepard, R. N., Psychometrika, Vol. 27, 1962, pp. 125-139, 219-246. [S69] A Nonlinear Mapping for Data Structure Analysis, Sammon, J. W. Jr., IEEE Trans. Comput., Vol. C-18, May 1969, pp. 401-409. [S71] Artificial Intelligence: A Heuristic Programming Approach, Slagle, J. R., New York: McGraw-Hill, 1971. [S80] Multidimensional Scaling, Tree-Fitting, and Clustering, Shepard, R. N., Science, Vol. 210 , No. 4468, 1980, pp. 390-398. [SC66] Parametric Representation of Nonlinear Data Structure, Shepard, R. and Carroll, J., P. R. Krishnaiah, Ed., New York: Academic Press, 1966. [SES] Cluster Analysis and Its Applications to Gene Expression Data, Sharan, R., Elkon, R. and Shamir, R., School of Computer Science, Tel-Aviv University, Tel-Aviv 69978, Israel, pp. 1-28. [SL71] Applications of Game Tree Searching Techniques to Sequential Pattern Recognition, Slage, J. R. and Lee, R. C. T., Commun. Assoc. Comput. Mach., Vol. 14, No. 2, Feb. 1971, pp. 103-110. [XOX2002] Clustering Gene Expression Data Using a Graph-Theoretic Approach: an Application of Minimum Spanning Tree, Xu, Y., Olman, V. and Xu, D., Bioinformatics, Vol. 18, No. 4, 2002, pp. 536-545. [Y87] Multidimensional Scaling: History, Theory and Applications, Young, F. W., Hamer, R. M. (ed), Hillsdale, NJ: Lawrence Erlbaum, 1987. 49 [YH38] Discussion of a Set of Points in Terms of Their Mutual Distances, Young, G. and Householder, A. S., Psychometrika, Vol. 3, 1938, pp.19-22. [Z71] Graph-Theoretical Methods for Detecting and Describing Gestalt Clusters, Zahn, C. T., IEEE Trans. Comput., Vol. C-20, Jan. 1971, pp. 68-86. [ZKK] Texture Mapping Using Surface Flattening via Multi-Dimensional Scaling, Zigelman, G., Kimmel, R. and Kiryati, N., the Department of Computer Science, Technion, Haifa 32000, Israel. 50 On LCS [BG2002] Sparse Dynamic Programming for Longest Common Subsequence from Fragments, Baker, B. S. and Giancarlo, R., Journal of Algorithm, Vol. 42, 2002, pp. 231-254. [BVM2001] Experimenting an Approximation Algorithm for the LCS, Bonizzoni, P., Vedova, G.. D. and Mauri, G.., Discrete Applied Mathematics, Vol. 110, No. 1, 2001, pp.13-24. [HS97] A Fast Algorithm for Computing Longest Common Subsequences, Hun, J. W. and Szymanski, T. G., Comm. ACM, Vol. 20, 1997, pp. 350-353. [J82] A Priority Queue in Which Initialization and Queue Operations Take O(loglogD) time, Johnson, D. B., Math. Systems Theory, Vol. 15, 1982, pp. 295-309. 51 On Classification of Protein Folds [FMT94] An Algorithm for Automatically Generating Protein Topology Cartoons, Folres T.P., Moss D.S. and Thornton J.M., Protein Engineering, vol.7, no.1, 1994, pp.31-37. [HOSTV92] A Database of Protein Structure Families with Common Folding Motifs, Holm L., Ouzounis C., Sander C., Tuparev G and Vriend G., Protein Science, vol1, 1992, pp.1691-1698. [MBHC95] SCOP: a structural classification of proteins database for the investigation of sequences and structures, Murzin A. G., Brenner S. E., Hubbard T., Chothia C., J. Mol. Biol. 247, 1995, pp.536-540. [O94] Classification of protein folds, Orengo C., Current Opinion in Structural Biology, Vol.4, 1994, pp.429-440. [OFTT 93] Identifying and Classifying Protein Fold Families, Orengo C.A., Flores T.P., Taylor W. R. and Thornton J.M., Protein Engineering, vol.6, 1993, pp.485-500. 52 On Structure Alignment [AM88] A Simple Qualitative Representation of Polypeptide Chain Folds: Comparison of Protein Tertiary Structures. Abagyan R.A and Maiorov V.N., J. Biomol. Struct. Dynam., Vol.5, No. 6, 1988, pp.1267-1279. [BCPS90] Identification of Protein Folds: Matching Hydrophobicity Patterns of Sequence Sets with Solvent Accessibility Patterns of Known Strucutres, Bowie J.U., Clarke N.D., Pabo C.O. and Sauer R.T., Proteins: Struct. Funct. Genet., Vol.7, 1990, pp.257-264. [BS87] Evaluation and Improvements in the Automatic Alignment of Protein Sequences, Barton, G.J. and Sternberg M.J.E., Protein Eng. Vol.1, No.2, 1987, pp.89-94. [CL87] The Evolution of Protein Structures, Chothia C. and Lesk A.M., Cold Spring Harbour Symp. Quant. Biol. 1987, pp.399-405. [FMS90] Alignment of Protein Sequence using Secondary Structure: A Modified Dynamic Programming Method. Fischel-Ghodsian F., Mathiowitz G. and Smith T.F., Protein Eng., Vol.3, No.7, 1990, pp.577-581. [JSB89] Phylogenetic Relationships from Three Dimensional Protein Structures, Johnson M.S., Sali A., Blundell T.L., Methods Enzymol., Vol.183, 1989, pp.670-690. [KS83] Dictionary of Protein Secondary Structure: Pattern Recognition of Hydrogen-bonded and Geometrical Features. Kabsch W. and Sander C., Biopolymers, Vol.22, 1983, pp.2577-2637. [M84] A Fast Method of Comparing Protein Structures, Murthy M.R.N., FEBS Lett., Vol 168, 1984, pp.97-102. [MARW89] Use of Techniques Derived from Graph Theory to Compare Secondary Structure Motifs in Proteins, Mitchell E.M., Artymiuk P.J., Rice D.W. and Willett P., J. Mol. Boil., Vol. 212, 1989, pp.151-166. [NW92] A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins, Needleman S.B. and Wunsch C. D., Journal of 53 Molecular Biology, vol.48, 1970, pp.443-453. [OBT92] Fast Structure Alignment for Protein Databank Searching, Orengo C.A., Brown N.P. and Taylor W.R., Proteins, vol.14, 1992, pp.139-167. [OT90] A Rapid Method of protein structure alignment, Orengo C. A. and Taylor W. R.,J. Theor. Biol., vol.147, pp 517-551. [OT93] A Local Alignment Method for Protein Structure Motifs, Orengo C. A. and Taylor W. R., Journal of Molecular Biology, vol.233, No. 3, 1993. pp.488-497. [P70] Development of Crystallographic Enzymology, Phillips D.C., Biochem. Soc. Symp. Vol.31, 1970, pp11-28. [R77] Beta-Sheet Topology and The Relatedness of Proteins, Richardson J.S., Nature, Vol. 268, 1977, pp.495-500. [R81] The Anatomy and Taxonomy of Protein Structure. Richardson J. S., Adv. Protein Chem., Vol. 34, 1981, pp.167-339. [RK88] Identification of Structural Motifs from Protein Coordinate Data: Secondary Structure and First Level Supersecondary Structure. Richards F. M. and Kundrot C. E., Proteins: Struct. Funct. Genet. Vol.3, 1988, pp.71-84. [RT91] Visualization of Structural Similarity in Proteins, Rippmann F. and Taylor W.R., J. Mol. Graph., Vol.2, 1982, pp.371-374. [SB90] The Definition of General Topological Equivalence in Protein Structures: A Procedure Involving Comparison of Properties and Relationships Through Simulated Annealing and Dynamic Programming., Sali A. and Blundell T.L., J. Mol. Biol., Vol. 212, 1990, pp.403-428 [SB] Protein Structure Alignment: A Comparison of Methods, Singh, A. P., Brutlag, D. L., Stanford University. [TO89a] Protein structure alignment, Taylor W. R. and Orengo C. A., Journal of Molecular Biology, vol. 208, No. 1, 1989, pp.1-22. 54 [TO89b] A Holistic Approach to Protein Structure Alignment, Taylor W. R. and Orengo C. A., Protein Eng., Vol.2, No.7, pp.505-519. [ZS89] The Alignment of Protein Structures in Three Dimensions. Zuker M. and Somorjai R. L., Bull. Math. Biol. Vol.51, No. 1, 1989, pp.55-78. [GMB96] Surprising Similarities in Structure Comparison. Gibrat, J-F., Madej, T. and Bryant, S.H., Current Opinion in Structural Biology. Vol.6, 1996, pp.377-385. [MGB] Threading a database of protein cores. Madej, T., Gibrat, J-F., and Bryant, S.H. Protein Struct. Funct. Genet. Vol.23, 1995, pp.356-369. [HS93] Protein - structure comparison by alignment of distance matrices. Holm, L. and Sander, C., Journal of molecular biology, Vol. 233, 1993, pp.123-138. 55 On Physical Mapping [95] 11 Papers Presenting at DIMACS Workshop (Combinatorial Methods in DNA Mapping and Sequencing), Speicl Issue of Journal of Computational Biology, Vol. 2, 1995. [2001] A Physical Map of the Human Genome, Nature, Vol. 409, 15 Feb., 2001, pp.934-941. [AJLTY91] Linear Approximation of Shortest Superstrings, Avrim, B., Jiang, T., Li, M., Tromp, J. and Yannakakis, M., Proceeding of the 23rd ACM Symposium on Theory of Computation, 1991, pp. 328-336. [AKNW93] Physical Mapping of Chromosomes: A Combinatorial Problem in Molecular Biology, Alizadeh, F., Karp, R. M., Newberg, L. A. and Weisser, D. K., Proceedings of the 4th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA '93), 1993, pp. 371-381. [AKR93] Cutting Down on Fill Using Nested Dissection: Provably Good Elimination Orderings, Aggarwal, A., Klein, P. and Ravi, R., Springer, 1993, pp. 31-55. [AKWZ94] Physical Mapping of Chromosomes Using Unique Probes, Alizadeh, F., Karp, R. M., Weisser, D. K. and Zweig, G.., In Proc. Fifth Annual ACM-SIAM Symp. on Discrete Algorithms (SODA), ACM Press, 1994, pp 489-500. [AM96] On Physical Mapping and the Consecutive Ones Property for Sparse Matrices, Atkins and Middendorf, DAMATH: Discrete Applied Mathematics and Combinatorial Operations Research and Computer Science, 1996, Vol. 71. [AS95] Improved Length Bounds for the Shortest Superstring Problem, Armen, C., and Stein, C., Proc. 4th Workshop on Algorithms and Data Structures (WADS95), Lecture Notes in Computer Science, Vol. 955, 1995, pp. 494-505. [AS96] A 2+2/3-Approximation Algorithm for the Shortest Superstring Problem, Armen, C. and Stein, C., Proc. 7th Annual Symposium on Combinatorial Pattern Matching (CPM95), Lecture Notes in Computer Science, Vol. 1075, 1996, pp. 87-101. 56 [B59] On the Topology of the Genetic Fine Structure, Benzer, S., PNAS, Vol. 45, 1959, pp. 1607-1620. [B62] The Fine Structure of the Gene, Benzer, S., Scientific American, Vol. 206, 1962, pp. 70-84. [B74] A Characterization of Rigid Circuit Graphs, Buneman, P., Discrete Math., Vol. 9, 1974, pp. 205-212. [B88] Construction of restriction maps, Bellon, B., CABIOS, Vol. 4, 1988, pp. 111-115. [BDC91] Theoretical Analysis of a Physical Mapping Strategy Using Random Single-Copy Landmarks, Barillot E., Dausset J., and Cohen D., PNAS, Vol. 88, 1991, pp. 3917-3921. [BFH94] Beyond NP-Completeness for Problems of Bounded Width: Hardness for the W Hierarchy (Extended Abstract), Bodlaender, H. L., Fellows, M. R. and Hallet, M. T., In Proceedings of the 26th Annual ACM Symposium on the Theory of Computing, ACM Press, New York, 1994, pp 449-458. [BI99] Physical Mapping with Repeated Probes: The Hypergraph Superstring Problem, Batzoglou S. and Istrail S., Lecture Notes in Computer Science, 1999, Vol. 1645, pp. 66-??. [BJLTY94] Linear Approximation of Shortest Superstrings, (Greedy: 4-approximation, Modified Greedy: 3-approximation), Blum, A., Jiang, T., Li, M., Tromp, J., and Yannakakis, M., Journal of Computer and System Sciences, Vol. 41, 1994, pp. 630-647. [BL76] Testing for the Consecutive Ones Property, Interval Graphs, and Planarity Using PQ-Tree Algorithms, Booth K. S. and Lueker G. S., J. Comput. Sys. Sci., 1976, 13:335-379. [BLC91] Theoretical Analysis of Library Screening Using a N-Dimensional Pooling Strategy. Barillot E., Lacriox B., and Cohen D., NAR, Vol.19, No. 22, 1991, pp. 6241-6247. 57 [BP93] An Introduction to Chordal Graphs and Clique Trees, Blair, J. R. and Peyton, B., Springer, 1993, pp. 1-29. [BSPGCW90] Optimizing Restriction Fragment Fingerprinting Methods for Ordering Large Genomic Libraries. Branscomb E., Slezak T., Pae R., Galas D., Carrano A.V., and Waterman M., Genomics, Vol. 8, 1990, pp. 351-366. [CBPKM90] Radiation Hybrid Mapping: A Somatic Cell Genetic for Reconstructing High-Resolution Maps of Mammalian Chromosomes, Cox, D. R., Burmeister, M., Price, R., Kim, S., and Myers, R. M., Science, Vol. 250, 1990, pp. 245-250. [CJKMR97] A Branch-and-Cut Approach to Physical Mapping with End-Probes, Christof, T., Ju"nger, M., Kececioglu, J., Mutzel, P. and Reinelt, G., Journal of Computational Biology, 1999, Vol. 4, pp. 433-447. [C88] Mapping Our Genes - the Genome Projects, How Big? How Fast?, Congress, U. S., Technical Report OTA-BA-373, Office of Technology Assessment, Washington, D.C., 1988. [C90] Orchestrating the Human Genome Project, Cantor, C. R., Science, Vol. 248, 1990, pp. 49-51. [CH92] Reconstructing Sequences from Shotgun Data, Cull, P. and Holloway, J., Manuscript, 1992. [CM2001] Algorithms for Large-Scale DNA Sequencing, Cerqueira, F. R. and Meidanis, J., Institute of Computing, University of Campinas, 2001. [CNHZL90] Ordering of Cosmid Clones Covering the Herpes Simplex Virus Type I (HSV-I) Genome: A Test Case for Fingerprinting by Hybridization, Craig, A. G., Nizetic, D., Hoheisel, D., Zehetner, G., and Lehrach, H., NAR, Vol. 18, 1990, pp. 2653-2660. [CSBK86] Toward a Physical Map of the Genome of the Nematode Caenorhabditis Elegans, Coulson, A., Sulston, J., Brenner, S., and Karn, J., PNAS, Vol. 83, 1986, pp. 7821-78265. 58 [D88] The Human Genome Project, DeLisi, C., The American Scientist, Vol. 76, 1988, pp. 488-493. [D88] Computers in Molecular Biology: Current Applications and Emerging Trends, DeLisi, C., Science, Vol. 240, 1988, pp. 47-52. [DB84] An efficient program to construct restriction maps from experimental data with realistic error levels, Durand, R., and Bregegere, F., Nucleic Acids Res., Vol. 12, 1984, pp. 703-716. [DK88] Errors between sites in restriction site mapping, Dix, T. I., and Kieronska, D. H., CABIOS, Vol. 4, 1988, pp. 117-122. [DLS92] An Efficient Algorithms for the All Pairs Suffix-Prefix Problem, Dan, G., Landau, G. and Schieber, B., Information Processing Letters, Vol. 41, 1992. pp. 181-185. [EY85] Addendum: Simple Linear-Time Algorithms to Least Chordality of Graphs, Text Acyclicity of Hypergraphs, and Selectively Reduce Acyclic Hypergraphs, and Selectively Reduce Acyclic Hypergraphs, Tarjan, R. E. and Yannakakis, M., SIAM J. Computing, Vol. 14, 1985, pp. 254-255. [F85] 1985. Interval Orders and Interval Graphs, Fishburn, P., Wiley, New York, [FHW93] DNA Physical Mapping: Three Ways Difficult, Fellows, M. R., Hallet, M. T. and Wareham, H. T., In Proc. First European Symp. on Algorithms (ESA ’93), Springer, LNCS726, pp 157-168. [FSR83] Mapping the order of DNA restriction fragments, Fitch, W. M., Smith, T. F., and Ralph, W. W., Gene, Vol. 22, 1983, pp. 19-29. [FSW2002] Rearrangement of DNA Fragments: a Branch-and-Cut Algorithm, Ferreira, C. E., de Souze, C. C. and Wakabayashi, Y., Discrete Applied Mathematics, Vol. 116, 2002, pp.161-177. [FTA92] On the Design of Genome Mapping Experiments Using Short Synthetic Oligonucleotides, Fu Y.-x., Timberlake W. E., and Arnold J., Biometrics, 59 Vol. 48, 1992, pp. 337-359. [G74] An Algorithm for Testing Chordality of Graphs, Gavril, F., Inf. Pro. Letts., Vol. 3, 1974, pp. 110-112. [G80] Algorithmic Graph Theory and Perfect Graphs, Golumbic M. C., Academic Press, New York, 1980. [G85] Interval Graphs and Related Topics, Golumbic, M. C., Discrete Math., Vol. 55, 1985, pp. 113-121. [GGKS93] Three Strikes Against Physical Mapping of DNA (Extended Abstract), Goldberg, P. W., Golumbic, M. C., Kaplan, H. and Shamir, R., Technical report, Institute of Computer Science, Tel Aviv Univeristy, 1993. [GKS94] On the Complexity of DNA Physical Mapping, Golumbic, M. C., Kaplan, H. and Shamir, R., ADVAM: Advances in Applied Mathematics, Vol. 15, 1994, pp. 251-261. [GH64] A Characterization of Comparability Graphs and of Interval Graphs, Gilmore P. C. and Hoffman A. J., Canad. J. Math., 1964, 16:539-548. [GI95] Physical Mappings by STS Hybridization: Algorithmic Strategies and the Challenge of Software Evaluation, Greenberg, D. S. and Istrail, S., Technical report, Sandia National Labs, 1995. [GI95] Physical Mapping by STS Hybridization: Algorithmic Stratgies and the Challenge of Software Evaluation, Greenberg D. S. and Istrail S., J. Comp. Biol., 1995, Vol. 2, pp. 219-273. [GKS94] On the Complexity of DNA Physical Mapping, Golumbic, M. C., Kaplan, H. and Shamir, R., Advances in Applied Mathematics, Vol. 15, 1994, pp 251-261. [GKS95] Graph Sandwich Problems, Golumbic, M. C., Kaplan, H. and Shamir, R., Journal of Algorithms, Vol. 19, 1995, pp 449-473. [GM90] Mapping DNA by stochastic relaxation : a new approach to fragment 60 sizes, Grigorjev, A. V., and Mironov, A. A., CABIOS, Vol. 6, 1990, pp. 107-111. [GM91] Mapping DNA by stochastic relaxation : a schedule for optimal annealing, Grigorjev, A. V., and Mironov, A. A., J. DNA Mapping and Sequencing, Vol. 1, 1991, pp. 221-226. [GMS80] On Finding Minimal Length Superstring, Gallant J., Maier D., and Storer J., Journal of Computer and System Sciences, Vol. 20, 1980, pp. 50-58. [GO90] Chromosomal Region of the Cystic Fibrosis Gene in Yeast Artificial Chromosomes: A Model for Human Genome Mapping, Green, E. D., and Olson, M. V., Science, Vol. 250, 1990, pp. 94-98. [GW87] Mapping DNA by Stochastic Relaxation, Goldstein L. and Waterman M. S., Advances in Applied Mathematics, Vol. 8, 1987, pp. 194-207. [H92] A Simple Test for Interval Graphs, Hsu, W.-L., In W-L. Hsu and R. C. T. Lee, editors, Proc. 18th Int. Workshop(WG’92), Graph-Theoretic Concepts in Computer Science, Springer-Verlag, 1992, LNCS 657, pp. 11-16. [H95] An STS-Based Map of the Human Genome, Hudson, T. J. et al, Science, Vo;. 270, 1995, pp. 1945-1954. [HAY90] Restriction site mapping for three or more enzymes, Ho, S. T. S., Allison, L., and Yee, C. N., CABIOS, Vol. 6, 1990, pp. 195-204. [HM91] Substitution Decomposition on Chordal Graphs and Applications, Hsu, W-L. and Ma, T-H, Proc. 2nd Int. Symp on Algorithms (ISA ‘91), Springer-Verlag, 1991, pp. 52-60. [HS2000] A Clustering Algorithm Based on Graph Connectivity, Hartuv, E. and Shamir, R., Information Processing Letters, Elsevier, Vol. 76, 2000, pp. 175-181. [K88] Algorithms for the restriction site mapping of DNA molecules, Krawczak, M., Proc. Nat. Acad. Sci. USA, Vol. 85, 1988, pp. 7298-7301. [K91] Exact and Approximation Algorithms for DNA Sequence Reconstruction, Kececioglu, J. D., Ph.D. Thesis, University of Arizona, Tucson, 1991. 61 [K93] Mapping of the Genome: Some Combinatorial Problem Arising in Molecular Biology, Karp, R. M., SODA, 1993, pp. 278-285. [KGRRG95] A Radiation Hybrid Map Spanning the Entire Human X Chromosome Integrating YACs, Genes and STS Markers, Kumlien J., Grigoriev A., Roest-Crollius H., Ross M., Goodfellow P., and Lehrach H., Technical Report, MPI, 1995. [KM89] An Incremental Linear Time Algorithm for Recognizing Interval Graphs, Korte, N. and Mohring, R. H., SIAM J. Computing, Vol. 18, 1989, pp. 68-81. [KS93] Pathwidth, Bandwidth and Completion Problems to Proper Internal Graphs with Small Cliques, Kaplan, H. and Shamir, R., Technical Report, CS Department, Tel Aviv University, 1993. [KS94] On the complexity of DNA physical mapping, Kaplan, H. and Shamir, R., Advances in applied mathematics, Vol. 15, 1994, pp. 251-261. [KS94] Tractability of Parameterized Completion Problems on Chordal and Interval Graphs: Minimum Fill-in and Physical Mapping, Kaplan, H., Shamir, R., Proceedings of the 35th Symposium on Foundations of Computer Science, 1994, pp. 780-791. [KS96] Bounded Degree Interval Sandwich Problems, Kaplan H. and Shamir R., Technical Report, CS Dept. Tel Aviv University, 1996. [KS96] Physical Maps and Interval Sandwich Problems: Bounded Degrees, Kaplan H. and Shamir R., In Proc. ISTCS, 1996. [KST96] Tractability of Parameterized Completion Problems on Chordal, Strongly Chordal and Proper Interval Graphs, Kaplan H., Shamir R. and Tarjan R. E., Technical Report, CS Dept. Tel Aviv University, 1996. [L90] Towards a DNA Sequencing Theory, Li, M., Proceedings of the 31st IEEE Symposium of Foundation of Computer Science, 1990, pp. 125-134. 62 [L90] High-Resolution Mapping of Human Chromosome 11 by in Situ Hybridization with Cosmid Clones, Lichter, P. et al., Science, Vol. 247, 1990, pp. 64-69. [LB62] Representation of a Finite Graph by a Set of Intervals on the Real Line, Lekkerkerker C. G. and Boland J.Ch., Fundam. Math., 1962, 51:45-64. [LW88] Genomic Mapping by Fingerprinting Random Clones: A Mathematical Analysis, Lander E. S. and Waterman M. S., Genomics, Vol. 2, 1988, pp. 231-239. [M67] An Algorithm for Reconstructing Protein and RNA Sequences, Marvin, S., Journal of the Association for Computing Machinery, Vol. 14, 1967, pp. 720-731. [M85] Algorithmic Aspects of Comparability Graphs and Interval Graphs, Mohring, R. H., In I. Rival, Editor, Graphs and Order, Reidel, Dordrecht, 1985, pp. 41-101. [M91] Mapping Irradiation Hybrids to Cosmid and Yeast Artificial Chromosome Libraries by Direct Hybridization to Alu-PCR Products, Monaco A. P. et al. Nucleic Acids Res., Vol. 19, 1991, pp. 3315-3318. [NMS84] Plasmid mapping computer program, Nolan, G. P., Maina, C. V., and Szalay, A. A., Nucleic Acids Res., Vol. 12, 1984, pp. 717-729. [ODGBHFMSF86] Random-Clone Strategy for Genomic Restrict Restriction Mapping in Yeast, Olson, M. V., Dutchik, J. E., Graham, M. Y., Brodeur, G. M., Helms, C., Frank, M., MacCollin, M., Scheinman, R., and Frank, T., PNAS, Vol. 83, 1986, pp.7826-7830. [OHCB85] A Common Language for Physical Mapping of the Human Genome, Olson, M. V., Hood, L., Cantor, C. and Botstein, D., Science, Vol. 234, 1985, pp.1434-1435. [P82] Automatic construction of restriction site maps, Pearson, W., Nucleic Acids Res., Vol. 10, 1982, pp. 217-227. [P90] DNA physical mapping, Pevzner, P. A., Computer Analysis of Genetic 63 Texts. Nauka, Moscow, 1990, pp. 154-188. [P95] DNA Physical Mapping and Alternating Eulerian Cycles in Colored Graphs, Pevzner Pavel A., Algorithmica, 1995, Vol. 13, pp. 77-105. [PC93] Efficient Constructions of High-Resolution Physical Maps from Yeast Artificial Chromosomes Using Radiation Hybrids: Inner Product Mapping, Perlin, M. W., Chakravarti, A., Genomics, Vol. 18, 1993, pp. 283-289. [PDDFFHNEQZSJR95] Rapid Construction of Integrated Maps Using Inner Product Mapping: YAC Coverage of Human Chromosome 11, Perlin, M. W., Duggan, D. J., Davis, K., Farr, J. E., Findler, R. B., Higgins, M. J., Nowak, N. J., Evans, G. A., Qin, S., Zhang, J., Shows, T. B., James, M. R., and Richard III, C. W., Genomics, Vol. 28, 1995, pp. 315-327. [PDO84] PMAP, PMAPS : DNA physical map construction programs, Polner, G., Dorgai, L., and Orosz, L., Nucleic Acids Res., Vol. 12, 1984, pp. 227-236. [PM87] An efficient method for physical mapping of DNA molecules, Pevzner, P. A., and Mironov, A. A., Molek. Biol., Vol. 21, 1987, pp. 788-796. [PS95] Interval Graphs with Side (and Size) Constraints, Pe’er, I. and Shamir, R., Proc. of the Third Annual European Symp. On Algorithms, (ESA 95) Corfu, Greece, Springer, 1995, pp. 142-154. [PS95] Realizing Internal Graphs with Size and Distance Constraints, Pe’er, I. and Shamir, R., Technical Report, Computer Science Dept., 1995. [PS95] Satisfiability Problems on Intervals and Unit Intervals, Pe’er, I. and Shamir, R., Technical report, Computer Science Dept., 1995. [PSU1984] SQUAIDS: A DNA Sequence Assembling Program Based on a Mathematical Model, Peltola, H., Soderlund, H. and Ukkonen, E., Nucleic Acids Research, Vol. 12, 1984, pp.307-321. [R69] Indifference Graphs, Roberts F. S., In F. Harary, Editor, Proof Techniques in Graph Theory, Academic Press, New York, 1969, pp. 139-146. 64 [R70] Triangulated Graphs and The Elimination Process, Rose, D. J., J. Math. Anal. Appl., Vol. 32, 1970, pp. 597-609. [R76] Discrete Mathematical Models, with Applications to Social Biological and Environmental Problems, Roberts F. S., Prentice-Hall, Englewood Cliffs, New Jersey, 1976. [R92] Challenges in the Human Genome Project, Robbins, R. J., IEEE Eng. Med. Biol., Vol. 11, No. 1, 1992, pp. 25-34. [R93] Clone Ordering by Simulated Annealing: the Application to the STS-Content Map of Chromosome 21, Rigault, P., Proc. of the Second International Conference on Bioinformatics, Supercomputing and Complex Genome Analysis, World Scientific, 1993, pp. 169-183. [R95] Random Subcloning, Roach, J. C., Technical Report, Dept. of Molecular Biology, University of Washington, 1995. [RTL76] Algorithmic Aspects of Vertex Elimination of Graphs, Rose, D. J., Tarjan, R. E. and Lueker, G. S., SIAM J. Computing, Vol. 5, 1976, pp. 266-283. [S78] Inferfing DNA structure from segmentation data, Stefik, M., Artificial Intelligence, Vol. 11, 1978, pp. 85-114. [S79] A Strategy of DNA Sequencing Employing Computer Programs, Staden, R., Nucleic Acids Research, Vol. 6:7, 1979, pp. 2601-2610. [S99] A 2+1/2-Approximation Algorithm for Shortest Superstring, Sweedyk, E., SIAM Journal on Computing, Vol. 29, 1999, pp. 954-986. [SB95] A Single Ataxia Telangiectasia Gene with a Product Similar to PI-3 kinase, Savitsky, K., Bar-Shira, A., et al., Science, Vol. 268, 1995, pp. 1749-1753. [SBAC91] Humpty an Automated Contig Assembly and Spanning Algorithm for DNA Mapping, Slezak T., Branscomb, E., Ashworth, L. and Carrano, A. V., Technical Report, Biomedical Science Division Lawrence Livermore Nat. Laboratory, 1991. 65 [SD95] SAM: a System for Iteratively Building Marker Maps, Soderlund, C., and Dunham, I., Technical report, The Sanger Centre, Cambridge, UK, 1995. [SESKC87] A Physical Map of the Esherichia Coli K12 Genome, Smith, C. L., Econome, J. G., Schutt, A., Klco, S., and Cantor, C. R., Science, Vol. 236, 1987, pp. 1448-1453. [T89] Approximation Algorithms for the Shortest Common Superstring Problem, Turner, J., Inform. And Comput., Vol. 83, 1989, pp. 1-20. [TU77] A Greedy Approximation Algorithm for Constructing Shortest Common Superstrings, Tarhio, J. and Ukkonen, E., Theoretical Computer Science, Vol. 57, 1977, pp. 131-145. [TY84] Simple Linear-Time Algorithms to Test Chordality of Graphs, Text Acyclicity of Hypergraphs, and Selectively Reduce Acyclic Hypergraphs, Tarjan, R. E. and Yannakakis, M., SIAM J., Vol. 13, 1987, pp. 566-579. [TY93] Approximating Shortest Superstrings, (2+8/9-approximation), Teng, S. H. and Yao, F., FOCS 1993, pp. 158--165. [U1990] A Linear Time Algorithms for Finding Approximate Shortest Common Superstrings, Ukkonen, E., Algorithmica, Vol. 5, 1990, pp. 313-323. [W85] Construction of Linkage Maps with DNA Markers for Human Chromosomes, White, R., Nature, Vol. 313, 1985, pp. 101-105. [W90] The Human Genome Project: Past, Present and Future, Watson, J. D., Science, Vol. 248, 1990, pp. 44-49. [WG86] Interval Graphs and Maps of DNA, Waterman M. S. and Griggs J. R., Bull. Math. Biol., Vol. 48, 1986, pp. 189-195. [WJLTY94] Linear Approximation of Shortest Superstrings, Wblum, A., Jiang, T., Li, M., Tromp, J., and Yannakakis, M., Journal of Computer and System Sciences, Vol. 41, 1994, pp. 630-647. 66 [Y91] Restriction site mapping in CLP, Yap, R. H. C., Proceedings of the 8th International Conference on Logic Programming, MIT Press, Cambridge, MA, 1991, pp. 521-534. 67 On Nearest Neighbor Search [A2002] Hierarchical Subspace Sampling: A Unified Framework for High Dimensional Data Reduction, Selectivity Estimation and Nearest Neighbor Search, Aggarwal, C. C., ACM SIGMOD, 2002, pp.452-463. [AMNSW98] An Optimal Algorithm for Approximate Nearest Neighbor Searching in Fixed Dimensions, Arya, S., Mount, D. M., Netanyahu, N. S., Silverman, R., Wu, A. Y., Journal of ACM, Vol. 45, No. 6, 1998, pp.891-923. [BKKS2000] Indexing the Solution Space: A New Technique for Nearest Neighbor Search in High-Dimensional Space, Berchtold, S., Keim, D. A., Kriegel, H. P., and Seidl, T., IEEE Transactions on Knowledge and Data Engineering, Vol. 12, No. 1, 2000, pp. 45-57. [BOR99] Lower bounds for high dimensional nearest neighbor search and related problems, Borodin, A., Ostrovsky, R. and Rabani, Y., STOC’99, 1999, pp.312-321. [C83] Fast algorithms for the all nearest neighbors problem, Clarkson, K. L., In Proc. 24th Ann. IEEE Sympos. On the Found. Comput. Sci., 1983, pp.339-349. [C88] A randomized algorithm for closest-point queries, Clarkson, K. L., SIAM Journal on Computing, Vol. 17, No. 4, 1988, pp.830-847. [C94] An algorithm for approximate closest-point queries, Clarkson, K. L., In Proc. 10th Annu. ACM Sympos. Comput. Geom., 1994, pp.160-164. [C2001] Toward Optimal -Approximate Nearest Neighbor Algorithms, Cary, M., Journal of Algorithms, Vol. 41, 2001, pp.417-428. [CHF2001] Fast algorithm for nearest neighbor search based on a lower bound tree, Chen, Y. S., Hung, Y. P. and Fuh, C. S., IEEE Computer Vision, Vol. 1, 2001, pp. 446-453. [CHF2002] Fast semi-local alignment for DNA sequence database search, Chen,Y. S., Hung, Y. P., Fuh, C. S., Pattern Recognition, 2002. Proceedings. 16th International Conference on, Vol. 3, 2002, pp.1019 -1022. 68 [CS2000] Nearest neighbor search using additive binary tree, Cha, S. H. and Srihari, S. N., IEEE Computer Vision and Pattern Recognition, Vol. 1, 2000, pp.782 -787. [CZPC2002] An efficient indexing method for nearest neighbor searches in high-dirnensional image databases, Cha, G. H., Zhu, X., Petkovic, D., and Chung, C. W., IEEE Transactions on Multimedia, Vol. 4, No.1, 2002, pp.76-87. [CS2002] A fast nearest neighbor search algorithm by filtration, Cha, S. H. and Srihari, S. N., Pattern Recognition, Vol. 35, 2002, pp.515-525. [JMF99] Data Clustering: A Review, Jain, A. K., Murty, M. N., and Flynn, P. J., ACM Computing Surveys, Vol. 31, No. 3, 1999, pp.264-323. [KR2002] Finding nearest neighbors in growth-restricted metrics, Karger, D. R. and Ruhl, M., STOC’02, 2002, pp.741-750. [LK2002] An efficient nearest neighbor search in high-dimensional data spaces, Lee, D. H. and Kim, H. J., Information Processing Letters, Vol. 81, 2002, pp.239-246. [NN97] A simple algorithm for nearest neighbor search in high dimensions, Nene, S. A. and Nayar, S. K., IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 19, No. 9, 1997, pp.989-1003. [RP99] Fast nearest-neighbor search based on Voronoi projections and its application to vector quantization encoding, Ramasubramanian, V. and Paliwal, K. K., IEEE Transactions on Speech and Audio Processing, Vol. 7, No. 2, 1999, pp.221-226. [TW2002] Adaptive approximate nearest neighbor search for fractal image compression, Tong, C. S. and Wong, M., IEEE Transactions on Image Processing, Vol. 11, No. 6, 2002, pp.605-615. 69 On NP-Complete Problems and Approximation Algorithms [ALMSS98] Proof Verification and the Hardness of Approximation Problems(Prove NP-Complete Problem), Arora, S., Lund, C., Motwani, R., Sudan, M. and Szegedy, M., Journal of the ACM, Vol. 45, No, 3, May 1998, pp. 501-555. [AS95] Improved Length Bounds for the Shortest Superstring Problem 3 (Shortest Common Superstring: 2 -Approximation), Armen, C. and Stein, C., in 4 Proceedings 5th International Workshop on Algorithms and Data Structures, Lecture Notes in Comput. Sci., Vol. 955, 1995, pp. 494-505 [AS96] A 2 2/3 Approximation Algorithm for the Shortest Superstring Problem, Armen, C. and Stein, C., in Proceedings Combinational Pattern Matching, Lecture Notes in Comput. Sci., Vol. 1075, 1996, pp. 87-101. [AS98] 2 2/3 Superstring Approximation Algorithm, Armen, C. and Stein, C., Discrete Applied Mathematics, Vol. 88, No. 1-3, Nov. 9, 1998, pp. 29-57. [BFH94] Beyond NP-Completeness for Problems of Bounded Width: Hardness for the W Hierarchy (Extended Abstract), Bodlaender, H. L., Fellows, M. R. and Hallet, M. T., In Proceedings of the 26th Annual ACM Symposium on the Theory of Computing, ACM Press, New York, 1994, pp. 449-458. [BJJ97] Rotations of Periodic Strings and Short Superstrings(2.596Approximation ), Breslauer, D., Jiang, T. and Jiang, Z., J. Algorithms, Vol. 24, No. 2, August, 1997, pp. 340-353. [BJLTY94] Linear Approximation of Shortest Superstrings, (Greedy: 4-approximation, Modified Greedy: 3-approximation), Blum, A., Jiang, T., Li, M., Tromp, J., and Yannakakis, M., Journal of Computer and System Sciences, Vol. 41, 1994, pp. 630-647. [BL98] Protein Folding in the Hydrophobic-Hydrophilic(HP) Model is NP-Complete(Prove NP-Complete Problem), Berger, B. and Leighton, T., Journal of Computational Biology, Vol. 5, No. 1, 1998, pp. 27-40. [BP93] Genome Rearrangements and Sorting by Reversals(Approximation 70 Algorithm), Bafna, V. and Pevzner, P., Proc. 34th FOCS, IEEE, 1993, pp. 148-157. [BV2001] The Complexity of Multiple Sequence Alignment with SP-Score that is a Metric(Prove NP-Complete Problem), Bonizzoni, P. and Vedova, G. D. Theoretical Computer Science, Vol. 259, 2001, pp. 63-79. [C97] Sorting by Reversals is Difficult(Prove NP-Complete Problem), Caprara A., Proceedings of the First Annual International Conference on Computational Molecular Biology(RECOMB’97), ACM Press, 1997. [C97] Sorting Permutations by Reversals and Eulerian Cycle Decompositions(Prove NP-Complete Problem), Caprara A., to appear in SIAM Journal on Discrete Mathematics, April 1997, pp. 1-23. [C99] Formulations and Hardness of Multiple Sorting by Reversals(Prove NP-Complete Problem), Caprara A., ACM, 1999, pp. 84-93. [FSW2002] Rearrangement of DNA Fragments: a Branch-and-Cut Algorithm(Prove NP-Complete Problem), Ferreira C. E., C. C. de Souza and Wakabayashi Y., Discrete Applied Mathematics, Vol. 116, 2002, 161-177. [GGP2001] The Complexity of Gene Placement(Prove NP-Complete Problem), Goldberg, L. A., Goldberg, P. W. and Paterson, M., Journal of Algorithms, Vol. 41, 2001, pp.225-243. [GMS80] On Finding Minimal Length Superstrings(Prove NP-Hard Problem), Gallant, J., Marier, D. and Storer, J. A., Journal of Computer and System Science, Vol. 20, 1980, pp. 50-58. [GW87] Mapping DNA by Stochastic Relaxation(Prove NP-Complete Problem of Double Digest Problem), Goldstein, L., and Waterman, M. S., Adv. In Appl. Math., Vol. 8, 1987, pp. 194-207. [H81] The NP-Completeness of Some Edge-Partition Problems, Holyer, I., SIAM Journal on Computing, Vol. 10, 1981, pp. 713-717. [K72] Reducibility among Combinatorial Problems(Prove NP-Complete Problem), Karp R. M., Complexity of Computer Computations, Plenum, New 71 York, 1972, pp. 85-103. [KE95] Combinational Algorithms for DNA Sequence Assembly, Kececioglu, D. J. and Myers, W. E., Algorithmica, Vol. 13, 1995, 7-51. [L94] The Protein Threading Problem with Sequence Amino Acid Interaction Preferences is NP-Complete(Prove NP-Complete Problem), Lathrop, R. H., Protein Engineering, Vol. 7, 1994, pp.1059-1068. [LCJW2001] The Longest Common Subsequence Problem for Sequences with Nested Arc Annotations(Prove NP-Complete Problem), Lin, G., Chen, Z. Z., Jiang, T. and Wen, J., Journal of Computer and System Sciences, Vol. 65, 2002, pp. 465-480. [LP2000] Pseudoknots in RNA Secondary Structure(Prove NP-Complete Problem), LyngsØ, R. B. and Pedersen, C. N. S., ACM(RECOM 2000 Tokyo Japan), 2000, pp. 201-209. [MS77] A Note on the Complexity of the Superstring Problem(Prove NP-Complete Problem), Maier, D. and Storer, J. A., Technical Report 233, Dept. of Electrical Engineering and Computer, 1977. [PS98] The Median Problems for Breakpoints are NP-complete(Prove NP-Complete Problem), Pe’er, I. and Shamir, R., Supported by Eshkol Scholarship from the Ministry of Science and Technology, Israel, Nov. 1998, pp. 1-15. [PW2002] Protein Design is NP-Hard(Prove NP-Complete Problem), Pierce, N. A. and Winfree, E., Protein Engineering, Vol. 15, No. 10, 2002, pp. 779-782. [S77] NP-Completeness Results Concerning Data Compression(Prove NP-Complete Problem), Storer, J. A., Technical Report 233, Dept. of Electrical Engineering and Computer Science, Princeton University, Princeton, N. J., 1977. [S99] A 2 1/2 Approximation Algorithm for Shortest Superstring, Sweedyk, Z., SIAM J. on Computing, Vol. 29, No. 3, 1999, pp. 954-986. [SSK96] Steiner Points in the Space of Genome Rearrangements, Sankoff, D., 72 Sundaram, G. and Kececioglu, J., in International Journal of Foundations of Computer Science, Vol. 7, No. 1, Jan. 1996, pp. 1-9. [TY93] Approximating Shortest Superstrings, Teng, S. and Yao, F., Proc. 34th Annual IEEE Symposium on Foundations of Computer Science, IEEE Computer Society Press, Los Alamitos, CA, 1993, pp.158-165. [UM93] Finding the Lowest Free Energy Conformation of a Protein is an NP-Hard Problem: Proof and Implications(Prove NP-Complete Problem), Unger, R. and Moult, J., Bull. Math. Biol., Vol. 55, 1993, pp.1183-1198. [W95] A Simplified Proof of the NP- and MAX SNP-Hardness of Multiple Sequence Tree Alignments(Prove NP-Complete Problem), Wareham, H. T., J. Comput. Biol., Vol. 2, No. 4., 1995, pp. 509-514. 73