Spring 2002 Bud Mishra's Research Prof. Bud Mishra is a professor of computer science and mathematics whose most recent research work is at the interface of computer science and biology. He has developed several sophisticated algorithms and statistical analysis tools to attack biological problems that range from deciphering the genome of pathogens (E. coli, P. falciparum, etc.) to understanding chromosomal aberrations that are implicated in cancer. This work will eventually lead to diagnostics, therapy, vaccines and drugs for various infectious and genetic diseases. His most recent focus has been on a bioinformatics environment that will make it easier for biologists to develop their own computational tools. This environment, dubbed Valis, includes tools for sophisticated visualization of biological information, design and simulation of in silico experiments and storage and communication of biological information. Very recently, Prof. Mishra with his colleagues has used Valis to develop a set of sophisticated tools that can validate and find errors in recently released genomic sequence data for human. The following papers describe microarray-based novel tools and algorithms to find amplifications and homozygous deletions of chromosmoal regions. The method can ultimately be used to find oncogenes and tumor-suppressor genes implicated in cancer. Since microarrays are based on hybridization, a rather noisy process adversely affected by cross-hybridization, hybridization failure, background noise, etc., great care needs to be taken in extracting information out of microarray data. We used error-correcting codes in designing the biological experiments so that one can recover meaningful information in the face of occasional experimental failures. The papers below are with the members of Wigler-lab at Cold Spring Harbor Lab and Norton-lab of Sloan-Kettering. "Comparing Genomes," Special issue on "Biocomputation:" Computing in Science and Engineering. , 4(1): 42--49, January/February 2002. "Placing Probes along the Genome using Pair-wise Distance Data," (with W. Casey and M. Wigler), Algorithms in Bioinformatics, First International Workshop, WABI 2001 Proceedings, LNCS 2149:5268, Springer-Verlag, 2001. "Detecting Gene Copy Number Fluctuations in Tumor Cells by Microarray Analysis of Genomic Representations," (with R. Lucito et al.), Genome Research, 10(11): 1726-1736, 2000. The following papers describe single-molecule based methods to make orderedrestriction maps of small clones (cosmids), medium-sized clones (BACs) and wholegenome. They raise many statistically and algorithmically interesting problems. Almost all the problems arising here are NP-complete, and yet if the parameters of the underlying biological experiments are chosen properly, then the problems lead to efficient polynomial time probabilistic algorithms that yield correct and highly accurate answers almost surely. Several of the papers deal with careful mathematical modeling of the underlying experiments so that proper experimental design could be carried out along with the design of the efficient algorithms. All the algorithms have been implemented by our group and are routinely used by chemists and biologists with no training in computer science. Interestingly, several computer science researchers have also attacked these problems; none of the other algorithms have worked correctly to compute the answers on a significant proportion of experimental data sets, and this is largely due to their failure in modeling the dependence between the experimental errors and the algorithmic structure! All the results reported here are jointly with a colleague Thomas Anantharaman, graduate students and members of Wisconsin's Schwartz-lab. Many of the applications are jointly with MIT Whitehead Institutes's Page-Lab, Celera/TIGR's group headed by Venter, Wisconsin's Blattner-Lab and others. "Mapping the Genome One Molecule at a Time -- Optical Mapping," (with A.H. Samad et al.), Nature, 378:516-517, 1995. "Optical Mapping and Its Potential for Large-Scale Sequencing Projects," (with C. Aston and D.C. Schwartz), Trends in Biotechnology, 17:297-302, 1999. "Optical Mapping," Encyclopedia of the Human Genome, Nature Publishing Group, Macmillan Publishers Limited, London, UK, 2002. The following papers explain the mathematical underpinning of optical mapping and its application to many areas of genomics. "Genomics via Optical Mapping I: Probabilistic Analysis of Optical Mapping Models," (with T.S. Anantharaman), 2001. "Genomics via Optical Mapping II: Ordered Restriction Maps," (with T.S. Anantharaman and D.C. Schwartz), Journal of Computational Biology, 4(2):91-118, 1997. "Genomics via Optical Mapping III: Contiging Genomic DNA and Variations," (with T.S. Anantharaman and D.C. Schwartz), Proceedings 7th Intl. Cnf. on Intelligent Systems for Molecular Biology: ISMB '99, 7:18-27, AAAI Press, 1999. "Genomics via Optical Mapping IV: Sequence Validation via Optical Map Matching," (with M. Antoniotti, T. Ananatharaman and S. Paxia), Submitted, 2001. "A Probabilistic Analysis of False Positives in Optical Map Alignment and Validation," (with T.S. Ananatharaman), Algorithms in Bioinformatics, First International Workshop, WABI 2001 Proceedings, LNCS 2149:27-40, Springer-Verlag, 2001. "Partitioning Single-Molecule Maps into Multiple Populations: Algorithms And Probabilistic Analysis," (with L. Parida), Discrete Applied Mathematics (The Computational Molecular Biology Series),104(l-3):203-227, August, 2000. Optical Mapping was used to map Y chromosome, one of the hardest to decipher. Y does not recombine with other chromosomes and seem to contain a very complex structure in terms of its organization of the genes. In addition, our method helped to understand the DAZ locus, where a deletion causes male infertility in one in thousand. "Optical Mapping of BAC Clones from the Human Y Chromosome DAZ Locus," (with J. Giacalone et al.), Genome Research, 10(9): 1421-1429, 2000. Optical Mapping was used to understand the genome of a micro-organism that can survive very high-degree of radiation (up to 50,000gY). Even if, radiation or dehydration breaks its DNA, it can repair the damages and come back to life! "Whole Genome Shotgun Optical Mapping of Deinococcus radiodurans," (with J. Lin et al.), Science, 285(5433):1558-1562, 1999. Optical Mapping was used to understand the genome of chloro-quinine resistant Malaria parasite that kills about 2 million young children every year and affects some 500 million people annually. "Optical Mapping of Plasmodium falciparum Chromosome 2," (with J. Jing et al.), Genome Research, 9:175-181, 1999. "A Shotgun Sequence-Ready Optical Map of the Whole Plasmodium falciparum Genome," (with Z. Lai et al.), Nature Genetics, 23(3):309-313, 1999. Optical Mapping was used to assemble the sequences of various strains of E. coli and Y. pestis. The paper on Y. pestis map has been submitted for journal publication. "Shotgun Optical Maps of the Whole Escherichia coli 0157:H7 Genome," (with A. Lim et al.), Genome Research, 11(9): 1584-1593, 2001. Miscellenia related to Optical Mapping, and single-molecule methods. "Optical PCR: Genomic Analysis by Long-Range PCR and Optical Mapping," (with J. Skiadas et al.), Mammalian Genome, 10:1005-1009, 1999. "Automated High Resolution Optical Mapping Using Arrayed, Fluid Fixated, DNA Molecules," (with J. Jing, et al.), Proc. National Academy of Science, 95:8046-8051, 1998. "High Resolution Restriction Maps of Bacterial Artificial Chromosomes Constructed by Optical Mapping," (with W. Cai, et al.), Proc. National Academy of Science, 95:3390-3395, 1998. An advanced textbook aimed at mathematically sophisticated biologists, and an advanced undergraduate level textbook for biology students. Algorithmic Biology, In Courant Lecture Notes Series, 2000 (Tentative). Bioinformatics, (with P. Benfey and A. Protopapas), Prentice Hall, 2002 (Tentative).