AJ_BIBLIO - Department of Computer Science • NJIT

advertisement
Page 1 of 9
Bibliography List:
General References:
Baxevanis, A.D. and Oullette, B.F.F., eds.
1998
Bioinformatics, A Practical
A. John Wiley and
Guide to the Analysis of Genes
Sons, Inc., NY
and Proteins
Brenner, S., Lewitter, F., Patterson, M., and
Handel, M., eds.
1998
Trends Guide to
Bioinformatics
Waterman, M.S.
1995
Introduction to Computational
Chapman Hall
Biology: Maps, sequences and
genomes
Salzberg, S.L., Searls, D.B., and Kasif, S., eds.
1998
Computational Methods in
Molecular Biology
1998
Biological Sequence Analysis:
Probabilistic models of
CUP
proteins and nucleic acids
Alberts, B., Bray, D., Lewis, J., Raff, M., Roberts,
K. and Watson, J.D., eds.
1994
Molecular Biology of the Cell
Garland Publishing
Baldi, P. and Brunak, S.
1998
Bioinformatics: The machine
learning approach
MIT Press
Rashidi, H. H. and Buehler, L. K.
2000
Bioinformatics Basics:
Applications in Biological
Science and Medicine
Durbin, R., Eddy, S., Krogh, A. and Mitchison, G.
Elsevier Science
Elsevier
CRC Press
Week 1: Foundations of Molecular Biology
 Alberts, B., Bray, D., Lewis, J., Raff, M., Roberts, K. and Watson, J.D., eds. 1994.
“Molecular Biology of the Cell”. Garland Publishing.
 Rubin, G.M., et al. 2000. “Comparative genomics of the eukaryotes”. Science 287:22042215.
 Schafer, A.J. and Hawkins, J. R. 1998. “DNA variation and the future of human genetics”
Nature Biotechnology 16:33-39.
Week 2: Modern Biochemical Techniques
Gene Sequence Analysis:
 Sterky, F. and Lundeberg, J. 2000. “Sequence analysis of genes and genomes”. Journal
of Biotechnology 76:1-31.
 Hunkapiller, T. et al. 1991. “Large-scale and automated DNA sequence determination”
Science 254:59-67.
Page 2 of 9
Week 3: Problems in Computational Molecular Biology
 Salzberg. Chapter 1. “Grand challenges in computational biology”
 Tsoka, S. and Ouzounis, C.A. 2000. “Recent developments and future directions in
computational genomics”. FEBS Letters 480:42-48.
 Delisi, C. 1988. “Computers in molecular biology: current applications and emerging
trends” Science 240:47-52.
 Howard, K. July 2000. “The bioinformatics gold rush” Scientific American pp 58-63.
 Koonin, E.V. 1999. “The emerging paradigm and open problems in comparative
genomics” Bioinformatics 15(4):265-266.
 Clark, M.S. 1999. “Comparative genomics: the key to understanding the Human Genome
project” Bioessays 21:121-130.
 Searls, D. B. 2000. “Using bioinformatics in gene and drug discovery” Drug Discovery
Today 5(4):135-143.
 Boguski, M.S. 1999. “Biosequence exegesis”. Science 286:453-455.
General:
 Nowak, R. 1995. “Entering the postgenome era”. Science 270:368-371.
 Kahn, P. 1995. “From genome to proteome”. Science 270:369-370.
Week 4: Statistical Preliminaries
Week 5: General Computational Search Methods
General:
 Chapter 2, Salzberg. “A tutorial introduction to computation for biologists”.
Data Structure and Algorithms

Techniques: Dynamic Programming
 Hillier, F.S. & Lieberman, G.J., 1995, “Introduction to Operations Research”, 6th ed.,
McGraw Hill (Chapter 10, pp 424-469)
 Gusfield, D., 1997, “Algorithms on Strings, Trees, and Sequences – Computer Science
and Computational Biology”, CUP (Chapter 11, pp 215-253)
 Bertsekas, D. P., 1995, “Dynamic Programming and Optimal Control”, Athena Scientific,
(Chapter 1, pp 1-49)
Techniques: Hidden Markov Models
 Krogh, A. et al. 1994. “Hidden Markov Models in Computational Biology: applications
to protein modeling”. J. Mol. Biol. 235:1501-1531.
 Forney, G.D. 1973. “The Viterbi Algorithm”. Proceedings of the IEEE. 61 (3):268-278.
 Rabiner, L.R. and Juang, B.H. Jan 1986. “An Introduction to Hidden Markov Models”.
IEEE ASSP Magazine pp 1-16.
 Durbin et al. (Chapter 3)
Page 3 of 9
 Baldi, P. et al. 1994. “Hidden Markov Models of biological primary sequence
information”. Proc. Natl. Acad. Sci. USA 91:1059-1063.
 Eddy, S.R. 1996. “Hidden Markov Models” Current Opinion in Structural Biology
6:361-365.
 Asai, K., Hayamizu, S. and Handa, H. 1993. “Prediction of protein secondary structure
by the hidden Markov model” Computer Applications in the Biosciences (CABIOS) 9
(2):141-146.
Techniques: Pattern Recognition
 Stormo, G.D. and Hartzell, G.W. 1989. “Identifying protein-binding sites from unaligned
DNA fragments”. Proc. Natl. Acad. Sci. USA 86:1183-1187.
 Smith, R.F. and Smith, T.F. 1990. “Automatic generation of primary sequence patterns
from sets of related protein sequences”. Proc. Natl. Acad. Sci. USA 87:118-122.
 Lawrence, C.E. et al. 1993. “Detecting subtle sequence signals: a Gibbs sampling
strategy for multiple alignment”. Science 262:208-214.
 Galas, D.J., Eggert, M. and Waterman, M.S. 1985. “Rigorous pattern-recognition
methods for DNA sequences: Analysis of promoter sequences from Escherichia coli” J.
Mol. Biol. 186:117-128.
 Smith, H.O., Annau, T.M. and Chandrasegaran, S. 1990. “Finding sequence motifs in
groups of functionally related proteins”. Proc. Natl. Acad. Sci. USA 87:826-830.
Week 6: Sequence Comparisons & Alignment
Substitution Matrices
 “Amino acid substitution matrices from an information theoretic perspective”, Altschul,
S.F. 1991 Journal of Molecular Biology 219: 555-565.
 “A model of evolutionary change in proteins”, Dayhoff, M.O., Schwartz, R.M., Orcutt,
B.C. 1978 In "Atlas of Protein Sequence and Structure, vol. 5, suppl. 3," M.O. Dayhoff
(ed.), pp. 345-352, Natl. Biomed. Res. Found., Washington.
 Henikoff, S. and Henikoff, J.G. 1992. “Amino acid substitution matrices from protein
blocks”, Proc. Nat. Acad. Sci. USA 89: 10915-10919.
 States, D.J., Gish, W., Altschul, S.F. 1991. “Improved Sensitivity of Nucleic Acid
Database Searches Using Application Specific Scoring Matrices”, Methods: A companion
to Methods in Enzymology 3: 66-70.
 Thompson, J.D., Higgins, D.G. and Gibson, T.J. 1994. “Improved sensitivity of profile
searches through the use of sequence weights and gap excision” CABIOS 10(1):19-29.
Sequence Alignment
 Needleman, S.B., Wunsch, C.D. 1970. “A general method applicable to the search for
similarities in the amino acid sequences of two proteins”, Journal of Molecular Biology
48:443-453.
 Smith, T.F. & Waterman, M.S. 1981. “Identification of common molecular
subsequences.” J. Mol. Biol. 147:195-197.
Page 4 of 9
 Gotoh, O. 1982. “An improved algorithm for matching biological sequences”. J. Mol.
Biol. 162:705-708.
 Pearson, W.R. & Lipman, D.J. 1988. “Improved tools for biological sequence
comparison.” Proc. Natl. Acad. Sci. USA 85:2444-2448.
 Johnson, M.S., Overington, J.P. 1993. “A Structural Basis of Sequence Comparisons An
evaluation of scoring methodologies”, J. Mol. Biol. 233: 716-738.
 Waterman, M.S., Eggert, M. 1987 . “A new algorithm for best subsequence alignments
with applications to tRNA-rRNA comparisons”, J. Mol. Biol. 197:723-728.
 Brenner, S. E., Chothia, C. and Hubbard, T.J.P. 1998. “Assessing sequence comparison
methods with reliable structurally identified evolutionary relationships”, Proc. Natl.
Acad. Sci. USA 95: 6073-6078
 Henikoff, S. 1996. “Scores for sequence searches and alignments”. Curr. Op. Struct.
Biol. 6: 353-360.
 Pearson, W.R. 1991. “Searching Protein Sequence Libraries: Comparison of the
Sensitivity and Selectivity of the Smith Waterman and FASTA algorithms”. Genomics
11: 635-650.
 Myers, E.W. and Miller, W. 1988. “Optimal alignments in linear space” CABIOS
4(1):11-17.
 Chao, K., Pearson, W.R. and Miller, W. 1992 “Aligning two sequences within a specified
diagonal band” CABIOS 8(5):481-487.
 Barton, G.J. 1993. “An efficient algorithm to locate all locally optimal alignments
between two sequences allowing for gaps” CABIOS 9(6):729:734.
 Huang, X. and Zhang, J. 1996. “Methods for comparing a DNA sequence with a protein
sequence” CABIOS 12(6):497-506.
 Pearson, W. R. 1996. “Effective Protein Sequence Comparison” Methods in Enzymology
266:227-258.
The statistics of sequence alignment
 Altschul, S.F. 1991, “Amino Acid Substitution Matrices from an Information Theoretic
Perspective”, J. Mol. Biol.,219: 555-565.
 Karlin, S. and Altschul, S. F., 1990, “Methods for assessing the statistical significance of
molecular sequence features by using general scoring schemes”, Proceedings of the
National Academy of Sciences USA 87: 2264-2268.
 Karlin, S., Dembo, A. and Kawabata, T.. 1990. “Statistical composition of high-scoring
segments from molecular sequences”. The Annals of Statistics, 18, no 2.: 571-581.
 Dembo, A. and Karlin, S. 1991. “Strong limit theorems of empirical functionals for large
exceedances of partial sums of I.I.D. variables”. The Annals of Probability, 19, no 4.:
1737-1755.
 Karlin, S. and Altschul, S. F. 1993. “Applications and statistics for multiple high-scoring
segments in molecular sequences”. Proc. Nat. Acad. Sci. USA 90: 5873-5877.
 Sjolander K. et al. 1996. “Dirichlet mixtures: a method for improved detection of weak
but significant protein sequence homology” CABIOS 12(4):327-345.
 Altschul, S.F. and Gish, W. 1996. “Local Alignment Statistics”. Methods in Enzymology
266: 460-480
Database Alignment Tools & Searches: BLAST, FASTA, PSI-BLAST
 “Basic local alignment search tool”, Altschul, S.F., Gish, W., Miller, W., Myers, E.W.,
Lipman, D.J. 1990 Journal of Molecular Biology 215:403-410.
Page 5 of 9
 “Improved tools for biological sequence comparison”, Pearson, W.R., Lipman, D.J. 1988
Proceedings of the National Academy of Sciences USA 85 :2444-2448.
 Altschul, S.F., Madden, T.L., Schâffer, A.A., Zhang, J., Zhang, Z., Miller, W. & Lipman,
D.J. (1997) “Gapped BLAST and PSI-BLAST: a new generation of protein database
search programs”. Nucleic Acids Res. 25:3389-3402.
 Altschul, S.F. and Koonin, E.V. 1998. “Iterated profile searches with PSI-BLAST – a
tool for discovery in protein databases. TIBS. 23:444-447.
 Aravind, L. and Koonin, E.V. 1999. “Gleaning non-trivial structural and evolutionary
information about proteins by iterative database searches”. J Mol. Biol. 287: 1023-1040.
Multiple Sequence Alignment
 “A tool for multiple sequence alignment”, Lipman, D.J., Altschul, S.F., Kececioglu, J.D.
1989 Proceedings of the National Academy of Sciences USA 86 :4412-4415.
 Barton, G.J., Sternberg, M.J. 1987. “A strategy for the rapid multiple alignment of
protein sequences. Confidence levels from tertiary structure comparisons”, Journal of
Molecular Biology 198 :327-337.
 Carrillo, H. and Lipman, D. 1988 “The multiple sequence alignment problem in biology”
SIAM Journal. App. Math. 48 (5): 1073-1082.
 Higgins, D.G., Bleasby, A.J. and Fuchs, R. 1992. “CLUSTAL V: improved software for
multiple sequence alignment” CABIOS 8(2):189-191.
 Hirosawa, M. et al. 1993. “MASCOT: multiple alignment system for protein sequences
based on three-way dynamic programming” CABIOS 9(2):161-167.
 Gotoh, O. 1993 “Optimal alignment between groups of sequences and its application to
multiple sequence alignment” CABIOS 9(3):361-370.
 Kim, J., Pramanik, S. and Chung, M.J. 1994. “Multiple sequence alignment using
simulated annealing” CABIOS 10(4):419-426.
 Stormo, G.D., Hartzell, G.W. III 1989 “Identifying protein-binding sites from unaligned
DNA fragments”. Proceedings of the National Academy of Sciences USA 86 :1183-1187.
 Galas, D.J., Eggert, M. Waterman, M.S. 1985 “Rigorous pattern-recognition methods for
DNA sequences. Analysis of promoter sequences from Escherichia coli”. Journal of
Molecular Biology 186 :117-128.
 Lawrence, C.E., Altschul, S.F., Boguski, M.S., Liu, J.S., Neuwald, A.F., Wootton, J.C.
1993 “Detecting Subtle Sequence signals: A Gibbs Sampling Strategy for Multiple
Alignment.” Science 262: 208-214.
 Schuler, G.D., Altschul, S.F., Lipman, D.J. 1991 “A workbench for multiple alignment
construction and analysis. Proteins 9 :180-190.
Sequence Profiles
 Smith, R.F., Smith, T. F. 1990 “Automatic generation of primary sequence patterns from
sets of related protein sequences”. Proc. Nat. Acad. Sci. USA 87 :118-122.
Page 6 of 9
 Gribskov, M., McLachlin, A.D., Eisenberg, D. 1987 . “Profile analysis: detection of
distantly related proteins”. 84 :4355-4358.
Sequences and Evolutionary Trees
 Doolittle, R. 1981. “Similar amino acid sequences: chance or common ancestry?” Science
214:149-159
 Henikoff, S. and Greene E.A. 1997. “Gene families: the taxonomy of protein paralogs
and chimeras” Science 278:609-614.
 Lake, J.A. 1994. “Reconstructing evolutionary trees from DNA and protein sequences:
paralinear distances” Proc. Natl. Acad. Sci. USA 91:1455-1409.
Weeks 7 & 8: Gene Finding and Sequence Annotation
SNPs:
 Buetow, K.H., Edmonson, M.N. and Cassidy, A.B. 1999. “Reliable identification of
large numbers of candidate SNPs from public EST data” Nature Genetics 21:323-325.
 Cargill, M. et al. 1999. “Characterization of single-nucleotide polymorphisms in coding
regions of human genes” Nature Genetics 22:231-238.
Clustering and ESTs:
 Chou, A. and Burke, J. 1999. “CRAWview: for viewing splicing variation, gene families,
and polymorphisms in clusters of ESTs and full-length sequences” Bioinformatics
15(5):376-381.
Finding consensus patterns:
 Hertz, G.Z., Hartzell, III, G.W. and Stormo, G.D. 1990. “Identification of consensus
patterns in unaligned DNA sequences known to be functionally related” CABIOS
6(2):81-92
Week 9: Shotgun global sequence alignment
Weeks 10 & 11: Gene expression analysis
Gene Chips:
 Gullans, S.R. 2000 “Of microarrays and meandering data points” Nature Genetics 26:4-5
 Brazma, A. and Vilo, J. 2000. “Gene expression data analysis”. FEBS Letters 480:17-24.
 Gerhold, D., Rushmore, T. and Caskey, C.T. 1999. “DNA chips: promising toys become
powerful tools”. TIBS 24:168-173.
 Knight, J. 2001. “When the chips are down.” Nature 410:860-861.
Page 7 of 9
 Hamadeh, H. and Afshari, C.A. 2000. “Gene chips and functional genomics” American
Scientist 88:508-515.
 Celis J.E. et al. 2000. “Gene expression profiling: monitoring transcription and
translation products using DNA microarrays and proteomics”. FEBS Letters 480:2-16.
 Ermolaeva, O. et al 1998. “Data management and analysis for gene expression arrays”
Nature Genetics 20:19-23.
 Khan, J. et al. 1999. “DNA microarray technology: the anticipated impact on the study of
human disease” Biochimica et Biophysica Acta 1423:M17-M28.
 Ramsay, G. 1998 “DNA chips: state of the art” Nature Biotechnology 16:40-44.
 Marshall, A. and Hodgson, J. 1998. “DNA chips: an array of possibilities” Nature
Biotechnology 16:27-31.
 Hacia, J.G. 1999. “Resequencing and mutational analysis using oligonucleotide
microarrays” Nature Genetics 21:42-47.
 Bucher, P. 1999. “Regulatory elements and expression profiles” Current Opinion in
Structural Biology 9:400-407.
 Debouck, C. and Goodfellow, P.N. 1999. “DNA microarrays in drug discovery and
development”. Nature Genetics 21:48-50.
 Schena, M. et al. 1995 “Quantitative monitoring of gene expression patterns with a
cDNA microarray” Science 270:467-470.
Computational Methods:
 Tavazoie, S. et al. 1999. “Systematic determination of genetic network architecture”
Nature Genetics 22:281-285.
 Greller, L.D. and Tobin, F. L. 1999. “Detecting selective expression of genes and
proteins” Genome Research 9(3):282-296.
 Eisen, M. B. et al 1998 “Cluster analysis and display of genome-wide expression
patterns” PNAS 95:14863-14868.
Protein Microarrays:
 Lueking A. et al. 1999 “Protein microarrays for gene expression and antibody screening”
Analytical Biochemistry 270:103-111.
Weeks 12 & 13: Algorithms for RNA and protein
structure prediction
RNA sequence and structure:
 Corpet, F. and Michot, B. 1994. “RNAlign program: alignment of RNA sequences using
both primary and secondary structures” CABIOS 10(4):389-399.
 Shapiro, B.A. and Wu, J. C. 1996. “An annealing mutation operator in the genetic
algorithms for RNA folding”. CABIOS 12(3):171-180.
Protein Structure Prediction:
 Kabsch, W. and Sander, C. 1984 “On the use of sequence homologies to predict protein
structure: identical pentapeptides can have completely different conformations”. Proc.
Nat. Acad. Sci. USA 81:1075-1078.
 Gribskov, M., Homyak, M., Edenfield, J., Eisenberg, D. 1988. “Profile scanning for
three-dimensional structure patterns in protein sequences”. Computer Applications in the
Biosciences 4:61-66.
Page 8 of 9
Detecting Motifs:
 Bork, P. and Koonin, E. V. 1996. “Protein sequence motifs” Current Opinion in
Structural Biology 6:366-376.
 Lawrence, C.E. and Reilly, A.A. 1990. “An expectation maximization (EM) algorithm
for the identification and characterization of common sites in unaligned biopolymer
sequences” Proteins: Structure, Function and Genetics 7:41-51
 Hughey, R. and Krogh, A. 1996. “Hidden Markov models for sequence analysis:
extension and analysis of the basic method” CABIOS 12(2):95-107.
 Smith, H.O., Annau, T.M., Chandrasegaran, S. 1990 “Finding sequence motifs in groups
of functionally related proteins”. Proceedings of the National Academy of Sciences USA
87 :826-830.
 Staden, R. 1988. “Methods to define and locate patterns of motifs in sequences” CABIOS
4(1):53-60.
Structural Genomics: (prediction of protein structure)
 Sali, A. 1998. “100,000 protein structures for the biologist” Nature Structural Biology
5(12):1029-1032
RNA motifs:
 Dandekar, T. and Hentze, M.W. 1995. “Finding the hairpin in the haystack: searching for
RNA motifs.” TIG 11(2):45-50.
Week 14: Databases and Web Tools
Database Searches:
 Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J. 1990. “Basic local
alignment search tool”, Journal of Molecular Biology 215: 403-410.
 Pearson, W.R., Lipman, D.J. 1988. “Improved tools for biological sequence comparison”,
Proc. Nat. Acad. Sci. USA 85: 2444-2448.
 Altschul, S.F., Boguski, M.S., Gish, W. and Wootton, J.C. 1994 “Issues in searching
molecular sequence databases”. Nature Genetics 6:119-129.
 Attwood, T. K. 2000. “The role of pattern sequence databases in sequence analysis”.
Briefings in Bioinformatics 1(1):45-59.
Page 9 of 9
TO FIND:
Sansom, C. 2000. “Database searching with DNA and protein sequences: an introduction.”
Briefings in Bioinformatics 1(1):22-32.
“Simultaneous comparison of three or more sequences related by a tree”, Sankoff, D., Cedergren,
R.J. (1983) In "Time Warps, String Edits and Macromolecules: The Theory and Practice of
Sequence Comparison," D. Sankoff & J.B. Kruskal (eds), pp. 253-263, Addison-Wesley,
Reading, MA.
Download