"Being Digital" or “Genetic Codes as Codes:”

advertisement
Towards a Theoretical Basis for Bioinformatics: “Genetic Codes as Codes”
John R. Jungck
Bioinformatics has developed primarily as a discipline within mathematics
and computer science devoted to organizing and analyzing large biological
databases. However, biology has much to offer to a synthetic discipline of
bioinformatics that draws upon and respects the mutual contributions of
biology, mathematics and computer science. In particular, biology has two
major theoretical foundations, both evolutionary: namely, phylogenetic
systematics and population genetics, that can serve as a cornerstone of a
theoretical foundation of bioinformatics along with traditional empirically
driven, pattern searching forms of classical bioinformatics. In this reconception of bioinformatics, mathematics and computer science are
instrumental in developing biological theory and in solving practical
biological problems. Since the genetic code is both an evolutionary product
as well as a process for mediating the conversion of genotype to phenotype,
it is argued here that an evolutionary analysis of genetic codes will
fundamentally affect our ability to make meaning out of molecular messages
through a theoretically grounded bioinformatics.
Mathematical properties of genetic codes will be demonstrated with respect
to their rates of transmission, correctability and detectability of errors,
efficiencies, symmetries, and origins by employing coding theory (Baudot
codes, Gray codes, Hamming codes, Huffman codes, common free codes,
etc.), abstract algebra, graph theory, combinatorics, information theory, and
phylogenetic systematics of sequences. Genetic codes become much more
understandable and elegant to biologists, mathematicians, and computer
scientists when they are not considered as mere ciphers, but are instead
understood from three perspectives: codes per se, physical chemical
interactions, and evolutionary selective pressures. These various faces of
genetic codes are useful for making meaning out of molecular messages,
applying causal mechanisms to complex patterns, and the efficient storage
and retrieval of large complex data sets. In addition, some of the alternative
distance metrics based upon different mathematical representations of
genetic codes that have utility in genomic data base searching (comparative
sequence analyses), phylogenetic tree construction, and prediction of three
dimensional structure from primary structure will be illustrated and different
evolutionary mechanisms affecting gene expression based upon codon
usage will be considered.
Key words: Evolutionary Bioinformatics; Genetic Codes; Huffman Codes
(Fractals and Power Laws), Gray Codes, Hamming Codes, Baudot Codes,
Comma-free Codes, Commaless Codes, and Overlapping Codes; codon
usage; Gatlin-Grantham Hypotheses; Shannon’s Information Theory and
Chaitin-Komogorov Algorithmic Complexity and Compressibility;
Algebraic Coding Theory; Klein-4 groups.
Bibliography:
John R. Jungck, Ethel D. Stanley, and Marion Field Fass, Editors. (2002).
Microbes Count! Problem Posing, Problem Solving, and Peer Persuasion in
Microbiology. American Society for Microbiology Press: Washington,
D.C.
John R. Jungck, Editor, (1998-), The BioQUEST Library V & VI (2002).
Academic Press: San Diego, California.
John R. Jungck, (1998), Evolutionary Problem Solving. BioQUEST Notes 8
(2): 4-5 (February).
John R. Jungck and Robert M. Friedman. 1984. Mathematical Tools for
Molecular Genetics Data: An Annotated Bibliography. Bulletin of
Mathematical Biology 46 (4): 699-744.
John R. Jungck. 1984. The adaptationist programme in molecular evolution.
The origins of genetic codes. In Molecular Evolution and Protobiology,
K. Matsuno, K. Dose, K. Harada, and D. L. Rohlfing, eds., Plenum
Press: New York, pp. 345-364.
Martha O. Bertman and John R. Jungck. 1979. Group graph of the Genetic
Code. Journal of Heredity 70: 379-384.
John R. Jungck. 1978. The genetic code as a periodic table. Journal of
Molecular Evolution 11: 211-224.
Plus attach the PubMed bibliography on Codon Usage entitled:
CodonCompPubMedBibliogr.doc
Web site tools:
1. BioQUEST Curriculum Consortium
2. BEDROCK: Bioinformatics Education Dissemination: Reaching Out,
Connecting, and Knitting-together
3. Biology Workbench
4. Codon Composition Analyzers:
5.
6. Freeland Lab: Genetic Code Evolution : projects
a. A Bioinformatics Lab of the Biological Sciences Dept. at UMBC
b. http://www.evolvingcode.net/project.php
c. The CAI Calculator: measures codon usage bias in a gene
d. Codon Sequence Analyzer: This tool investigates the codon
error minimization property of the genetic code by analyzing
protein coding sequences
7. Codon Usage Database (NAKAMURA Yasukazu, Dr.)
a. http://www.kazusa.or.jp/codon/
8. Codon Usage Table analysis
i. http://www.entelechon.com/eng/cutanalysis.html
ii. (Entelechon - the syntheticgenes.company)
9. Graphical Codon Usage Analyzer http://gcua.schoedl.de/
Markus Fuhrmann, Lars Ferbitz, Amparo Hausherr,
Thomas Schödl and Peter Hegemann
10.
The analysis of codon usage patterns. James O. McInerney <
http://www.rfcgr.mrc.ac.uk/embnet.news/vol4_2/codon.html>
11.Correspondence Analysis of Codon Usage : CodonW is a programme
designed to simplify the Multivariate analysis (correspondence
analysis) of codon and amino acid usage.
http://www.molbiol.ox.ac.uk/cu/
12.
Download