© Gareth Witten M Maatthheem maattiiccaall aanndd C Coom mppuuttaattiioonnaall C Chhaalllleennggeess iinn tthhee B Biioollooggiiccaall SScciieenncceess Gareth Witten Department of Mathematics and Applied Mathematics University of Cape Town, South Africa Santa Fe Institute, 1399 Hyde Park Road, Santa Fe 87501, USA Draft: 29 October 2003 The interface between mathematics and biology presents opportunities and challenges for both mathematicians and biologists. For biologists research areas range from the level of the cell to the biosphere (see Figure 1) and for mathematicians research includes “traditional” areas such as mathematical statistics and dynamical systems and “non-traditional” areas of research, which include knot theory (De Wit Sumners, 1995), interval graphs and algorithms for DDP (double digest problem) (Waterman, 1999), and differential inclusions (Aubin, 1991). For example, in knot theory one would like to understand the 3-D structure of proteins and DNA in solution in the cell and the relationship between its structure and function (Figure 2). The packing and twisting and topological constraints pose serious functional problems for DNA and this entanglement interferes with replication, transcription and recombination. Performing experiments on modified circular DNA so that certain enzymes may recognize a particular region allows one to capture the topological effects that can then be captured by gelelectropheresis and electron microscopy. These experiments can help us answer questions like, “How can one deduce enzyme mechanisms from observed changes in DNA geometry and topology?” 1 © Gareth Witten Fig. 1. Scope and scale of biology from sub-cellular structures to ecosystems Fig. 2. Topological approach to enzymology. Opportunities have surfaced within the last three decades because of the enormous increase in the quantity and quality of biological data due to advances in technology and the availability of powerful computing power (hardware and software) that can potentially organize the plethora of biological data. Figure 3 gives a “roadmap” for the contributions to drug docking and molecular recognition with advances in software and hardware. However, further advances in this direction is necessary but may not be sufficient to further our understanding of biological systems. In my opinion, there are two further requirements: (1) We need to integrate the information across different time and spatial scales (Figure 4a), (2) We need theoretical frameworks for approaching behaviour of spatially extended, hierarchical systems. (for example, see May, 1992, Figure 4b). Mathematical models do provide such a framework but I will go further and argue for additional approaches and tools to understanding such systems. 2 © Gareth Witten Fig. 3. A “roadmap” for the biological contributions possible for drug docking and molecular recognition with advances in software and hardware (Wooley, 1999) Fig. 4. (a) Suite of different models used for to model different spatial (and temporal) scales in ecosystems, (b) Spatial dynamics of the population densities of hosts and parasitoids (Gibbs lecture, May, 1995). 3 © Gareth Witten Mathematical modelling does provide such a framework but I will go a bit further and argue for new tools and conceptual developments to understand such complex dynamical systems. In my opinion, it is to this end that the complex systems research community endeavours, but further development for large dynamical systems are required. Complex systems research borrows (and develops) quantitative tools from computer science, mathematics and physics and uses it to understand large social and biological systems (Figure 5) Complex Systems Physics Computer Science Maths Fig. 5. Domain of complex systems research. There exist areas in biology that are virtually devoid of mathematical theory, and some must remain so for years to come. In these anecdotal information accumulates, awaiting the integration and insights that come from mathematical abstraction. Examples of these fields include developmental biology (Figure 6a) and neurobiology (Figure 6b). 4 © Gareth Witten Fig. 6. (a) Growth of a human fetus, (b) Neuronal network. Some questions that are being asked include: “How are the developmental pathways stabilized and spatially organized to yield a sea urchin or lily or a giraffe? “How do genes act and interact within the context of cells so as to bring about these units of structure and function?” “How do cells act and interact within the context of the organism to generate coherent wholes?” In other areas, theoretical developments have run far ahead of the capability of empiricists to test ideas, developments that may capture biological phenomena. For example, catastrophe theory was inspired by Waddington’s (1957) idea of an epigenetic landscape. Fig. 6. Waddington’s landscape. The ways in which whole fields of research are approached have changed. For example, evolutionary genetics and evolutionary biology were fields historically concerned largely with inferring process from pattern, the explosion of knowledge at the cellular and molecular levels have permitted the complementary approaches that begins with processes at the micro level. 5 © Gareth Witten Fig. 7. Fossil record during the Cambrian period (about 570 million years ago) Most applications of mathematics and biology will have little effect on core areas of mathematics. Interactions of mathematics and biology can be divided into three categories: 1. Routine application of existing mathematical techniques to biological problems Eg. Lotka Volterra PDE’s for tumor growth… 2. Existing mathematical techniques are inadequate and new mathematics must be developed, within conventional frameworks Eg. IBM’s, networks (Figure 8), differential inclusions 3. Some fundamental issues in biology appear to require new ways of thinking Eg. Catastrophe theory. Figure 8: A complex food web pattern from a marine ecosystem in South Africa 6 © Gareth Witten Classical approaches to population biology, like classical approaches to other problems in biology, emphasised deterministic systems of low dimensionality, and thereby swept as much stochasticity and heterogeneity as possible under the rug. New techniques and the advances in technology and advances in algorithm development has led to the development of highly detailed models in which a wide variety of components and mechanisms can be incorporated. A consequence of this approach to include more detail is a question that is central to science: “what detail at the level of individual units is essential to understand more macroscopic regularities.” Part of the problem is the use of mathematical models to represent model structures and processes are modelled as different types of mathematical objects; for example, the muscle fibre orientation is modelled by a tensor, while action potential in a cell can be modelled by solutions of differential equations. The answers lies in the principles of dynamic organisation that are still far from clear, but that involve emergent properties that resolve the extreme complexity of gene and cellular activities into robust patterns of coherent order. The reductionist approach (for eg. HGP) ignores the fact that an organism is not a thing composed of parts, but a system of interacting processes. What is needed is a means of reconstructing the behaviour of a system from a detailed knowledge of its components and their interactions – given the baroque complexity of living systems any such reconstruction must be constructive and computational. For example in organismal biology deals with all aspects of the biology of individual plants and animals, including physiology, morphology, development, and behaviour. It interfaces cellular and molecular biology at one end, and ecology at the other. In cellular and molecular biology one attempts to develop integrative theories or organismal function and in ecology, one attempts to place individual behaviour and function within an environmental context. However, there are several problems in understanding the behaviour of a biological system even when a detailed and accurate description of its components is available: 1. There is the sheer complexity of the system and the number of its components 2. The components operate over radically different time scales and spatial scales 3. The processes are occurring in a system that is spatially extended and organized within a structural and functional hierarchy 7 © Gareth Witten 4. Most of the functional interactions within biological systems are nonlinear, and small changes in the parameters of nonlinear systems can lead to large scale, qualitative changes in their behaviour. In summary, a number of fundamental mathematical issues cut across all of these challenges: 1. How can we incorporate variation among individual units in nonlinear systems? 2. How do we treat the interaction among phenomena that occur on a wide range of scales or space, time and organisational complexity? 3. What is the relation between pattern and process? References: 1. D. Sumners (1995). Lifting the curtain: using topology to rpobe the hidden action of enzymes. Notices of the AMS, 42(5):528-535. 2. R. Sole and B. Goodwin (2000). Signs of Life: How complexity pervades biology.New York, Basic Books. 3. R Leakey and Roger Lewin (1995). The sixth extinction: Biodiversity and its survival. London, Phoenix publishers. 4. M. S. Waterman (1995). Introduction to Computational Biology: maps sequences and genomes. Chapman & Hall, New York. 5. J. C. Wooley (1999). Trends in Computational Biology: A summary based on a RECOMB plenary session. Journal of Computational Biology, 6(3/4):459-474. 6. JP Aubin (1991). Viability Theory, Birkhauser. 7. RM May (1995). Necessity and Chance: Deterministic chaos in ecology and evolution. Bulletin of the American Mathematical Society 32(3), 291-308. 8. O. Wolkenhauer (2001) Systems biology: The reincarnation of systems theory applied in biology. Briefings in Bioinformatics 2(3):258-270. 8