Self-Organization of Population Structure in Biological Systems Guy A. Hoelzer Department of Biology Department of Environmental and Resource Sciences University of Nevada Reno Reno, NV 89557 hoelzer@unr.edu 1. Introduction Conventional wisdom in the field of population genetics suggests that discrete boundaries between distinctive, geographically adjacent biological populations must reflect the influence of external factors, such as differential selection or a barrier to dispersal [Endler 1977]. Therefore, empirical observations of such boundaries are usually taken as evidence of a previous period of geographical isolation, unless there is an obvious change in environment that coincides with the boundary. A justification for this practice was provided in a seminal paper [Avise 1987], which coined the term "phylogeography" and laid the groundwork for a great deal of recent research (mostly empirical) on the spatial distribution of genetic variation within species. However, the argument provided by [Avise 1987] was, at best, a tenuous one that permitted research in this area to proceed without the burden of a seriously complicating factor. They could see no reason to expect such boundaries to emerge intrinsically and the few empirical examples available at that time seemed to be easily explained as cases of secondary contact after a period of complete geographical isolation [Avise 1987]. Since publication of [Avise 1987], spatial boundaries between gene pools currently exchanging migrants on a regular basis have been identified in many widespread species. Reliance on the argument of [Avise 1987] has led many to infer the earlier existence of a now defunct, complete barrier to gene flow between the gene pools. The boundaries are identified when a sufficient sampling of individuals, locations, and genetic markers reveal relatively homogeneous regions significantly differing in allele frequencies [Avise 1999]. Complexity theory, especially the theory of self-organizing systems, provides a theoretical basis for emergence of boundaries between relatively homogeneous gene pools in systems exhibiting "isolationby-distance" [Wright 1943]. Gene flow distances that are shorter than the extent of the species' range characterize such systems. This condition affords local populations a degree of evolutionary independence from distant parts of the species' range. 2. General qualities of self-organizing systems and their expression in spatially-structured, biological species The theory of self-organizing systems [Bak 1996] is still in its infancy, and the necessary and sufficient conditions for the process of self-organization have yet to be elucidated. However, some factors have been identified as being typical of self-organizing systems and the logical bases for their effects have been explored [Bak 1996], particularly for those described as complex adaptive systems [Levin 1999]. I will list some of these factors and describe how each is expressed in the context of isolation-by-distance. 2.1. Diversity and individuality of components [Levin 1999] Elements of complex adaptive systems have unique qualities and behave independently (to a degree). The dynamics and structure of the system are products of the interactions among these elements. In spatial population genetics, the distinctive elements are local gene pools comprised of unique combinations of allele frequencies (alleles are alternative forms of a genetic locus). Local gene pools interact by exchanging individuals through migration and subsequent reproduction. The flow of alleles between local gene pools limits divergence, which is advanced by localized selection, genetic drift, and mutation. One feature of dynamic genetic systems is the constant possibility of allelic extinction. The loss of allelic diversity would threaten the potential for self-organization, except that mutation and recombination continually add new variants [Levin 1999]. 2.2. Localized interactions among components [Levin 1999] Isolation-by-distance geographically limits interactions among gene pools. In general, this permits divergence among different regions of the system, which could define spatial organization. Localization of interaction in a vast system is commonly described as flow s within complexity theory, because it creates time lags as effects of such interactions spread throughout the system. It is telling, or at least convenient, that the term gene flow is used in population genetics to describe the spread of alleles across localities. 2.3. Non-linear interactions among components [Levin 1999] Non-linear interactions in complex systems can lead to events of surprisingly large effect, which can occur in predictable patterns, although the details of timing and causality are not predictable for particular events [Bak 1996]. These non-linear interactions provide the basis for the formation of boundaries among regions of the system that define its organized structure. The boundaries themselves are non-linear outcomes, which might require non-linear component interactions. Frequency distributions for dispersal distances in natural systems are generally very non-linear [e.g. Wasser 1987]. 2.4. An autonomous process [Levin 1999] A process inherent to the system must cause structural organization of the system. As [Levin 1999] pointed out, natural selection can be such a force; however, this depends on the confines placed on the system of interest. By limiting my system to the gene pool of a single species, many potential sources of selection (e.g. the abiotic environment and interactions with other species) are defined as external to the system. Social interactions among individuals, and epistatic interactions among genetic loci, could still generate variation in selection pressures across the geographic range of a species, but this possibility will be ignored here. Instead, I will focus on genetic drift (i.e. random changes in allele frequencies caused by sampling error between generations) as a distinctly autonomous process that can lead to self-organization of population substructure. In his original paper, [Wright 1943] recognized that genetic drift acts independently at distant locations within a system of isolation-by-distance. This means that the identities of alleles increasing or decreasing in frequency due to sampling error are somewhat free to differ between distant sites. Wright also concluded that isolation-by-distance reduces local variation, while simultaneously increasing regional differences. 2.5. Dissipation [Nicolis 1989, Prigogine 1992] Prigogine and his colleagues [Nicolis 1989, Prigogine 1992] have stressed the importance of dissipation in self-organizing systems. He has been concerned with systems typified by a constant input of energy, which must be dissipated before the state of maximum entropy is exceeded, as dictated by the second law of thermodynamics. The localization of interactions leads to viscosity of flow through the system, which makes the process of dissipation inefficient. The flux of energy through such systems results in the formation of structures that increase the efficiency of flow. The flow of genetic variation through systems exhibiting isolation-by-distance is analogous to the flow of energy through the systems described by Prigogine. Mutation provides a constant input of variation, and population genetic flux causes old alleles to be replaced by new ones. Genetic drift results in allelic replacement, but isolation-by-distance causes drift to be very inefficient at purging alleles from the system once they have become widespread. I propose that the self-organization of discrete subpopulations serves to increase the efficiency of genetic drift as a mechanism of dissipating genetic variation. Without population subdivision, isolation-by-distance causes species to retain more allelic variation. Consistent with the activity of an autonomous process and continual dissipation is the notion that self-organizing systems exhibit dynamic behavior. They are always in flux and their general structure is actively maintained by tension between processes eroding and regenerating structure. System dynamics, and the details of structural regeneration, are often contingent upon unpredictable events, such as the outcome of genetic drift. The birth and death of individuals causes genetic drift to be a continual process, and it prevents the system from obtaining a static equilibrium state. The geographical distribution of genetic variation is contingent on the idiosyncratic history of allele frequency changes locally, and across the species' range. Every new generation changes the previous distribution in unpredictable ways, so that the emergence of large-scale population structure is necessarily a dynamic process. 3. Simulation-based evidence of the self-organization of spatial population structure Because isolation-by-distance has not been effectively modeled using analytical methods, computational simulation is an attractive approach for exploration of its effects. The selforganizing property of such systems has been revealed using individual-based, spatially explicit models in at least two instances [see also Rolf 1971]. Both explored large geographic scales compared with average dispersal distances. 3.1. Self-organization of nuclear alleles in simulated plants [Turner 1982] described a model in which pollination (i.e. gene flow) of individual plants was only allowed to occur between close neighbors in a 2-dimensional lattice of 100 X 100 (population size = 10,000). To begin each simulation, two alleles, representing a nuclear locus, were combined at random into diploid genotypes, which were then placed onto vertices of the lattice at random. The system evolved a significant degree of genotype and allelic spatial clumping. After 800 generations, 52% of the individuals in the system belonged to homogeneous clumps (i.e. subpopulations without allelic variation), of 100 or more individuals. This striking degree of population substructure emerged from the initially random distribution of both alleles and genotypes throughout the lattice. Because this simulation did not include the process of mutation, one allele would have eventually reached a frequency of 100%, marking the final loss of spatial structure and the dynamic nature of the system. Natural biological systems are constantly subject to mutation; thus the dynamics of self-organization should be perpetual. 3.2. Self-organization of mitochondrial alleles in simulated animals The second model to exhibit self-organization was described by [Hoelzer 1998]. It was superficially very different from the model of [Turner 1982]. This model mimicked the evolution of haploid mitochondrial genomes, in which alleles were not combined into 2allele genotypes, in the context of a primate social system. Individuals were organized into social groups, which existed in a 5 X 5 lattice, and the average migration distance was made very small by reducing the frequency with which individuals emigrated from their natal group. Again, individuals that did migrate were constrained to enter neighboring groups. This model included the influence of mutation, allowing for persistent system dynamics. This model generated the same sort of clumping (i.e. spatial autocorrelation among alleles) observed by [Turner 1982], but the transient nature of the clumps did not doom the system to homogeneity. A small fraction of new alleles, generated by mutation, would increase substantially in frequency and spread locally due to drift. Thus, new clumps were continually created, which replaced those that disappeared. When average gene flow distance was too great relative to the scale of the system, no self-organization occurred. However, a threshold was reached as viscosity was increased, where the system bifurcated into two subpopulations. This is biologically surprising, because dispersal across the geographic boundary between subpopulations occurred with equal likelihood as dispersal within the bounds of a subpopulation. The locations of boundaries between adjacent subpopulations were arbitrary and the boundaries moved across the landscape over time. Genetic drift was efficient within subpopulations, so little variation was found within them at any point in time. However, the lineages occupying different subpopulations were highly divergent. Although an analysis of the behavior of this model was published [Hoelzer 1998, 1999], its self-organizing properties were not described. 3.3. A new simulation designed to explore self-organization of population substructure The simulations described above were not designed to illustrate the self-organizing process; in fact, the observation of systemic self-organization was a surprise to the authors of both simulations. Furthermore, each simulation had idiosyncratic features that mask the generality of the phenomenon. Therefore, I am currently developing a new simulation model designed specifically to study spatial self-organization, in collaboration with Chris Ray at the University of Nevada Reno. This model includes mutation, but no social structure. It does not require genotype or recombination analyses, because it assumes a haploid genome. Finally, the scales of both the geographic range of the system and the distribution of dispersal distances can be varied. In the following simulations, a 2-dimensional lattice of 100X100 vertices was used; thus, there was a maximum population size of 10,000, but smaller populations occurred when some vertices were unoccupied. The lattice was rolled onto a torus, producing a donut-shaped range without edges. Initial population size was set to 10, and each individual had a unique mutation. Mutations color the vertex to facilitate visual recognition of spatial genetic structure. Mutation rate was set at 105 /individual/generation. Offspring inherited the color of the parent, unless they experienced a new mutation. Generations did not overlap, so offspring could inherit the parental vertex. The expected number of offspring per individual was m/n, where m is a hypothetical maximum number of individuals and n is the number of individuals currently in the lattice. Here we set m equal to 10,000, 15,000 and 20,000 (Figures 1A, 1B, and 1C, respectively). In each case, the actual lattice capacity was 10,000. This construction caused the lattice to fill quickly with descendants of the 10 original founders. Isolation-bydistance was implemented by constraining the vertices occupied by offspring to either the parental vertex or one of eight neighboring vertices. Offspring could migrate one or two steps up, down, left, or right on the lattice. The program attempted to place each offspring in one of these nine vertices at random, but attempts failed when the chosen vertex was already occupied. A failed placement was followed by up to 10 new, random choices among the same nine vertices. The method of controlling equilibrium densities described above effectively causes the number of attempts allowed for the random placement of offspring to vary between 10 and 20. Ultimately, failure to find an unoccupied vertex resulted in death of the offspring. As expected, the dynamic equilibrium density on the lattice was higher when more attempts were made to place offspring; these densities were approximately 9,050, 9,950 and 10,000, when m was 10,000, 15,000 and 20,000, respectively. Figure 1. Snapshsots of simulations taken at generation 10,000. The conditions of all three simulations were identical (see section 3.3), except for equilibrium densities, which were about (A) 9,050, (B) 9,950, and (C) 10,000, respectively. A snapshot of the spatial structure of genetic diversity on this landscape is shown in Figure 1 for each of the three equilibrium densities at generation 10,000. In each case, the widespread colors are identical to starting lineages, without any evolutionary change. The rare, locally clustered colors represent recent mutations that have begun to spread across the landscape. While this preliminary exploration of the model does not yet reveal substantial subpopulations derived from mutant lineages arising during the simulation, it nevertheless exhibits a clumped, non-random spatial distribution of colors. Contrasting results from simulations run under different population densities reveals that competition for space enhances the self-organizing effect. Under the high-density condition, the parental vertex is likely to be filled by one of its offspring, but neighboring sites will rarely be available. Thus, competition for space enhances viscosity, and the tendency for self-organization, in a system of isolation-by-distance. 4. Implications for the field of population genetics Contrary to the conclusions of [Avise 1987], and current standards of practice in the field of phylogeography, observations of such boundaries in natural populations do not necessarily indicate secondary contact or the effects of selection in different environments. The model described here suggests that such boundaries can emerge as a result of the internal dynamics of a system exhibiting isolation-by-distance. Therefore, this model provides a new null hypothesis, which predicts the occurrence of boundaries among subpopulations, and the maintenance of highly divergent alleles without intermediate forms, in natural populations that are sufficiently viscous relative to their geographic ranges. This is a null hypothesis, because it does not attribute the pattern to the influence of any factor external to the system under study. Unfortunately, this makes study of such external factors more difficult. 4.1. If a boundary is not evidence of secondary contact or local adaptation, then what would constitute evidence of these phenomena? The influence of external factors on spatially structured populations was a subject of investigation before the advent of phylogeography. The model presented here brings into question the validity of some uses of phylogeographic methods, but it does not impinge on more traditional approaches. For example, physical (e.g. fossilized) evidence of historical ranges can suggest secondary contact, and local adaptation on either side of a boundary can be studied experimentally (e.g. through reciprocal translocations). Indeed, observation of a phylogeographic boundary can be the basis for hypotheses about the roles of external factors, which can then be tested in these ways. It is also possible, given enough data, that this null hypothesis could be rejected based on predicted patterns of system dynamics. For example, complex systems are generally characterized by fractal patterns, 1/f noise, and power law relationships [Bak 1996]. While it has yet to be determined where the power laws lie in the population genetics of isolation-by-distance (perhaps the ranked frequency distribution of alleles at any point in time?), I expect such relationships to be predicted by this null hypothesis. The influence of external factors might make some of these predictions false. 4.2. Future directions Exploration of the range of conditions under which spatially structured population subdivision is expected to self-organize will be needed to appreciate the potential role of this process in natural systems. It is possible that the ratio of gene flow distance to range required for self-organization is rarely realized in natural systems. It is also possible that external forces frequently interfere with the self-organizing process. Following further development of the theory, empirical research will be needed to explore these possibilities. However, before this research program is initiated the possibility of population self-organization must first be appreciated. I expect that this will require a period of intellectual digestion, including some discomforting indigestion, because biologists are traditionally trained to look for the influences of external factors when systemic structures are observed. Appreciation of this model will necessitate a new, general way of thinking about problems for many biologists. I also expect that natural systems will not fit cleanly into either the model of selforganizing systems or the traditional view that all structure is explained by the effects of external forces. It is likely that both internal and external sources of structure interact in nature. For example, this null model predicts that the locations of boundaries will be arbitrary, assuming environmental homogeneity. This means that the model also predicts that boundaries existing for unlinked genetic markers would not necessarily coincide; however, features of the landscape might marginally reduce local dispersal and attract otherwise arbitrarily located boundaries. Geographic heterogeneity of this sort might cause alignment of boundaries for unlinked markers, resulting in a pattern of population subdivision reflecting most of the genome. This could set the stage for parapatric speciation [Endler 1977] in a way that has previously been unappreciated. References Avise, J.C., Arnold, J., Ball, R.M., Bermingham, E., Lamb, T., Neigel, J.E., Reeb, C.A., & Saunders, S.C., 1987, Intraspecific phylogeography: the mitochondrial bridge between population genetics and systematics, Annual Review of Ecology and Systematics, 18, 489-522. Avise, J.C., 1999, Phylogeography: The History & Formation of Species, Harvard University Press (Cambridge). Bak, P., 1996, How Nature Works: The Science of Self-Organized Criticality, Springer Verlag (New York). Endler, J. A., 1977, Geographic Variation, Speciation, and Clines, Monographs in Population Biology, no. 10, Princeton University Press (Princeton). Hoelzer, G. A., Wallman, J., & Melnick, D. J., 1998, The effects of social structure, geographical structure and population size on the evolution of mitochondrial DNA. II. Molecular clocks and the lineage sorting period, Journal of Molecular Evolution, 47, 21-31. Hoelzer, G. A., Wallman, J., & Melnick, D. J., 1999, Erratum: The effects of social structure, geographical structure and population size on the evolution of mitochondrial DNA. II. Molecular clocks and the lineage sorting period, Journal of Molecular Evolution, 48, 628-629. Levin, S., 1999, Fragile Dominion, Perseus Books (Reading). Nicolis, G., & Prigogine, I., 1992, Exploring Complexity, W. H. Freeman (New York). Prigogine, I., 1992, Dissipative structure in quantum theory, Quantum-Theory Physics ReportsReview Section Of Physics Letters, 219, 93-108. Rolf, F. J., & Schnell, G. D., 1971, , American Naturalist, 105, 295-324. Turner, M. E., Stephens, J. C., & Anderson, W. W., 1982, Homozygosity and patch structure in plant populations as a result of nearest-neighbor pollination, Proceedings of the National Academy of Sciences USA, 79, 203-207. Wasser, P. W., 1987, A model predicting dispersal distance distributions, in Mammalian Dispersal Patterns: The Effects of Social Structure on Population Genetics, edited by B. D. Chepko-Sade and Z. Tang Halpin, University of Chicago Press (Chicago). Wright, S., 1943, Isolation by distance, Genetics, 28, 114-138.