Complexity: An Invitation to Statisticians

advertisement
Complexity: An Invitation to Statisticians
Complessità: Un Invito agli Statistici
David A. Lane
Università degli Studi di Modena e Reggio Emilia
Viale Berengario 51, Modena
lane@unimo.it
Abstract Over the last few decades, a new paradigm, based on the idea of complexity, has emerged
from research in physics, chemistry, biology and, recently, the social sciences. The paradigm is
concerned especially with the self-organization of higher level, emergent structures, from the
interactions among a set of lower-level interacting entities or “agents.” Statisticians have already
begun to apply several techniques developed by complexity researchers, neural nets and genetic
algorithms. In this talk, I suggest that the statistical community might benefit from a closer look at
what complexity research is about – and may even contribute to successfully resolving some of the
inferential problems that arise in this research.
Key words: Complexity, Self-organization, Emergence
1. Introduction
All sciences are from time to time influenced, sometimes strongly, by concepts and methods imported
from other sciences. This is especially true for statistics, whose task it is to develop ideas and
techniques that lend meaning to data drawn from all the sciences. Over the last several decades, a new
scientific paradigm has been emerging from research in chemistry, condensed matter physics,
theoretical biology, computer science and, recently, social sciences: the paradigm of complexity.1 In
this talk, I will briefly describe the kinds of phenomena complexity theories set out to explore, some
attributes of the models that scientists use to explore these questions, how these models differ from
those that statisticians typically use, and how complexity theorists and statisticians might benefit from
each other’s concepts and methods.
Complex phenomena are characterized by a set of entities that interact with one another. Each
interaction event typically involves relatively few of the entities, and which entities interact with one
another depends on some sort of underlying network structure. As a result of their interactions, some
properties of an entity may change – including its position in the network structure and its interaction
modes. While interactions are local, the objects of interest in complex phenomena are usually global –
some function of a pattern of interaction events that is relatively stable over a time scale much larger
than that of the interaction events themselves. Frequently, such meta-stable patterns of interactions
self-organize, and their resulting structures and even functionality can often be described in a language
1
For interesting and accessible introductions to complexity, see Nicolas and Prigogine (1989), Gell-Mann (1994),
Kauffman (1993), and Holland (1995, 1998). For an introduction to some standard complex system modeling techniques,
see Weisbuch (1991), Serra and Zanarini (1990), Bar Yam (1997) and Bossomaier and Green (2000). The web-sites of the
Santa Fe Institute (www.santafe.edu) and the New England Complex Systems Institute (www.necsi.org) are good sources
for current directions in complexity research.
that makes no reference to the underlying entities and their interactions. In that case, these patterns are
called emergent, and the study of self-organization and emergence constitutes the principle goal of
complexity research. Typically, emergence is driven by a combination of positive and negative
feedbacks: the positive feedbacks amplify fluctuations at the level of local interactions between entities
into global, system-level features, and the negative feedbacks stabilize these features, rendering them
resistant for some extended time period to further micro-level interference. Such processes tend to
exhibit such dynamical features as multiple possible meta-stable states, bifurcation points at which the
process “chooses” among these states, and “lock-in” to the meta-stable state that actually arises. In
Section 2, I present two examples of complex phenomena exhibiting these features.
The models that are used to study complex phenomena have a structure similar to what I just described
for the phenomena themselves. They specify a collection of entities, organized according to some
given network structure, and rules for interaction among entities, which specify both the order of
interaction and the changes in entity properties and network structure that result from the interactions.
Usually, though not always, there is a stochastic component in the specification of the interaction rules.
The objects of inferential interest in the model are defined in terms of some function of interaction
histories, which in turn depend on entity properties and interaction rules. The functions of interest
generally are global, in the sense that they depend upon the ensemble of all the entities in the model.
These functions represent emergent phenomena within the model. Not surprisingly, in view of the
examples mentioned above, they are exceedingly complicated, nonlinear functions of entity properties
and interaction rules. Section 3 and 4 introduce models for the phenomena described in Section 2.
In Section 5, I compare complexity models and standard statistical models, pointing out a few
similarities and some major differences between them. Section 6 concludes the talk, with some
suggestions about how statisticians may contribute to and benefit from the growing body of complexity
literature.
2. Two complex phenomena
In this section, I discuss two examples of complex phenomena. The first is biological. It concerns one
of the many fascinating behaviors of social insects, the raiding parties of tropical army ants. The
second is economic. It concerns the role of “word-of-mouth” advertising in determining whether a
just-released movie will be a hit or a flop – or whether it might be a great success in Switzerland and a
failure in Austria.
2.1.1
Army ants: “The Huns and Tartars of the insect world”2
“Swarm raid patterns in army ants are among the most astonishing social behaviors one can observe in
nature. Within a matter of hours, thousands of ants leave their [nest], forming swarms or columns, with
the only purpose of finding food for the colony. These raids are able to sweep out an area of over 1000
square meters in a single day” (Sole et al., 1999). What is of interest here is the shape of the swarm
and the process by which the swarm is formed.
Different species of army ants form swarms with different characteristic shapes. E. burchelli swarms
may contain up to 200,000 individuals. “The raid system is composed of a swarm front, a dense carpet
of ants that extends for approximately 1 meter behind the leading edge of the swarm, and a very large
2
Hölldobler and Wilson, quoted in Sole et al. (1999).
system of …trails [behind the front]. These trails, along which the ants run out to the swarm front and
return with prey items, characteristically form loops that are small near the raid front and get ever
bigger and less frequent away from it. The final large loop leads to a single principle trail [usually a
mostly straight line] that provides a permanent link between the … raid and the [nest].” In contrast, E.
hamatum swarms consist of a single long column with a few smaller columns branching off at angles
of between about 30 and 90 degrees.
How do the swarms form? Clearly, there is no directing intelligence here: the swarm is an instance of
self-organization. Individual ants are recruited into the swarms by means of a chemical trail, produced
from pheromones deposited on the ground by other ants. As more and more ants enter the swarm, the
trail of pheromones becomes ever denser and thus more attractive to other ants, at least until the swarm
saturates and it becomes physically impossible for additional ants to follow the scent of the others.
Here we see both positive feedback (as ants join the swarm, they deposit pheromone making the swarm
more attractive to other ants) and negative feedback (spatial saturation, at the nest and along the most
densely occupied areas in the swarm). The direction that the swarm takes may initially be determined
randomly, but as ants return to the nest laden with prey, depositing pheromone as they go, the swarm
tends to move towards zones where prey is to be found.
What determines the characteristic shape of the swarm? In particular, why are E. burchelli and E.
hamatum swarms so different? There are several possible explanations. One is that the inherited
“instruction set” inside individual ants that determine how they respond to pheromones, how much
pheromone they deposit on their outward and return trips, and the physical characteristics of the
pheromones involved may differ between the species, in ways that determine the shapes of the swarms
that emerge from the individual ants’ interactions via their pheromones. Another, more interesting,
possibility is that the environment in which they hunt may interact with the ants’ inherited “instruction
sets” to produce the characteristic shapes of their swarms. Indeed, E. burchelli generally preys on
arthropods that are fairly common but dispersed in space, while E. hamatum preys on social insect
colonies that are relatively rare but offer a tight concentration of prey in a small space.
Ant researchers have been able to show through experiments that the environmental interaction
hypothesis is indeed correct. When prey is distributed sparsely but in high local concentrations, E.
burchelli ants form columnar swarms that closely resemble those of E. hamatum (Franks et al., 1991,
quoted in Bonabeau, Dorigo and Theraulaz, 1999). Thus, evolution seems to have constructed a search
algorithm that succeeds in self-organizing different solutions to different sorts of search problems. It
might be valuable to operations researchers and statisticians to discover how it managed to accomplish
this feat. To understand better the relation between ants, pheromones, prey distribution and swarm
formation and shape, we need to build a model for the process, which we do in Section 3.
2.1.2
Information contagion: learning from the experience of others
Suppose you are thinking about seeing a movie. Which one should you see? Even if you keep up with
the reviews in the newspapers and magazines you read, you will probably also try to find out which
movies your friends are seeing -- and what they thought of the ones they saw. So which movie you
decide to see will depend on what you learn from other people, people who already went through the
same choice process in which you are currently engaged. And after you watch your chosen movie,
other people will ask you what film you saw and what you thought of it, so your experience will help
inform their choices and hence their experiences as well.
Now suppose you are a regular reader of Variety, and you follow the fortunes of the summer releases
from the major studios. One of the films takes off like a rocket, but then loses momentum.
Nonetheless, it continues to draw and ends up making a nice profit. Another starts more slowly but
builds up an audience at an ever-increasing rate and works it way to a respectable position on the alltime earners' list. Most of the others fizzle away, failing to recover their production and distribution
costs.
I have just described the same process at two different levels: the individual-level decision process and
the aggregate-level market-share allocation process. I focused on a particular feature of the individuallevel process: people learn from other people, what they learn affects what they do – and then what
they do affects what others learn. Thus, the process that results in allocation of market-share to the
competing films is characterized by an informational feedback that derives from the fact that
individuals learn from the experience of others.
The question of interest here is: can this informational feedback suffice to “make a hit,” even when
there is little intrinsic difference in the entertainment value of the competing movies? That is, can the
local process of learning from others generate a positive feedback sufficient to amplify a more or less
random fluctuation in attendance that favors one of the competing movies, into an avalanche that
results in an overwhelming, society-wide hit, a global effect? If so, under what circumstances? Again,
to get at this question, we need a model whose analysis will tell us when and how hits self-organize.
3. A model for army ant swarm formation
In this section, I describe a model for swarming behavior, introduced by Deneubourg et al. (1989) and
slightly modified by Sole et al. (1999). In this model, all interactions between ants are indirect,
mediated by pheromones that ants deposit as they move. Thus, the network that describes which ants
interact with which cannot be described a priori: by means of their interactions with the environment,
an ant “interacts” with all other ants who pass through the same sites it visits. The spatial environment
on which these indirect interactions occur has a highly ordered structure:
1) Ants move on a discrete two-dimensional lattice (LxL), with the nest located at site (1,L/2). Ant
movements occur at discrete time steps. At each time step, an individual ant is either still in the
nest, searching for prey (that is, moving away from the nest, in the direction of an increasing value
of the first site coordinate), or returning to the nest with prey (that is, in the direction of a
decreasing value of the first site coordinate). An ant that “falls off” the edge of the lattice dies.
2) Moving ants deposit pheromone. A searching ant deposits one unit of pheromone per time step on
the site it currently occupies, unless the current pheromone level at that site exceeds a parameter s0.
A returning ant deposits q units of pheromone per time step, unless the current level exceeds s1.
Pheromone evaporates at rate d per time step.
3) An ant moves forward with a probability that increases with the total pheromone level in the sites
to which it may move (except near the edges of the lattice, there are three such sites in front of the
ant, one to the left, one to the right, and one straight ahead). The precise form of this function has
virtually no effect on the ensuing analysis. As a consequence of this condition, ants move faster
when they travel through a dense pheromone field than when they are exploring “virgin” territory,
which is well supported by empirical evidence (Sole et al., 1999). Figure 1 plots the graph of the
function used by Sole and his collaborators.
Figure 1: Probability of moving as a function of pheromone level l in forward sites
1
l
p(l )  (1  tanh(
 1))
2
100
4) Given that an ant moves, the probability that it moves, say, to the left is given by the formula
(u  cl ) 2
, where cl, cr, and cd are the pheromone levels at the sites
p(l ) 
(u  cl ) 2  (u  cr ) 2  (u  cd ) 2
facing the ant to the left, right and center respectively, and u is a parameter that measures the
attractiveness of sites that have no pheromone. According to Sole et al. (1999), this formula
provides a reasonable fit to empirical evidence derived from experiments with ants moving
through measured pheromone fields.
5) Sites saturate. The number of ants at a given site may not exceed a threshold A.
6) B ants leave the nest at every time step.
As is typical of models for complex phenomena, this fairly simple model has a lot of parameters: A, B,
L, s0, s1, u, q and d; not to mention the shape of the functions for the probability and direction of
movement. Again like virtually all complexity models, the model may easily be simulated on a
computer, so the modelers may carry out robustness experiments to find out which of their parameters
matter, in the sense that they have a determining influence on the emergent phenomena the model is
designed to investigate, and which do not.3 So henceforth we may follow Sole et al. (1999) in
declaring L to be “large”, A equal to 30, and B equal to 10.
The point of the analysis is to find whether the model can produce swarms with shapes characteristic of
those of E. burchelli and E. hamatum. Deneubourg et al. (1989) claim that with the same set of
parameter values that are reasonable given observational and experimental data, swarms characteristic
of both species emerge in model simulations, depending on the distribution of prey over sites. In
particular, when each site, independently of the others, receives 1 unit of prey with probability ½, a
swarm front like that of E. burchelli forms, while when each site receives 400 units of prey with
probability 0.01, the simulations generate branching columnar swarms like those of E. hamatum. How
3
I should point out here that current practice for this kind of experiment is extremely informal and might benefit from some
input from statisticians with experience in experimental design.
do they document the resemblance between their computer simulations and real army ant swarms?
Naturally, it is extremely difficult to develop (and then apply) a measure of similarity for swarm
shapes. This is a typical problem that arises with emergent global patterns and structures. Like most
complexity modelers, Deneubourg and his collaborators do not attempt a formal analysis of swarm
shape, but rely on visual evidence and our built-in ocular pattern recognizers to verify the resemblance.
It would be interesting to see if statisticians could come up with a better procedure.
Sole et al. (1999) provide a deeper analysis of the model. They ask: for a particular distribution of
prey, which parameter values would optimize the prey-gathering efficiency – that is, the amount of prey
gathered per time step, divided by the number of ants currently hunting at that time, summed over a
large number of time steps. Clearly, this is an exceedingly difficult optimization problem. To solve it,
the authors use an optimization method that itself derives from complexity research, similar to the
genetic algorithm that has become familiar to many statisticians in the past several years. They obtain
a remarkable result. Like Deneubourg et al. (1989), they consider two different prey distributions. In
case 1, each site, independently of the others, receives 200 prey units with probability 0.01; in case 2, 1
prey unit with probability ½. Table 1 gives the optimum parameter values in the two cases:
Table 1: Parameters that optimize efficiency for rare but concentrated prey (case 1) and common but
dispersed prey (case 2)
case
s0
s1
u
q
d
1
431
592
47.5
46
2.5 x 10-3
2
51
390
47
47
2.1 x 10-3
Note that the values for u, q, and d are nearly identical in the two cases. For both, the pheromone
evaporation rate is very low; clearly, it is important to establish trails that persist for substantial periods
of time. It is perhaps more surprising that there is no difference in u, which measures the attractiveness
of previously unexplored sites, and q, which measures the ratio between pheromone deposited on
returning with prey to that deposited while searching for prey. One might have expected to find a
higher “search premium” and a lesser “reward” for discovering prey in when the prey distribution is
more dispersed. The difference between the optimal solutions to the two cases is just with respect to
the s-parameters: the amount of allowable pheromone per site is substantially greater when prey is
concentrated. Thus, ants searching for concentrated prey lay down denser pheromone trails than ants
searching for dispersed prey. In both cases, as one might expect, denser concentrations of pheromone
are tolerated by returning ants (who reinforce the direction they have discovered to prey) than by
searching ants.
The fact that the optimal models for both cases are so similar is suggestive. Assuming that the model is
“correct”, evolution seems to have found a way to build into both species a nearly optimal foodgathering strategy that is robust to prey distribution and to allow individual species to fine-tune that
strategy by adjusting a single kind of tolerance threshold, to match the distribution of prey they most
often hunt. That this fine-tuning should lead to the emergence of very different characteristic shapes
for the swarms in which the ants hunt is a surprising and beautiful example of a general “theorem” of
complexity theory: sharp boundaries in parameter space between qualitatively different kinds of
emergent patterns and structures.
Of course, Sole and his collaborators simulated the two optimal models they discovered. By now, you
should not be surprised to learn that the simulations of the optimal model for case 1 produced typical
columnar E. hamatum-like swarms, and the case 2 optimal model simulations resulted in a long and
deep swarm front, with ever-lengthening looping trails returning from that front, just like E. burchelli
swarms.
4. A model for information contagion
Arthur and Lane (1993) introduced a highly stylized model to study the phenomenon of information
contagion. In the model, a countable number of agents choose between two products, A and B. The
model focuses on the effect of private information obtained from previous adopters in affecting agents’
choice, so other factors that might have an influence on the market allocation process, like price or
differences in needs and tastes between agents, are excluded. Here is a description of the model:
1) Agents are homogeneous.
2) The relative value of the two products to agents depends on the products’ respective
performance characteristics, real numbers cA and cB respectively. If agents knew the value of
these numbers, they would choose the product associated with the larger one.
3) Agents all have access to the same public information about the two products, which does not
change throughout the allocation process.
4) The allocation process begins with an initial seed of r A-adopters and s B-adopters.
5) Successive agents supplement the public information by information obtained from n randomly
sampled previous adopters. From each sampled informant, an agent learns two things: which
product the informant adopted and the value of that product's performance characteristic
perturbed by a standard normal observation error.4
6) Agents translate their information into a choice between A and B by means of a decision rule D:
{{A,B}xR}n  {A,B}.
7) Here is a decision rule that corresponds (almost) to the economists’ and statisticians’ ideas of
“rationality”. Agents begin by converting their (shared) prior information about cA and cB into a
(shared) prior probability distribution: cA and cB are independent, with normal distributions.
Then, agents use Bayes’ Theorem to calculate posterior distributions for cA and cB, given their
private information (and the “right” probability model for this information). Finally, they
choose the product with maximum expected utility under the posterior distribution, where they
use the constant risk-aversion utility function u, parameterized by a non-negative number, with
the form u(x) = x for = 0, otherwise u(x) = -exp(-2x). The parameter  measures agents’ risk
aversion. A tedious but straightforward calculation shows that agents with this decision rule will
choose whichever product A has the larger value of
1
ni    2
(ni Yi    2    ).
In this model, agents interact directly with other agents, giving and receiving information about product
performance. The network structure is randomly determined, which is not very realistic, a point to
which I will return. An advantage of the simple structure of this model is that it can be solved
4
To be precise, agent j learns from informant i the values of two random variables, X ji and Yji. Xji takes values in {A,B};
Yji is a normal random variable, with mean cX and variance 1. Given {Xj1,…,Xjn}, Yj1,…Yjn are independent. Finally,
ji
Y1,Y2,… are independent random vectors, given {X1,X2,…}.
analytically: one can show that the proportion of agents adopting product A converges with probability
one, and one can determine the (discrete) support of the limiting distribution at the roots of a certain
polynomial of order n. Using this model, Arthur and Lane (1993) investigated the case in which the two
products were in fact equal in performance (that is, cA = cB), and the agents’ normal prior distributions
for cA and cB had identical means and variances. They showed that for some parameter values, this
individual-level symmetry between the products was preserved at the aggregate level, in the sense that
the only possible limiting market share for each product was 50% (independent of r and s, which may
perhaps be surprising). However, for other parameter values, the symmetry is broken, and either A or B
eventually takes essentially 100% market share. Which product wins is determined by chance
fluctuations in sampling and reporting early on, amplified by the informational positive feedback
described in Section 2. The only other possibility that can occur (and does occur, in a certain compact
set in parameter space, when n > 2) is that the support of the limiting distribution of A has three points:
essentially 0%, 50%, or essentially 100%. That is, knowing the values of all the model parameters, one
cannot even determine a priori whether the market will end up equally shared out between the two
products, or one of the products will completely dominate the other.
When one product is actually better than the other, the situation becomes considerably more
complicated. Generically, there are five possible outcomes in this case: 1) the superior product attains
essentially 100% market share; 2) the two products share the market, with the superior product having
the larger market share a (0.5 < a < 1); 3) one or the other product attains essentially 100% market
share – that is, the inferior product has positive probability of completely dominating the market (the
probability depends in a quite complicated way upon the difference in performance between the two
products, as well as on the initial distribution r and s); 4) the superior product attains either essentially
100% market share, or market share a, where 0.5 < a < 1; 5) one or the other product completely
dominates the market, or they share the market, with the market share for the superior product equal to
a, 0.5 < a < 1. Figure 2 shows the boundaries between these regimes as a function of λ and n, with fixed
values for the other parameters. Note how small is the region supporting only the most “reasonable”
aggregate level outcome, the one in which the population eventually “learns” which product is better
and adopts only that one.
Figure 2: Support of the distribution for the inferior product A (cA-cB = -0.1; prior distributions for cA
and cB have mean cB+2 and variance 1)
There are some surprising emergent features of this model. Here is one of them (see Lane, 1997, for
others). Certainly, at the individual level, more information is better. But this need not hold true at the
aggregate level. As Figure 3 shows, sometimes increasing the number of previous adopters each agent
samples can reduce the proportion of agents that adopt the better product!
Figure 3: Asymptotic market share of the superior product B (cA-cB = -0.1; prior distributions for cA
and cB have mean cB and variance 1)
As I pointed out at the outset, this model is highly stylized. It serves more as an “existence proof” for
phenomena related to information contagion than as a potential tool for data analysis and inference, in
contrast to the model for army ant swarms described in the previous section. Probably the most unusably unrealistic aspect of the model is its random sampling of previous adopters. In fact, social
networks are highly structured, not totally random as in this model. At the other extreme from random
connections in networks are “totally ordered” networks – that is, lattices – also much used in complexity
models, and also quite an unrealistic representation of actual social networks. Recently, a class of
networks between random graphs and lattices, called small world networks, have been introduced and
applied to a variety of problems, including the spread of infectious disease in structured populations
(Watts, 1999). Already a large literature, analytical and empirical, is developing around these networks,
which have the lattice-like property that a large proportion of a given agent’s neighbors are also
neighbors of his neighbors, and the random graph-like property that any two agents in the network can
be joined by a quite small number of intermediary agents. It would be interesting to study the
information contagion phenomenon on small world networks.
5. Complexity models and statistical models: a brief comparison
There are several obvious respects in which the models I have just described are similar to statistical
models. In particular, both classes of models consist of parameterized families of probability models,
and both are designed to make inferences about real-world phenomena. But there are also several
respects in which they are quite different.
First, the observable objects in statistical models are, typically, variables; in complexity models, they
are entities, or agents, that interact with one another. The entity-orientation of complexity models
cannot be over-emphasized, as I will remind you in the next section when I talk about the application of
complexity ideas in statistical and data-analytic contexts.
Second, complexity models are designed to give insight into a particular class of phenomena:
emergence and self-organization. Thus, the objects of inferential interest are global-level phenomena
that typically require an entirely different language of description than that appropriate to the locallevel entities whose interactions generate them. In statistics, the objects of inferential interest are either
observable variables, with exactly the same sort of ontological status as the observable variables that
are used to generate predictions about them, or parameters, that are generally meaningful only to the
extent they can be interpreted in terms of the observable variables to which they are coupled via
statistical models. Emergence, self-organization, even the distinction between global- and local-level
structures, are simply not meaningful concepts in the contexts in which virtually all statistical models
are employed.
Third, complexity models are inherently non-linear, and the feature that makes them so is amplifying
positive feedbacks. Most statistical models “start out” with a linear structure, and add nonlinear
features a step at a time when linear descriptions do not provide the required level of precision. Even
then, a typical approach to nonlinear analysis is by means of local linear approximations, which in most
complexity problems do not give useful results, when they exist. Nor do amplifying positive feedbacks
play a role in standard statistical models.
The fourth and final difference I want to point out is closely related to the third. In general, what is
interesting in complex phenomena involves non-equilibrium, non-stationary dynamics. As John
Holland, the inventor of the genetic algorithm and one of the pioneers in complexity research, is fond
of pointing out, “real” complex systems settle down only when they die. So far, methods based upon
statistical models have not been notably successful when applied to problems involving nonequilibrium, non-stationary dynamics. However, complexity theorists have been showing in the past
ten or so years how many interesting problems in biology, chemistry, physics and the social sciences
involve just this kind of dynamics. Theories and methods of inference in this setting would help
greatly in addressing these problems.
5. How complexity theorists and statisticians might help each other
I conclude this talk with two proposals for extended collaboration between complexity theorists and
modelers and statisticians. The first proposal has to do with some ways in which statisticians might
help complexity theorists. Several years ago, science journalist John Horgan launched on attack on
complexity theory, calling it “ironic science”, and claiming that it lacked an essential credential for a
“real” science, confrontation with potentially disconfirming empirical evidence. Though Horgan
managed to several a surprisingly large number of copies of his book, he was really only attacking a
straw man: in particular, there is no such thing as “complexity theory”, only complex phenomena and
complexity theorists and modelers who try to understand them. Still, his argument about the
remoteness of complexity theorists’ work from the grubby world of real data hit home: perhaps
complexity theorists were spending too much time gazing at the behavior of the models on their
computer screens and not enough looking at the phenomena they hoped to understand. In the past
several years, there has been a strong movement in the complexity community back to real data; the
work of the experimentalists, field observers and modelers studying social insect behavior together is a
good case in point, but there are many other such projects now, ranging in content from studies of the
formation of agricultural communities in the prehistoric American Southwest, through paleontologists
trying to untangle the causes of mass extinction events.
As I have tried to suggest, though, data-analytic and inferential methods for studying emergent
phenomena are not easy to come by. Off-the-shelf statistical techniques are ill suited for the task, and
the measurement of emergent structures and relating these measurements to suitable descriptions of
underlying local-level interaction events are very difficult problems. To date, most complexity
researchers are flying by the seat of their pants inferentially, and it is difficult to see just where they are
headed – and to recognize when they have arrived at their destinations. I think there is a real
opportunity for young statistically-trained scientists to make a positive and important contribution to
complexity research groups, similar to the work of Ripley and Titterington when they applied statistical
logic to help understand what neural nets were all about. And as with their work, I would guess that
the probability that such scientists could also make valuable contributions to statistical theory and
methodology from such collaboration is high.
Though I do not think there are very many opportunities for successful and important straight-forward
application of existing statistical methods to complexity research, one possible exception is a problem I
already mentioned: how to design simulation experiments to determine which model parameters may
be safely neglected in an analysis of a particular emergent phenomenon. Agent-based modeling is
becoming a large industry, and some relatively standardized method for handling this particular
robustness problem might be not only useful but also much appreciated by the practitioners.
My second proposal has to do with two ways in which statistical research might benefit by importing a
few ideas and methods from complexity research. The first is about methods. Statistics has already
imported two relatively general-purpose tools from complexity research, neural nets and genetic
algorithms.5 Both of these are special instances of what is called emergent or evolutionary
computation. As the names suggest, in these approaches to computations, solutions evolve or emerge;
a specific solution strategy need not pre-exist in the mind of a programmer. Typically, the techniques
are modeled upon some biological phenomenon in which “nature” or an organism is seen as solving
some particular problem: genetic algorithms are abstractions of natural selection operating on a
population of possible “solutions”, generating variability in the population through such operations as
cross-over and mutation; neural nets are inspired by collections of nerve cells in the brain selforganizing patterns of response to environmental inputs. There are other approaches to emergent
5
Going back a little further in time, it might be legitimate to count simulated annealing and Monte Carlo methods as well;
key elements of the latter arose in the proto-complexity environment of Los Alamos Laboratories in the late 40’s and early
50’s.
computation that statisticians might also apply to problems like classification and pattern recognition.
Indeed, Bonabeau et al. (1999) offers many examples of what the authors call “swarm intelligence,”
algorithms based upon various behaviors of social insects, some of which are quite intriguing; for
example, letting artificial ants loose in a telecommunication network to develop efficient routing
algorithms based upon their artificial pheromone trails. Whether swarm intelligence will take off like
neural nets and genetic algorithms did is not at all clear at this time; but there is much research
underway in this and other methods of emergent computation, and statisticians should not be the last in
line to find out what functions these new ideas may be able to help fulfill.
Finally, there are some ideas that have surfaced in the complexity research community that might
challenge statisticians who like to think about the foundations of statistics and the problem of
induction. One set of these ideas has to do with attempts to formalize just what constitutes
“complexity.” Some of these attempts resemble Kolmogorov’s idea of algorithmic complexity, which
he introduced in an effort to develop an intrinsic definition of randomness. But it turns out that there is
more than one dimension to complexity, and some of the others may also help illuminate some
underlying issues in statistics, for example the model selection problem. Zurek (1990) provides several
ways into an investigation of these concepts. One particular approach, developed by Jim Crutchfield
around the idea of “statistical complexity”, is leading to a new computation- and dynamic-theory based
approach to the analysis of time series. I have neither the time nor expertise to discuss this work here,
but I heartily recommend it those statisticians who work in this area. A good introduction is Shalizi
and Crutchfield (1999).
References
Arthur B., Lane D. (1993), Information contagion, Structural Change and Economic Dynamics, 4, 81104.
Bar Yam Y. (1997), Dynamics of Complex Systems, Perseus, NY.
Bonabeau E., Dorigo M., Theraulaz G. (1999), Swarm Intelligence : From
Natural to Artificial Systems, Oxford, New York.
Bossomaier T., Green D. (2000), Complex Systems, Cambridge, Cambridge UK.
Deneubourg J.-L., Goss S., Franks N.R., Pasteels J.M. (1989), The blind leading the blind: Modeling
chemically mediated army ant raid patterns, Journal Insect Behavior, 2, 719-725.
Franks N.R., Gomez N., Goss S., Deneubourg J.-L. (1991), The blind leading the blind in army ant raid
patterns: Testing a model of self-organization, Journal Insect Behavior, 5, 583-607.
Gell-Mann M. (1994), The Quark and the Jaguar: Adventures in the Simple and the Complex,
Freeman, NY.
Holland J.H. (1998), Emergence: From Chaos to Order, Addison-Wesley, Reading MA.
Holland J.H. (1995), Hidden Order: How Adaptation Builds Complexity, Addison-Wesley, Reading
MA.
Kauffman S. (1993), The Origins of Order: Self-Organization and Selection in Evolution, Oxford, NY.
Lane D. (1998), Is what is good for each good for all? in: Economy as a Complex, Evolving System II,
Arthur W.B., Durlauf S., Lane, D. (Eds.).,Addison-Wesley.
Nicolas G., Prigogine I. (1989), Exploring Complexity: An Introduction, Freeman, NY.
Serra R., Zanarini G. (1990), Complex Systems and Cognitive Processes, Springer-Verlag, Berlin.
Shalizi C.R., Crutchfield J.P.(1999), Computational mechanics: Pattern and prediction, structure and
simplicity, Santa Fe Institute Working Paper.
Sole R.V., Bonabeau E., Delgado J., Fernandez P., Marin J. (1999), Pattern formation and optimization
in army ant raids, Santa Fe Institute Working Paper, submitted to Proceedings Royal Society of
London B.
Watts D.J. (1999), Small Worlds: The Dynamics of Networks between Order and Randomness,
Princeton, Princeton NJ.
Weisbuch G. (1991), Complex Systems Dynamics, Addison-Wesley, Redwood City CA.
Zurek W. (1990), Complexity, Entropy and the Physics of Information, Addison-Wesley, Redwood
City CA.
Download