In Search of the “World Formula” Stuart Smith stu@cs.uml.edu INTRODUCTION In a paper published in 1992 [1], French crystallographer and biochemist Gérard Langlet (1940-1996) claimed that (1) the natural processes studied in physics, biology, chemistry, and other sciences are computations, and (2) all such computations are ultimately based on a single, very simple algorithm that employs only boolean operations. Langlet named this algorithm “parity integration.” At the time Langlet’s paper appeared, his first claim would not have been news. It is clearly recognizable as a modern version of Leibniz’ mathesis universalis, a universal science supported by a system of logical calculation; however, it was not until the publication of Stephen Wolfram’s A New Kind of Science[2] in 2002 that an easily accessible vocabulary and array of tools became available for assessing Langlet’s second claim. While Wolfram started from a position not unlike Langlet’s, he made clear through an abundance of examples that no one algorithm is the “world formula” and that, in fact, very simple programs of many different kinds can produce complex behavior analogous to that observed in nature. For an algorithm to fulfill the role Langlet wanted parity integration to play, it would have to meet two criteria: 1. It would have to be universal, that is, given suitable initial conditions it must be able to emulate the behavior of any other algorithm; however, to be universal an algorithm need not be efficient or even computationally feasible. 2. Of all universal algorithms it would have to be computationally feasible and, in addition, satisfy some strong criterion of merit, such as being the most efficient, or the simplest, or the most compact. It is shown here that while parity integration is feasible, as well as efficient, simple, and compact, it is not universal. This conclusion is based on Wolfram’s experiments with cellular automata and on more recent formal mathematical analyses by Chua et al. [3,4] SOME DEFINITIONS Langlet developed a system of computation1 that operates almost entirely in the boolean field, which consists of the values {0,1}, with exclusive-or ( ) as "addition" and logical-and (∧) as "multiplication." Exclusive-or is extended element-wise to operate on pairs of vectors of the same length, that is: xor( X , Y ) x1 y1 , x2 y2 ,..., xn yn (1) Extended exclusive-or and two additional operations together constitute what Zaus [5] calls “generalized” exclusive-or. The additional operations are binary vector integral (bvi), which is defined as bvi( X ) x1 , ( x1 x2 ),..., ( x1 x2 ... xn ) (2) and binary scalar integral (bsi), which is defined as bsi ( X ) x1 x2 ... xn (3) As can be seen, the final term of the bvi of a given vector is the bsi of that vector. bvi implements parity integration, which is the core of Langlet’s system. All three generalized exclusive-or operations can also be performed with addition modulo 2. A fourth fundamental operation, binary vector differential (bvd), is the inverse of bvi: bvd (bvi( X )) X (4) It is worth noting that bvi is the same process used for graycode-to-binary conversion, while bvd is the same process used for its inverse, binary-to-graycode conversion. bvd is defined as 1 The development of this system in [1] is somewhat sketchy. The complete development, with significant additions, can be found in [5]. bvd ( X ) xor( shiftr ( X ), X ) (5) where shiftr shifts its vector argument one position to the right (the shift is end-off, with zero fill on the left). The following notations have been used to characterize cellular automata [6]: i : the position of an individual cell in the one-dimensional array of cells t : the time step qi (t ) : the output state of the ith cell at the tth time step qi (t 1) : the output of the ith cell at the (t+1)th time step The next-state (transition) for “Rule 60” of Wolfram’s elementary one-dimensional cellular automaton is: qi (t 1) qi1 (t ) qi (t ) (6) Thus, if the state of the automaton is represented by the one-dimensional boolean array q, the next state of each cell will be the exclusive-or of the current state of the cell and the current state of its left-hand neighbor. In Wolfram’s work, the left neighbor of the leftmost cell is the rightmost cell, and the right neighbor of the rightmost cell is the leftmost cell, that is, the boundary conditions are periodic (i.e., the array wraps around). bvi, bvd, and Rule 60 In brief, the main argument of this paper is that Langlet’s parity integration cannot be the universal algorithm underlying all other algorithms because it produces an exact mirror image of the output of Wolfram’s elementary onedimensional cellular automaton with Rule 60, which has been shown not to be universal.[7] The basic pattern for both Rule 60 and bvd is produced when the initial conditions are a single 1 followed by all zeroes, and the output of each step is used as the input to the next. Fig. 1, left, shows sixteen steps of these operations. Fig. 1, right, shows the mirror image produced by sixteen steps of bvi. The last row of the bvd case is used as the initial conditions, and the output of each step as the input to the next. Fig. 1 Basic Rule 60 pattern (left) and basic bvi pattern (right) As stated above, Rule 60 computes the next state of a cell as the exclusive-or of its current state and the current state of its left hand neighbor. bvd computes its result by performing a bitwise exclusive-or between a given vector and the same vector shifted one position right. For example, with the vector [1 0 0 0 0 0 0 0], the computation is: 10000000 0 1 0 0 0 0 0 0 0←bvd discards this bit to keep its result the same length as the initial conditions ——————— 11000000 Wolfram’s cellular automata operate in an indefinitely large cell space with periodic boundary conditions, while Langlet’s operations generally create or modify arbitrarily large square matrices, as shown in fig. 1. This difference can be accommodated if, instead of discarding the rightmost bit of its result, bvd is allowed to grow one more column of cells to the right on each step. The image of the evolution of Wolfram’s cellular automaton with Rule 60 effectively grows one column of cells to the right on each step. Because the basic pattern obtained with bvi is the mirror image of the basic pattern obtained with bvd, any feature of one can be immediately located in the other. Such a feature might be the state of a particular cell, an entire row or column, or a two-dimensional region of cells. For example, row i of the n×n result of n steps of bvd will be found as row n-i-1 in the mirror image produced by bvd, and vice versa. Column indices are identical in both results, so no additional computation is required to access a particular cell within a row. We can conclude therefore that the basic patterns of bvd/Rule 60 and bvi contain the same information. The next step is to determine the behavior of bvd/Rule 60 and bvi given arbitrary initial conditions. Wolfram made an intensive investigation of all 256 rules for his elementary cellular automaton and identified two attributes that are significant for the non-universality of parity integration: computational irreducibility and additivity. Each of these is discussed next.2 COMPUTATIONAL IRREDUCIBILITY Wolfram uses the term “computational irreducibility” to describe the fact that for many cellular automata, there is no shorter way to compute state n of the automaton than to compute all n-1 states from the initial conditions to state n. [8] But for other cellular automata there are shortcuts that allow any given state to be computed in much less time. This is the case for Rule 60. The basic pattern for Rule 60 resembles the nested, triangles-with-triangles pattern of the Sierpinski Triangle, as shown in fig. 1. It has long been known that Pascal’s Triangle modulo 2 is precisely the same pattern of 1’s and 0’s as the Sierpinski Triangle. Because row n of Pascal’s Triangle can be computed directly as binomial(n,k), where k=0,1,..,n+1, it is necessary to compute only the values in row n to obtain row n of the Sierpinski Triangle. Row n of the cellular automaton’s output is then completed by padding it with zeroes on the right, for a total length of n+1. Thus there is a shortcut for computing any cell or row of the pattern. The same conclusion can be drawn from Wolfram’s discovery that the state of an individual cell in a nested pattern can be computed directly from the base-2 digit sequences of the x and y coordinates of the cell. In the case of the Sierpinski Triangle pattern the state of a given cell is determined as follows: “if any digit in the y coordinate of a particular cell is 0 when the corresponding digit in the x coordinate is 1 then the square is white; otherwise it is black.” [9]. Because Rule 60 has the “additive” property—to be discussed next—it turns out that given any initial conditions, the state of a given cell after any number of steps can be determined from its x and y coordinates alone. Furthermore, because the patterns produced by bvi are simply the mirror images of Rule 60 patterns, the state of any cell in a sequence of rows produced by parity integration can be determined similarly. ADDITIVITY Rule 60 is one of eight rules that possess the additive property. For any initial conditions, the patterns produced by an additive rule are always “simple superpositions of the basic nested pattern obtained by starting with a single black cell.” [10] No matter how complex the output of the automaton might appear when using an additive rule, the array of cells produced corresponds to the superposition of some number of shifted copies of the basic pattern. The number of positions to shift each copy is the index of the corresponding 1 in the vector of initial conditions. Thus, for example, if the initial conditions are [1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0], the result for Rule 60 will be the superposition the un-shifted original and a copy shifted six positions to the right, as shown in fig. 2. Fig. 2 Superposition of two basic Rule 60 patterns Chua et al. [3] actually do much more than investigate the capabilities of Wolfram’s elementary cellular automaton as a generator of interesting patterns of 1’s and 0’s. They treat the cellular automaton as a dynamical system whose behavior under the different rules can be described by appropriate ordinary differential equations, which they provide for all 256 rules. This has led to some real-world applications, such as the design of Cellular Neural Network chips. 2 Because superposition is accomplished with exclusive-or, wherever two or more 1’s overlap the result is their modulo 2 sum. As can be seen in fig. 2, the two Sierpinski patterns are visible for the first few steps, but as the un-shifted pattern begins to spread out to the right it starts interacting with the shifted copy. As a result the individual patterns become more difficult to discern. mathworld.wolfram.com/Rule60.html provides an animated demonstration of Rule 60’s additivity. Because additivity is unaffected by mirror imaging, superposed copies of the basic pattern for parity integration (fig. 3) form the mirror image of fig. 2, as shown in fig. 4.3 Fig. 3 Basic pattern for parity integration If the vector [1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0] is again used to determine the placement of the shifted copies, the result of superposition is shown in fig. 4. Fig. 4 Superposition of two basic parity integration patterns As can be seen, fig. 4 is the mirror image of fig. 2. As a consequence of additivity the state of any cell in any row can be determined simply by computing the whole row for the basic pattern (either from binomial or from the x/y coordinates, as described above) and then adding to it appropriately shifted copies of that row. As was shown in figs. 2 and 4, the shifted copies are truncated to the length of the basic pattern in order to retain the square matrix shape found throughout Langlet’s writings. 4 No information is lost in this process because every state of the system all the way back to the initial conditions can be exactly recreated with the inverse of the function used to create the image, whether the results have been truncated or not. This brings us to Wolfram’s conclusion concerning cellular automata with additive rules: “With simple initial conditions, [additive rules] always yield very regular nested patterns. But with more complicated initial conditions, they produce more complicated patterns of behavior; however, these patterns never in fact really correspond to more than rather simple transformations of the initial conditions. Indeed, after 1,048,576 steps—or any number of steps that is a power of two—the array of cells produced always turns out to correspond just to a simple superposition of two or three shifted copies of the initial conditions. And since there are many kinds of behavior that do not return to such predictable forms after any limited number of steps, one must conclude that additive rules cannot be universal.” [11] Is there a rule for the elementary cellular automaton that is universal? The answer is yes: Rule 110. Rule 110 has been proven universal by several investigators [12, 13, 14, 15]; however, none has claimed that Rule 110 is the “world formula” that Langlet had sought. What about Rule 60? It is not claimed here that it cannot be used for some computational purposes merely because it is not universal. In fact, [6] suggests applications for all of the additive cellular automata. Langlet’s convention is that the generating vector [1 0 0 ... 0] creates the first row of the basic bvi pattern but does not itself appear until n steps have been computed, where n is the length of the generating vector. Thus, the first row is already the result of one step of parity integration rather than the initial conditions. This convention makes parity integration consistent with other important ways of creating the basic bvi pattern, such as Kronecker product and block replacement. (See the Appendix) 4 Alternatively, each copy could be padded with the appropriate number of trailing zeroes to maintain a uniform length for all the vectors. 3 LIMITATIONS OF PARITY ALGORITHMS Throughout his published writings, Langlet focused on the role played by parity in the natural processes studied in physics, chemistry, biology, and other sciences. The three parity-based operations that constitute generalized exclusiveor (equations 1-3 above) are the foundation of his system: bsi gives the parity of a boolean vector, bvi gives the cumulative parity of a boolean vector, and xor gives the element-wise parities of two boolean vectors. Leslie Valiant [16] has offered an argument challenging the idea that parity algorithms could play a role in evolution. This argument potentially undermines Langlet’s entire enterprise: if we take seriously Daniel Dennett’s claim5 that natural selection is a substrate-neutral, mindless algorithm for moving through a design space, then it leaks out of biology into the universe at large and becomes a basic part of the explanation of why anything is the way it is. The “world formula” would then be a mathematical formalization of Darwinian evolution. Valiant’s argument concerns the mechanisms that create different variants of an organism, which can be selected for survival—or not—depending on its fitness in a given environment. How can variants of some existing configuration of proteins in a genome be created such that they will converge to some new, fitness-enhancing configuration under the pressure of Darwinian selection? For the sake of simplicity, Valiant represents a configuration of proteins as a set of boolean variables. Thus, each protein is either present or absent (he does later use real-valued variables—representing the concentration of each protein—which, however, does not change the conclusions of his argument). Under this representation, a subset of proteins can be represented as a disjunction of variables. For example, a particular fitnessconferring subset of configuration X, might consist of x3 or x17 or x25. If any one of the variables is true, the entire subset is true. Variants of an existing disjunction can be produced by swapping an existing variable for a new one, deleting a variable, or adding one. It can be shown that this way of generating new variants will produce convergence within the time available for evolution; however, this will not be the case if the variables are related by some parity function. The reason is that adding or deleting an individual variable in an existing subset will change the subset’s parity from odd to even, that is, from 1 to 0 or “true” to “false.” A parity function must have an odd number of ones to be true. It can be shown that there is no parity algorithm for generating variants that is significantly faster than exhaustively generating and testing all possible combinations of variables.6 For example, with more than 20,000 proteins encoded in the human genome, the search space for a combination of proteins that produces some specific 20, 000 performance advantage could be larger than 2 distinct combinations of boolean variables. There is no possibility of exhaustively generating and testing this enormous number of combinations. But for evolution to work, it must be possible to converge on a performance-enhancing combination much more efficiently. Valiant therefore concludes that variants cannot be produced by a parity algorithm; however, he does not regard this claim to be proven. He rather notes that “any evidence in biology that a parity function on a large subset of variables has been discovered in the course of evolution on Earth would contradict the validity of our model here. However, there seems to be no such evidence, an absence that is consistent and would be predicted by our theory.”[17] CONCLUSIONS The argument presented here concludes that parity integration cannot be the foundation of a “world formula” because, in Wolfram’s terminology, it lacks “intrinsic” randomness, that is, randomness that arises directly as a result of the computational process itself. As was shown above, it is possible to compute step n of a sequence of parity integration steps from just n itself and the initial conditions. Also, because parity integration is completely reversible, the state of any step previous to step n all the way back to the initial conditions can be recovered. Thus, whatever randomness is observed in a sequence of steps of parity integration is contained in the initial conditions. Consequently, parity integration is limited in the types of structures and patterns it can produce; hence it is not universal. Rather than the desired universality, parity integration exhibits Wolfram’s “Type 3” behavior: many cells change state on each step, which inhibits the formation of persistent interacting structures that could be harnessed to enable useful computation. This can be seen in the overall visual character of the images generated by some number of steps of parity integration with arbitrary initial conditions: a relatively uniform texture that remains the same regardless of the number of steps generated. Although [1] and [5] both make much of the fact that parity integration produces 1f (i.e., “pink”) noise, which occurs in many physical, biological, and economic systems, neither of these papers demonstrates that parity integration is, in this respect, anything more than another process that generates this ubiquitous kind of randomness. It 5 Darwin’s Dangerous Idea. New York: Simon and Schuster (1995). This analysis is from the work of V. Feldman, e.g., “Robustness of Evolvability,” Proceedings of the 22nd Annual Conference on Learning Theory, Montreal, Quebec, Canada (2009). 6 is not shown that parity integration is somehow the foundation of all instances of pink noise. Viewed in the light of Wolfram’s work and the work of others who have built on it, searching for a single foundational algorithm is pursuing the wrong goal. Given the richness of the field of cellular automata, a better goal would be to identify as many potentially universal, and useful non-universal, systems as possible and to study their operation.7 Author’s Note Within the community of APL implementers, application developers, and users, Langlet was regarded with almost the same respect and admiration as the creator of APL, Ken Iverson. I first became acquainted with Langlet’s work when I attended his talk at the APL ’93 Conference in Toronto, where he presented a follow-on to his major paper of the preceding year. Like many of his friends, colleagues, and students I was intrigued by the freshness of his thought and his iconoclasm. In the years following, I studied his writings and tried to find applications for his ideas. In particular, Langlet (and later, Michael Zaus) suggested that the two transforms that are part of his system would be useful in signal processing. Since I was involved for many years in sound recording and electroacoustic music, I decided to explore possible applications in audio technology. This effort was quite frustrating and, as I found out after his passing, Langlet himself hadn’t been able to make the connection he sought between his transforms and the FFT. The transforms never produced anything useful in my own practical work. Any problem I could come up with either already had a good solution using tried-and-true methods or it did not yield to an approach based on Langlet’s transforms. It became increasingly clear that his computational system is a solution in search of a problem. So, although I didn’t start out to write criticisms of his work, that’s where I’ve ended up. It’s now clear, after the publication of A New Kind of Science, that Langlet’s work has been completely superseded by Wolfram’s and that of the many scientists and scholars who have built on it. While Langlet and Wolfram shared some important new ideas about computing in the sciences, Wolfram considered computation over a much wider range of applications and with a much deeper knowledge of computer science. Although some reviewers of A New Kind of Science have criticized Wolfram’s “experimental” mathematical methods, the book has nonetheless been influential in many fields and it remains a force to be reckoned with. Appendix The fundamental unit of Langlet’s system is the following 2×2 matrix, which Langlet calls G, for “geniton: 11 10 The first row is generated by applying parity integration to the fundamental vector [1 0], which reappears as the last row of the matrix. G is the basic pattern for parity integration. A key characteristic of parity integration is that it is periodic: given a length-n boolean vector of initial conditions (n a power of 2), the same sequence of rows appears every n steps. To obtain the next higher order G, compute the Kronecker product G2 n G Gn where n is the order of the matrix to be enlarged to the next higher order. Each time the product is computed, the length of the side is doubled and the number of cells is quadrupled. The block replacement method is similar. Beginning with the 2×2 G matrix, replace each cell containing a 1 with a copy of G, and replace each cell containing a 0 with a 2×2 allzeroes matrix. As with the Kronecker product method, this method doubles the length of the side and quadruples the number of cells each time it is applied. For example, if we compute either G G or one round of block replacement Langlet’s system can be explored with the Langlet Toolkit [18], which requires Matlab. Wolfram’s cellular automata (and several other systems) can be explored with NKS Explorer and, of course, Mathematica. 7 on G, the result is 1111 1010 1100 1000 This example shows that G is a self-similar, fractal pattern. Specifically, it is a variety of L-system. The G pattern cannot be directly produced by any of the 256 rules of Wolfram’s elementary cellular automata. The reason is that the cellular automaton updates all cells in parallel while parity integration requires cells to be updated one at a time from left to right (see equation 2). References [1] Gérard Langlet. Towards the Ultimate APL-TOE. ACM SIGAPL Quote Quad, 23(1) (July 1992) [2] Stephen Wolfram. A New Kind of Science. Champaign, IL: Wolfram Media (2002) [3] Leon O. Chua, Sook Yoon, and Radu Dogaru. A nonlinear dynamics perspective of Wolfram’s new kind of science, Part I. Threshold of Complexity. International Journal of Bifurcation and Chaos, 12(12), 2655–2766 (2002) [4] Leon O. Chua, V. I. Sbitnev, and Sook Yoon. A nonlinear dynamics perspective of Wolfram’s new kind of science, Part II: Universal Neuron. International Journal of Bifurcation and Chaos 13(9), 2377–2491 (2003) [5] Michael Zaus. Crisp and Soft Computing with Hypercubical Calculus. Heidelberg: Physica-Verlag (1999) [6] P.P.Chaudhuri, D.R.Chowdhury, Sukumar Nandi, and Santanu Chattopadhyay. Additive Cellular Automata. Los Alamitos, CA: IEEE Computer Society Press (1997) [7] Wolfram, pp. 695-696. [8] Wolfram, pp. 737-750. [9] Wolfram, p. 608. [10] Wolfram, p. 264. [11] Wolfram, pp. 695-696. [12] Chua et al., 2003. [13] M. Cook. Universality in Cellular Automata. Complex Systems, 15(1), 1-40 (2004) [14] M. Cook. A Concrete View of Rule 110 Computation. In: T. Neary, D. Woods, A. K. Seda, and N. Murphy, eds. The Complexity of Simple Programs, pp. 31-55. Elsevier (2011) [15] H.V. McIntosh. Rule 110 is Universal! http://delta.cs.cinvestav.mx/~mcintosh/comun/texlet/texlet.html. [16] L. Valiant. Probably Approximately Correct. New York: Basic Books, 2013. [17] Valiant, p. 107. [18] Stuart Smith. Langlet Toolkit. www.cs.uml.edu/~stu