darwins_snowflake

advertisement

Darwin's Snowflake

(A blog I wrote for Andrea Borghini in 2006)

What is life? The Von Neumann machine is a persuasive model, but examples are given of things that behave like Von Neumann machines despite being obviously not alive. One such example is crystals; it would be good to have an abstract model of bio-molecules in general that was as explanatory as our abstract models of crystals are. It is proposed here that a class of mathematical objects called Wang tilings can play that role, and some of the implications of this model are developed through a comparison between the structures of a crystal of water ice and a folded RNA molecule .

It’s hard for a curious person to look at a blade of grass, or a baby, or a bacterium, and not be astonished by the very fact of its existence. Why should there be any such wildly improbable phenomenon? Why isn’t the green, changing Earth more like the pale static

Moon, where things just sit there stably being ‘things’? What is it about the way nature is set up that made this strange and complicated class of objects, all of them somehow ‘life’, burst forth spontaneously from the sublunary muck? And just what exactly is life?

For the last four or five decades, the answer almost any serious biologist would have given to this question is John Von Neumann’s; a living thing is essentially a naturally arising self-replicating machine. Von Neumann himself only intended this simple formulation as a starting point for an inquiry into more subtle aspects of the phenomenon

(for example, is DNA or some other similar memory molecule really necessary, or could there be living things with no genome at all?) but his untimely death meant that the world got his biological work in an unfinished form, and what is remembered now is the snappy and persuasive formulation he started it with.

The temptation to take this influential starting point, on its own, as an adequate final theory of what life is, is reflected in the definition NASA officially uses to guide its search for life on other planets; life is defined as ‘any self-replicating chemical system that is capable of Darwinian evolution.’ This definition represents a rather common theoretical point of view. But it doesn’t really tell us what we need to know. It’s true that the ability to evolve in a Darwinian manner is a universal feature of life, but we really want the definition to tell us why some self-replicating systems can evolve in this way, and some can’t. It’s quite easy to find non-living things that replicate themselves, but don’t evolve. The question is what is different about the ones that do. For what about fire? Does not a forest fire leap from tree to tree, replicating itself wherever it finds suitable conditions, and is not a flame a ‘chemical system’? And in precisely what sense is a fire not also a ‘naturally-arising self-replicating machine’? If a hurricane or a bacterial flagellum should count as ‘natural motors’ then surely a flame is also a sort of mechanism, which also turns free energy into heat in a way that performs work (for example, the transport of hot air and sparks that is such an important step in its self-replication.) The lattice structure of a salt crystal or a crystal of water ice will also replicate itself indefinitely under the right conditions, and many stars die in ways that ultimately produce more stars. Are these things then all examples of life? No, because they don’t evolve in a Darwinian manner - but we still don’t know why. What qualities

do living things all have that ice and fire and stars all lack, and why? What exactly is missing from the standard approach typified by the NASA definition?

Von Neumann’s story is intuitively appealing because of the apparent simplicity and directness of its connection to the Darwinian model of evolution. Since the defining characteristic of a Von Neumann machine is that it is ‘self-replicating’, the machines that replicate themselves most efficiently can apparently be easily identified with the

Darwinian ‘fittest’. If all evolution is just survival of the fittest replicators from a randomly varying collection of them, and all a replicator is is just a certain type of machine, then apparently we can do without any distinct conception of ‘what life is’ that is any more demanding than the one Von Neumann has given us; life is any naturally arising self-replicating mechanism that occasionally makes small errors in self replication.

But doesn’t this description also apply to the crystalline lattice of a growing snowflake, which grows into peculiar shapes because of its unique small imperfections? It might be helpful if we could compare the biological equivalent of a snowflake to its icy counterpart, in order to determine just exactly what the basic difference between them is.

But what is that equivalent? It could be argued that whole cells are the minimal repeating unit of life, just as the minimal crystalline ‘cell’ is in snowflakes. But for our present purposes, it is more revealing to look at life on a finer scale. It is instructive to compare the fine structure of a conventional crystal like salt or ice with the structural arrangement of that original and still central biological item, a folded RNA molecule.

RNA is a good molecule to analyze if we want to know what living things are fundamentally all about, both because it combines the two key functions of information storage and mechanical activity that otherwise are carried out separately by DNA and proteins in a single object, and because it is widely believed to be the oldest surviving, and perhaps even the original, type of biomolecule. For life to get going, its original molecules had, at a minimum, to both act as copying devices, and serve as templates to be copied. An RNA molecule can do the latter, in its unfolded form, because each of its two complimentary strands completely specifies the other, but its ability to fold up into any arbitrary shape depending on its sequence means that it can also be set up to act, when folded, as a copying mechanism, which can replicate itself using another, unfolded copy of itself as template. (The idea that a string can be folded up in a way that makes it act as a mechanism may seem like an unfamiliar one to non-biologists, but everyone knows about knots, which are the same phenomenon on a macroscopic scale – a noose is nothing but a machine, made from folded string, that converts pulling into squeezing. On a quantum scale statistical forces like friction become less important, and exact gear-like or key-like or hinge-like interactions between molecular surfaces on the basis of things like shape and charge become dominant, making the mechanical possibilities of an RNA molecule much more complicated than those of a knotted string, but the principle is the same.)

How, then, does a folded RNA compare with a snowflake? Classical crystals such as salt are characterized by a deterministic repeating micro-structure, in which the same

collection of atoms is packed together in the same way over and over. Water ice makes somewhat more complex crystals, because water molecules are tetrahedral. Equilateral

(or nearly-equilateral) tetrahedra will not pack together periodically in a way that fills space completely, and the presence of electrons at some points of tetrahedral water molecules, and protons (the nuclei of hydrogen atoms) at others actually can cause these molecules to aggregate together point to point, so water ice can have a variety of crystalline structures. (A snowflake is the macroscopic certificate of this richness of microscopic possibilities.) But all these structures are, as far as we can tell, roughly periodic on some scale.

We do not know of any single shape that can only be packed into space in a non-periodic way; but, as people like Roger Penrose and Hao Wang discovered in the ‘60’s and ‘70’s, there are plenty of small collections of two- and three-dimensional shapes that can only tile space aperiodically. Here, already, is one dramatic difference between a crystal of water ice and an RNA molecule; both are largely made of roughly tetrahedral units - carbon, nitrogen, and oxygen atoms with sp3-hybridized orbitals, in the case of RNA – but water is made of only one type of tetrahedron in isolation, while the RNA molecule is made out of several types, glued together into four distinct complex aggregates in combination with a few other simple shapes. While water ice has several possible periodic microstructures, folded RNA molecules apparently have an infinity of nonperiodic space-filling structures available to them; so it would actually be rather unsurprising if the distinction between periodic tilings or crystals and aperiodic ones coincided in some way with the distinction between the abiological and the biological, as

Erwin Schrödinger suggested six decades ago.

Internally, a folded RNA molecule has no real repeating structure at all, despite being made from a small set of standard units. It is assembled as a single or double string of smaller molecules, strung together, in various sequences, into a polymer by complex protein machinery, on the basis of instructions encoded on a template molecule, either another RNA or a molecule of DNA. Its three-dimensional structure is a consequence of exactly what these sequences are. Complementary subsequences tend to stick together.

Thus, depending on its precise sequence, an RNA molecule can self-assemble into more or less any three-dimensional shape at all.

We have nice abstract models of classical crystals, which involve stylized representations of their structures as regular lattices of points in space, or honeycombs of repeating identical cells. But is there any abstraction which similarly captures the very different sort of non-repeating structure we find in the folded RNA molecule? There arguably is. The candidate is something called a Wang tiling, and the conjecture being made here is that a folded biogenic RNA molecule is what might be called a ‘Wang crystal’, a crystal-dense packing with a Wang-tiling-like structure. If this is so, it implies all kinds of interesting things about the nature and evolutionary tendencies of life.

{Author’s note: when I wrote this, in 2006, I wasn’t yet aware of the existence of

Faulhammer, D., Cukras, A. Lipton, R. and Landweber, L., 2000. “Molecular

Computation: RNA Solutions to Chess Problems.” PNAS 97: 1385 – 1395.}

Hao Wang, the logician who invented Wang tilings and discovered many of their properties during the 1960’s, was, at the time, working towards a goal that seemed to have nothing at all to do with molecular biology. He wanted to know if there could, even in principle, be what is called a ‘nontrivial, non-futile game’.

A trivial game is one like tic-tac-toe in which it is possible for one player to know at its beginning that she can force a win if she pursues a certain strategy. (In this context, the word ‘strategy’ means exactly ‘sequence of moves’) The winning strategy may depend on the precise moves made by the other player, but if the game is trivial one of the players can know from the outset that for any move the other player might make, there is a move she can make that keeps her on a path to inevitable victory. A game in which one player can know from the start that she can force a draw is called ‘futile’. So a nontrivial, non-futile game (which might perhaps be referred to as an ‘interesting’ game) is one in which neither player can know, even in principle, whether some particular opening move will or will not force a win or a draw at some later point.

The games Wang chose to investigate were ones involving dominos. He did so because of the generality of this representation; any game at all can be encoded as a domino game. Each possible game-state can be given a domino. Each possible way of getting into that state can then be encoded as an edge of that domino (colored so that only possible predecessor states can be put next to it.) Each way of getting out of that state can be encoded as another edge, again colored so that only the tile representing the resulting successor state can be placed adjacent to it.

Wang’s great innovation was to realize that any computational device could also be encoded in a collection of dominos or tiles, and therefore that any possible computation starting from any possible input could be encoded in the playing out of a properly-set-up domino game. (In essence, he showed, in a very elegant way, that a strategy, conceived of as a sequence of game moves, and a calculation or algorithmic process, conceived of as a sequence of computational states or steps, are, from a formal perspective, more or less the same thing.) Because there are things about some computational processes that can not, even in principle, be predicted or proven from a computer’s starting state (for example, the starting state can be cleverly set up so that any attempt to determine in advance whether the consequent computation will ever halt will itself go on forever) this amounted to a proof that there are in fact ‘interesting’ domino games, games which are both nontrivial and non-futile.

What connects all of this rather directly to molecular biology is the very simple way

Wang found to encode any computation whatsoever as a game played with dominos or tiles, in order to prove these points. Since a folded RNA molecule is basically a tiling of three-dimensional space with four tiles or dominos – the four basic nucleotides – and at the same time a cog in the computational machinery that regulates, repairs, and replicates the cell that assembled it, a bridging of the conceptual gap between tilings and computations is also a bridging of the conceptual gap between structural biology and the concept of the ‘organism’, which necessarily includes, besides Von Neumann’s chosen

attribute of self-replication, some provision for homeostatic self-control, for self-repair and self-regulation. So Wang’s elegant encoding offers us the enticing promise of a sort of grand unified theory of the organism, one that is reductionist enough to satisfy the most puritanical biochemist, and yet holistic enough to please even Leibniz, because it is directly based on the physical shapes and affinities of life’s minimal material constituents, and yet promises us an account of the living thing as a non-local organic unity that can not simply be naïvely identified with its momentary or local material substrate.

What is this biologically interesting encoding? Alan Turing showed that the very simplest kind of general computer is a device that can read symbols on a paper tape, write symbols on the same paper tape, enter various machine states in response to the symbols it encounters (states which will determine what it would do when confronted with a particular next symbol) and move one square to the left or right along the tape, or halt, depending on the last symbol read off the tape and the state the device was in when it read it.

So at each particular square of the tape which such a computing device reaches at some point in its progress through the computation, it needs to be in a state, to encounter a symbol, to write another symbol over that (or to not write anything) in a manner that depends on the state it is in and the symbol it encounters, to either halt, continue moving in the same direction it was moving before, or reverse direction, and to be in a state when it finishes doing so that depends on the state it was in when it entered and the symbol it encountered.

Wang invented a type of square domino, called a Wang tile, with four colored faces, each of which can be placed next to the same-colored face of another such domino. He showed that if we think of each Wang tile as a square on the paper tape of a Turing machine of the kind just described, then (provided the machine is going to continue on in the same direction from that square) we can use different colors to encode the state the machine was in when it entered the square on one side of the tile, the symbol it encounters there on the top edge of the tile, the symbol it leaves there, having been in that state and encountered that symbol, on the bottom edge of the tile, and the state it leaves the square in on the far side of the tile. A series of computational steps during which the machine moves along its tape in one direction is a single horizontal row of such tiles or dominos.

Where the device would change direction and begin moving back along its own tape, reading and reacting to the symbols it just deposited there, we can encode this reversal, in a Wang tiling, by moving down to the next row and laying the next line of tiles down in the opposite direction. (So we need a new special color for the bottom edge of change-ofdirection dominos, which encodes both a state and a deposited symbol – no problem if we have enough different hues at our disposal.)

We lay dominos in that row, moving back along the bottom edge of the row above, where the symbols the machine just left when it was previously in those ‘tape-squares’ are

encoded as bottom edge colors, which dictate the sequence of top-edge colors in our new row, until we halt or change direction again; when we do that we move down another row, and so on until the machine halts, or ad infinitum if it never does.

Thus we can think of the computational process growing out of a certain machine starting state, encoded as an initial row or sequence of tiles, as a domino game involving a finite number of types of dominos (the number depends on the number of states and symbols the encoded computing machine has, but people have designed extremely simple general computing machines, with only a few states and a few symbols, using various clever tricks.) In fact it is rather like the game of assembling a jigsaw puzzle; the assembly of a jigsaw puzzle itself can, conversely, also be seen as a sort of analog geometric computation.

The idea that embraces both is construction. From a computational perspective, construction is calculation, and calculation construction, which in itself accounts for the difficulty of assembling a jigsaw puzzle or a Wang tiling; complex constructions don’t just materialize from nothing, they require either lots of trial and error or the conservation of superior ‘calculating strategies’ or paths of assembly. Since a strategy is nothing but a series of moves, and a series of moves in a domino game is a nothing but a series of tiles emplaced adjacent to one another in a certain order, a conserved construction strategy can, in this kind of world, be nothing but a conserved string of tiles.

Why should trial and error even be necessary to assemble a jigsaw puzzle or a patch of

Wang tiling? Partly it is required just to sort through candidate tiles for ones that will fit in a certain location. But the globally appropriate choice can, in a jigsaw puzzle or a

Wang tiling, also be locally underdetermined; it may be possible to insert either piece A or piece B in a given space, even though the insertion of B will eventually have ruinous consequences somewhere else in the puzzle. This is precisely the kind of coordination problem evolution has to overcome in designing macromolecules, cells, and whole organisms; the need to produce globally compatible structures in ways not completely forced by purely local constraints, as the much simpler repeating structures of classical crystals are.

Physically stringing design elements together like knots on a cord conserves effective global solutions to these locally insoluble coordination problems, so that they don’t have to be found again and again through uncoordinated trial and error. (The ultimate conserved connected code-string is, of course, a whole chromosome.) Memory, in this sense of an encoded record of what worked out well globally in previous cases, a string of instructions to add this rather than that to the structure at this point, appears to be unique to life and its artifacts.

The mechanical operations of cells consist largely of contact between properly shaped and constituted faces of interacting molecules – an enzyme with its substrate, a receptor with its ligand, a transcription factor with its target sequence on the genome. In a Wangcrystal model of how molecular biology works, this should perhaps be thought of as the interaction of the output of two separate computations, or else as a two-player domino

game. Within a single organism the game is a cooperative one (unlike the entirely antagonistic domino game of ‘antibody design’ which the immune system plays ceaselessly against pathogens.) The intra-organismic ‘players’ are selected to put matching codes on meta-tiles that ought (from the point of view of the organism’s fitness) to interact, and non-matching ones on meta-tiles that shouldn’t. Thus a fitnessoptimizing computer is built out of smaller computers, a complex set of large Wang tiles

(which jointly encode a large set of strong, but not very general, computational languages) out of a simple set of smaller Wang tiles (which jointly encode a single weaker, but more general language.) This hierarchy of more specialized computers made of less specialized computers made of even less specialized computers continues all the way up to the reader’s brain. (So apparently, Descartes was wrong; Res Cogitans and Res

Extensa are really more or less the same thing.)

The very simple picture of what life is that has been presented here – that it is a sort of hierarchy of nested Wang crystals (whose complex modular elements are filtered through natural selection, at each level of the hierarchy, both singly, as tiles in a set, and jointly, as whole assemblies of these single tiles or modules, which together act as single metatiles in meta-tile-sets on the next hierarchical level) may be an unfamiliar one to anyone steeped in the conventional Von Neumann/NASA story about what life is. The good thing about it, though, is that it begins to explain why life is the sort of thing that can evolve, while ice and fire are not.

A classical, periodic crystal like salt or ice has essentially the same repeating structure throughout, whose form is dictated by purely local forces. No long, coordinated series of appropriate assembly steps needs to be found over the course of billions of years by trial and error, or conserved for future generations in the form of a strategy-encoding string.

The first crystal of salt that ever forms exhausts most of the structural possibilities salt offers; there is nothing more to be found through a long stochastic process of trial and error, no superior strategy for making crystals of salt that is not already transparently implied by the unvarying elementary properties of sodium and chlorine. (The same thing is true for fire; the first one to ever break out must have already exhibited all the really important features of the phenomenon.) The structure of present-day salt and the way combustion works nowadays are both things that are physically necessary; neither phenomenon could really have been otherwise in our universe.

Life, on the other hand, occupies the much larger modal realm of the necessarily possible.

It is an inevitable feature of the universe that life as we know it could exist; but the actual existence of babies and blades of grass is merely one of the ways this spectrum of necessary possibilities could have been played out. Because a Wang crystal has no deterministically repeating local structure, but has open to it the entire huge realm of varied structures isomorphic to the realm of all possible computational processes, the set of assembly steps that lead to any particular complex structure need to be found over time, among the googols of possibilities, and conserved on the genome by natural selection as a module in an ever-growing modular tool-kit. The first small RNA molecule that ever formed was very far indeed from exhausting the structural possibilities of folded organic polymers, or even of RNA.

Time is required for good assembly strategies to be discovered through trial and error; and this asymmetric evolutionary quest puts living things in a kind of memory-defined, directional time that has no real parallel in the abiotic world. True diamonds are forever, but we living quasi-diamonds each have our day; we must wait for the baby or the blade of grass to arrive at their own proper point in cosmic history. Even after all that waiting, though, we should still expect, when first looking into the strange blue eyes of this new example just fallen from the moon, to be once again surprised by the newest twists and turns in Darwin’s evolving snowflake.

Bibliography

Albert, D. (2000) Time and Chance.

Cambridge, Massachussetts: Harvard University

Press.

Cairns Smith, A. G. (1990) Seven Clues to the Origin of Life.

Cambridge: Canto Press.

Grünbaum, B. and Shepherd, G. (1987)

Tilings and Patterns.

New York: W. H. Freeman

& Co.

Janot, C. (1994) Quasicrystals: a Primer.

Oxford: Oxford University Press.

Kauffman, S. (1993) The Origins of Order.

Oxford: Oxford University Press.

Maynard Smith, J. (1982) Evolution and the Theory of Games.

Cambridge: Cambridge

University Press.

Patthy, L. (1999) Protein Evolution.

Hoboken: Wiley-Blackwell.

Robinson, G, (1978) ‘Undecidability and Nonperiodicity for Tilings of the Plane.’

Inventiones Mathematicae 1978; 12: 177 – 209.

Schrödinger, E. (1947) What is Life? The Physical Aspect of the Living Cell . New York:

Macmillan.

Van Ophuysen (1991) ‘Non-Locality and Aperiodicity of d-Dimensional Tilings.’ In

Quasicrystals and Discrete Geometry . Jiri Patera, ed. Toronto: Fields Institute

Monographs.

Von Neumann, J. (1966) Theory of Self-Reproducing Automata . Edited and Compiled by

Arthur W. Burks. Champaign: University of Illinois Press.

Wang (1965) ‘Logic, Games, and Computers.’ Scientific American , November 1965, pp.

98 -106.

Zubay ( 2000) The Origins of Life On Earth and in the Cosmos. Waltham: Academic

Press.

Download