Three classes of systems involved in most adaptation and innovation. In addition to
proteins and RNA, discussed in the main text, two other classes of biological systems play
central roles in evolutionary adapation and innovation.
Regulatory circuits are systems of one or more regulatory molecules that influence
each other’s activity. Especially important are transcriptional regulation circuits, which
consist of transcriptional regulators that influence each other’s expression. Each regulatory
circuit has a regulatory genotype that specifies how its member molecules mutually regulate
each other’s activities, and how they produce a gene expression phenotype that influences
many processes in physiology and development. Genotypic change that alters regulatory
interactions can bring forth novel gene expression phenotypes. These are involved in many
evolutionary adaptations and innovations, such as the dissected leaves of some plants, the
eyespots of some butterflies, the flowers of flowering plants, and the limbs of
vertebrates(Bharathan et al. 2002; Brakefield et al. 1996; Burke et al. 1995; Carroll et al.
2001; Coen & Meyerowitz 1991; Davidson & Erwin 2006; Hay & Tsiantis 2006; Hughes &
Kaufman 2002; Keys et al. 1999). Any one circuit genotype exists in a much larger genotype
space. This space captures all biochemically feasible circuits involving a given set of
Metabolic networks, a third system class, comprise hundreds to thousands of chemical
reactions that are catalyzed by enzymes, which are encoded by genes. These networks are
responsible for providing cells with energy and multiple molecular building blocks -- amino
acids, nucleotides, lipids, and others -- for cell growth. Innovations involving metabolic
networks enable an organism to produce useful secondary metabolites, to detoxify waste
products of its metabolism, or to use novel molecules as a source of energy or chemical
elements. Heterotrophic bacteria, for example, have acquired the ability to use a broad
spectrum of different molecules as sole carbon sources that include crude oil and natural gas,
but also man-made compounds such as antibiotics and industrial chemicals (Dantas et al.
2008; Rehmann & Daugulis 2008; van der Meer 1995; van der Meer et al. 1998). The
necessary biochemical pathways often do not arise through the evolution of novel enzymes,
but through novel combinations of already existing, individually widespread enzymes, which
may be facilitated by horizontal gene transfer (Copley 2000; Lerat et al. 2005; Ochman et al.
2005; Pal et al. 2005). Metabolic networks exist in a metabolic genotype space, a space of
possible metabolic networks, where each network has a different metabolic genotype. This
genotype can be compactly represented through information about the presence or absence of
individual reactions (enzyme-coding genes) from a much larger universe of metabolic
reactions (Rodrigues & Wagner 2009; Samal et al. 2010).
Macromolecules, regulatory circuits, and metabolism share features important for
evolutionary adaptation and innovation. In all three major system classes, the neighbor of a
genotype G in genotype space is an important concept. In the macromolecules discussed in
the main text, this is a genotype that differs from G in exactly one amino acid or nucleotide.
In the case of regulatory circuits, a genotype’s neighbor differs from it in one regulatory
interaction, and in metabolic networks, it differs in one metabolic reaction (one enzymecoding gene) in the case of metabolic networks. More generally, a genotype’s k-neighbor
differs from it in k system parts (amino acids, regulatory interactions, metabolic reactions). A
genotype’s (k-)neighborhood includes all its (k-)neighbors, and may comprise thousands of
different genotypes. More generally, one can define a distance between two genotypes as the
fraction of system parts in which they differ.
One can view mutational robustness as a property of a genotype G’s neighborhood,
namely as the fraction of G’s neighbors that have the same phenotype P as G itself. Systems
in all three classes are to some extent robust to mutations. This has been shown through
engineered mutations that eliminate enzyme-coding genes from a genome, through rewiring
of regulatory circuits, through large-scale mutagenesis studies in proteins, and through a
variety of comparative and modeling approaches (Alon et al. 1999; Blank et al. 2005;
Edwards & Palsson 2000; Giaever et al. 2002; Hafner et al. 2009; Huang et al. 1996; Isalan et
al. 2008; Kleina & Miller 1990; Raman & Wagner 2011; Rennell et al. 1991; Segre et al.
2002; Soyer & Pfeiffer 2010; Stelling et al. 2002; Thompson et al. 1999; Wagner 2005a;
Wagner 2005b; Wang & Zhang 2009; Weatherall & Clegg 1976). The fraction of neighbors
with the same phenotype varies widely, and typically ranges between 10 percent to more than
50 percent, depending on system class, system size, and phenotype (Wagner 2011b).
Genotype networks exist in metabolic and regulatory networks, just as they exist in
macromolecules (Ciliberti et al. 2007b; Giurumescu et al. 2009; MacCarthy et al. 2003;
Ndifon et al. 2009; Rodrigues & Wagner 2009; Rodrigues & Wagner 2011; Samal et al.
2010). Genotype networks typically extend far – between 70 and 100 percent -- through
genotype space. This means that two genotypes can differ in more than 70 percent of their
parts (amino acids, regulatory interactions, metabolic reactions) and still have the same
Robustness is both necessary (Wagner 2011b) and sufficient (Reidys et al. 1997) for
the existence of genotype networks with this property. I will briefly sketch how one can show
that this assertion is correct, an argument that is presented in greater detail elsewhere (Wagner
2011a; Wagner 2011b). Consider a typical phenotype P in any of the three system classes I
mentioned. It will be adopted by some very large number M of genotypes that, however,
jointly constitute a very small fraction of a vast genotype space (Ciliberti et al. 2007a; Samal
et al. 2010; Sumedha et al. 2007; Todd et al. 1999; Wagner 2011b). Let us assume that this set
of genotypes consists of genotypes chosen at random from genotype space, without requiring
that each genotype has neighbors with the same phenotype. One can then estimate the
probability that each of these genotypes has no neighbors with phenotype P – it completely
lacks robustness. This probability is very close to one. In other words, robustness is necessary
for the existence of genotype networks.
Robustness is also sufficient. To see this, it is useful to view genotype networks as
graphs (mathematical objects that consist of nodes, and of edges that link these nodes), and to
ask how genotype space would be organized if genotype networks were random networks that
shared only the one feature that each genotype has some fraction ν of neighbors with the same
phenotype as itself. I emphasize that such random networks may show little resemblance to
actual genotype networks. However, they are useful in forming null-hypotheses about
genotype space organization. One can show that a random graph constructed by connecting a
genotype G to a fraction ν of its neighbors, and each of these neighbors to a fraction ν of their
neighbors, and so on – without any further assumptions – would form a genotype networks
that would span genotype space or nearly so. In other words, random genotype networks
would extend far through genotype space, as long as genotypes in them have many neighbors
with the same phenotype (Wagner 2011b, Chapter 6).
A second common property of different system classes regards the neighborhoods of
different genotypes G1 and G2 that have the same phenotype, and that lie on the same
genotype network. One can ask whether any one phenotype that occurs in one of these
neighborhoods occurs only in this neighborhood (and not in the other neighborhood), or
whether it occurs in both neighborhoods. The answer is that the fraction of phenotypes unique
to one neighborhood in this sense increases with the distance between two genotypes. But
even if the two genotypes G1 and G2 have a modest distance and differ in as little as 25
percent of their parts, the majority of phenotypes in one of the genotype’s neighborhoods
typically does not occur in the other neighborhood (Ferrada & Wagner 2010). Pertinent
evidence exists for proteins (Ferrada & Wagner 2010), RNA (Huynen 1996; Schuster et al.
1994; Sumedha et al. 2007), model regulatory circuits (Ciliberti et al. 2007c), and metabolic
networks (Rodrigues & Wagner 2009; Rodrigues & Wagner 2011).
