1 Protein interaction world – an alternative hypothesis about the

advertisement
Protein interaction world – an alternative hypothesis about the origins of life
Peter Andras, PhD1 and Csaba Andras, MSc2
1
School of Computing Science,
University of Newcastle,
Newcastle upon Tyne, NE1 7RU, UK
2
Department of Chemical Engineering,
Budapest University of Technology and Economics,
Budapest, Hungary
Correspondence:
Peter Andras
School of Computing Science
University of Newcastle
Newcastle upon Tyne
NE1 7RU
United Kingdom
tel: +44-191-2227946
fax: +44-191-2228232
e-mail: peter.andras@ncl.ac.uk
1
ABSTRACT
The protein interaction world hypothesis about the origins of life is introduced in this
paper. According to this hypothesis, life emerged as a self reproducing and expanding
system of protein interactions, which is conceptualized as an abstract communication
system. We describe key components of abstract communication systems and how
such systems work, including the role of memories of communications. Protein
interaction systems are made of communications that are the interactions between
proteins. In the context of the protein interaction world hypothesis RNA molecules
serve as memories of protein interactions and DNA molecules are memories of RNA
interactions. The protein interaction world hypothesis is based on plausible prebiotic
processes and offers a systematic view on how life emerged and evolved towards
current cellular life forms. We compare the protein interaction world hypothesis with
the most commonly accepted RNA world hypothesis about the origins of life. We
conclude that the protein interaction world hypothesis is more plausible than the RNA
word hypothesis. We discuss the role of rare nucleic bases in the context of the
protein interaction world, and we show that their role can be explained more
parsimoniously in this context than in the context of the RNA world. We also discuss
the replication in the context of the two theories, and we highlight that while in the
case of the RNA world the replication refers to material replication of some molecules,
in the context of the protein interaction world hypothesis the replication happens at
the level of replication of interactions between proteins.
2
1. INTRODUCTION
To a good extent the origins of life constitute still a mystery (1, 2, 3, 4). Since the
1950s there were several experiments aimed to elucidate how life emerged on the
Earth (5). During the past 50 years several meteorite remnants have been analysed to
find traces of life related molecules (6, 7, 8). These works led to the conclusion that
many organic molecules could emerge in abiotic conditions, including amino acids,
lipids, aromatic compounds and possibly simple sugars (5), and even small globular
vesicles may form in such conditions (9). At the same time these experiments
provided no indication of how self-replicating structures like cells, ribonucleic acid
(RNA) and deoxyribonucleic acid (DNA) molecules could emerge.
Currently the most widely accepted hypothesis about the origins of life is based on the
assumption that RNA molecules emerged in abiotic conditions (10, 11, 12, 13).
Theoretical investigations and experimental evidence indicate that the building blocks
of nucleotides (i.e., nucleic bases and sugars) were synthesised in the prebiotic
environment (10, 14, 15, 16, 17, 18, 19). It is supposed that these molecules
constituted nucleotides, which formed heteropolymers in the form of RNA molecules.
Such RNA molecules could catalyze organic chemical reactions and replicate
themselves forming the basis of self-replicating life. Of course there are several
unanswered questions in the context of the RNA world hypothesis, like how did
purine (adenine and guanine) and pyrimidine (cytosine, thymine and uracil) bases
combine with sugars (ribose) to form nucleotides in abiotic conditions (20), or how
did the replication process achieve a high enough precision in order to allow stable
evolutionary selection of some RNAs (21).
3
The other major alternative hypothesis is the protein world hypothesis (22). This
supposes that proteins and peptides were the first molecules that started life and RNA
molecules emerged later to support the replication of protein molecules. The
fundamental problem of this hypothesis is that proteins are unable to replicate
themselves, which questions the beginning of the protein world based on selfreplication of proteins. Variants of the protein world hypothesis suggest that the
protein world may have started as a world of thioesters (23), or that proteins could
have co-evolved together with RNA molecules (22).
Systems theory offers powerful analysis methods to untangle complex problems (24,
25). The fundamental assumption of systems theory analysis is that the analysed
phenomenon can be conceptualised as an abstract communication system. The
phenomenon itself consists of communications in this communication system.
Analysing such communication systems allows defining components of the complex
system and functions of these components. The components are characterized by a set
of constraints on communications. Earlier versions of systems theory have been
applied to problems of life sciences (26), but these applications lacked the conceptual
clarity brought by new advances in systems theory of abstract communication systems.
Here we will use concepts of systems theory to analyse cells and early life forms, and
to formulate an alternative hypothesis about the origins of life.
We follow the protein world hypothesis and propose that the prebiotic world was
made by a mixture of small organic molecules (e.g., short-chain fatty acids, amino
acids) that produced relatively short peptides (e.g., the products of protenoids (27) and
4
thioesters (23), or peptides generate using carbonyl sulphide in aqueous medium (28)).
This prebiotic world led to the emergence of proteins (i.e., polypeptides; more
specifically those peptides which are made of 20 biotic amino acids which are
produced in living cells) by using smaller peptides to catalyse reactions between
further peptides and other molecules. We propose that the emerging life constituted as
a protein interaction world. We see this world as an abstract communication system
constituted by communications in the form of interactions between proteins. The
replication of the system consists of the replication of these interactions. According to
our hypothesis the protein interaction world led to the emergence of encapsulated
reproduction of sequences of protein interactions in form of protocells. Such
protocells turned into advanced protocells by developing memories of protein
interactions in form of RNA molecules. This was followed by evolving into proper
cells having complex intracellular processes and long term memory in the form of
DNA. In this paper we describe how systems theory provides the conceptual
foundations for our hypothesis and we argue that the protein interaction world
hypothesis may be more plausible than the RNA world hypothesis.
Our proposal shows similarities with the earlier proposal of Lacey et al. (22), who
suggested the co-evolution of proteins and RNA molecules, including the suggestion
that RNA molecules may have served as the memories of proteins. Similar ideas can
be found also in the work of Kauffman (29), who suggested that life may have
emerged as a system of biochemical interactions, which reached a critical level of
compound variability and interaction density. To some extent similarly to our ideas,
Segre and Lancet (3) have also suggested that life may have emerged as a system of
molecular interactions that reproduced itself.
5
The rest of the paper is structured as follows: Section 2 describes briefly the RNA
world hypothesis; in Section 3 we provide a review of key systems theory concepts; in
Section 4 we describe in detail the protein interaction world hypothesis; Section 5
contains discussion of implications of the protein interaction world hypothesis; the
paper is closed by the conclusions in Section 6.
6
2. RNA WORLD
RNA or ribonucleic acid molecules are sequences of nucleotides, containing typically
four types of bases: adenine, guanine, cytosine and uracil. The identity of an RNA
molecule is determined by the sequence of the nucleotides. An RNA molecule may
contain a few tens or even thousands of nucleotides. The role of RNA molecules in
cells is to drive the building process of proteins that takes place at the ribosomes (2,
30). The RNA molecules are built within the cell nucleus by copying segments of the
DNA. The primary RNA molecules go through a maturation process, when parts of
them are cut out or possibly changed.
It is a widely accepted hypothesis that RNA molecules were present in the prebiotic
environment on the Earth (e.g., 11, 20). RNA molecules can act as information
storage molecules due to their specificity in terms of interactions with other molecules
(31), i.e., they store the information about which molecules they should interact with,
for example by building them in the case of mRNAs. The RNA world hypothesis is
built on the assumption that information (i.e., the sequence of nucleotides, which is
different from random) was stored in RNA in the prebiotic environment and this
information was replicated by copying of RNA molecules.
A cornerstone of the RNA world hypothesis is that RNA molecules can catalyze
biosynthesis processes (30). By catalysing interactions between other organic
molecules (e.g., proteins) the RNA molecules can facilitate the molecular interaction
mechanisms needed for the synthesis and replication of the RNA (12). Consequently
7
the RNA molecules can organize autocatalytic processes required for selfreproductive biochemical systems (1, 32).
The replication of RNA molecules can be more efficient if the ingredients of the RNA
catalyzed self-reproductive system are enclosed, which may have led to the
emergence of protocells (9). Such protocells may have contained a mixture of proteins
which were used during the RNA replication process. The DNA may have evolved as
a long term information storage molecule, which being more stable than the RNA,
could maintain the information needed for the RNA replication over periods of
disturbance. The evolution of the replication mechanism of information encoded in
RNA molecules led to the evolution of cells with complex intracellular mechanisms,
involving a large number of proteins and DNA used as the cell’s long term
information storage device.
Theoretical studies (12) and existing experimental evidence (8, 15) indicate that the
components of RNA molecules, sugars, purines and pyrimidines can be synthesised in
abiotic conditions, although in relatively small quantities (8, 33). There are also
suggestions about how the synthesis of RNA molecules from mono-nucleotides (i.e.,
combinations of purines / pyrimidines and sugars) may have emerged on catalysing
clay surfaces (34, 35). Still, a fundamental problem with the RNA world hypothesis is
that it does not explain how nucleotides, made of purines, pyrimidines and sugars,
formed before the emergence of RNA molecules (20). While synthesis of nucleotides
and of the RNA happens in biotic conditions, robust abiotic counterparts of such
processes are not known (23). This negative result questions the fundament of the
RNA world hypothesis.
8
Supposing that RNA molecules and their constituent nucleotides are available in the
prebiotic environment we still face further important problems. RNA molecules are
not very stable and except in relatively cold environment they decompose into their
constituents preventing faithful replication of themselves (36, 37). Theoretical
estimations have shown that the replication of information molecules should be very
precise to allow stable evolution (2, 38) (i.e., otherwise replication noise would turn
back the population of replicating molecules to a random mixture of them).
Considering the instability of RNA molecules and the actual frequency of replication
errors it seems unlikely that simple RNA replication mechanisms could have
reproduced RNA molecules with the required accuracy for stable evolution (2).
Some authors suggested alternative versions of the RNA world hypothesis, including
the start with the DNA world (20) or the initial use of peptide nucleic acids as
information coding molecules (39). These alternatives do not really solve the
fundamental problems of the RNA world; just replace them by comparable problems
associated with the alternative proposal.
Summarizing, the RNA world hypothesis and its alternative versions offer a good
explanation of how life may have emerged supposing the existence of RNA molecules
and sufficiently high fidelity and stable RNA replication mechanisms. The Achilles
heel of the hypothesis is that the existence of prebiotic RNA molecules and high
fidelity RNA replication mechanisms is questionable and so far it was not possible to
be proven.
9
3. SYSTEMS THEORY
The origins of systems theory go back to the 1940s, when cybernetics research (40)
programs started to investigate the behaviour of complex engineering systems.
Starting from 1950s the general systems theory was developed following the work of
von Bertalanffy (41). The mathematical theory of complex systems emerged in the
1960s and focuses on systems that can be described by sets of differential equations
and analyses the properties of these equations (e.g., 42). In the period between the
1960s – 1980s Maturana and Varela developed the theory of autopoetic systems (26)
which aims to explain how self-regenerating and self-replicating living systems
emerge and evolve. More recently Luhmann (25) introduced a new approach to
systems theory following to some extent the works of Maturana and Varela.
Luhmann’s work concentrates on abstract communication systems made of
communications,
ignoring
the
communication
units
that
generate
these
communications. The theory of abstract communication systems gives a fresh look at
the complex systems, different from the classical approaches of 1940s – 1960s and
also from the approach of mathematical complex systems theory. This new approach
offers powerful analysis tools that allow to identify systems and their components,
and to analyse the function of these components in the context of the system. We
follow the work of Luhmann in this paper.
Communication units produce symbols that are transmitted to other communication
units, which perceive them. Communications are sequences of such symbols. Abstract
communication systems are made of such communications (see Figure 1A for
illustration). By definition, the communication units are not part of the system.
10
Communications reference other communications in the sense that the sequence of
symbols contained in a communication is dependent on the contents of other earlier or
simultaneous communications. A dense cluster of inter-referencing communications
surrounded by rare network of communications constitutes a communication system
(see Figure 1B for illustration).
11
Figure 1. A) The concept of communication. B) The concept of communication
system. Squares represent communication units, continuous arrows represent
communications, and segmented arrows represent referencing relations between
communications. The communication system is a dense cluster of inter-referencing
12
communications (area in the middle), surrounded by a rare network of other
communications. The communication units are not part of the communication system.
A communication system is defined by the regularities that define how referenced
communications determine the content of referencing communication. All
communications that follow the set of rules defining the system are part of the system.
All other communications that do not follow the rules of the system are part of the
system’s environment. Communication systems reproduce themselves by recruiting
new communications between communication units. A communication is recruited to
a system if it follows the referencing rules of the system. How successful is the
recruitment of new communications depends on earlier communications generated by
the system and on the environment of the system. We can view the system as a selfdescribing system made of communications. At the same time the system describes its
environment in a complementary sense. Better descriptions of the system’s
environment lead to higher success in recruiting new communications.
Systems that reproduce and expand faster than other systems may drive to extinction
the slower reproducing and expanding systems. The limits of system expansion are
determined by the probabilistic nature of referencing rules. A communication may
reference several earlier communications indirectly through other referenced
communications constituting referencing sequences of communications. The
probabilities of referencing rules determine how long can be such referencing
sequences of communications before the last communications becomes a random
continuation. Longer referencing sequences of communications (i.e., more detailed
descriptions) allow better descriptions of the systems and its environment. The
13
optimal size of the system (i.e., the number of simultaneous communications being
part of the system) is also determined by the probabilistic indeterminacies of
referencing rules. Systems that overgrow their optimal size may split into two similar
systems with the same (or possibly similar) set of referencing rules.
Communication systems may develop subsystems that are systems within the system,
i.e., they constitute a denser inter-referencing cluster within the dense communication
cluster of the system. Communications that are part of subsystems follow system rules
with additional constraints that are characteristic of the subsystem. More constrained
referencing rules decrease indeterminacies and allow the system to generate better
complementary descriptions of the environment and expand itself faster than systems
without subsystems. Systems may also change by simplification of the set of their
communication symbols (i.e., reduction of the number of such symbols). This may
lead to reduction of probabilistic indeterminacies in the referencing rules.
Consequently systems with simpler sets of communication symbols may expand
faster than systems with larger sets of communication symbols.
Another way of extending reliable descriptions of the environment (i.e., non-random
sequences of referencing communications) is by retaining records of earlier
communications, i.e., by having memories of earlier communications that can be
referenced by later communications. We can view such memories as the use of
additional communication units that reproduce for a certain period a certain
communication. Having memories reduces the indeterminacies in referencing by
allowing direct referencing of much earlier communications, instead of referencing
14
them through a chain of references. Systems with memory can expand faster than
systems without memory.
Systems with memory may develop a memory or information subsystem (i.e., the
memory is information about the past of the system) consisting of communications
between communication units generating memory communications. If such
communications constitute a dense cluster of inter-referencing communications
determined by a set of characteristic referencing rules the information subsystem of
the system emerges. Having an information subsystem allows combination of
memories and by this the generation of descriptions of the environment which are
better than environment descriptions in systems with memory but without information
subsystem.
Systems compete with each other for communications. Systems which have better
complementary descriptions of the environment can generate communications that fit
better their environment and make easier the recruitment of new communications.
Systems with better environment descriptions out compete systems with less good
descriptions of the environment. Systems having subsystems, simple communication
symbol set, memory and information subsystem can generate better descriptions of
their environment than systems which lack any of these features.
15
4. PROTEIN INTERACTION WORLD
Experiments simulating prebiotic Earth environment (5) and the analysis of larger
meteorite remains indicate that organic molecules like amino acids, short-chain fatty
acids and others can form without requiring the preceding existence of life. These
molecules may form simple autocatalytic interaction systems (1, 43) and small
vesicles delimited by lipidic or amphiphilic membranes (5).
Experiments have shown that amino acids form tight clusters called proteinoids (22,
27) at high temperatures, which may lead to the formation of simpler peptides (i.e.,
short chains of amino acids or oligopeptides) (44). Another way of forming peptides
is by the transformation of thioesters (23, 45), a chemical pathway that works
efficiently in abiotic conditions and is also used in biological organisms (23).
Experimental simulation of marine hydrothermal vents has shown that amino acids
may form short peptides in such conditions (44). Recently, Leman et al. (28) have
shown that peptides may form with high yield in plausible volcanic marine
environments in the presence of carbonyl sulphide, a common volcanic gas. Most
genetic analysis evidence suggest that early life emerged in high temperature
environment rich in sulphur (46), which implies the plausibility of the above
mentioned ways to the synthesis of early peptides. The interactions between peptides
may catalyse the formation of long chain peptides (proto-proteins), long-chain linear
fatty acids, lipids, and other organic molecules. The analysis of common and
evolutionarily preserved parts of many genomes indicate that the most preserved are
about 60 genes which are involved in the translation of genetic information into
proteins (47). This also suggests that early cells may have developed in protein-rich
16
environments, which provided the building blocks for the development and
multiplication of cells.
By adopting a systems view perspective, we can see the interactions between amino
acids, peptides and other molecules as abstract communication systems. In this system
the communication units are the peptides and other molecules, while their interactions
constitute the communications, the communication symbols being the constituent
phases of such interactions. Reproduction of this system means the reproduction of
the interactions between these molecules. The interactions between amino acids,
peptides and other molecules depend on earlier interactions between these molecules
that prepare the right molecules in the right conformation to perform interactions.
Subsets of the possible interactions may form a dense cluster of interdependent
interactions, referencing other interactions on the output of which the actual
interaction depends. The peptides being the catalysts of most interaction, and the
catalytic activity of peptides depending to a good extent on interactions with other
peptides, places peptide interactions at the core of dense interdependent interaction
clusters.
Following considerations from systems theory the peptide interaction systems can
reproduce and expand faster if they are in an enclosed environment, reducing the
diffusion of molecules that needed to participate in the generation of interactions to
maintain and expand the system. This may have led to the emergence of protocells
made of isolating lipid membranes encapsulating interacting peptides and other
molecules (9). As such systems grow they reach their growth limit and split in similar
systems, which continue their growth.
17
Contemporary living cells can be viewed as protein interaction systems in which
proteins interact with themselves and with other molecules, change their conformation
to prepare for further interactions, and perform phenomenological cellular functions
(e.g., cell respiration) by specific sequences of molecular interactions (e.g., metabolic
cycles, signalling pathways). In this view cells are self-reproducing and expanding
protein interaction systems, which reproduce the interactions between proteins and
other molecules according to system specific rules determining the referential (or
dependency) sequences of these interactions.
The peptide interaction system of protocells describes itself and in complementary
sense its environment. Systems theory indicates that a system with memory is likely
to out perform systems without memory. In this context memory means long term
storage of information about earlier system communications. In the case of peptide
interaction systems the memory should represent the interactions between peptides
and in complementary sense the environment of protocells. In a very rough
approximation the environment of cells is determined by the amount and availability
of atomic building blocks of proteins, namely the carbon (C), nitrogen (N), oxygen
(O), hydrogen (H) and to some extent sulphur (S), phosphorus (P) and halogen (F, Cl,
Br) atoms.
Our hypothesis is that the candidate molecules that could serve as memories of
peptide interactions and representations of the environment are the sugars,
representing C, O, and H content of the environment and also the presence of P in
their phosphorylated compounds, and purines and pyrimidines, representing C, N, H
18
content of the environment and also the presence of S and halogens in their
corresponding compounds (e.g., 4-thiouridine (48), 5-chlorocytosine (49)). This
means that protocells may have used sugars, purines and pyrimidines to store
information about interactions between peptides and also in complementary sense
about their environment. Proteins and peptides can be seen as the product of
interaction between other proteins / peptides containing sub-chains of amino acids
corresponding to parts of the amino acid chain of the product protein or peptide (note
that proteins are also peptides). To keep a memory of such interactions the system
memory of protocells should have been able to record the sequence of amino acids of
participating peptides. This may have led to the emergence of proto-RNA, which
encoded the sequence of amino acids of interacting peptides by using specific
combinations of sugars, purines and pyrimidines to encode amino acids. Theoretical
investigations (22) about interactions between peptides and mono-nucleotides support
our hypothesis. According to these earlier works chains of amino acids can form
double helix chains with mono-nucleotides, such that each amino acid is linked to a
triplet of mono-nucleotides. Such complementary chains of peptides could have
turned into heteropolymers of nucleotides forming the ancient version of RNA
molecules.
The requirement that system memories should work as communication units, allowing
the referencing of their memory and the reproduction of the communication content of
their memory, implies that the memory molecules should be able to interact with
peptides and catalyze interactions between peptides. Considering that all peptides are
produced from other peptides and possibly by adding some amino acids, the above
argument implies that proto-RNA molecules should have existed for all peptides and
19
amino acids. These molecules would have catalyzed interactions between the
corresponding peptides and amino acids.
Arguments from systems theory suggest that the simplification of the vocabulary of
interactions leads to faster expanding systems. This could have been the reason for the
elimination of many amino acids from the list of amino acids making peptides
participating in the most successful peptide interaction systems, leading to the
selection of the 20 protein forming biotic amino acids commonly occurring in living
organisms and the corresponding proteins made of these amino acids. Having
simplified, more reliable peptide interaction systems led to the increase in length of
the peptides leading through several evolutionary steps to giant proteins of currently
living cells.
Selection factors that led to the selection of some sugar, purine, pyrimidine
combinations could have been: (a) the ability of these nucleotides to form long chain
heteropolymers, (b) thermal stability of these polymers, and (c) specific enzymatic
abilities of proto-RNA molecules. These factors together with simplification –
expansion pressures are likely to have led to the selection of combinations of ribose
with a few purines and pyrimidines as nucleotide coding units in the proto-RNA, and
ultimately to the emergence of present day RNA molecules.
The above presented view suggests that protocells were vesicles surrounded by a lipid
membrane, reproducing inside series of peptide interactions helped by proto-RNA
molecules, which catalyzed these interactions. The building blocks of proto-RNAs
were produced by chemical reactions catalyzed by proteins. The proto-RNAs served
20
as memory molecules in the protocells. The competition between protocell systems
led to systems with simplified symbol sets (peptides containing selected amino acids
and few bases used to build proto-RNA) corresponding to early cells containing
proteins made of biotic amino acids and RNA molecules containing mostly the usual
bases ribose-adenine (A), ribose-cytosine (C), ribose-guanine (G) and ribose-uracil (U)
building blocks. Some RNA molecules contain unusual bases as well, which are
combinations of ribose with rare purines and pyrimidines.
The interactions between RNA molecules led to the emergence of the information
subsystem of the cell, which consists of a dense cluster of interactions between RNA
molecules, which depend on earlier interactions between RNA molecules. Some early
cells may have developed memory for their information subsystem. This memory
would have consisted of molecules that could interact with RNA molecules and would
retain the memory of interactions between RNA molecules. The memory of RNA
interactions should have encoded the RNAs that were present at the same time and
same location participating in interactions between them. Our conjecture is that DNA
molecules are memories of RNA interactions, and in early cells they catalyzed the
interaction and possibly formation of corresponding RNAs. Being memories of
memories the DNA acts in the context of the cell system as long term memory. So,
having DNA makes cells more successful in reproducing and expanding themselves
than cells without DNA.
Early versions of protocells could have contained many versions of combinations of
proto-RNAs catalyzing interactions between peptides. The simplification – expansion
argument leads to the conjecture that this mechanism evolved towards a simplified
21
version of it, preserving only interactions between proto-RNAs corresponding to
proteins and proto-RNAs corresponding to amino acids. In this way the proteins can
be built up by condensation of single amino acids with an existing partial chain of the
protein, reducing the unreliability of the catalysis of interactions between larger
peptides.
The emergence and expansion of the RNA communication system implies that there
should be many RNA interactions that do not lead directly to generation of proteins,
but in functional terms regulate the process of protein generation (see recent results on
siRNAs and microRNAs (50, 51)). In a similar way the existence of DNA memories
of RNA interactions may have led to the emergence of a system of DNA interactions
forming a new subsystem of the cell. The emergence of dense interdependent DNA
communications (i.e., interactions between DNA molecules) could have led to the
clustering of DNA molecules and the formation of cell nucleus. This also suggests
that in cells with nucleus there should be many DNA interactions that do not lead to
the production of RNA molecules, but rather regulate this process in functional terms.
Summarizing, our hypothesis is that life originated from peptide interaction systems,
which reproduced and expanded as vesicles surrounded by lipid bi-layer membranes.
Such peptide interaction systems led to the emergence of proto-RNA molecules that
served as memories of peptide interactions, facilitating the reproduction and
expansion of protocells. Simplification driven expansion led to the selection of biotic
amino acids and the reduction of the typical RNA alphabet to the four usual bases (A,
C, G, U). Interactions between RNA molecules led to the emergence of the RNA
interaction subsystem of the cell and to the emergence of memories of RNA
22
interactions in the form of DNA molecules. The expansion of DNA molecule
interactions led to the clustering of DNA molecules and formation of cell nucleus.
23
5. DISCUSSION
A. Protein interaction world vs. RNA world
Experimental results indicate that the protein interaction world that we described may
have originated in an abiotic environment able to produce amino acids, oligopeptides,
short-chain fatty acids and other relatively simple organic molecules. The RNA world
hypothesis is not supported so far by experimental evidence that would describe ways
of abiotic synthesis of nucleotides, the building blocks of RNAs. Consequently the
origin requirements of the protein interaction world hypothesis are more plausibly
satisfied than the origin requirements of the RNA world.
The reproduction of protocells and cells in the context of protein interaction world
hypothesis is relatively simple, requiring the reproduction of interactions between
peptides / proteins, which can occur in autocatalytic systems of peptide / protein
interactions. The reproduction of cells in the context of RNA world requires
complicated high precision molecular interaction machinery, which makes
questionable the sufficiently high fidelity reproducibility of early RNA world life that
would be required for evolution towards modern cellular forms. This shows that early
life and evolution according to the protein interaction world hypothesis is simpler to
maintain and reproduce than in the context of the RNA world hypothesis.
The protein interaction world hypothesis offers a well integrated scenario for the
emergence, role and evolution of all macromolecular components of living cells (i.e.,
DNA, RNA, proteins and other molecules), conceptualizing them in the context of the
24
cell’s internal communication system as communication units and memories, and
their interactions as communications. The RNA world hypothesis provides an
integrative description of cells, but with several ad-hoc elements (e.g., proteins are
side products that turn to be useful as catalysts of RNA replication) and without
providing a clear conceptual framework that would explain the evolutionary steps
leading to contemporary living cells. This indicates that the protein interaction world
hypothesis may have more explanatory and predictive power than the RNA world
hypothesis with respect to the interpretation of functional processes characterising
living cells and the evolution of cellular life.
B. Rare nucleic bases
Since the 1960s, researchers found several unusual nucleic bases in RNAs of various
micro-organisms. RNA bases like inosine, 1-methyl-guanin, pseudouridin, 4-thiouridine, wybutosine, 5-fluoro-uracil and others are found typically in tRNAs
(transport RNA: amino acid specific RNA responsible for the transportation of amino
acids to the protein assembly sites) of bacteria and archaea. In many cases the routes
of biosynthesis of these unusual RNA bases is already known. The RNA world
hypothesis does not provide an easy answer, why such unusual nucleic bases exist and
how did they emerge as RNA bases.
In the context of the protein interaction world hypothesis we can find a relatively
straightforward explanation of the existence and role of unusual nucleic bases. By
considering that RNA memories of protein interactions should represent in a crude
sense the atomic composition of the environment it follows immediately that micro-
25
organisms living in environments characterised by high sulphur or halogen content
should represent these in their RNA memories. This implies that such organisms
should have included in their RNA bases that contain sulphur or halogen representing
high bio-impact atomic content (i.e., atoms that relatively easily participate in a large
number of organic chemical compounds) of the environment. Considering that most
RNAs representing proteins went through a long evolutionary selection process driven
partly by simplification – expansion forces, we expect that sulphur and halogen
containing bases should be present in the older, more preserved tRNAs representing
the amino acids that are added to forming proteins during protein synthesis.
Experimental evidence shows that sulphur containing bases occur commonly in
tRNAs of thermophilic archaea (e.g., Thermus thermophilus) (52, 53, 54)). These
organisms typically live in high sulphur concentration, high temperature, deep marine
environments. In accordance with our theory they should have nucleic bases in their
tRNA which represent sulphur, which is confirmed by experimental analysis. In the
case of Escherichia coli growing in presence of iron, which is associated in natural
conditions with the presence of sulphur (55), tRNA molecules include sulphur
containing 2-methylthio-N6-(∆2-isopentenyl)-adenosine bases. If iron is bound by
iron-binding molecules, inducing low iron content and implying low sulphur content,
the same tRNAs will loose the sulphur containing bases, which are replaced by N6(∆2-isopentenyl)-adenosine bases (56). This supports our theory, which implies that in
low sulphur environment bacterial tRNA should include less sulphur containing bases
than in high sulphur environments.
26
Halogen containing RNA bases are reported in the literature as cancer treatment
agents (e.g., 5-fluoro-uracil) (e.g., 57, 58), which prevent normal development of
some proteins participating in thymine synthesis and indirectly block the replication
of DNA. Others report that halogenated bases are formed in bacteria attacked by
immune cells, and these bases contribute to the killing of bacteria (49). According to
our hypothesis it should be possible to find such nucleic bases in some primitive life
forms living in halogen-rich environment. Considering that the incorporation of
halogen containing bases may prevent the formation of thymine we consider it
unlikely to find DNA containing micro-organisms having halogen containing bases in
their tRNA. At the same time it might be possible to find RNA viruses of halobacteria,
which may contain halogenated RNA bases. Alternatively it may be also possible to
find primitive unicellular organisms living in halogen-rich environment which
replicate without using thymine containing DNA. Such organisms could live only in
isolated ecological niches, where competition with DNA containing life forms would
not have rendered them to become extinct.
C. Replication
A fundamental concept of evolution theory is the replication (31), which means the
identical or almost identical replication of life forms (e.g., cells, whole multi-cellular
organisms). In the context of the RNA world hypothesis the replication happens at the
level of RNA and DNA, which are replicated by a complex molecular interaction
machinery involving RNA, DNA molecules, proteins and other molecules. High
precision replication being required for stable evolution the supposition of such RNA
replication is a cornerstone of the RNA world hypothesis. At the same time this is also
27
the weakest point of this theory, as such high precision replication machinery without
pre-existing large RNA and DNA molecules regulating the replication process is not
known (2).
The protein interaction world hypothesis considers that replication of early life
happens in terms of replication of peptide/protein interactions, organized in sequences
of interdependent interactions. The replication of such interactions requires the
presence of the same molecules to reproduce the interaction. The replication of
conditional sequences of interactions requires the execution of conditioning
interactions that produce the peptides/proteins in the right conformation to perform
the conditional interaction. The replication of longer sequences of conditional
interactions is limited by the diffusion of required molecules. The diffusion of
molecules can be reduced by encapsulating them in vesicles made of lipid membranes.
Such vesicles could have formed in prebiotic conditions according to results of
experiments trying to replicate prebiotic Earth environment (5). This suggests that the
replication required for the protein interaction world hypothesis can be based on
plausible processes.
The emergence of memories of peptide/protein interactions in form of RNAs and of
RNA interactions in form of DNA molecules allows high precision replication of
interactions between large proteins resulting from many interactions between other
proteins, amino acids and other molecules. The protein interaction world hypothesis
provides a well integrated role for RNA and DNA molecules in the process of
replication of life, including the replication of these molecules.
28
A key difference between the replication concept of the protein interaction world and
RNA world hypotheses is that while the RNA world builds on replication of
molecules, the protein world hypothesis builds on the replication of interactions
between molecules. In the context of the protein interaction world the replication of
molecules is a side effect of replication of interactions between molecules.
According to the protein interaction world hypothesis the concept of replication is
extended into the concept of replication and expansion. Protein interaction systems
replicate and expand by producing protein interactions that follow their conditional
rules and in this they way they reproduce and expand themselves. The growth limits
of interaction systems lead to the splitting of systems and to system scale replication
and expansion.
29
6. CONCLUSIONS
The above described protein interaction world hypothesis formulates an alternative to
the most commonly accepted RNA world hypothesis about the origins of life. The
protein interaction world hypothesis is fundamentally different from the RNA world
hypothesis in the sense that while the RNA world hypothesis is build on the concept
of replication of RNA and DNA molecules, the protein world hypothesis is built on
the assumption of replication and expansion of protein interaction systems perceived
as abstract communication systems.
The protein interaction world hypothesis is based on experimentally validated,
plausible assumptions about the emergence of early peptide/protein interaction
systems in the prebiotic Earth environment. The protein interaction world hypothesis
provides a systematic integrated view of how life emerged and developed, providing
well-defined places for RNA molecules that are seen as memories of protein
interactions, and DNA molecules considered as memories of RNA interactions.
Predictions based on the protein interaction world hypothesis about the sulphur
containing unusual RNA bases fit with experimental findings about these bases
providing additional support for this hypothesis. Further predictions about halogen
containing bases are not yet validated by experimental evidence, but they point very
specifically for directions in which potential experimental evidence might be possible
to be found.
30
The perception of living systems as abstract communication systems opens up new
avenues for research about the role of various organic molecules, the evolution of
their role in the context of living systems, and for the analysis of the boundary
between living and non-living systems. We believe that the protein interaction world
hypothesis can provide more parsimonious explanations of how living systems work
and organize themselves than other hypotheses about the origins of life like the RNA
world hypothesis.
31
REFERENCES
1. Ganti, T (1997). Biogenesis itself. Journal of Theoretical Biology, 187:583-593.
2. Joyce, GF (2002). Booting up life. Nature, 420:278-279.
3. Segre D, Lancet, D (2000). Composing life. EMBO Reports, 1:217-222.
4. Woese, CR (1987). Bacterial evolution. Microbiological Reviews, 51:221-271.
5. Miller, SL, Orgel, LE (1974). The Origins of Life on the Earth. Englewood Cliffs,
NJ, Prentice Hall.
6. Botta, O, Bada, JL (2002). Extraterrestrial organic compounds in meteorites.
Surveys in Geophysics, 23:411-467.
7. Henry, DA (2003). ‘Star dust memories’ – A brief history of the Murchison
carbonaceous chondrite. Publications of the Astronomical Society of Australia, 20:viiix.
8. Pizzarello, S (2004). Chemical evolution and meteorites: An update. Origins of Life
and Evolution of the Biosphere, 34:25-34.
9. Deamer, DW (1997). The first living systems: a bioenergetic perspective.
Microbiology and Molecular Biology Reviews, 61:239-261.
10. Joyce GF (1989). RNA evolution and the origins of life. Nature, 338:217-224.
11. Joyce, GF (2002). The antiquity of RNA-based evolution. Nature, 418:214-221.
12. Unrau, PJ, Bartel DP (1998). RNA-catalysed nucleotide synthesis. Nature,
395:260-263.
13. Zubay, G, Schechter, A (2000). Current status of the RNA – only world.
CHEMTRACTS – Biochemistry and Molecular Biology, 13:829-836.
32
14. Cottin, H, Gazeau, MC, Raulin, F (1999). Cometary organic chemistry: a review
from observations, numerical and experimental simulations. Planetary and Space
Science, 47: 1141-1162.
15. Glavin, DP, Bada, JL (2004). Isolation of purines and pyrimidines from the
Murchison meteorite using sublimation. 35th Lunar and Planetary Science Conference,
paper 1022.
16. Robertson, MP, Miller, SL (1995). An efficient prebiotic synthesis of cytosine and
uracil. Nature, 375:772-774.
17. Zubay, G (1999). Synthesis of the first nucleotides. CHEMTRACTS –
Biochemistry and Molecular Biology, 12:432-452.
18. Zubay, G (2000). Biochemical pathways may provide leads to prebiotic pathways.
CHEMTRACTS – Biochemistry and Molecular Biology, 13:357-363.
19. Zubay, G, Schechter, A (2001). Prebiotic routes for the synthesis and separation of
ribose. CHEMTRACTS – Biochemistry and Molecular Biology, 14:117-124.
20. Dworkin, JP, Lazcano A, Miller, SL (2003). The roads to and from the RNA
world. Journal of Theoretical Biology, 222:127-134.
21. Maynard-Smith, J, Szathmary, E (1997). The Major Transitions in Evolution.
Oxford, Oxford University Press.
22. Lacey, JC, Cook, GW, Mullins, DW (1999). Concepts related to the origin of
coded protein synthesis. CHEMTRACTS – Biochemistry and Molecular Biology,
12:398-418.
23. DeDuve C (1993). RNA without protein or protein without RNA ? In: What is
Life ? The Next Fifty Years, Murphy, MP, O’Neill, LAJ (eds.), Cambridge University
Press, Cambridge, UK, pp.79-82.
33
24. Charlton, BG, Andras, P (2003). The Modernization Imperative. Exeter,
Academic Imprint.
25. Luhmann, N (1996). Social Systems. Stanford University Press.
26. Maturana HR, Varela, FJ (1980). Autopoiesis and Cognition : the realization of
the living. Boston, D. Reidel Publishing Company.
27. Fox, SW, Bahn, PR, Dose K, et al. (1994). Experimental retracement of the
origins of a protocell – it was also a protoneuron. Journal of Biological Physics,
20:17-36.
28. Leman, L, Orgel, L, Ghadiri, MR (2004). Carbonyl sulphide mediated prebiotic
formation of peptides. Science, 306:283-286.
29. Kauffman, SA (1993). ‘What is life ?’: was Schrodinger right ? In: What is Life ?
The Next Fifty Years, Murphy, MP, O’Neill, LAJ (eds.), Cambridge University Press,
Cambridge, UK, pp.83-114.
30. Doudna JA, Cech, TR (2002). The chemical repertoire of natural ribozymes.
Nature, 218: 222-228.
31. Szathmary, E (2003). Why are there four letters in the genetic alphabet ? Nature
Reviews Genetics, 4:995-1001.
32. Segre, D, Lancet, D, Kedem, O, Pilpel, Y (1998). Graded autocatalysis replication
domain (GARD) : Kinetic analysis of self-replication in mutually catalytic sets.
Origins of Life and Evolution of the Biosphere, 28:501-514.
33. Dworkin, JP (1997). Attempted prebiotic synthesis of pseudouridine. Origins of
Life and Evolution of the Biosphere, 27:345-355.
34. Ferris, JP (2003). Montmorillonite catalysis of 30-50 MER oligonucleotides:
laboratory demonstration of potential steps in the origin of the RNA world. Origins of
Life and Evolution of the Biosphere, 32:311-332.
34
35. Huang, W, Ferris, JP (2003). Synthesis of 35-40 mers of RNA oligomers from
unblocked monomers. A simple approach to the RNA world. Chemical
Communications, 12:1458-1459.
36. Larralde, Robertson, MP, Miller, SL (1995). Rates of decomposition of ribose and
other sugars : Implications for chemical evolution. PNAS, 92:8158-8160.
37. Levy, M, Miller, SL (1998). The stability of RNA bases: Implications for the
origin of life. PNAS, 95:7933-7938.
38. Szabo P, Scheuring I, Czaran T, Szathmary E (2002). In silico simulations reveal
that replicators with limited dispersal evolve towards higher efficiency and fidelity.
Nature, 420: 340-343.
39. Nelson, KE, Levy, M, Miller, SL (2000). Peptide nucleic acids rather than RNA
may have been the first genetic molecule. PNAS, 97:3868-3871.
40. Wiener, N (1948). Cybernetics or, Control and communication in the animal and
the machine. New York, Wiley.
41. von Bertalanffy, L (1973). General System Theory: foundations, development,
applications. Harmondsworth, Penguin.
42. Perko, L (1996). Differential Equations and Dynamical Systems. New York,
Springer.
43. Orgel, LE (2000). Self-organizing biochemical cycles. PNAS, 97:12503-12507.
44. Imai, E, Honda, H, Hatori, K, Brack, A, Matsuno, K (1999). Elongation of
oligopeptides in a simulated submarine hydrothermal system. Science, 283:831-833.
45. Francis, BR (2000). A hypothesis that ribosomal protein synthesis evolved from
couple protein and nucleic acid synthesis. CHEMTRACTS – Biochemistry and
Molecular Biology, 13: 153-191.
35
46. Deamer, DW, Chakrabarti, A (1999). The first living organisms: In the light or in
the dark. CHEMTRACTS – Biochemistry and Molecular Biology, 12: 453-467.
47. Koonin, E,V. (2003), Comparative genomics, minimal gene sets and the last
universal common ancestor. Nature Reviews Microbiology, 1:127-136.
48. Mueller, EG, Buck, CJ, Palenchar, CM, Barnhart, LE, Paulson, JL (1998).
Identification of a gene involved in the generation of 4-thiouridine in tRNA. Nucleic
Acid Research, 26:2606-2610.
49. Henderson, JP, Byun, J, Heinecke, JW (1999). Molecular Chlorine Generated by
the Myeloperoxidase-Hydrogen Peroxide-Chloride System of Phagocytes Produces 5Chlorocytosine in Bacterial RNA. Journal of Biological Chemistry, 274: 3344033448.
50. Denli, AM, Tops, BBJ, Plasterk, RHA, Ketting, RF, Hannon, GJ (2004).
Processing of primary microRNAs by the Microprocessor complex. Nature, 432: 231234.
51. Gregory, RI, Yan, K-P, Amuthan, G, et al. (2004). The Microprocessor complex
mediates the genesis of microRNAs. Nature, 432: 235-241.
52. Naoki, S, Suzuki, T, Tamakoshi, M, Oshima, T, Watanabe, K (2002). Conserved
bases in the TΨC loop of tRNA are determinants for thermophile-specific 2thiouridylation at position 54. The Journal of Biological Chemistry, 277:39128-39135.
53. McCloskey, JA, Graham, DE, Zhou, S, et al. (2001). Post-transcriptional
modification in archaeal tRNAs: identities and phylogenetic relations of nucleotides
from mesophilic and hyperthermophilic Methanococcales. Nucleic Acids Research,
29:4699-4706.
36
54. Watanabe, K, Kuchino, Y, Yamazuki, Z, Kato, M, Oshima, T, Nishimura, S
(1979). Nucleotide sequence of formyl-methionine tRNA from an extreme
thermophile, Thermus thermophilus. Journal of Biochemistry, 86: 893-905.
55. Sekowska, A, Kung, H-F, Danchin, A (2000). Sulphur metabolism in Escherichia
coli and related bacteria: Facts and fiction. Journal of Molecular Microbiology and
Biotechnology, 2:145-177.
56. Buck, M, Griffiths, E (1982). Iron mediated methylthiolation of tRNA as a
regulator of operon expression in Escherichia coli. Nucleic Acids Research 10: 2609–
2624,
57. Maxwell, PJ, Longley, DB, Latif, T, et al. (2003). Identification of 5-fluorouracilinducible Target Genes Using cDNA Microarray Profiling. Cancer Research, 63:
4602-4606.
58. Samuelsson, T (1991) Interactions of transfer RNA pseudouridine synthases with
RNAs substituted with fluorouracil. Nucleic Acids Research, 19: 6139-6144
37
Download