Supplementary material for Thomas Cavalier-Smith

advertisement
1
Supplementary material for Thomas Cavalier-Smith: Kingdoms Protozoa and Chromista and
the eozoan root of the eukaryotic tree. Biology Letters.
This electronic supplement contains additional explanations for my conclusions, discussion of the
drawbacks of alternative ideas in the literature, a summary of the revised classification of both
kingdoms with nomenclatural details (Table 1), and further references, which severe space
constraints did not allow including in the printed paper.
Since the final version of the paper was prepared I have found nine additional lines of evidence
for the root being between Euglenozoa and neokaryotes. As space constraints did not allow their
insertion into this paper I explain them elsewhere (Cavalier-Smith 2009d). They are: (1) absence of
the centromeric histone H3 variant CENPA (crucial for neokaryote centromere assembly) in
trypanosomatids and somewhat shorter N-terminal tails for histone H3 (with the segment embracing
the key lysine for neokaryote acetylation labelling for heterochromatinization missing) and histone
H4; (2) the RNase III dicer enzyme that generates small RNAs lacks two domains in
trypanosomatids (like the ancestral prokaryotic RNase III) that were arguably added stepwise in
neokaryotes and then neozoa; (3) absence in trypanosomatids (as in bacteria) of the PIWI paralogue
of the Argonaute proteins that targets double-stranded RNA for digestion; (4) absence in
trypanosomatids of RNA polymerase II transcription factors IIA, F, and H; (5) ER luminal quality
control of nascent glycoproteins is simpler in trypanosomatids with two enzymes that Neozoa use to
digest faulty ones (Mannosidase I and peptide-N-glycanase); (3) absence in trypanosomatids of
kinesins 4-8 and 15; (6) absence in trypanosomatids of widespread tail domains from two of the
three putatively ancestral myosins; (7) absence in trypanosomatids of the chromosomal cohesin
Smc heterodimer Smc5/6; (8) absence in trypanosomatids of the ER calcium-binding protein
calreticulin. (9) trypanosomatids have more primitive archaebacteria-like small nucleolar RNAs
(snoRNAs) involved in prerRNA processing than do neokaryotes. All nine characters are most
simply interpreted as the primitive condition for Euglenozoa (testable by studying them in
bodonids, diplonemids, and euglenoids) and also for eukaryotes as a whole, rather than secondary
simplifications of trypanosomatids alone.
Thus with ORC and TOM40 this makes at least 11 independent trypanosomatid characters
best interpreted as the primitive state for all eukaryotes and supporting the primary eukaryotic
dichotomy being between Euglenozoa and neokaryotes. The fact that these include such
fundamental and diverse cell properties as mitochondrial protein import, nuclear DNA replication
initiation, snoRNAs, and centromere biogenesis means that they cannot be dismissed as trivia and
highlights the importance of studying all these features intensively in a phylogenetically broad
spectrum of eukaryotes and carrying out genome projects for a similarly broad range of deepbranching Euglenozoa.
Furthermore, on the Polo-like paralogue-rooted aurora kinase tree Euglenozoa are the most
divergent eukaryotes (Brown et al. 2004), as they are for each of the four paralogue subtrees of the
giant chromatin protein SMC family (Gluenz et al. 2008); the latter may be especially significant as
SMCs are extremely long and well-conserved proteins that seems to suffer much less from episodic
quantum evolution in the stems of paralogue rooted trees than most other proteins and arguably give
better single-gene trees than almost any other protein normally used for deep phylogeny.
‘Chromophyte’ here refers collectively to all algae with chlorophyll c so as to contrast them with
non-photosynthetic Chromista such as Rhizaria, Ciliophora, Pseudofungi, and Heliozoa.
Throughout this paper ‘haem lyase’ refers always to the invariably nuclear-encoded
monomolecular haem (=heme to Americans) lyase of neozoa only. Sometimes the unrelated nonhomologous multigene Ccms of excavates and corticates (encoded in the mitochondrial genome in
Loukozoa) are confusingly annotated in GenBank as haem or heme lyase, instead of by the more
usual term ‘cytochrome c-type biogenesis protein’ normally used for their bacterial homologues);
2
Allen et al. (2008) clearly explain all the different types of cytochrome c biogenesis enzymes
currently known; the fourth method of cytochrome c-type biogenesis in eukaryotes is that
chloroplasts use a second bacterial c-type biogenesis mechanism involving Res, probably
introduced by their cyanobacterial ancestor and transferred to chromophytes by the secondary
symbiogenetic red alga - but this was not discussed in this paper or shown on Fig. 1 as it is
irrelevant to rooting the tree.
Table 1. Revised Classification of the Kingdoms Protozoa and Chromista
Kingdom Protozoa† Owen 1858 emend.
Subkingdom Eozoa†* Cavalier-Smith 1997 emend.
Infrakingdom and Phylum Euglenozoa Cavalier-Smith 1981 (Euglenoidea,
Diplonemea, Postgaardea**, Kinetoplastea)
Infrakingdom Excavata† Cavalier-Smith 2002 emend.
Phylum Percolozoa Cavalier-Smith 1991 (Pharyngomonadia and Tetramitia, i.e.
Lyromonadea, Heterolobosea, Percolatea)
Phylum Loukozoa† Cavalier-Smith 1999 (Jakobea, Malawimonadea, ?Diphyllatea)
Phylum Metamonada Cavalier-Smith 1981 emend. 2003 (Anaeromonadea, Eopharyngia,
Parabasalia)
Subkingdom Sarcomastigota† Cavalier-Smith 1983 emend.
Phylum Amoebozoa Lühe 1913 emend. Cavalier-Smith 1998
Phylum Apusozoa Cavalier-Smith 1997 stat. n. 2003 emend. 2008
Phylum Choanozoa† Cavalier-Smith 1981
Kingdom Chromista Cavalier-Smith 1981 emend.
Subkingdom Harosa subk. n. Diagnosis: typically with cortical alveoli or tripartite
ciliary hairs or reticulose or filose pseudopods or ciliary gliding. Etymology: HAR the
initials of Heterokonta (=stramenopiles) Alveolata and Rhizaria (= SAR group of Burki et
al. 2007); plus meaningless suffix –osa as used in such names as Filosa (within the rhizarian
Cercozoa), Lobosa (Amoebozoa) and Conosa (Amoebozoa) referring also to at least
partially amoeboid groups.
Infrakingdom Heterokonta Cavalier-Smith 1986 (also known by the unnecessary junior (1989)
synonym ‘stramenopiles’)
Phylum Ochrophyta Cavalier-Smith 1986 (e.g. diatoms, brown algae, chrysophytes)
Phylum Pseudofungi Cavalier-Smith 1986 stat. n. 1989 (Oomycetes, Hyphochytrea,
Developayella)
Phylum Bigyra Cavalier-Smith 1998 (Opalozoa, e.g. Actinophryida, Blastocystis; Bicoecea;
Labyrinthulea)
Infrakingdom Alveolata Cavalier-Smith 1991
Phylum Myzozoa Cavalier-Smith 2004 (Dinozoa [dinoflagellates, ellobiopsids and
perkinsids]; Apicomplexa [apicomonads, Chromera, Sporozoa])
Phylum Ciliophora Doflein 1901 stat. n. Copeland 1956 (ciliates and suctorians)
Infrakingdom Rhizaria Cavalier-Smith 2002 emend.
Phylum Cercozoa Cavalier-Smith 1998
Phylum Retaria Cavalier-Smith 1999 (Foraminifera; Radiozoa)
Subkingdom Hacrobia (Okamoto et al. 2009***) subking. n.
Phylum Cryptista Cavalier-Smith 1989 (Cryptophyceae, Goniomonadea; Katablepharidea,
Telonemea)
Phylum Haptophyta Hibberd ex Cavalier-Smith 1986
Phylum Heliozoa Haeckel 1862 stat. n. Margulis 1974 em. Cavalier-Smith 2003
(Centrohelea)
3
† Paraphyletic; the validity and importance of ancestral (paraphyletic) taxa (e.g. Bacteria, Protozoa,
Eozoa, Excavata, Loukozoa, Choanozoa, Sarcomastigota) is explained elsewhere (Cavalier-Smith
2009c).
Only these four taxa and those they include should be treated under the International Code of Botanical
Nomenclature. All other chromist and all protozoan names should be subject to the International Code
of Zoological Nomenclature.
* I invented the name Eozoa as a subkingdom name (Cavalier-Smith 1997) for the protozoan phyla
Euglenozoa, Percolozoa, and Trichozoa (the latter now subsumed within Metamonada: Cavalier-Smith
2003b). Here I emend the subkingdom by formally adding the phylum Loukozoa (Cavalier-Smith
1999) and all taxa now included in Metamonada Cavalier-Smith (2003b).
**Includes Postgaardi and Calkinsia (Cavalier-Smith 2003b,c). Separating Calkinsia into a new
unranked higher taxon with a new name (Yubuki et al. 2009) was not merited by morphology and was
unwise in the absence of molecular data for Postgaardi.
***These authors introduced this name for the clade comprising the last common ancestor of
haptophytes and cryptomonads and all its descendants, but without assigning a taxonomic rank. Here I
formally adopt the same name for this new subkingdom (Diagnosis as in their paper p. 5), which
currently has the same taxonomic composition as in their paper (I formally exclude from the
subkingdom those dinoflagellates that are partially descended from haptophytes by acquiring their
plastids and various genes).
__________________________________________________________________________________
Note 1. Dual Green and Red secondary symbiogenesis in the origin of kingdom Chromista
‘I think it best to put forward simple, detailed and specific hypotheses, since these have a better chance of
stimulating (and being refuted or corroborated by) future research than are vague or unnecessarily
complicated ones.’ (Cavalier-Smith 1993a p. 339.)
This radical new interpretation unifies for the first time three independent recent discoveries: (1)
that all four chromalveolate groups (Haptophyta, Cryptista, Heterokonta, Alveolata) accepted to
have acquired their chloroplasts symbiogenetically from a red alga also contain scores hundreds or
over a thousand nuclear genes that are proposed to be specifically related to those of nonstreptophyte (possibly prasinophyte) green algae (in heterokonts even more than those specifically
related to reds) (Moustafa et al. 2009); (2) that the chlorarachnean alga Bigelowiella, which belongs
to the phylum Cercozoa within Rhizaria, accepted to have acquired its nucleomorph and plastid
from a non-streptophyte (probably ulvophyte) green alga (Ishida et al. 1997) also contains
numerous genes of probably red algal origin (Archibald et al. 2003); (3) That Rhizaria, including
Cercozoa and Bigelowiella, robustly group on multigene trees as sister to Heterokonta/Alveolata
within the chromist subkingdom Harosa as its deepest branch. Moustafa et al. (2009) pointed out
that the presence of many of the same putatively prasinophyte ‘green’ genes in all four
chromalveolate lineages, including the entirely plastid-free Ciliophora (Frommolt et al. 2008)
strongly indicates that they were implanted in the common ancestor of all four groups (incidentally
strongly supporting their monophyly independently of earlier evidence for this), and furthermore
makes it probable that they were implanted over a relatively short time by a previously gene
transfer from a previously unrecognised intracellular green algal symbiont (endosymbiotic gene
transfer: EGT) in the common ancestor of chromalveolates rather than by numerous independent
lateral transfers (LGT) spread over a long evolutionary timespan.
Moustafa et al (2009) refer to their reasonably postulated ancestral green algal
endosymbiosis as ‘cryptic’, meaning presumably that no visible cellular evidence of it survives
today comparable to the chlorophyll c-containing plastids of chromalveolates or the nucleomorphs
of Cryptophyceae. However, I suggest here that the increasingly strong evidence that Rhizaria
branch within Chromista after Harosa and Hacrobia diverged means that the postulated second
4
ancestral green symbiogenesis (if it genuinely occurred; see technical caveat below) might not have
been cryptic at all, but may persist to this day in the form of the green algal chloroplast of
chlorarachnean Rhizaria and their nucleomorph (Cavalier-Smith 2006a). Particularly as the donor
of the chlorarachnean chloroplast and nucleomorph and the donor of the chromalveolate green
genes both appear to have been a chlorophyte, non-streptophyte alga, it would not be parsimonious
to assume one green algal symbiosis in the common ancestor of all chromists to explain the data of
Moustafa et al. (2009) and another (perhaps not much later) in just one chromist lineage to explain
the origin of chlorarachnean algae, unless one were sure that each involved a different group of
green algae, which is not currently the case. Assuming one green algal symbiogenesis only has
important and testable implications also for the red algal symbiogenesis (below).
Those averse to accepting evolutionary loss will immediately point out that this entails
accepting several more losses of the green algal chloroplast and nucleus within Chromista than the
Frommolt et al. (2008) and Moustafa et al. (2009) idea of a cryptic endosymbiosis, which assumes
only one such loss for both organelles, but no persistence in chlorarachneans, as if that were a
disadvantage of this new simplifying hypothesis, which it is not. Already one must accept at least
4-5 independent losses of the red algal nucleomorph within Chromista and still more losses of the
‘red’ plastid and even more losses of photosynthesis but with retention of a leucoplast. Loss is
pervasive in cellular evolution and often (not always) much easier mechanistically than gain of
complex characters whether by symbiogenesis or otherwise. Discussing parsimony about the
number of qualitatively incomparable events in alternative scenarios is very misleading without
realistically weighting them according to the mechanistic changes involved. The tiny cost in
parsimony of assuming about 5-7 losses of the green nucleomorph and plastids within Chromista
(the excavate number is uncertain because the basal topology of the tree within Rhizaria is still
unsettled) must be weighed against the great economy of hypothesis that the present theory of a
dual concerted secondary symbiogenesis can provide.
There is no need to assume a serial symbiogenesis as did Moustafa et al. (2009). A
temporally overlapping dual symbiogenesis is more parsimonious as it would allow the initial stage
of the evolution of protein-targeting of rough ER-made proteins into the green and red chloroplasts
to be shared. This would halve the evolutionary difficulty of secondary symbiogenesis compared
with the traditional idea that the green and red enslavements took place in separate cells (CavalierSmith 1999, 2000, 2003). Therefore I now suggest that in the stem lineage of Chromista green and
red symbionts became enslaved in the same cell and that at least some of the metabolite exchange
proteins that arguably initiated that process are shared between chlorarachneans and
chromalveolates. Furthermore I suggest that the initial stages of protein targeting from the ER to the
plastid were also shared, there being a common set of new SNAREs that targeted Golgi vesicles
indiscriminately to both the red and green plastids.
This predicts that when elucidated the plastid-destined vesicle targeting systems of the
harosan groups Alveolata and Chlorarachnea may share more properties than would be expected if
they evolved independently. The next stage in targeting is across the periplastid membrane, the
former plasma membrane of the enslaved algae. In chromalveolates this is mediated by transit-like
presequences that probably evolved from chloroplast transit sequences (Cavalier-Smith 1999,
2003a) which are recognised by Derlin proteins (Der1) of a relocated periplastid membrane-specific
version of the ERAD export machinery used to export damaged proteins from the ER of all
eukaryote cells, and which evolved by gene duplication in the ancestral chromist (Bolte et al. 2009;
Hempel et al. 2009; Agrawal et al. 2009). The unity of this periplastid membrane machinery in all
chromalveolates is one of the two strongest lines of evidence for there having been only one
secondary symbiogenetic intracellular enslavement of a red alga in the history of life.
Possibly the chlorarachnean periplastid membrane protein system also might turn out to
have more in common with the chromalveolate periplastid Derlin system than would be expected if
the green and red symbiogeneses evolved independently. However, a common origin of the
periplastid membrane transport machinery in chlorarachneans and chromophytes, though a
permissible feature of the dual origin theory is not a necessary one; what one might expect would
5
depend on the relative timing of the events and how much common evolution proceeded before
each divergence shown in figure 1. I have long argued that symbiogeneses could be completed
surprisingly quickly, and that the basal divergences of both Plantae and Chromista were probably
much more rapid than most biologists would intuitively expect and that such rapid divergence is the
chief reason why the topology of both kingdoms is hard to resolve on sequence trees (CavalierSmith 1993). Moreover, if two different secondary plastids were indeed being enslaved partially
simultaneously, one would expect some divergent selection for specificity to cause their import
machinery to diverge even if it had some shared features originally (just such divergence occurred
during temporary coexistence in the tertiary symbiogenetic replacement of a dinoflagellate plastid
by one from haptophytes: Patron and Waller 2007). Present evidence is indecisive: transit-peptidelike sequences of the rhizarian Bigelowiella are similar in amino acid composition to those of
Apicomplexa (Alveolata) but that for the RuBisCo gene did not support targeting into the plastid of
Toxoplasma in a heterologous transformation experiment (Rogers et al. 2004).
The above scenario assumes that the ‘green’ genes of Moustafa et al. and the chloroplast and
nucleomorph of Chlorarachnea both came from the same secondary symbiogenesis and therefore
from the same species of non-streptophyte green alga. Currently it is unclear whether or not this
assumption is valid. An early EF-Tu protein sequence tree suggested that the donor for
chlorarachneans was a chlorophyte belonging to the class Ulvophyceae (Ishida et al. 1997) and 18S
rRNA trees also suggested this but with extremely weak support (Silver et al. 2007). 70-gene
chloroplast trees robustly rule out both streptophytes and prasinophytes as donor and indicate a
position deep within tetraphytine green algae (Cavalier-Smith 2007) close to Ulvophyceae (Turmel
et al. 2009). By contrast the analyses of Moustafa et al. (2009) suggest Prasinophyceae as the gene
donor and apparently rule out streptophytes. However, as there are no genome sequences for
Ulvophyceae, their trees cannot contain ulvophyte sequences, making it premature to conclude that
most of the genes came from prasinophytes rather than ulvophytes (or close relatives of them).
When more green algal genomes, including several ulvophytes are available it should be possible to
distinguish between a single green algal (probably ulvophyte) secondary symbiogenesis only in
Chromista, as proposed here for its heuristic simplicity, or an additional separate cryptic symbiosis
from a different (prasinophyte) donor as they suggest.
Tertiary symbiogenesis is a red herring for understanding chromist deep phylogeny.
The unity of chromalveolates shown by the periplastid targeting machinery, and equally
compellingly but entirely independently, by the gene duplication, plastid retargeting and gene
replacement of plastid GAPDH and plastid FBA in all four chromalveolate groups (Fast et al. 2001;
Patron et al. 2004), proves beyond any shadow of doubt that only one secondary symbiogenetic
enslavement of a red alga was involved in the origin of chromalveolate plastids.
But this triply compelling evidence does not in itself rule out the possibility that chromist
plastids and nuclei with all these genes were also transferred bodily and laterally among distantly
related lineages, a process known as tertiary symbiogenesis (whose possibility for chromists I first
emphasized: Cavalier-Smith et al. 1994). Indeed one such case is proven: the replacement of the
typical peridinin-containing plastid of one small lineage of dinoflagellates by a foreign fucoxanthin
containing one from a haptophyte (Patron et al. 2006). However, this tertiary symbiogenesis was
almost certainly a replacement of a pre-existing chloroplast, as evidence for genes of both still
persisting attest (Patron et al. 2006). In principle, already having a plastid of secondary origin and
nuclear genes coding for the machinery for protein import across several membranes and a
thousand or more genes encoding proteins with topogenic sequences recognised by that machinery
ought to facilitate replacement of that chloroplast by a foreign one with similar machinery, which
ought therefore to be much easier evolutionarily than tertiary implantation of a plastid with four
bounding membranes into a purely heterotrophic lineage that never had a plastid. It would therefore
be fallacious to argue that this known case of tertiary chloroplast replacement makes it acceptable
to assume that tertiary symbiogenesis into a plastid-free lineage, which must be evolutionarily
6
extremely difficult (not one example exists), is anywhere near as likely as the evolutionary loss of
plastids, which could in principle result from a single mutation.
Nonetheless Sanchez-Puerta and Delwiche (2008) postulated that tertiary symbiogenesis
occurred either once or twice into lineages that they suggest originally never had plastids (as others
also have, but I single out their hypothesis for criticism as it is better argued than most). Exactly
why they suggested this is unclear, as they gave no explicit reasons. Reading between the lines I can
only suppose that they do not want to accept that Rhizaria had a photosynthetic ancestor with a red
algal plastid and would prefer not to accept this for Ciliophora either, and assume (wrongly I think)
that chromist monophyly should be easier to demonstrate on sequence trees than it is. But accepting
a red algal plastid in the ancestral rhizarian adds only one additional plastid loss, and accepting one
in ciliates also (which they seem more ready to, for an unstated reason) adds just two losses, to the
several with which they seem to have no problem; that hypothetical reduction in losses is an
extremely weak justification for their incredibly complicated and entirely unnecessary scenario.
As stressed long ago, worry over Chromista and Plantae not appearing monophyletic on
many sequence trees is misplaced as this inconclusive resolution was expected for sound
evolutionary reasons (Cavalier-Smith 1993a p. 331-2). The problem was worst with single-gene
trees but persists with multigene trees. However most multigene trees do group the main
chromalveolate taxa together in pairs (haptophytes with cryptists and heterokonts with alveolates:
Burki et al. 2007, 2008, Hackett et al. 2007) and the most recent, most taxonomically
comprehensive tree based on most (127) genes (Burki et al. 2009) groups all four together with
moderately good support, with of course the inclusion also of Rhizaria and Heliozoa - which simply
indicates that these two taxa evolved secondarily from chromophyte ancestors, as was long
considered likely for the heterotrophic heterokont phyla (Pseudofungi and Bigyra) and should now
also be accepted for Ciliophora. This latest multigene tree thus shows the monophyly of both
Plantae (now generally accepted, but about which there also was scepticism for decades since I first
advocated it: Cavalier-Smith 1981, 1982) and Chromista in the expanded sense of this paper, and
shows Plantae and Chromista as sisters (i.e. monophyletic corticates) as on figure 1 of this paper.
Thus it appears that the failure of Plantae and Chromista to appear as two distinct noninterdigitating clades on so many published trees may indeed simply be poor resolution resulting
from extremely rapid radiation after the single primary symbiogenesis that all now accept made
chloroplasts and the single secondary symbiogenesis that made the chromalveolate/chromist plastid
(as long argued: Cavalier-Smith 1993a). Thus there is no reason to postulate multiple tertiary
symbiogenesis to explain the origins of the ancestral plastid characterising any of the four main
chromophyte groups on the mistaken grounds that the kingdom Chromista (now including Rhizaria
and Heliozoa) is not monophyletic. As taxon sampling for hundreds of genes improves, evidence
that Chromista are both monophyletic and holophyletic will probably grow stronger still.
Moreover, as Archibald (2009) points out, the lateral gene transfer shared by haptophyte and
cryptophyte chloroplast genomes, in which a bacterial ribosomal protein gene (rpl36 (see Fig. 1 in
blue) replaced the endogenous one, severely limits the tertiary symbiogeneses that could justifiably
be assumed. One could not reasonably postulate lateral transfer between the two chromist
subkingdoms recognised here, in an effort to explain the origins of their main plastids, at any time
after the donor subkingdom had undergone its primary bifurcation to produce its two main
photosynthetic lineages. This is because after that time the donor plastid would lack either the
replacement bacterial gene or the original endogenous gene that is now present in the postulated
recipient group’s plastids. Thus only tertiary symbiogenesis from a stem group prior to the date of
the LGT into hacrobian plastids are permissibly postulatable (Archibald 2009). However such early
transfers cannot explain the extensively mosaic presence and absence of plastids within Hacrobia,
Heterokonta, and Alveolata; as Archibald (2009) correctly stresses, one would still have to accept
several plastid losses within each group. Yet Sanchez-Puerta and Delwiche (2008) ignore this
restriction in their specific proposals: either two separate tertiary symbiogeneses from a haptophyte
to the ancestors of Myzozoa and Heterokonta or just one to their last common ancestor. Both
hypotheses contravene the rpl36 replacement constraints by gratuitously assuming that the original
7
and replacement genes both persisted for a long time in the donor lineage (up to the time of the
postulated symbiogenesis) and that the replacement gene was lost at least five times independently
(by Cryptophyceae, at least twice within haptophytes and by heterokonts). Many would expect the
replacement by recombination to be almost instantaneous and thus regard this scenario as most
implausible.
Bodyl et al. (2008) also invoked tertiary symbiogeneses (most unparsimoniously postulating
four!), claiming even that a single secondary symbiogenetic insertion of a red alga is impossible
because the topology of the multigene tree of Burki et al. (2008), in which Hacrobia are sisters to
Plantae not to Harosa, would imply that this had to take place before red algae had even evolved.
However, Burki et al. (2009) using 127 genes and a much richer taxon sample, especially of
Hacrobia, have now shown that the earlier tree was probably incorrect in that respect and that
Hacrobia and Harosa are probably sisters, their tree showing a holophyletic Chromista (in the
present sense) with moderately good support. Thus, contrary to the claim of Bodyl et al. (2008),
there is no solid phylogenetic objection to a single ancestral chromistan secondary acquisition of a
red alga. Far from being impossible, it is highly likely.
At best, invoking tertiary symbiogeneses can slightly reduce the number of plastid losses that
must be accepted; it cannot eliminate the need to accept evolutionary plastid losses in some
lineages. But there is no scientific merit in postulating numerous evolutionarily extremely complex
and mechanistically onerous events (tertiary symbiogeneses) to avoid postulating a similar number
of evolutionarily and mechanistically extremely simple ones (plastid losses). In evolutionary
biology one must distinguish between the possible and highly likely and the possible but extremely
unlikely. Mere possibility alone is no reason to promote an unnecessarily complex explanation with
less likely assumptions. Another limitation of such tertiary symbiogenesis ideas is that they address
only the origins of plastids, not those of the tubular ciliary hairs of Cryptista and Heterokonta,
which were the second major reason for establishing the Chromista (Cavalier-Smith 1981, 1986),
and are evidence independent of plastids for chromist monophyly (assuming they are indeed
homologous, which remains to be tested by molecular biology). One could not explain the presence
of tubular hairs in heterokonts by tertiary transfer from haptophytes, which lack them (putatively
secondarily: Cavalier-Smith 1986, 1994; multiple losses of tripartite hairs are now clear within
Heterokonta and likely within Cryptista: Cavalier-Smith and Chao 2006; Cavalier-Smith 2004).
Molecular data are also needed to test whether the simple ciliary hairs of Myzozoa and solid hairs
of Goniomonadea are related to the tubular hairs of heterokonts, Cryptophyceae and Telonemea
despite their markedly different morphology.
The discovery of the multitude of genes of green algal origin in both chromist subkingdoms
(Moustafa et al. 2009) also favours a photosynthetic ancestry for all chromists and their monophyly.
It could not be readily explained by the specific tertiary symbioses of Sanchez-Puerta and Delwiche
(2008) or of Bodyl et al. (2008) or of Fig. 2 of Archibald (2009). As heterokonts (the postulated
recipients) seem to have several times as many ‘green’ genes as haptophytes (the postulated donors)
supposing that they got them from haptophytes is implausible.
The presence of genes of red algal origin in Bigelowiella is not specific support for the dual
theory; it may simply be a consequence of the rhizarian ancestor having had an ancestor containing
a nucleomorph of red algal origin. However, the presence of 20:5(n-3) fatty acids in the glycolipids
of the rhizarian chlorarachnean algae, which are characteristic of red algae but unknown in green
algae (Leblond et al. 2005), specifically favours the dual simultaneous symbiogenesis theory as
these lipids are probably now located in the green chloroplast, suggesting at least a brief period of
coexistence of green and red plastids in the same cell in early Cercozoa.
Biologists have been too ready to invoke multiple origins, whether by LGT or by
symbiogenesis, when an ancestral presence of multiple characters followed by several differential
losses of one character or another often offers a simpler and more likely explanation of patchy
character distribution, as exemplified by the complex mutually exclusive distribution of the
alternative protein synthesis elongation proteins (EF1- and EFL) in Euglenozoa (Gile et al. 2009).
Contrary to earlier assumptions of repeated LGT the simplest interpretation now is that both were
8
present in the ancestral euglenozoan and that these two genes evolved by gene duplication and
divergence from a single ancestral prokaryotic protein in the ancestral eukaryote.
Technical Caveat
Like Dagan and Martin (2009) I am concerned that when dealing with ‘thousands of trees some
trees will give erroneous results purely by chance’ and that ‘what constitutes “evidence” in the
analysis of thousands of gene trees remains subjective’, and thus I think that the assumption of a
massive early influx of green algal genes into chromalveolates (Moustafa et al. 2009) is at least
partly open to question. In particular Moustafa et al. (in their electronic supplement) too glibly
dismiss the possibility that Plantae really are sisters of Chromista as shown on the best multigene
trees (e.g. Burki et al. 2009) and figure 1 here. If that is so, the first step of their methodology
searching for diatom genes more related to red and green algae than to any other non-chromist taxa
could largely be selecting for genes that were vertically inherited from the common ancestor of
Plantae and Chromista. Chance and lineage-specific biases and model violations could make the
diatoms/chromists artefactually sister to (or even branch within) either green plants or red algae in a
significant proportion of the trees even if their correct position were as sister to reds + greens +
glaucophytes (as no glaucophyte genome is available they would be missing from many trees). The
vastly greater number of green plant than red genes in the target data set could have biased the
initial section of genes towards the greens.
Such tree topology artefacts are particularly likely because most of the genes studied probably
had fewer than 250 alignable amino acids (they gave no data on that, but on average each gene of
Burki et al. (2009) contributed only 230 amino acids) and the chances are that many (conceivably
most) of the thousands of genes they screened automatically just gave bad trees that do not
accurately reflect their evolutionary history. As they only included two trees in the supplement, one
cannot form a judgement about this, the adequacy of taxon sampling, or whether the tree topology
was sensible in other respects. For the vast majority of these genes there are no published trees and
no track record of giving congruent results in general; remarkably many of the green ones
especially are hypothetical of unknown function. If chromists originated 700 My ago, Plantae 770
My ago and the red-green split was 735 My ago and the unikont-corticate split was 800 My ago (all
reasonable guesses based on the fossil record), then only amino acid changes that occurred in the
narrow 770-735 My ago window and were not overwritten by changes in either the chromist or
plant lineages in the following 735 My would retain evidence for the true position. If changes were
random along the molecule and unbiased through time the overwriting problem means there is very
little chance that any protein would have more than one or two amino acids that happened to evolve
that way and most might have none; either the gene would be evolving too slowly to have many
amino acid changes in the relevant relatively short time interval or too fast to retain them
subsequently. Of course molecules do not evolve that predictably, as evolution is more erratic and
biased. Such biases, e.g. frozen covarion effects or temporary rapid evolution, will sometimes
accentuate the true history for some deep branches and make it easier than expected to recover the
correct history by that gene and sometimes instead lead to incorrect topologies with strong enough
‘support’ to fool us. The PhyML method used is particularly prone to getting stuck in local minima
and giving contradictory branch topologies with high support for different genes, and the aLRT of
0.75 gives no assurance that genes labeled ‘green’ actually came from green algae.
Thus I think the number of genes that purportedly came from green algae into the first
chromist may be substantially overestimated. Nonetheless, it seems likely that there is a real signal
amidst the almost inevitable phylogenetic noise, especially as the strong bias towards a nonstreptophyte affinity in their results, which would not obviously be expected from the
considerations just mentioned, is concordant with the evidence for a non-streptophyte green algal
secondary symbiogenesis making the chlorarachnean algae, which now seem to have had a deep
common history with the chromalveolates. The dual symbiogenesis hypothesis simply and
economically accounts for this non-streptophyte bias in the results. Among green algae the marine
9
Ulvophyceae and largely marine prasinophytes are a priori more likely symbionts for the ancestral
chromist, which was almost certainly marine (Cavalier-Smith 2009a), as they are more abundant in
the oceans than either Chlorophyceae or unicellular streptophytes, which are both largely freshwater
or soil organisms. Thus even if there were two independent secondary green symbioses in
chromists both would probably have involved non-streptophytes. The parsimony in the present
hypothesis therefore comes not from the mere fact that both lines of evidence point to nonstreptophyte green algae, but from a potential mechanistic economy in the initial stages of the red
and green symbiogenesis. The significance of this broad taxonomic agreement over the donor is
simply that it fails to contradict the present theory of a common cause for both observations. There
is a pleasing symmetry in the chromist part of figure 1 in that in each subkingdom the first
diverging branch includes an algal class that retains a nucleomorph and either a green or a red
plastid; in each case the later branches retain only the red plastid or neither.
More and more new data, like the discovery of photosynthetic apicomplexans, e.g.
Chromera (Moore et al. 2008) and of colourless plastids in species often assumed never to have had
algal ancestors (both within Myzozoa and Heterokonta), attest to chloroplasts being ancestral for
Myzozoa, which many long resisted, and thus increase the plausibility of their antiquity in
Chromista, albeit falling short of proving their presence in their last common ancestor (Burki et al.
(2009) and Archibald (2009) discuss this further).
Chromista (1981) versus chromalveolates (1999). Placing alveolates within Chromista expands
Chromista so all chromalveolates are now included within Chromista. Adding also Rhizaria and
centrohelid Heliozoa to Chromista has made it now even broader than my original
‘chromalveolates’ (Cavalier-Smith 1999), a name unnecessary for formal classification. Adl et al.
(2005) unwisely tentatively introduced the slightly different name Chromalveolata as a clade name
and with a totally inadequate diagnosis, but it did not include either Rhizaria or Heliozoa, so it is
not a synonym for Chromista in either its original sense or the new broader one established in this
paper. Chromalveolata as they defined it is paraphyletic; Moustafa et al. (2009) used it in a broader
sense to include also Rhizaria. The name Chromista was introduced for a kingdom (Cavalier-Smith
1981) and has historical precedence and is shorter and in my view greatly preferable as the taxon
name to ‘chromalveolates’, which simply started as a convenient name for an alignment file on my
computer. Adl et al. did not make Chromalveolata a taxon or rank it by a conventional Linnean
category. It might be least confusing to retain ‘chromalveolates’ as an informal name for the
paraphyletic group comprising Heterokonta, Alveolata, Haptophyta, and Cryptista (its original and
still most widely used meaning), should anyone wish to retain it, rather then to expand the concept
to include Rhizaria and Heliozoa, which would make it an unnecessary junior synonym of
Chromista as here expanded. However, I do not envisage many circumstances in which one would
want a term denoting chromists other than Rhizaria and Heliozoa, except as a means to reduce
confusion in the transitional period whilst the wider meaning of the older and more euphonious
Chromista becomes adopted. For the photosynthetic chromists, a still narrower paraphyletic subset
of the chromalveolates, the older term ‘chromophytes’ suffices and will often remain useful.
Note 2. Further explanation of Fig. 1 and Tom40 and ORC distribution
Tom40 distribution. Published data on the distribution of Tom40 was restricted to genomes
completely sequenced some while ago and therefore did not include free-living Metamonada,
Loukozoa, Percolozoa, Diplonemea or Euglenoidea. I made some additional BLAST studies for
Eozoa. Using GenBank I readily detected a Tom40 homologue in the free-living metamonad
Trimastix pyriformis, and using the JGI website for the now completed genome of the percolozoan
Naegleria gruberi I found one Tom40 homologue: estExt_fgeneshHS_pg.C_460029. I also
consulted the Protist EST database at the University of Montreal (http://tbestdb.bcm.umontreal.ca),
which contains ESTs and automatically annotated BLAST hits for 60 protists including 6 Loukozoa
10
and 3 non-kinetoplastid Euglenozoa. I found one annotated putative Tom40 homologue cluster
SEL00000632 for the jakobid Seculomonas, which when reblasted had the putative Trimastix
Tom40 among its top hits, but no putative homologues in the two euglenoids and one diplonemid
were listed. But complete genomes are obviously needed for better evidence that they are truly
undetectable in all three main groups of Euglenozoa. Since even Microsporidia and Giardia have
Tom40 (and Microsporidia at least also the major receptor Tom70) in their mitosome outer
membranes, despite their dramatic simplification compared with aerobic mitochondria, and since in
general Microsporidia proteins evolve much faster even than those of trypanosomatids yet one can
identify their Tom40 by BLAST, its undetectability in trypanosomatids cannot easily be dismissed
as simply secondary divergence. Direct biochemical studies of the mitochondrial targeting
machinery are needed in trypanosomatids, euglenoids and diplonemids to determine what proteins
they use and whether porin VDAC is part of this machinery. Such studies are important not only for
testing my present hypothesis of the location of the eukaryotic root, but also for better
understanding the evolutionary flexibility and origins of the protein-import machinery during the
origin of mitochondria.
Distribution of the origin recognition complex (ORC). I also carried out BLAST studies to
clarify the distribution of ORC and two ancillary proteins, Cdc6 and Cdt1, which interact with it in
neozoa. In neozoa ORC consists of six proteins: five evolutionary related proteins, Orc1-5, which
share a major domain related in turn to Cdc6, plus Orc6 a smaller and faster-evolving protein
unrelated to any of the others. Cdt1 belongs to a third protein family. The collective function of
these eight proteins is to load the hexameric Mcm2-7 DNA helicase complex onto DNA at the
proper sites recognized by ORC and its two associated proteins, so that the helicase can open the
double helix to allow access by the replication machinery. In Archaebacteria the same function is
mediated by homologues of CDC6 and Cdt1 only, ORC being absent (Robinson and Bell 2007).
During the origin of eukaryotes there was a major increase in chromatin complexity associated with
the evolution of additional histones and mitosis (Cavalier-Smith 2002); as part of this process
Mcms, which in archaebacteria consist of a single protein that forms a homohexamer, underwent
duplication and divergence to form a heterohexamer (Liu et al. 2009). It has been assumed that a
heterohexameric ORC also evolved at that time (Duncker et al. 2009), but Godoy et al. (2009)
provide evidence that hexameric ORC is absent from trypanosomes and suggest that they exemplify
a primitive, archaebacteria-like, evolutionary phase before the gene duplications that created a
heterohexameric ORC. My BLAST results support this and suggest that ORCs arose and increased
in complexity after the origin of eukaryotes in four distinct phases:
I suggest that, as in archaebacteria, trypanosomatid and probably also other euglenozoan
Mcms are loaded onto replicon origins solely by Cdc6 and a distant homologue of Cdt1, even
though they have a heterohexameric Mcm. I propose that a primitive ORC first evolved in the
ancestor of neokaryotes, and may have contained as few as two different proteins: Orc6 and Orc2
(i.e. just one member of the Orc1-5/Cdc6 family). Then a further increase in complexity involved
the origin of at least Orc4 in a common ancestor of Metamonada and Neozoa (as Orc4 is detectable
by BLAST in the metamonad Trichomonas but not the percolozoan Naegleria or Euglenozoa;
http://genome.jgi-psf.org/Naegr1/Naegr1.home.html) (Fig. 1). Then a major change in the ancestor
of neozoa spliced a large chromodomain (BAH) onto the N-terminal end of an Orc1-5/Cdc6 domain
creating the much larger neozoan Orc1 (Fig. 1). This might be associated with substantial changes
to heterochromatin biology in neozoa compared with Eozoa. Finally an extra domain seems to have
been added to N-terminal of Orc2 in the ancestor of opisthokonts.
Just looking at gene annotations is confusing, as the single Cdc6 protein is rather
indiscriminately annotated in GenBank in both archaebacteria and trypanosomatids as Cdc6 or
Cdc1, just because they share the same AAA ATPase domain with Orc1-5, even though no
archaebacteria or eozoan sequences that I examined had the BAH domain characteristic of neozoan
Orc1. Because of this historical annotation confusion, Godoy et al. (2009) referred to the
trypanosome replication initiating AAA ATPase as Orc1/Cdc6, despite showing that it rescued
11
yeast Cdc6 mutations but not Orc1 mutations. Clearly it is functionally more like Cdc6 than
neozoan Orc1 and is best simply called Cdc6. On trees, however, Orc1 and Cdc6 are robustly
sisters, and closer to each other than to any of Orc2-5 (Duncker et al. 2009). The high degree of
conservation of Orc1, 2 and 4 and Cdc6 in neozoa, and the ability to detect them even in the
radically modified microsporidian fungi that have truncated the N-terminal end of both Orc1 and
Orc2 makes it unlikely that they would have been missed in BLAST searches, so I regard the
absence of Orc 1,2,3 in trypanosomes and the absence of Orc4 in Naegleria as very significant.
However Orc3, 5 and 6 and Cdt1 all evolve more rapidly, sufficiently so to make false negatives a
genuine risk in rapidly diverging lineages using simple BLASTP. Thus my inability to find most of
these four proteins (except Orc6 in Naegleria) in Eozoa needs interpreting with caution. Direct
biochemical studies on prereplication complexes in Eozoa are needed to establish what subunits
they actually contain and thus test the strong indications from BLAST of a stepwise increase in
ORC complexity in Eozoa and to establish when in this process Orc3 and 5 evolved. When using
human Cdt1 as query, homologues could be more readily detected in Posibacteria (the putative
ancestors of both eukaryotes and archaebacteria: Cavalier-Smith 2006b: Valas and Bourne 2009)
than in archaebacteria or some neozoan lineages (e.g. alveolates, red algae) prone to rapid protein
evolution, but were readily detectable in green plants. The origin of Orc6 is unclear but a few
possible distant homologues are detectable in both archaebacteria and eubacteria including proteobacteria. The simplest scenario for the origins of Orc1-5 is by successive gene duplications of
Cdc6, with repeated divergences in which the Cdc6 domain of Cdc1 most conservatively retained
its original structure.
The fact that chromatin in trypanosomatids is generally dispersed throughout the cell cycle
whereas in diplonemids and euglenoids it is condensed throughout (Table 2 below), both different
from most neokaryotes, makes it especially important to study Orcs and heterochromatin properties
in all major groups of Euglenozoa, including especially the putatively early diverging
petalomonads, so as to disentangle which chromatin characters are ancestral and which derived,
both for Euglenozoa and eukaryotes as a whole.
Chromist diversification. Fig. 1 assumes that chlorophylls c1 and c2 and the carotenoid pigment
fucoxanthin all evolved in the ancestral chromist prior to the primary divergence of harobiotes and
hacrobians. This implies that cryptists and the photosynthetic alveolate Chromera (Moore et al.
2008) independently lost all three pigments, and that the alveolate dinoflagellates lost chlorophyll
c1. Multiple losses of photosynthetic pigments within Chromista are well established. Within the
heterokont (=stramenopile) class Chrysomonadea chlorophyll c2 was lost by Synurales, and
fucoxanthin was lost within the class Raphidophyceae and independently by Eustigmatophyceae
and Xanthophyceae. Given that fucoxanthin is thus known to have been lost three times within
heterokonts, postulating two further losses (in Alveolata and Cryptista) soon after the initial
diversification of chromists is not evolutionarily onerous. There is no need to invoke either
independent origins of complex pigment biosynthetic pathways or lateral transfer, whether by
tertiary symbiogenesis or otherwise, to explain chromist pigment distribution. Accepting that
alveolates branch within Chromista also increases the number of nucleomorph losses that must be
postulated; in addition to those already necessitated within Hacrobia at least one is required prior to
the common ancestor of Heterokonta/Alveolata. On present information, small GTPase paralogue
Rab1A appears to be a synapomorphy just for Harosa, not for chromalveolates as a whole as
implied by Elias et al. (2009).
Euglenoid chloroplast origin: to reduce clutter the secondary symbiogenesis that implanted a
prasinophyte green algal chloroplast (Turmel et al. 2009) into an advanced subgroup of euglenoid
Euglenozoa, which thereafter abandoned phagotrophy, is not shown.
Dates: based on evidence and arguments in Cavalier-Smith (2006c). However accepting an eozoan
root adds the further complication that no Eozoa fossilize well, and almost no unambiguous eozoan
12
fossils exist beyond a plausible euglenoid in rather recent amber. The oldest fossils that I accept as
unambiguously eukaryotic are testate amoebae (Melanocyrillium) dating back to 760-800 My ago; I
do not accept identifications of any of them as Cercozoa. Though some are plausibly Amoebozoa
they might in fact come from an extinct, probably neozoan stem group. No Eozoa have pseudopods
known to be able to manipulate particles to make tests; heterolobosean pseudopods (Percolozoa, the
only ones well established in free-living Eozoa) are apparently not that versatile. But it may not be
safe to conclude that none ever did so in the past. If they did not, the ~800 My date would represent
the date for neozoa; very likely eukaryotes would be somewhat older; the reasonably good
resolution in the basal part of the excavate tree could be interpreted as resulting from either
reasonably good temporal spacing or from frozen rapid covarion-like changes during their early
diversification. If the former interpretation were correct, then ~900-1000 My ago might be a better
rough estimate for the date origin of eukaryotes, taking both sequence tree proportions and
Neoproterozoic fossil data into account. In that case the largest acritarchs in the period 800-1000
My ago that cannot confidently be assigned to any protist or bacterial phylum, despite claims to the
contrary, might reasonably be interpreted as possible eozoan cysts rather than stem eukaryotes or
prokaryotes. However, if there was once an eozoan lineage of testate amoebae represented by the
Melanocyrillium fossils, the date of origin of both neozoa and eukaryotes could be more recent than
that.
Note 3: invalidity of an earlier rooting between unikont and bikont eukaryotes.
Both arguments for the root being between unikonts (originally comprising only opisthokonts and
Amoebozoa) and bikonts as originally presented are now invalid.
First the pattern of ciliary transformation of bikonts in which the anterior cilium was the
younger and was transformed into the structurally and functionally different posterior cilium was
proposed to be a derived state for bikonts only, whereas unikonts were believed to have either no
ciliary transformation or an opposite pattern with the anterior cilium being older (Cavalier-Smith
2002). The assumption of two contrasting patterns of ciliary transformation in eukaryotes was based
on the review of Moestrup (2000), and especially a paper describing that pattern in the amoebozoan
myxogastrid Physarum (Wright et al. 1980). Both Moestrup and I overlooked that Gely and Wright
(1986) had retracted their earlier interpretation, so that ciliary transformation in Physarum (and by
implication other Amoebozoa also) has the same pattern as in bikonts. Even though my maximum
likelihood 18S rRNA tree showed Apusozoa as grouping within unikonts as sister to Amoebozoa
(but other methods gave no bootstrap support: Cavalier-Smith 2002) I then regarded Apusozoa as
bikonts as they were all biciliate and one species (originally incorrectly described under the name
Rhynchomonas mutabilis) had been observed to have the bikont pattern of ciliary transformation
(Griessmann 1913). However evidence is growing from other gene trees also that Apusozoa do
belong in unikonts, most likely as sister to opisthokonts rather than Amoebozoa though the
possibility that Apusozoa are the paraphyletic ancestors of both opisthokonts and Amoebozoa has
not been ruled out (Kim et al. 2006; Moreira et al. 2007; Brown et al. 2009). This means that
Amoebozoa and opisthokonts, each of which has been argued to have been ancestrally uniciliate
(despite the ancestral opisthokont at least having had two centrioles) probably became uniciliate
independently. I now suggest that this happened by suppression of the anterior cilium’s growth in
the ancestral opisthokont, making it posteriorly uniciliate, and the independent suppression of
posterior ciliary growth in the ancestral amoebozoan making it anteriorly uniciliate (but with
reversion to the biciliate condition by relieving this suppression in the ancestor of myxogastrid
slime moulds).
The second argument for bikonts being derived was the dihydrofolate-reductasethymidylate-synthetase (DHFR-TS) gene-fusion shared by bikonts, including the apusozoan
Amastigomonas debruynei (not its correct name: Cavalier-Smith and Chao submitted). If
apusomonad Apusozoa are genuinely sisters to opisthokonts then this gene fusion must have been
reversed in one or both of Amoebozoa and opisthokonts; both cannot be regarded as representing a
13
primitive uniciliate state. The argument from myosin synapomorphies (Richards and CavalierSmith 2006) for unikont holophyly makes it likely that the DHFR-TS gene fusion was reversed
independently in opisthokonts and Amoebozoa by gene duplication and differential deletions, as is
mechanistically plausible. Thus neither of the reasons for placing the root outside bikonts has
withstood subsequent scrutiny. Neither is an obstacle to the earlier view that the root was in Eozoa
because of the primitive mitochondrial genomes of jakobid excavates (Cavalier-Smith 2000).
The Ccm/lyase argument for an eozoan root is inherently stronger than the DHFR/TS
argument because (a) there is no doubt that the presence of Ccm genes in excavate mitochondrial
genomes is the primitive state and (b) it is arguably easier for a gene fusion to be secondarily lost by
gene duplication and differential deletions of the two parts than it would be for nuclear lyase to be
replaced by mitochondrially coded Ccms by lateral transfer from bacteria into mitochondria (no
case of such transfer is known). If the root were amongst core excavates (Loukozoa, Metamonada,
Percolozoa) such loss must still be postulated because of the strong sequence evidence that
Euglenozoa are related to them on unrooted trees.
In principle the root within Eozoa could either be within excavates, e.g. beside the
zooflagellate jakobid Loukozoa as sometimes suggested because of their particularly primitive
mitochondrial genomes (notably retention of the -proteobacterial RNA polymerase) (CavalierSmith 2000), or between excavates and Euglenozoa, as proposed here. Other once plausible places
within excavates are between Loukozoa and Discicristata (Euglenozoa, Percolozoa) because of their
different mitochondrial cristae or within or beside Percolozoa because of their absence of Golgi
stacking and aberrant short nuclear rRNAs, but neither of these has a strong rationale. Tom40 and
Orc arguments discussed above strongly favour a root within or beside Euglenozoa instead. If these
are accepted, one must assume that the viral RNA polymerase now used by eukaryotes other than
jakobids for transcribing mitochondrial DNA entered the ancestral eukaryote prior to the divergence
of neokaryotes and Euglenozoa, and that the -proteobacterial RNA polymerase was immediately
lost by Euglenozoa but persisted in neokaryotes for a period until after the divergence of jakobids
(the second neokaryote branch after Percolozoa) and was lost independently twice within
neokaryotes: by the ancestor of Percolozoa and by the common ancestor of Malawimonas,
Metamonada and neozoa. As the time interval between the divergence of Percolozoa and jakobids
could have been very short, brief coexistence of two RNA polymerases is not an evolutionary
onerous assumption, especially as a similar viral polymerase and the cyanobacterial polymerase
have coexisted in chloroplasts with partially overlapping functions for at least 600 My. The present
rooting requires only three losses of the -proteobacterial polymerase unlike the unikont/bikont root
that required at least four, and more importantly a briefer period of coexistence in only one segment
of the tree, not on several segments.
********************************************************************
Rogozin et al (2009) have also argued that the root is within bikonts, postulating that it is
not in Eozoa but between Plantae and all other eukaryotes. However, their own analyses of rare
conserved changes in numerous proteins actually contradict their conclusion and are fully consistent
with the root being within Eozoa as argued here. They used an ingenious four-taxon method (three
eukaryote groups plus bacterial outgroup) to analyse eight different trifurcations within eukaryotes.
Analyses of all three trifurcations that included eozoa showed with strong statistical support the
eozoan group as branching more deeply than either plants or opisthokonts, exactly as on my Fig. 1.
However, they assumed that all three results were biased by long-branch artefacts and should all be
ignored, basing their conclusions only on the five other analyses that included Neozoa only (making
their analyses irrelevant to the question of whether the root is within Eozoa or not!). The branches
for the three eozoan taxa (Giardia, Trichomonas, and kinetoplastids) were indeed very long.
However, in principle extremely rapid evolution in such a branch by introducing numerous
convergences with the bacterial outgroup, could either change the topology of the tree by putting
the branch deeper than it should be or instead simply add false amplification to a weaker true signal
that it really is a deep branch, and thus not give a topologically incorrect conclusion. A priori there
is no way of knowing whether a long branch is giving a false or a true topology; to assume that all
14
analyses including Eozoa gave a false result is purely arbitrary. They might all be topologically
correct or some right and some wrong!
The subjectivity of how Rogozin et al. (2009) drew conclusions from their analyses is also
illustrated in two parts of the neozoan tree. One was the trifurcation involving the amoebozoans
Entamoeba and Dictyostelium and opisthokonts. The raw data showed that these two amoebae share
more rare amino acid substitutions with each other than with opisthokonts, which is consistent with
the holophyly of Amoebozoa strongly indicated by the best available multigene tree (Minge et al.
2009). However, their statistical test, which attempts to correct for long branches by assuming that
they necessarily proportionally introduce homoplasies, favoured instead the contradictory idea that
Entamoeba branches more deeply than Dictyostelium and that Amoebozoa are paraphyletic. They
unwisely concluded that the elaborate statistical test, which makes untestable assumptions about the
numerical relationship between branch lengths and misleading convergences, gave the right answer
and that the contradictory raw data were misleading and that Amoebozoa really are paraphyletic.
Almost certainly this conclusion is wrong and the statistical test simply overcorrected for
homoplasy; a multigene tree for scores of genes including 7 Amoebozoa (Minge et al. 2008) is
probably more reliable than a statistical analysis with dubiously valid assumptions based on only
two Amoebozoa and a tiny number of conservative amino acid positions in many fewer proteins. It
is also odd that Rogozin et al. (2009) chose to accept the statistical conclusion for this unikont
trifurcation, even though they rejected the statistical conclusion that kinetoplastids are deeper
branching than Plantae or opisthokonts as a long-branch artefact, when in fact the Entamoeba
branch was objectively longer than that for kinetoplastids. In the case of the metamonads Giardia
and Trichomonas, both the raw data and the statistical tests agreed in placing them more deeply
than Plantae, yet they concluded that their analyses were ‘best compatible with’ Plantae being
deepest! The other problematic interpretation concerned the chromalveolate (chromist), plant,
opisthokont trifurcation, where contradictory topologies were favoured by different species samples
and different genes. They recognised that there was a strong signal from many genes and chromist
species for a sister relationship between Plantae and Chromista as shown in my Fig. 1 and
multigene trees (Burki et al. 2009), but dismissed these (without any evidence or specific arguments
for any gene) all as cases of replacement of host genes by those from the enslaved red alga. Instead
they assumed that the genes that showed chromalveolates as sisters to opisthokonts were giving the
true vertical signal for the host; however, it is perfectly possible that it is these genes that are the
artefacts by excessive divergence and those indicating a sisterhood of chromist and plants are the
true signal. They provide no way of distinguishing these possibilities, making their ‘conclusion’ as
to the position of the root totally subjective. Overall one can argue that statistical treatment of such
rare amino acid changes involving untestable assumptions about the impact of branch lengths
coupled with the necessary restriction of the method to three eukaryote taxa at a time as more likely
to lead to artefact than conventional multigene trees, so one cannot regard this method as a panacea
to avoid such problems.
Note 4. Euglenozoan characters in relation to the position of the root
Euglenozoa comprise four classes (Table 1) whose relationships are not thoroughly established.
Protein sequence trees suggested that Kinetoplastea (parasitic Trypanosomatida plus the ancestral
mostly free-living Bodonida from which they evolved) are sisters to Diplonemea, but recent 18S
rRNA evidence for Calkinsia, which I have classified within Postgaardea, suggests that Postgaardea
instead might be sister to Kinetoplastea (Yubuki et al. 2009), as assumed (Cavalier-Smith 1998),
and raises the possibility that Euglenoidea could be paraphyletic ancestors of the other three classes,
rather than sisters of Kinetoplastea as protein trees (poorly sampled for deep-branching euglenoids)
have suggested. Because of this topological uncertainty and the paucity of biochemical and total
absence of genomic information from the four nutritionally ancestral free-living phagotrophic
groups (Bodonida, Diplonemea, Postgaardea and Peranemia [basal phagotrophic euglenoids]) it is
not possible yet to say what are the ancestral molecular characters for Euglenozoa. Therefore one
15
cannot currently distinguish between the eukaryotic root being between Euglenozoa and
neokaryotes or deep within Euglenozoa themselves.
Cytologically it would probably be simplest if the root were between all Euglenozoa and all
neokaryotes, as the special features of Euglenozoa (1-3 in Table 2) and excavates would then be
divergent specializations of a possibly simpler early eukaryote; that avoids supposing that the
remarkably stable euglenozoan pattern was ancestral to the alternative pattern that characterizes
excavates. However, given our ignorance about molecular and cytological diversity in the
putatively most deeply branching euglenozoan group, the bacteria-eating petalomonad euglenoids
(Peranemia), such a conclusion might be premature. Most free-living Euglenozoa have two cilia;
petalomonads have only one emergent one (anterior used for gliding on surfaces); though some
have two centrioles and a second rudimentary or vestigial non-emergent cilium, and all are often
considered as derived from biciliate ancestors. There is unambiguous phylogenetic evidence that the
uniciliate trypanosomatids evolved from the biciliate bodonids, but no evidence that the
petalomonad genus Scytomonas, which also has only one cilium and centriole (Mignot 1961), had
biciliate ancestors; like other petalomonads Scytomonas has simpler mouthparts than most
phagotrophic euglenoids or diplonemids and a particularly simple pellicle; it is also the only
euglenoid for which sexual cell fusion is known. It is therefore a candidate for a descendant from
the long-postulated unicentriolar uniciliate ancestor of bicentriolar eukaryotes. Cultures need to be
obtained to test whether it is an early diverging euglenozoan lineage that diverged from other
eukaryotes before most of the features now widespread in Euglenozoa (Table 2) evolved or instead
arose by simplification from biciliate petalomonads.
As Table 2 below indicates, most Euglenozoa have radically different properties from all
neokaryotes. Characters 4-9 are clearly derived specialised characters that cannot be regarded as
ancestral to those of neokaryotes, as they are as divergent from those of prokaryotes as from
standard neokaryotic ones, and it would be mechanistically hard to envisage most of them giving
rise later to neokaryotic properties; character 10 is also unlikely to be primitive for eukaryotes. In
this respect they differ profoundly from characters like the absence of Tom40 and ORC, which
seem to reflect the ancestral conditions respectively in the proteobacterial ancestor of mitochondria
and the neomuran ancestor of the rest of the eukaryote cell (Cavalier-Smith 2009b), as arguably do
the nine other characters mentioned in the second paragraph of this supplement. Thus the magnitude
and number of the special euglenozoan properties of Table 2 are comprehensible consequences of
the root being either between Euglenozoa and neokaryotes or deeply amongst deep-branching
Euglenozoa. The latter must be studied to see whether any have Tom40 or ORC and which if any of
the characters of Table 2. If all petalomonads possessed most or all of the Table 2 properties but had
neither Tom40 nor ORC nor any of the other 8 characters mentioned above as absent from
trypanosomatids the only plausible place for the root would be between Euglenozoa as a whole and
neokaryotes. But more complex character distributions could favour a root either somewhere
between trypanosomatids and euglenoids or deep within euglenoids.
Note that my contrasting of the morphology of excavate vanes and rods and arguing for a
primary eukaryotic bifurcation between them, does not exclude the possibility that they had a
simpler common ancestor; conceivably some of their proteins are distantly related.
Table 2. Eleven unusual properties of Euglenozoa absent from other eukaryotes
___________________________________________________________________________
1. Two cilia with dissimilar lattice paraxial rods stemming from parallel centrioles located in a deep
anterior reservoir (Simpson 1997).
2. Complex anterior ingestion apparatus ancestrally with dense rod and plicate vanes, sometimes
reduced to an MTR pocket (Simpson 1997).
3. Long rod-shaped extrusomes (Simpson 1997).
4. Unique cytochrome c with only one cysteine for haem binding and mechanism of biogenesis unique
in the living world (Allen et al. 2008).
16
5. Mitochondrial DNA of multiple circles with extensive U-insertion editing (Gray 2003; Marande et
al. 2005).
6. Nuclear succinate dehydrogenase split into two genes for proteins separately imported into
mitochondria (Gawryluk & Gray 2009).
7. Nuclear messenger RNA made by trans-splicing splice-leaders onto coding regions (Frantz et al.
2000) (similar behaviour independently evolved in dinoflagellates, and for some nematode genes).
8. Nuclear protein-coding transcripts are almost all polygenic each with several unrelated genes
(Berriman et al. 2005).
9. Unusual base J (beta-d-glucopyranosyloxymethyluracil) in their nuclear DNA (Borst & Sabatini
2008)
10. Nuclear chromosomes remain visibly condensed throughout interphase (euglenoids and
diplonemids; in kinetoplastids they are never visibly condensed even during mitosis, probably a
derived state) (Triemer 1991)
11. Systematically longer 18S rRNA than other eukaryotes with unique expansion segments (also true
of Foraminifera and Myxogastria).
All 11 unique characters in Table 2 were probably ancestral for Euglenozoa, most being
found in at least three of the four classes (4-10 are unstudied in Postgaardea as they have never been
cultured, but as they are almost certainly not the deepest branch (Yubuki et al 2009) this is irrelevant
to deducing the ancestral state). All except perhaps (10) (condensed chromatin) are probably derived
characters for Euglenozoa compared with the ancestral eukaryote. The significance of these 11
remarkable differences from other eukaryotes is that they may constitute one of two early divergent
evolutionary responses to the problems of being eukaryotic (the general ‘typical textbook’ features
of neokaryotes being the other); unlike the absence of Tom40 and ORC and the nine other arguably
primitive characters listed in paragraph 2, most cannot be primitive precursors of the typical pattern
seen in neokaryotes. However, some Euglenozoa have additional unusual features that might be
ancestral; these include euglenoid multiple mitotic spindles and kinetoplastid glycolysis being
located in peroxisome-like microbodies, not the cytosol. Euglena mitochondria synthesize fatty
acids anaerobically by unique machinery and make wax esters and ferment them anaerobically in
the cytoplasm by enzymes lacking homologues in other eukaryotes (but with bacterial relatives)
(Hoffmeister et al.). Euglena small nucleolar guide RNAs (which process rRNA) are smaller and
more uniform than in other eukaryotes (Russell et al. 2006), more like those of archaebacteria. This
feature could be primitive, but at least five of these 17 unique characters are not ancestral eukaryotic
properties, but secondarily derived: three have independently derived parallels in other eukaryotes
(notably trans-splicing); the mitochondrial genomes of dinoflagellates are analogously radically
changed from the ancestral state best exemplified by the jakobid Reclinomonas.
_______________________________________________________________________
Note 5. The clade names neokaryotes, corticates, taxon names Eozoa, Loukozoa, Excavata,
and Sarcomastigota, and grade name ‘discicristates’.
I invented the name neokaryote to denote all eukaryotes that branch higher in rRNA trees than
Euglenozoa (Cavalier-Smith 1993b). It is used in precisely that sense here, assuming that the tree is
rooted between Euglenozoa and neokaryotes, even though when proposed it was mistakenly
thought that some groups (Metamonada, Microsporidia, Archamoebae and Percolozoa) branched
more deeply. Now that we know that Metamonada, Microsporidia, Archamoebae are not
primitively amitochondrial, and as argued here are not the first branch on the eukaryote tree, my
later redefinition of neokaryote (Cavalier-Smith 1998) as all eukaryotes other that Metamonada
sensu Cavalier-Smith (2003b) would not define a clade and lacks utility, and thus understandably
never became widely used. So sticking to the original phylogenetic definition but with altered
17
circumscription will not be confusing. The name was coined specifically to emphasize the marked
differences in genome organization between Euglenozoa and neokaryotes, which the present
rooting stresses even more by seeing it as the primary eukaryotic divergence in both genome
organisation and cell structure.
I invented the subkingdom name Neozoa (Cavalier-Smith 1983) to denote all protozoa except
Discicristata (Euglenozoa and Percolozoa) and Metamonada. That paraphyletic taxon is not
retained here as it would be equivalent to Sarcomastigota plus Rhizaria and Heliozoa, which are
here placed in two separate kingdoms. I therefore now use neozoa not for a taxon, but as an
informal name for the smallest clade that includes the last common ancestor of those three taxa
(Fig. 1); this phylogenetic redefinition expands its composition to include all Animalia, Fungi,
Chromista, and Plantae.
I later invented the subkingdom name Eozoa (Cavalier-Smith 1987) to denote all Protozoa that
branched below neozoa on phylogenetic trees, but the present superclass Eopharyngia of
Metamonada was excluded as they were then in a separate kingdom Archezoa. Eozoa is here
formally emended by including all Metamonada as now circumscribed (Cavalier-Smith 2003b, i.e.
including Eopharyngia) as well as Loukozoa (not yet created in 1987), Percolozoa, and Euglenozoa.
I invented the phylum name Loukozoa (Cavalier-Smith 1999) and revised it (Cavalier-Smith
2003b,c) to embrace two excavate classes only (Jakobea and Malawimonadea); it is not yet widely
used because of widespread aversion to paraphyletic taxa that is based on flawed arguments
(Cavalier-Smith 2009c). The class Diphyllatea (Diphylleia, Collodictyon, Sulcomonas) may also
belong here (as its groove weakly suggests) but the position of Diphylleia on 18S rRNA trees is
extremely unstable, as it sometimes groups with excavates near Loukozoa/Metamonada and
sometimes with Amoebozoa or Apusozoa within unikonts.
Following the studies that first defined jakobids (O’Kelly 1993), who pointed out that their
possession of ciliary vanes and ciliary root characters suggested an affinity with retortamonads,
Simpson and Patterson (1999) used the informal name ‘excavates’ to denote eukaryotes which like
jakobids and Carpediemonas membranifera (a novel kind of free-living metamonad whose
ultrastructure they characterized) had a noticeably ‘scooped out’ or ‘excavated’ ventral feeding
groove associated with vaned or flange-bearing cilia and distinctive ciliary roots; initially excavates
embraced only a subset of Metamonada with clear grooves and/or vanes (Carpediemonas,
Trimastix, retortamonads) and Jakobida (Patterson 1999), an assemblage that was initially thought
to be paraphyletic. But the concept was soon extended to include Malawimonas (O’Kelly 1999) and
Percolozoa, which have similar ciliary roots. To these ‘core excavates’ Simpson and I
independently eventually added Euglenozoa, despite their not having the core excavate
ultrastructure, on the assumption that the eukaryote root was between bikonts and unikonts
(Cavalier-Smith, 2002, which established the taxon Excavata to include both core excavates and
Euglenozoa) coupled with the fact that they grouped on unrooted sequence trees with excavates;
this assumption about the root position, I have argued here, was mistaken, but the unrooted
grouping has been strongly confirmed. If the present rooting of the tree is correct, both the original
taxon Excavata (Cavalier-Smith 2002) and the present phenotypically substantially more
homogenous one made by excluding Euglenozoa are paraphyletic. Those ready to accept Excavata
in either sense as a taxon should also in principle be willing to accept the paraphyletic taxa
Loukozoa, Eozoa, Choanozoa, Sarcomastigota, Protozoa, and Bacteria.
The name Sarcomastigota is not, as sometimes incorrectly assumed, a synonym for the old
Sarcomastigophora, but was invented for a new protozoan infrakingdom (Cavalier-Smith 1983) that
excluded all Eozoa but at first included what are now called Amoebozoa plus those former
flagellate and amoeboid protozoa here transferred to kingdom Chromista. Originally it excluded
18
Choanozoa, but these were subsequently added and Heliozoa, Radiozoa and Alveolata excluded
(Cavalier-Smith 1998); later Rhizaria also were excluded as evidence for the unikont bikont
dichotomy grew (Cavalier-Smith 2002, 2003). Apusozoa have sometimes being excluded (CavalierSmith 2002) but have usually have been placed within Sarcomastigota, as I do here because of
increased evidence that Apusozoa are sisters to opisthokonts (Kim et al. 2006; Brown et al. 2009;
and our own unpublished multigene data) and because apusomonads have myosin II like other
unikonts (Berney and Cavalier-Smith in prep.). Sarcomastigota should be more stable in
composition than in the past as it now includes only protozoa with pronounced actomyosin
pseudopodial activity dependent on myosin II. The only other eukaryote known to have myosin II is
Naegleria (first noted by T. A. Richards pers. comm.; see also Odronitz and Kolmar (2007) who
incorrectly assume Naegleria to be related to Amoebozoa); it is unclear whether Naegleria acquired
it by lateral gene transfer from a sarcomastigote, in which case myosin II is an additional
synapomorphy for unikonts to those mentioned in Fig. 1, or whether instead myosin II originated in
the ancestral neokaryote and was lost independently by corticates and metamonads).
Discoba: this name was introduced by Hampl et al. (2009) for a putative clade comprising
discicristates and jakobids. If the root is within discicristates as argued here, this grouping is not a
clade but paraphyletic; it seems of no taxonomic utility and of doubtful descriptive value as (unlike
the also paraphyletic discicristates) it does not unite organisms that share phenotypic characters that
would make it useful to distinguish them from others by name. Rodríguez-Ezpeleta et al. (2007)
discovered insertions in the protein Rpl24A that they interpreted as synapomorphic for discoba.
However the insertion in Euglenozoa is one amino acid shorter than in jakobids plus Percolozoa,
meaning either that the euglenozoan insertion was independent or that there was a single amino acid
deletion in the ancestral euglenozoan or else an independent single amino acid insertion in the
ancestor of Percolozoa/Jakobea. One of the two latter possibilities seems most likely as both
insertions start with a proline even though there is almost no other sequence similarity between the
Euglenozoan and jakobid/percolozoan insertions. Thus on any scenario for the position of the root
there must have been at least two evolutionarily independent length changes in this part of the
molecule. If my present rooting of the tree is correct three changes are needed; I suggest that either
a four or five amino acid insertion occurred in the ancestral eukaryote and that the five amino acids
were secondarily deleted in the common ancestor of Neozoa, Metamonada, and Malawimonas.
Avoiding postulating this 5-amino acid secondary deletion is not a sufficiently strong reason for
placing the root within excavates (e.g. between Malawimonas and jakobids) instead of beside or
within Euglenozoa for the many reasons discussed here.
I have abandoned the superphylum Discicristata Cavalier-Smith 2002, but “discicristates” is still
useful to refer to Euglenozoa and Percolozoa jointly because of their shared discoid mitochondrial
cristae, which now appears to have been the ancestral state for eukaryotes.
__________________________________________________________________________________
References for Supplementary Material Additional to those in the Printed Version
Adl, S. M., Simpson, A. G., Farmer, M. A., Andersen, R. A., Anderson, O. R., Barta, J. R., Bowser, S.
S., Brugerolle, G., Fensome, R. A., Fredericq, S., James, T. Y., Karpov, S., Kugrens, P., Krug, J.,
Lane, C. E., Lewis, L. A., Lodge, J., Lynn, D. H., Mann, D. G., McCourt, R. M., Mendoza, L.,
Moestrup, O., Mozley-Standridge, S. E., Nerad, T. A., Shearer, C. A., Smirnov, A. V., Spiegel, F.
W. & Taylor, M. F. 2005 The new higher level classification of eukaryotes with emphasis on the
taxonomy of protists. J. Eukaryot. Microbiol. 52, 399-451.
Agrawal, S., van Dooren, G. G., Beatty, W. L. & Striepen, B. 2009 Genetic evidence that an
endosymbiont-derived ERAD system functions in import of apicoplast proteins. J. Biol. Chem. 284,
33683-33691.
19
Archibald, J. M. 2009 The puzzle of plastid evolution. Curr. Biol. 19, R81-R88.
Archibald, J. M., Rogers, M. B., Toop, M., Ishida, K. & Keeling, P. J. 2003 Lateral gene transfer and
the evolution of plastid-targeted proteins in the secondary plastid-containing alga Bigelowiella
natans. Proc. Natl Acad. Sci. USA 100, 7678-7683.
Bodyl, A., Stiller, J. W. & Mackiewicz, P. 2008 Chromalveolate plastids: direct descent or multiple
endosymbioses? Trends Ecol. Evol. 24, 119-121.
Bolte, K., Bullmann, L., Hempel, F., Bozarth, A., Zauner, S. & Maier, U. G. 2009 Protein targeting
into secondary plastids. J. Eukaryot. Microbiol. 56, 9-15.
Borst, P. & Sabatini, R. 2008 Base J: discovery, biosynthesis, and possible functions. Ann. Rev.
Microbiol. 62, 235-251.
Brown, J. R., Koretke, K. K., Birkeland, M. L., Sanseau, P. & Patrick, D. R. 2004 Evolutionary
relationships of Aurora kinases: implications for model organism studies and the development of
anti-cancer drugs. BMC Evol. Biol. 4, 39.
Brown, M. W., Spiegel, F. W. & Silberman, J. D. 2009 Phylogeny of the "forgotten" cellular slime
mould, Fonticula alba, reveals a key evolutionary branch within Opisthokonta. Mol. Biol. Evol. 26,
2699-2709.
Burri, L., Williams, B. A., Bursac, D., Lithgow, T., Keeling, P. J. 2006 Microsporidian mitosomes
retain elements of the general mitochondrial targeting system. Proc. Natl Acad. Sci. USA 103,
15916-15920.
Cavalier-Smith, T. 1982 The origins of plastids. Biol. J. Linn. Soc. 17, 289-306.
Cavalier-Smith, T. 1983 A 6-kingdom classification and a unified phylogeny. In Endocytobiology II
(ed. W. Schwemmler & H. E. A. Schenk), pp. l027-l034. Berlin: de Gruyter.
Cavalier-Smith, T. 1986 The kingdom Chromista: origin and systematics. In Progress in Phycological
Research. F. E. Round & D. J. Chapman, eds Vol. 4, pp. 309-347. Biopress Ltd., Bristol.
Cavalier-Smith, T. 1993a The origin, losses and gains of chloroplasts. In Origin of Plastids:
Symbiogenesis, Prochlorophytes and the Origins of Chloroplasts. R. A. Lewin (ed.). pp. 291-348.
Chapman & Hall, New York.
Cavalier-Smith, T. 1993b Evolution of the eukaryotic genome. In The Eukaryotic Genome, eds. P.
Broda, S. G. Oliver & P. Sims. pp. 333-385. Cambridge University Press.
Cavalier-Smith, T. 1994 Origin and relationships of Haptophyta. In The Haptophyte Algae, eds J.C.
Green and B. S. C. Leadbeater. Clarendon Press, Oxford. pp. 413-435.
Cavalier-Smith, T. 1997 Amoeboflagellates and mitochondrial cristae in eukaryotic evolution:
megasystematics of the new protozoan subkingdoms Eozoa and Neozoa. Arch. Protistenk. 147,
237-258.
Cavalier-Smith, T. 1998 A revised six-kingdom system of life. Biol. Rev. 73, 203-266.
Cavalier-Smith, T. 1999 Principles of protein and lipid targeting in secondary symbiogenesis:
euglenoid, dinoflagellate, and sporozoan plastid origins and the eukaryote family tree. J. Euk.
Microbiol. 46, 347-366.
Cavalier-Smith, T. 2000 Flagellate megaevolution: the basis for eukaryote diversification. In The
Flagellates. (eds J. C. Green & B. S. C. Leadbeater), pp. 361-390. London: Taylor and Francis.
Cavalier-Smith, T. 2003a Genomic reduction and evolution of novel genetic membranes and proteintargeting machinery in eukaryote-eukaryote chimaeras (meta-algae). Phil. Trans. Roy. Soc. Lond. B
358, 109-134.
Cavalier-Smith, T. 2003b The excavate protozoan phyla Metamonada Grassé emend.
(Anaeromonadea, Parabasalia, Carpediemonas, Eopharyngia) and Loukozoa emend. (Jakobea,
Malawimonas): their evolutionary affinities and new higher taxa. Int. J. Syst. Evol. Microbiol. 53,
1741-1758.
Cavalier-Smith, T. 2003c Protist phylogeny and the high-level classification of Protozoa. Eur. J.
Protistol. 39, 338-348.
Cavalier-Smith, T. 2004 Chromalveolate diversity and cell megaevolution: interplay of membranes,
genomes and cytoskeleton. In Organelles, Genomes and Eukaryote Phylogeny Systematics
20
Association Special Volume 68 eds R. P. Hirt & D. S. Horner. Taylor & Francis, London. Pp. 75108.
Cavalier-Smith, T. 2006a The tiny enslaved genome of a rhizarian alga. Proc. Natl Acad. Sci. USA
103, 9779-9780.
Cavalier-Smith, T. 2006b Rooting the tree of life by transition analysis. Biol. Direct 1: 19.
Cavalier-Smith, T. 2006c Cell evolution and earth history: stasis and revolution. Phil. Trans. Roy. Soc.
Lond. B. 361, 969-1006.
Cavalier-Smith, T. 2007 Evolution and relationships of algae: major branches of the tree of life. In
Unravelling the Algae (ed. J. Brodie & J. Lewis), pp. 21-55. Boca Raton: CRC Press.
Cavalier-Smith, T. 2009a Megaphylogeny, cell body plans, adaptive zones: causes and timing of
eukaryote basal radiations. J. Euk. Microbiol. 56, 26-33.
Cavalier-Smith, T. 2009b Predation and eukaryote cell origins: a coevolutionary perspective. Int. J.
Biochem. Cell Biol. 41, 307-322.
Cavalier-Smith, T. 2009c Deep phylogeny, ancestral groups, and the four ages of life. Phil. Trans.
Roy. Soc. B. in press.
Cavalier-Smith, T. 2009d Origin of the cell nucleus and sex: roles of intracellular coevolution. Biol.
Direct In press.
Cavalier-Smith, T. & Chao, E. E. 2006 Phylogeny and megasystematics of phagotrophic heterokonts
(kingdom Chromista). J. Mol. Evol. 62, 388-420.
Cavalier Smith, T., Allsopp, M. T. E. P. & Chao, E. E. 1994 Chimeric conundra: are nucleomorphs
and chromists monophyletic or polyphyletic? Proc. Natnl Acad. Sci. USA. 91, 11368-11272.
Dagan, T. & Martin, W. 2009 Seeing green and red in diatom genomes. Science 323, 1651-1652.
Dagley, M. J,, Dolezal, P., Likic, V. A., Smid, O., Purcell, A. W., Buchanan, S. K., Tachezy, J. &
Lithgow, T. 2009 The protein import channel in the outer mitosomal membrane of Giardia
intestinalis. Mol. Biol. Evol. 26, 1941-1947.
Dolezal, P., Likic, V., Tachezy, J. & Lithgow, T. 2006 Evolution of the molecular machines for
protein import into mitochondria. Science 313, 314-318.
Elias, M., Patron, N. J. & Keeling, P. J. 2009 The RAB family GTPase Rab1A from Plasmodium
falciparum defines a unique paralog shared by chromalveolates and Rhizaria. J. Eukaryot.
Microbiol. 56, 348-356.
Fast, N. M., Kissinger, J. C., Roos, D. S. & Keeling, P. J. 2001 Nuclear-encoded, plastid-targeted
genes suggest a single common origin for apicomplexan and dinoflagellate plastids. Mol. Biol. Evol.
18, 418-426.
Frantz, C., Ebel, C., Paulus, F. & Imbault, P. 2000 Characterization of trans-splicing in euglenoids.
Curr. Genet. 37, 349-555.
Frommolt, R., Werner, S., Paulsen, H., Goss, R., Wilhelm, C., Zauner, S., Maier, U. G., Grossman, A.
R., Bhattacharya, D. & Lohr, M. 2008 Ancient recruitment by chromists of green algal genes
encoding enzymes for carotenoid biosynthesis. Mol. Biol. Evol. 25, 2653-2667.
Gawryluk, R. M. R. & Gray, M. W. 2009 A split and rearranged nuclear gene encoding the iron-sulfur
subunit of mitochondrial succinate dehydrogenase in Euglenozoa. BMC Res. Notes 2, 16.
Gely, C. & Wright, M. 1986 The centriole cycle in the amoebae of the myxomycete Physarum
polycephalum. Protoplasma 132, 23-31.
Gile GH, Faktorová D, Castlejohn CA, Burger G, Lang BF, Farmer MA, Lukes J, Keeling PJ. 2009
Distribution and phylogeny of EFL and EF-1 in Euglenozoa suggest ancestral co-occurrence
followed by differential loss. PLoS One. 2009
Gluenz, E., Sharma, R., Carrington, M. & Gull, K. 2008 Functional characterization of cohesin subunit
SCC1 in Trypanosoma brucei and dissection of mutant phenotypes in two life cycle stages. Mol
Microbiol 69, 666-680.
Griessmann, K. 1913 Über marine Flagellaten. Archiv Protistenk. 32, 1-78.
Gray, M. W. 2003 Diversity and evolution of mitochondrial RNA editing systems. IUBMB Life 55,
227-233.
21
Hampl, V., Hug, L., Leigh, J. W., Dacks, J. B., Lang, B. F., Simpson, A. G. & Roger, A. J. 2009
Phylogenomic analyses support the monophyly of Excavata and resolve relationships among
eukaryotic "supergroups". Proc. Natl Acad. Sci. USA 106, 3859-3864.
Hempel, F., Bullmann, L., Lau, J., Zauner, S. & Maier, U. G. 2009 ERAD-derived preprotein transport
across the second outermost plastid membrane of diatoms. Mol. Biol. Evol. 26, 1781-1790.
Hoffmeister, M., Piotrowski, M., Nowitzki, U. & Martin, W. 2005 Mitochondrial trans-2-enoyl-CoA
reductase of wax ester fermentation from Euglena gracilis defines a new family of enzymes
involved in lipid synthesis. J. Biol. Chem. 280, 4329–4338.
Ishida, K., Cao, Y., Hasegawa, M., Okada, N. & Hara, Y. 1997 The origin of chlorarachniophyte
plastids, as inferred from phylogenetic comparisons of amino acid sequences of EF-Tu. J. Mol.
Evol. 45, 682-687.
Kim, E., Simpson, A. G. & Graham, L. E. 2006 Evolutionary relationships of apusomonads inferred
from taxon-rich analyses of six nuclear-encoded genes. Mol. Biol. Evol. 23, 2455-2466.
Leblond, J. D., Dahmen, J. L., Seipelt, R. L., Elrod-Erickson, M. J. and Kincaid, R. 2005 Lipid
composition of chlorarachniophytes (Chlorarachniophyceae) from the genera Bigelowiella,
Gymnochlora, and Lotharella. J. Phycol. 41, 311-321.
Liu, Y., Richards, T. A. & Aves, S. J. 2009 Ancient diversification of eukaryotic MCM DNA
replication proteins. BMC Evol. Biol. 9, 60.
Marande, W., Lukes, J. & Burger, G. 2005 Unique mitochondrial genome structure in diplonemids, the
sister group of kinetoplastids. Eukaryot. Cell 4, 1137-1146.
Mignot, J.-P. 1961 Contribution à l’étude cytologique de Scytomonas pusilla (Stein) (Flagellé
euglénien). Bull Biol Fr Belg 95, 665-678.
Minge, M., Silberman, J. D., Orr, R., Cavalier-Smith, T., Shalchian-Tabrizi, K., Burki, F.,
Skjaeveland, Å. & Jakobsen, K. S. 2009 Evolutionary position of breviate amoebae illuminates the
primary eukaryote divergence. Phil. Trans. Roy. Soc. B 276, 597-604.
Moestrup, Ø. 2000 The flagellate cytoskeleton: introduction of a general terminology for microtubular
roots in protists. In The flagellates: unity, diversity and evolution (ed. B. S. Leadbeater & J. C.
Green), pp. 69-94. London: Taylor & Francis.
Moreira, D., von der Heyden, S., López-García, P., Bass, D., Chao, E. and Cavalier-Smith, T. 2007
Global eukaryote phylogeny: combined small- and large-subunit ribosomal DNA trees support
monophyly of Rhizaria, Retaria and Excavata. Mol. Phylogen. Evol. 44, 255-266.
Moore, R. B., Obornik, M., Janouskovec, J., Chrudimsky, T., Vancova, M., Green, D. H., Wright, S.
W., Davies, N. W., Bolch, C. J., Heimann, K., Slapeta, J., Hoegh-Guldberg, O., Logsdon, J. M. &
Carter, D. A. 2008 A photosynthetic alveolate closely related to apicomplexan parasites. Nature
451, 959-963.
Odronitz, F. & Kollmar, M. 2007 Drawing the tree of eukaryotic life based on the analysis of 2,269
manually annotated myosins from 328 species. Genome Biol 8, R196.
O'Kelly, C. 1993 The jakobid flagellates: structural features of Jakoba, Reclinomonas and Histiona
and implications for the early diversification of eukaryotes. J. Euk. Microbiol. 40, 627-636.
O'Kelly, C., Nerad, T. A. 1999 Malawimonas jakobiformis n. gen., n. sp. (Malawimonadidae fam.
nov.): a jakoba-like heterotrophic nanoflagellate with discoidal mitochondrial cristae. J. Euk.
Microbiol. 46, 522-531.
Patron, N. J. & Waller, R. F. 2007 Transit peptide diversity and divergence: A global analysis of
plastid targeting signals. BioEssays 29, 1048-1058.
Patron, N. C., Rogers, M. B. & Keeling, P. J. 2004 Gene replacement of fructose-1,6-bisphosphate
aldolase supports the hypothesis of a single photosynthetic ancestor of chromalveolates. Eukaryot.
Cell 3, 1169-1175.
Patron, N. J., Waller, R. F. & Keeling, P. J. 2006 A tertiary plastid uses genes from two
endosymbionts. J. Mol. Biol. 357, 1373-1382.
Patterson, D. J. 1999 The diversity of eukaryotes. Am. Nat. 154, S96-S124.
22
Pusnik, M., Charriere, F., Maser, P., Waller, R. F., Dagley, M. J., Lithgow, T., Schneider, A. 2009 The
single mitochondrial porin of Trypanosoma brucei is the main metabolite transporter in the outer
mitochondrial membrane. Mol. Biol. Evol. 26: 671-680.
Richards, T. A. & Cavalier-Smith, T. 2005 Myosin domain evolution and the primary divergence of
eukaryotes. Nature 436, 1113-1118.
Robinson, N. P. & Bell, S. D. 2007 Extrachromosomal element capture and the evolution of multiple
replication origins in archaeal chromosomes. Proc. Natl Acad. Sci. USA 104, 5806-5811.
Rodríguez-Ezpeleta, N., Brinkmann, H., Burger, G., Roger, A. J., Gray, M.W, Philippe, H., Lang, B.
F. 2007 Toward resolving the eukaryotic tree: the phylogenetic positions of jakobids and
cercozoans. Curr. Biol. 17: 1420-1425.
Rogers, M. B., Archibald, J. M., Field, M. A., Li, C., Striepen, B. & Keeling, P. J. 2004 Plastidtargeting peptides from the chlorarachniophyte Bigelowiella natans. J. Eukaryot. Microbiol. 51,
529-535.
Russell, A. G., Schnare, M. N. & Gray, M. W. 2006 A large collection of compact box C/D snoRNAs
and their isoforms in Euglena gracilis: structural functional and evolutionary insights. J. Mol. Biol.
367, 1548–1565.
Sanchez-Puerta, M.V. and Delwiche, C.F. 2008 A hypothesis for plastid evolution in chromalveolates.
J. Phycol. 44, 1097–1107.
Silver, T. D., Koike, S., Yabuki, A., Kofuji, R., Archibald, J. M. & Ishida, K. 2007 Phylogeny and
nucleomorph karyotype diversity of chlorarachniophyte algae. J. Eukaryot. Microbiol. 54, 403-410.
Simpson, A. G. B. 1997 The identity and composition of the Euglenozoa. Arch. Protistenk. 148, 318328.
Simpson, A. G. B., Patterson, D. J. 1999 The ultrastructure of Carpediemonas membranifera
(Eukaryota) with reference to the "excavate hypothesis". Eur. J. Protistol. 35, 353-370.
Spork, S., Hiss, J. A., Mandel, K., Sommer, M., Kooij, T. W., Chu, T., Schneider, G., Maier, U. G. &
Przyborski, J. M. 2009 An unusual ERAD-like complex is targeted to the apicoplast of Plasmodium
falciparum. Eukaryot. Cell 8, 1134-1145.
Triemer, R. E. & Farmer, M. A. 1991 The ultrastructural organization of heterotrophic euglenids and
its evolutionary implications. In The biology of free-living heterotrophic flagellates (eds D. J.
Patterson & J. Larsen), pp. 185-204. Oxford: Clarendon Press.
Turmel, M., Gagnon, M. C., O'Kelly, C. J., Otis, C. & Lemieux, C. 2009 The chloroplast genomes of
the green algae Pyramimonas, Monomastix, and Pycnococcus shed new light on the evolutionary
history of prasinophytes and the origin of the secondary chloroplasts of euglenids. Mol Biol Evol
26, 631-648.
Valas, R. E. & Bourne, P. E. 2009 Structural analysis of polarizing indels: an emerging consensus on
the root of the tree of life. Biol. Direct 4, 30.
Waller, R. F., Jabbour, C., Chan, N. C., Celik, N., Likic, V. A., Mulhern, T. D. & Lithgow, T. 2009
Evidence of a reduced and modified mitochondrial protein import apparatus in microsporidian
mitosomes. Eukaryot. Cell 8: 19-26.
Wright, M., Moisand, A. & Mir, L. 1980 Centriole maturation in the amoebae of Physarum
polycephalum. Protoplasma 105, 149-160.
Yubuki, N., Edgcomb, V. P., Bernhard, J. M. & Leander, B. S. 2009 Ultrastructure and molecular
phylogeny of Calkinsia aureus: cellular identity of a novel clade of deep-sea euglenozoans with
epibiotic bacteria. BMC Microbiol 9, 16.
Download