1 Supplementary material for Thomas Cavalier-Smith: Kingdoms Protozoa and Chromista and the eozoan root of the eukaryotic tree. Biology Letters. This electronic supplement contains additional explanations for my conclusions, discussion of the drawbacks of alternative ideas in the literature, a summary of the revised classification of both kingdoms with nomenclatural details (Table 1), and further references, which severe space constraints did not allow including in the printed paper. Since the final version of the paper was prepared I have found nine additional lines of evidence for the root being between Euglenozoa and neokaryotes. As space constraints did not allow their insertion into this paper I explain them elsewhere (Cavalier-Smith 2009d). They are: (1) absence of the centromeric histone H3 variant CENPA (crucial for neokaryote centromere assembly) in trypanosomatids and somewhat shorter N-terminal tails for histone H3 (with the segment embracing the key lysine for neokaryote acetylation labelling for heterochromatinization missing) and histone H4; (2) the RNase III dicer enzyme that generates small RNAs lacks two domains in trypanosomatids (like the ancestral prokaryotic RNase III) that were arguably added stepwise in neokaryotes and then neozoa; (3) absence in trypanosomatids (as in bacteria) of the PIWI paralogue of the Argonaute proteins that targets double-stranded RNA for digestion; (4) absence in trypanosomatids of RNA polymerase II transcription factors IIA, F, and H; (5) ER luminal quality control of nascent glycoproteins is simpler in trypanosomatids with two enzymes that Neozoa use to digest faulty ones (Mannosidase I and peptide-N-glycanase); (3) absence in trypanosomatids of kinesins 4-8 and 15; (6) absence in trypanosomatids of widespread tail domains from two of the three putatively ancestral myosins; (7) absence in trypanosomatids of the chromosomal cohesin Smc heterodimer Smc5/6; (8) absence in trypanosomatids of the ER calcium-binding protein calreticulin. (9) trypanosomatids have more primitive archaebacteria-like small nucleolar RNAs (snoRNAs) involved in prerRNA processing than do neokaryotes. All nine characters are most simply interpreted as the primitive condition for Euglenozoa (testable by studying them in bodonids, diplonemids, and euglenoids) and also for eukaryotes as a whole, rather than secondary simplifications of trypanosomatids alone. Thus with ORC and TOM40 this makes at least 11 independent trypanosomatid characters best interpreted as the primitive state for all eukaryotes and supporting the primary eukaryotic dichotomy being between Euglenozoa and neokaryotes. The fact that these include such fundamental and diverse cell properties as mitochondrial protein import, nuclear DNA replication initiation, snoRNAs, and centromere biogenesis means that they cannot be dismissed as trivia and highlights the importance of studying all these features intensively in a phylogenetically broad spectrum of eukaryotes and carrying out genome projects for a similarly broad range of deepbranching Euglenozoa. Furthermore, on the Polo-like paralogue-rooted aurora kinase tree Euglenozoa are the most divergent eukaryotes (Brown et al. 2004), as they are for each of the four paralogue subtrees of the giant chromatin protein SMC family (Gluenz et al. 2008); the latter may be especially significant as SMCs are extremely long and well-conserved proteins that seems to suffer much less from episodic quantum evolution in the stems of paralogue rooted trees than most other proteins and arguably give better single-gene trees than almost any other protein normally used for deep phylogeny. ‘Chromophyte’ here refers collectively to all algae with chlorophyll c so as to contrast them with non-photosynthetic Chromista such as Rhizaria, Ciliophora, Pseudofungi, and Heliozoa. Throughout this paper ‘haem lyase’ refers always to the invariably nuclear-encoded monomolecular haem (=heme to Americans) lyase of neozoa only. Sometimes the unrelated nonhomologous multigene Ccms of excavates and corticates (encoded in the mitochondrial genome in Loukozoa) are confusingly annotated in GenBank as haem or heme lyase, instead of by the more usual term ‘cytochrome c-type biogenesis protein’ normally used for their bacterial homologues); 2 Allen et al. (2008) clearly explain all the different types of cytochrome c biogenesis enzymes currently known; the fourth method of cytochrome c-type biogenesis in eukaryotes is that chloroplasts use a second bacterial c-type biogenesis mechanism involving Res, probably introduced by their cyanobacterial ancestor and transferred to chromophytes by the secondary symbiogenetic red alga - but this was not discussed in this paper or shown on Fig. 1 as it is irrelevant to rooting the tree. Table 1. Revised Classification of the Kingdoms Protozoa and Chromista Kingdom Protozoa† Owen 1858 emend. Subkingdom Eozoa†* Cavalier-Smith 1997 emend. Infrakingdom and Phylum Euglenozoa Cavalier-Smith 1981 (Euglenoidea, Diplonemea, Postgaardea**, Kinetoplastea) Infrakingdom Excavata† Cavalier-Smith 2002 emend. Phylum Percolozoa Cavalier-Smith 1991 (Pharyngomonadia and Tetramitia, i.e. Lyromonadea, Heterolobosea, Percolatea) Phylum Loukozoa† Cavalier-Smith 1999 (Jakobea, Malawimonadea, ?Diphyllatea) Phylum Metamonada Cavalier-Smith 1981 emend. 2003 (Anaeromonadea, Eopharyngia, Parabasalia) Subkingdom Sarcomastigota† Cavalier-Smith 1983 emend. Phylum Amoebozoa Lühe 1913 emend. Cavalier-Smith 1998 Phylum Apusozoa Cavalier-Smith 1997 stat. n. 2003 emend. 2008 Phylum Choanozoa† Cavalier-Smith 1981 Kingdom Chromista Cavalier-Smith 1981 emend. Subkingdom Harosa subk. n. Diagnosis: typically with cortical alveoli or tripartite ciliary hairs or reticulose or filose pseudopods or ciliary gliding. Etymology: HAR the initials of Heterokonta (=stramenopiles) Alveolata and Rhizaria (= SAR group of Burki et al. 2007); plus meaningless suffix –osa as used in such names as Filosa (within the rhizarian Cercozoa), Lobosa (Amoebozoa) and Conosa (Amoebozoa) referring also to at least partially amoeboid groups. Infrakingdom Heterokonta Cavalier-Smith 1986 (also known by the unnecessary junior (1989) synonym ‘stramenopiles’) Phylum Ochrophyta Cavalier-Smith 1986 (e.g. diatoms, brown algae, chrysophytes) Phylum Pseudofungi Cavalier-Smith 1986 stat. n. 1989 (Oomycetes, Hyphochytrea, Developayella) Phylum Bigyra Cavalier-Smith 1998 (Opalozoa, e.g. Actinophryida, Blastocystis; Bicoecea; Labyrinthulea) Infrakingdom Alveolata Cavalier-Smith 1991 Phylum Myzozoa Cavalier-Smith 2004 (Dinozoa [dinoflagellates, ellobiopsids and perkinsids]; Apicomplexa [apicomonads, Chromera, Sporozoa]) Phylum Ciliophora Doflein 1901 stat. n. Copeland 1956 (ciliates and suctorians) Infrakingdom Rhizaria Cavalier-Smith 2002 emend. Phylum Cercozoa Cavalier-Smith 1998 Phylum Retaria Cavalier-Smith 1999 (Foraminifera; Radiozoa) Subkingdom Hacrobia (Okamoto et al. 2009***) subking. n. Phylum Cryptista Cavalier-Smith 1989 (Cryptophyceae, Goniomonadea; Katablepharidea, Telonemea) Phylum Haptophyta Hibberd ex Cavalier-Smith 1986 Phylum Heliozoa Haeckel 1862 stat. n. Margulis 1974 em. Cavalier-Smith 2003 (Centrohelea) 3 † Paraphyletic; the validity and importance of ancestral (paraphyletic) taxa (e.g. Bacteria, Protozoa, Eozoa, Excavata, Loukozoa, Choanozoa, Sarcomastigota) is explained elsewhere (Cavalier-Smith 2009c). Only these four taxa and those they include should be treated under the International Code of Botanical Nomenclature. All other chromist and all protozoan names should be subject to the International Code of Zoological Nomenclature. * I invented the name Eozoa as a subkingdom name (Cavalier-Smith 1997) for the protozoan phyla Euglenozoa, Percolozoa, and Trichozoa (the latter now subsumed within Metamonada: Cavalier-Smith 2003b). Here I emend the subkingdom by formally adding the phylum Loukozoa (Cavalier-Smith 1999) and all taxa now included in Metamonada Cavalier-Smith (2003b). **Includes Postgaardi and Calkinsia (Cavalier-Smith 2003b,c). Separating Calkinsia into a new unranked higher taxon with a new name (Yubuki et al. 2009) was not merited by morphology and was unwise in the absence of molecular data for Postgaardi. ***These authors introduced this name for the clade comprising the last common ancestor of haptophytes and cryptomonads and all its descendants, but without assigning a taxonomic rank. Here I formally adopt the same name for this new subkingdom (Diagnosis as in their paper p. 5), which currently has the same taxonomic composition as in their paper (I formally exclude from the subkingdom those dinoflagellates that are partially descended from haptophytes by acquiring their plastids and various genes). __________________________________________________________________________________ Note 1. Dual Green and Red secondary symbiogenesis in the origin of kingdom Chromista ‘I think it best to put forward simple, detailed and specific hypotheses, since these have a better chance of stimulating (and being refuted or corroborated by) future research than are vague or unnecessarily complicated ones.’ (Cavalier-Smith 1993a p. 339.) This radical new interpretation unifies for the first time three independent recent discoveries: (1) that all four chromalveolate groups (Haptophyta, Cryptista, Heterokonta, Alveolata) accepted to have acquired their chloroplasts symbiogenetically from a red alga also contain scores hundreds or over a thousand nuclear genes that are proposed to be specifically related to those of nonstreptophyte (possibly prasinophyte) green algae (in heterokonts even more than those specifically related to reds) (Moustafa et al. 2009); (2) that the chlorarachnean alga Bigelowiella, which belongs to the phylum Cercozoa within Rhizaria, accepted to have acquired its nucleomorph and plastid from a non-streptophyte (probably ulvophyte) green alga (Ishida et al. 1997) also contains numerous genes of probably red algal origin (Archibald et al. 2003); (3) That Rhizaria, including Cercozoa and Bigelowiella, robustly group on multigene trees as sister to Heterokonta/Alveolata within the chromist subkingdom Harosa as its deepest branch. Moustafa et al. (2009) pointed out that the presence of many of the same putatively prasinophyte ‘green’ genes in all four chromalveolate lineages, including the entirely plastid-free Ciliophora (Frommolt et al. 2008) strongly indicates that they were implanted in the common ancestor of all four groups (incidentally strongly supporting their monophyly independently of earlier evidence for this), and furthermore makes it probable that they were implanted over a relatively short time by a previously gene transfer from a previously unrecognised intracellular green algal symbiont (endosymbiotic gene transfer: EGT) in the common ancestor of chromalveolates rather than by numerous independent lateral transfers (LGT) spread over a long evolutionary timespan. Moustafa et al (2009) refer to their reasonably postulated ancestral green algal endosymbiosis as ‘cryptic’, meaning presumably that no visible cellular evidence of it survives today comparable to the chlorophyll c-containing plastids of chromalveolates or the nucleomorphs of Cryptophyceae. However, I suggest here that the increasingly strong evidence that Rhizaria branch within Chromista after Harosa and Hacrobia diverged means that the postulated second 4 ancestral green symbiogenesis (if it genuinely occurred; see technical caveat below) might not have been cryptic at all, but may persist to this day in the form of the green algal chloroplast of chlorarachnean Rhizaria and their nucleomorph (Cavalier-Smith 2006a). Particularly as the donor of the chlorarachnean chloroplast and nucleomorph and the donor of the chromalveolate green genes both appear to have been a chlorophyte, non-streptophyte alga, it would not be parsimonious to assume one green algal symbiosis in the common ancestor of all chromists to explain the data of Moustafa et al. (2009) and another (perhaps not much later) in just one chromist lineage to explain the origin of chlorarachnean algae, unless one were sure that each involved a different group of green algae, which is not currently the case. Assuming one green algal symbiogenesis only has important and testable implications also for the red algal symbiogenesis (below). Those averse to accepting evolutionary loss will immediately point out that this entails accepting several more losses of the green algal chloroplast and nucleus within Chromista than the Frommolt et al. (2008) and Moustafa et al. (2009) idea of a cryptic endosymbiosis, which assumes only one such loss for both organelles, but no persistence in chlorarachneans, as if that were a disadvantage of this new simplifying hypothesis, which it is not. Already one must accept at least 4-5 independent losses of the red algal nucleomorph within Chromista and still more losses of the ‘red’ plastid and even more losses of photosynthesis but with retention of a leucoplast. Loss is pervasive in cellular evolution and often (not always) much easier mechanistically than gain of complex characters whether by symbiogenesis or otherwise. Discussing parsimony about the number of qualitatively incomparable events in alternative scenarios is very misleading without realistically weighting them according to the mechanistic changes involved. The tiny cost in parsimony of assuming about 5-7 losses of the green nucleomorph and plastids within Chromista (the excavate number is uncertain because the basal topology of the tree within Rhizaria is still unsettled) must be weighed against the great economy of hypothesis that the present theory of a dual concerted secondary symbiogenesis can provide. There is no need to assume a serial symbiogenesis as did Moustafa et al. (2009). A temporally overlapping dual symbiogenesis is more parsimonious as it would allow the initial stage of the evolution of protein-targeting of rough ER-made proteins into the green and red chloroplasts to be shared. This would halve the evolutionary difficulty of secondary symbiogenesis compared with the traditional idea that the green and red enslavements took place in separate cells (CavalierSmith 1999, 2000, 2003). Therefore I now suggest that in the stem lineage of Chromista green and red symbionts became enslaved in the same cell and that at least some of the metabolite exchange proteins that arguably initiated that process are shared between chlorarachneans and chromalveolates. Furthermore I suggest that the initial stages of protein targeting from the ER to the plastid were also shared, there being a common set of new SNAREs that targeted Golgi vesicles indiscriminately to both the red and green plastids. This predicts that when elucidated the plastid-destined vesicle targeting systems of the harosan groups Alveolata and Chlorarachnea may share more properties than would be expected if they evolved independently. The next stage in targeting is across the periplastid membrane, the former plasma membrane of the enslaved algae. In chromalveolates this is mediated by transit-like presequences that probably evolved from chloroplast transit sequences (Cavalier-Smith 1999, 2003a) which are recognised by Derlin proteins (Der1) of a relocated periplastid membrane-specific version of the ERAD export machinery used to export damaged proteins from the ER of all eukaryote cells, and which evolved by gene duplication in the ancestral chromist (Bolte et al. 2009; Hempel et al. 2009; Agrawal et al. 2009). The unity of this periplastid membrane machinery in all chromalveolates is one of the two strongest lines of evidence for there having been only one secondary symbiogenetic intracellular enslavement of a red alga in the history of life. Possibly the chlorarachnean periplastid membrane protein system also might turn out to have more in common with the chromalveolate periplastid Derlin system than would be expected if the green and red symbiogeneses evolved independently. However, a common origin of the periplastid membrane transport machinery in chlorarachneans and chromophytes, though a permissible feature of the dual origin theory is not a necessary one; what one might expect would 5 depend on the relative timing of the events and how much common evolution proceeded before each divergence shown in figure 1. I have long argued that symbiogeneses could be completed surprisingly quickly, and that the basal divergences of both Plantae and Chromista were probably much more rapid than most biologists would intuitively expect and that such rapid divergence is the chief reason why the topology of both kingdoms is hard to resolve on sequence trees (CavalierSmith 1993). Moreover, if two different secondary plastids were indeed being enslaved partially simultaneously, one would expect some divergent selection for specificity to cause their import machinery to diverge even if it had some shared features originally (just such divergence occurred during temporary coexistence in the tertiary symbiogenetic replacement of a dinoflagellate plastid by one from haptophytes: Patron and Waller 2007). Present evidence is indecisive: transit-peptidelike sequences of the rhizarian Bigelowiella are similar in amino acid composition to those of Apicomplexa (Alveolata) but that for the RuBisCo gene did not support targeting into the plastid of Toxoplasma in a heterologous transformation experiment (Rogers et al. 2004). The above scenario assumes that the ‘green’ genes of Moustafa et al. and the chloroplast and nucleomorph of Chlorarachnea both came from the same secondary symbiogenesis and therefore from the same species of non-streptophyte green alga. Currently it is unclear whether or not this assumption is valid. An early EF-Tu protein sequence tree suggested that the donor for chlorarachneans was a chlorophyte belonging to the class Ulvophyceae (Ishida et al. 1997) and 18S rRNA trees also suggested this but with extremely weak support (Silver et al. 2007). 70-gene chloroplast trees robustly rule out both streptophytes and prasinophytes as donor and indicate a position deep within tetraphytine green algae (Cavalier-Smith 2007) close to Ulvophyceae (Turmel et al. 2009). By contrast the analyses of Moustafa et al. (2009) suggest Prasinophyceae as the gene donor and apparently rule out streptophytes. However, as there are no genome sequences for Ulvophyceae, their trees cannot contain ulvophyte sequences, making it premature to conclude that most of the genes came from prasinophytes rather than ulvophytes (or close relatives of them). When more green algal genomes, including several ulvophytes are available it should be possible to distinguish between a single green algal (probably ulvophyte) secondary symbiogenesis only in Chromista, as proposed here for its heuristic simplicity, or an additional separate cryptic symbiosis from a different (prasinophyte) donor as they suggest. Tertiary symbiogenesis is a red herring for understanding chromist deep phylogeny. The unity of chromalveolates shown by the periplastid targeting machinery, and equally compellingly but entirely independently, by the gene duplication, plastid retargeting and gene replacement of plastid GAPDH and plastid FBA in all four chromalveolate groups (Fast et al. 2001; Patron et al. 2004), proves beyond any shadow of doubt that only one secondary symbiogenetic enslavement of a red alga was involved in the origin of chromalveolate plastids. But this triply compelling evidence does not in itself rule out the possibility that chromist plastids and nuclei with all these genes were also transferred bodily and laterally among distantly related lineages, a process known as tertiary symbiogenesis (whose possibility for chromists I first emphasized: Cavalier-Smith et al. 1994). Indeed one such case is proven: the replacement of the typical peridinin-containing plastid of one small lineage of dinoflagellates by a foreign fucoxanthin containing one from a haptophyte (Patron et al. 2006). However, this tertiary symbiogenesis was almost certainly a replacement of a pre-existing chloroplast, as evidence for genes of both still persisting attest (Patron et al. 2006). In principle, already having a plastid of secondary origin and nuclear genes coding for the machinery for protein import across several membranes and a thousand or more genes encoding proteins with topogenic sequences recognised by that machinery ought to facilitate replacement of that chloroplast by a foreign one with similar machinery, which ought therefore to be much easier evolutionarily than tertiary implantation of a plastid with four bounding membranes into a purely heterotrophic lineage that never had a plastid. It would therefore be fallacious to argue that this known case of tertiary chloroplast replacement makes it acceptable to assume that tertiary symbiogenesis into a plastid-free lineage, which must be evolutionarily 6 extremely difficult (not one example exists), is anywhere near as likely as the evolutionary loss of plastids, which could in principle result from a single mutation. Nonetheless Sanchez-Puerta and Delwiche (2008) postulated that tertiary symbiogenesis occurred either once or twice into lineages that they suggest originally never had plastids (as others also have, but I single out their hypothesis for criticism as it is better argued than most). Exactly why they suggested this is unclear, as they gave no explicit reasons. Reading between the lines I can only suppose that they do not want to accept that Rhizaria had a photosynthetic ancestor with a red algal plastid and would prefer not to accept this for Ciliophora either, and assume (wrongly I think) that chromist monophyly should be easier to demonstrate on sequence trees than it is. But accepting a red algal plastid in the ancestral rhizarian adds only one additional plastid loss, and accepting one in ciliates also (which they seem more ready to, for an unstated reason) adds just two losses, to the several with which they seem to have no problem; that hypothetical reduction in losses is an extremely weak justification for their incredibly complicated and entirely unnecessary scenario. As stressed long ago, worry over Chromista and Plantae not appearing monophyletic on many sequence trees is misplaced as this inconclusive resolution was expected for sound evolutionary reasons (Cavalier-Smith 1993a p. 331-2). The problem was worst with single-gene trees but persists with multigene trees. However most multigene trees do group the main chromalveolate taxa together in pairs (haptophytes with cryptists and heterokonts with alveolates: Burki et al. 2007, 2008, Hackett et al. 2007) and the most recent, most taxonomically comprehensive tree based on most (127) genes (Burki et al. 2009) groups all four together with moderately good support, with of course the inclusion also of Rhizaria and Heliozoa - which simply indicates that these two taxa evolved secondarily from chromophyte ancestors, as was long considered likely for the heterotrophic heterokont phyla (Pseudofungi and Bigyra) and should now also be accepted for Ciliophora. This latest multigene tree thus shows the monophyly of both Plantae (now generally accepted, but about which there also was scepticism for decades since I first advocated it: Cavalier-Smith 1981, 1982) and Chromista in the expanded sense of this paper, and shows Plantae and Chromista as sisters (i.e. monophyletic corticates) as on figure 1 of this paper. Thus it appears that the failure of Plantae and Chromista to appear as two distinct noninterdigitating clades on so many published trees may indeed simply be poor resolution resulting from extremely rapid radiation after the single primary symbiogenesis that all now accept made chloroplasts and the single secondary symbiogenesis that made the chromalveolate/chromist plastid (as long argued: Cavalier-Smith 1993a). Thus there is no reason to postulate multiple tertiary symbiogenesis to explain the origins of the ancestral plastid characterising any of the four main chromophyte groups on the mistaken grounds that the kingdom Chromista (now including Rhizaria and Heliozoa) is not monophyletic. As taxon sampling for hundreds of genes improves, evidence that Chromista are both monophyletic and holophyletic will probably grow stronger still. Moreover, as Archibald (2009) points out, the lateral gene transfer shared by haptophyte and cryptophyte chloroplast genomes, in which a bacterial ribosomal protein gene (rpl36 (see Fig. 1 in blue) replaced the endogenous one, severely limits the tertiary symbiogeneses that could justifiably be assumed. One could not reasonably postulate lateral transfer between the two chromist subkingdoms recognised here, in an effort to explain the origins of their main plastids, at any time after the donor subkingdom had undergone its primary bifurcation to produce its two main photosynthetic lineages. This is because after that time the donor plastid would lack either the replacement bacterial gene or the original endogenous gene that is now present in the postulated recipient group’s plastids. Thus only tertiary symbiogenesis from a stem group prior to the date of the LGT into hacrobian plastids are permissibly postulatable (Archibald 2009). However such early transfers cannot explain the extensively mosaic presence and absence of plastids within Hacrobia, Heterokonta, and Alveolata; as Archibald (2009) correctly stresses, one would still have to accept several plastid losses within each group. Yet Sanchez-Puerta and Delwiche (2008) ignore this restriction in their specific proposals: either two separate tertiary symbiogeneses from a haptophyte to the ancestors of Myzozoa and Heterokonta or just one to their last common ancestor. Both hypotheses contravene the rpl36 replacement constraints by gratuitously assuming that the original 7 and replacement genes both persisted for a long time in the donor lineage (up to the time of the postulated symbiogenesis) and that the replacement gene was lost at least five times independently (by Cryptophyceae, at least twice within haptophytes and by heterokonts). Many would expect the replacement by recombination to be almost instantaneous and thus regard this scenario as most implausible. Bodyl et al. (2008) also invoked tertiary symbiogeneses (most unparsimoniously postulating four!), claiming even that a single secondary symbiogenetic insertion of a red alga is impossible because the topology of the multigene tree of Burki et al. (2008), in which Hacrobia are sisters to Plantae not to Harosa, would imply that this had to take place before red algae had even evolved. However, Burki et al. (2009) using 127 genes and a much richer taxon sample, especially of Hacrobia, have now shown that the earlier tree was probably incorrect in that respect and that Hacrobia and Harosa are probably sisters, their tree showing a holophyletic Chromista (in the present sense) with moderately good support. Thus, contrary to the claim of Bodyl et al. (2008), there is no solid phylogenetic objection to a single ancestral chromistan secondary acquisition of a red alga. Far from being impossible, it is highly likely. At best, invoking tertiary symbiogeneses can slightly reduce the number of plastid losses that must be accepted; it cannot eliminate the need to accept evolutionary plastid losses in some lineages. But there is no scientific merit in postulating numerous evolutionarily extremely complex and mechanistically onerous events (tertiary symbiogeneses) to avoid postulating a similar number of evolutionarily and mechanistically extremely simple ones (plastid losses). In evolutionary biology one must distinguish between the possible and highly likely and the possible but extremely unlikely. Mere possibility alone is no reason to promote an unnecessarily complex explanation with less likely assumptions. Another limitation of such tertiary symbiogenesis ideas is that they address only the origins of plastids, not those of the tubular ciliary hairs of Cryptista and Heterokonta, which were the second major reason for establishing the Chromista (Cavalier-Smith 1981, 1986), and are evidence independent of plastids for chromist monophyly (assuming they are indeed homologous, which remains to be tested by molecular biology). One could not explain the presence of tubular hairs in heterokonts by tertiary transfer from haptophytes, which lack them (putatively secondarily: Cavalier-Smith 1986, 1994; multiple losses of tripartite hairs are now clear within Heterokonta and likely within Cryptista: Cavalier-Smith and Chao 2006; Cavalier-Smith 2004). Molecular data are also needed to test whether the simple ciliary hairs of Myzozoa and solid hairs of Goniomonadea are related to the tubular hairs of heterokonts, Cryptophyceae and Telonemea despite their markedly different morphology. The discovery of the multitude of genes of green algal origin in both chromist subkingdoms (Moustafa et al. 2009) also favours a photosynthetic ancestry for all chromists and their monophyly. It could not be readily explained by the specific tertiary symbioses of Sanchez-Puerta and Delwiche (2008) or of Bodyl et al. (2008) or of Fig. 2 of Archibald (2009). As heterokonts (the postulated recipients) seem to have several times as many ‘green’ genes as haptophytes (the postulated donors) supposing that they got them from haptophytes is implausible. The presence of genes of red algal origin in Bigelowiella is not specific support for the dual theory; it may simply be a consequence of the rhizarian ancestor having had an ancestor containing a nucleomorph of red algal origin. However, the presence of 20:5(n-3) fatty acids in the glycolipids of the rhizarian chlorarachnean algae, which are characteristic of red algae but unknown in green algae (Leblond et al. 2005), specifically favours the dual simultaneous symbiogenesis theory as these lipids are probably now located in the green chloroplast, suggesting at least a brief period of coexistence of green and red plastids in the same cell in early Cercozoa. Biologists have been too ready to invoke multiple origins, whether by LGT or by symbiogenesis, when an ancestral presence of multiple characters followed by several differential losses of one character or another often offers a simpler and more likely explanation of patchy character distribution, as exemplified by the complex mutually exclusive distribution of the alternative protein synthesis elongation proteins (EF1- and EFL) in Euglenozoa (Gile et al. 2009). Contrary to earlier assumptions of repeated LGT the simplest interpretation now is that both were 8 present in the ancestral euglenozoan and that these two genes evolved by gene duplication and divergence from a single ancestral prokaryotic protein in the ancestral eukaryote. Technical Caveat Like Dagan and Martin (2009) I am concerned that when dealing with ‘thousands of trees some trees will give erroneous results purely by chance’ and that ‘what constitutes “evidence” in the analysis of thousands of gene trees remains subjective’, and thus I think that the assumption of a massive early influx of green algal genes into chromalveolates (Moustafa et al. 2009) is at least partly open to question. In particular Moustafa et al. (in their electronic supplement) too glibly dismiss the possibility that Plantae really are sisters of Chromista as shown on the best multigene trees (e.g. Burki et al. 2009) and figure 1 here. If that is so, the first step of their methodology searching for diatom genes more related to red and green algae than to any other non-chromist taxa could largely be selecting for genes that were vertically inherited from the common ancestor of Plantae and Chromista. Chance and lineage-specific biases and model violations could make the diatoms/chromists artefactually sister to (or even branch within) either green plants or red algae in a significant proportion of the trees even if their correct position were as sister to reds + greens + glaucophytes (as no glaucophyte genome is available they would be missing from many trees). The vastly greater number of green plant than red genes in the target data set could have biased the initial section of genes towards the greens. Such tree topology artefacts are particularly likely because most of the genes studied probably had fewer than 250 alignable amino acids (they gave no data on that, but on average each gene of Burki et al. (2009) contributed only 230 amino acids) and the chances are that many (conceivably most) of the thousands of genes they screened automatically just gave bad trees that do not accurately reflect their evolutionary history. As they only included two trees in the supplement, one cannot form a judgement about this, the adequacy of taxon sampling, or whether the tree topology was sensible in other respects. For the vast majority of these genes there are no published trees and no track record of giving congruent results in general; remarkably many of the green ones especially are hypothetical of unknown function. If chromists originated 700 My ago, Plantae 770 My ago and the red-green split was 735 My ago and the unikont-corticate split was 800 My ago (all reasonable guesses based on the fossil record), then only amino acid changes that occurred in the narrow 770-735 My ago window and were not overwritten by changes in either the chromist or plant lineages in the following 735 My would retain evidence for the true position. If changes were random along the molecule and unbiased through time the overwriting problem means there is very little chance that any protein would have more than one or two amino acids that happened to evolve that way and most might have none; either the gene would be evolving too slowly to have many amino acid changes in the relevant relatively short time interval or too fast to retain them subsequently. Of course molecules do not evolve that predictably, as evolution is more erratic and biased. Such biases, e.g. frozen covarion effects or temporary rapid evolution, will sometimes accentuate the true history for some deep branches and make it easier than expected to recover the correct history by that gene and sometimes instead lead to incorrect topologies with strong enough ‘support’ to fool us. The PhyML method used is particularly prone to getting stuck in local minima and giving contradictory branch topologies with high support for different genes, and the aLRT of 0.75 gives no assurance that genes labeled ‘green’ actually came from green algae. Thus I think the number of genes that purportedly came from green algae into the first chromist may be substantially overestimated. Nonetheless, it seems likely that there is a real signal amidst the almost inevitable phylogenetic noise, especially as the strong bias towards a nonstreptophyte affinity in their results, which would not obviously be expected from the considerations just mentioned, is concordant with the evidence for a non-streptophyte green algal secondary symbiogenesis making the chlorarachnean algae, which now seem to have had a deep common history with the chromalveolates. The dual symbiogenesis hypothesis simply and economically accounts for this non-streptophyte bias in the results. Among green algae the marine 9 Ulvophyceae and largely marine prasinophytes are a priori more likely symbionts for the ancestral chromist, which was almost certainly marine (Cavalier-Smith 2009a), as they are more abundant in the oceans than either Chlorophyceae or unicellular streptophytes, which are both largely freshwater or soil organisms. Thus even if there were two independent secondary green symbioses in chromists both would probably have involved non-streptophytes. The parsimony in the present hypothesis therefore comes not from the mere fact that both lines of evidence point to nonstreptophyte green algae, but from a potential mechanistic economy in the initial stages of the red and green symbiogenesis. The significance of this broad taxonomic agreement over the donor is simply that it fails to contradict the present theory of a common cause for both observations. There is a pleasing symmetry in the chromist part of figure 1 in that in each subkingdom the first diverging branch includes an algal class that retains a nucleomorph and either a green or a red plastid; in each case the later branches retain only the red plastid or neither. More and more new data, like the discovery of photosynthetic apicomplexans, e.g. Chromera (Moore et al. 2008) and of colourless plastids in species often assumed never to have had algal ancestors (both within Myzozoa and Heterokonta), attest to chloroplasts being ancestral for Myzozoa, which many long resisted, and thus increase the plausibility of their antiquity in Chromista, albeit falling short of proving their presence in their last common ancestor (Burki et al. (2009) and Archibald (2009) discuss this further). Chromista (1981) versus chromalveolates (1999). Placing alveolates within Chromista expands Chromista so all chromalveolates are now included within Chromista. Adding also Rhizaria and centrohelid Heliozoa to Chromista has made it now even broader than my original ‘chromalveolates’ (Cavalier-Smith 1999), a name unnecessary for formal classification. Adl et al. (2005) unwisely tentatively introduced the slightly different name Chromalveolata as a clade name and with a totally inadequate diagnosis, but it did not include either Rhizaria or Heliozoa, so it is not a synonym for Chromista in either its original sense or the new broader one established in this paper. Chromalveolata as they defined it is paraphyletic; Moustafa et al. (2009) used it in a broader sense to include also Rhizaria. The name Chromista was introduced for a kingdom (Cavalier-Smith 1981) and has historical precedence and is shorter and in my view greatly preferable as the taxon name to ‘chromalveolates’, which simply started as a convenient name for an alignment file on my computer. Adl et al. did not make Chromalveolata a taxon or rank it by a conventional Linnean category. It might be least confusing to retain ‘chromalveolates’ as an informal name for the paraphyletic group comprising Heterokonta, Alveolata, Haptophyta, and Cryptista (its original and still most widely used meaning), should anyone wish to retain it, rather then to expand the concept to include Rhizaria and Heliozoa, which would make it an unnecessary junior synonym of Chromista as here expanded. However, I do not envisage many circumstances in which one would want a term denoting chromists other than Rhizaria and Heliozoa, except as a means to reduce confusion in the transitional period whilst the wider meaning of the older and more euphonious Chromista becomes adopted. For the photosynthetic chromists, a still narrower paraphyletic subset of the chromalveolates, the older term ‘chromophytes’ suffices and will often remain useful. Note 2. Further explanation of Fig. 1 and Tom40 and ORC distribution Tom40 distribution. Published data on the distribution of Tom40 was restricted to genomes completely sequenced some while ago and therefore did not include free-living Metamonada, Loukozoa, Percolozoa, Diplonemea or Euglenoidea. I made some additional BLAST studies for Eozoa. Using GenBank I readily detected a Tom40 homologue in the free-living metamonad Trimastix pyriformis, and using the JGI website for the now completed genome of the percolozoan Naegleria gruberi I found one Tom40 homologue: estExt_fgeneshHS_pg.C_460029. I also consulted the Protist EST database at the University of Montreal (http://tbestdb.bcm.umontreal.ca), which contains ESTs and automatically annotated BLAST hits for 60 protists including 6 Loukozoa 10 and 3 non-kinetoplastid Euglenozoa. I found one annotated putative Tom40 homologue cluster SEL00000632 for the jakobid Seculomonas, which when reblasted had the putative Trimastix Tom40 among its top hits, but no putative homologues in the two euglenoids and one diplonemid were listed. But complete genomes are obviously needed for better evidence that they are truly undetectable in all three main groups of Euglenozoa. Since even Microsporidia and Giardia have Tom40 (and Microsporidia at least also the major receptor Tom70) in their mitosome outer membranes, despite their dramatic simplification compared with aerobic mitochondria, and since in general Microsporidia proteins evolve much faster even than those of trypanosomatids yet one can identify their Tom40 by BLAST, its undetectability in trypanosomatids cannot easily be dismissed as simply secondary divergence. Direct biochemical studies of the mitochondrial targeting machinery are needed in trypanosomatids, euglenoids and diplonemids to determine what proteins they use and whether porin VDAC is part of this machinery. Such studies are important not only for testing my present hypothesis of the location of the eukaryotic root, but also for better understanding the evolutionary flexibility and origins of the protein-import machinery during the origin of mitochondria. Distribution of the origin recognition complex (ORC). I also carried out BLAST studies to clarify the distribution of ORC and two ancillary proteins, Cdc6 and Cdt1, which interact with it in neozoa. In neozoa ORC consists of six proteins: five evolutionary related proteins, Orc1-5, which share a major domain related in turn to Cdc6, plus Orc6 a smaller and faster-evolving protein unrelated to any of the others. Cdt1 belongs to a third protein family. The collective function of these eight proteins is to load the hexameric Mcm2-7 DNA helicase complex onto DNA at the proper sites recognized by ORC and its two associated proteins, so that the helicase can open the double helix to allow access by the replication machinery. In Archaebacteria the same function is mediated by homologues of CDC6 and Cdt1 only, ORC being absent (Robinson and Bell 2007). During the origin of eukaryotes there was a major increase in chromatin complexity associated with the evolution of additional histones and mitosis (Cavalier-Smith 2002); as part of this process Mcms, which in archaebacteria consist of a single protein that forms a homohexamer, underwent duplication and divergence to form a heterohexamer (Liu et al. 2009). It has been assumed that a heterohexameric ORC also evolved at that time (Duncker et al. 2009), but Godoy et al. (2009) provide evidence that hexameric ORC is absent from trypanosomes and suggest that they exemplify a primitive, archaebacteria-like, evolutionary phase before the gene duplications that created a heterohexameric ORC. My BLAST results support this and suggest that ORCs arose and increased in complexity after the origin of eukaryotes in four distinct phases: I suggest that, as in archaebacteria, trypanosomatid and probably also other euglenozoan Mcms are loaded onto replicon origins solely by Cdc6 and a distant homologue of Cdt1, even though they have a heterohexameric Mcm. I propose that a primitive ORC first evolved in the ancestor of neokaryotes, and may have contained as few as two different proteins: Orc6 and Orc2 (i.e. just one member of the Orc1-5/Cdc6 family). Then a further increase in complexity involved the origin of at least Orc4 in a common ancestor of Metamonada and Neozoa (as Orc4 is detectable by BLAST in the metamonad Trichomonas but not the percolozoan Naegleria or Euglenozoa; http://genome.jgi-psf.org/Naegr1/Naegr1.home.html) (Fig. 1). Then a major change in the ancestor of neozoa spliced a large chromodomain (BAH) onto the N-terminal end of an Orc1-5/Cdc6 domain creating the much larger neozoan Orc1 (Fig. 1). This might be associated with substantial changes to heterochromatin biology in neozoa compared with Eozoa. Finally an extra domain seems to have been added to N-terminal of Orc2 in the ancestor of opisthokonts. Just looking at gene annotations is confusing, as the single Cdc6 protein is rather indiscriminately annotated in GenBank in both archaebacteria and trypanosomatids as Cdc6 or Cdc1, just because they share the same AAA ATPase domain with Orc1-5, even though no archaebacteria or eozoan sequences that I examined had the BAH domain characteristic of neozoan Orc1. Because of this historical annotation confusion, Godoy et al. (2009) referred to the trypanosome replication initiating AAA ATPase as Orc1/Cdc6, despite showing that it rescued 11 yeast Cdc6 mutations but not Orc1 mutations. Clearly it is functionally more like Cdc6 than neozoan Orc1 and is best simply called Cdc6. On trees, however, Orc1 and Cdc6 are robustly sisters, and closer to each other than to any of Orc2-5 (Duncker et al. 2009). The high degree of conservation of Orc1, 2 and 4 and Cdc6 in neozoa, and the ability to detect them even in the radically modified microsporidian fungi that have truncated the N-terminal end of both Orc1 and Orc2 makes it unlikely that they would have been missed in BLAST searches, so I regard the absence of Orc 1,2,3 in trypanosomes and the absence of Orc4 in Naegleria as very significant. However Orc3, 5 and 6 and Cdt1 all evolve more rapidly, sufficiently so to make false negatives a genuine risk in rapidly diverging lineages using simple BLASTP. Thus my inability to find most of these four proteins (except Orc6 in Naegleria) in Eozoa needs interpreting with caution. Direct biochemical studies on prereplication complexes in Eozoa are needed to establish what subunits they actually contain and thus test the strong indications from BLAST of a stepwise increase in ORC complexity in Eozoa and to establish when in this process Orc3 and 5 evolved. When using human Cdt1 as query, homologues could be more readily detected in Posibacteria (the putative ancestors of both eukaryotes and archaebacteria: Cavalier-Smith 2006b: Valas and Bourne 2009) than in archaebacteria or some neozoan lineages (e.g. alveolates, red algae) prone to rapid protein evolution, but were readily detectable in green plants. The origin of Orc6 is unclear but a few possible distant homologues are detectable in both archaebacteria and eubacteria including proteobacteria. The simplest scenario for the origins of Orc1-5 is by successive gene duplications of Cdc6, with repeated divergences in which the Cdc6 domain of Cdc1 most conservatively retained its original structure. The fact that chromatin in trypanosomatids is generally dispersed throughout the cell cycle whereas in diplonemids and euglenoids it is condensed throughout (Table 2 below), both different from most neokaryotes, makes it especially important to study Orcs and heterochromatin properties in all major groups of Euglenozoa, including especially the putatively early diverging petalomonads, so as to disentangle which chromatin characters are ancestral and which derived, both for Euglenozoa and eukaryotes as a whole. Chromist diversification. Fig. 1 assumes that chlorophylls c1 and c2 and the carotenoid pigment fucoxanthin all evolved in the ancestral chromist prior to the primary divergence of harobiotes and hacrobians. This implies that cryptists and the photosynthetic alveolate Chromera (Moore et al. 2008) independently lost all three pigments, and that the alveolate dinoflagellates lost chlorophyll c1. Multiple losses of photosynthetic pigments within Chromista are well established. Within the heterokont (=stramenopile) class Chrysomonadea chlorophyll c2 was lost by Synurales, and fucoxanthin was lost within the class Raphidophyceae and independently by Eustigmatophyceae and Xanthophyceae. Given that fucoxanthin is thus known to have been lost three times within heterokonts, postulating two further losses (in Alveolata and Cryptista) soon after the initial diversification of chromists is not evolutionarily onerous. There is no need to invoke either independent origins of complex pigment biosynthetic pathways or lateral transfer, whether by tertiary symbiogenesis or otherwise, to explain chromist pigment distribution. Accepting that alveolates branch within Chromista also increases the number of nucleomorph losses that must be postulated; in addition to those already necessitated within Hacrobia at least one is required prior to the common ancestor of Heterokonta/Alveolata. On present information, small GTPase paralogue Rab1A appears to be a synapomorphy just for Harosa, not for chromalveolates as a whole as implied by Elias et al. (2009). Euglenoid chloroplast origin: to reduce clutter the secondary symbiogenesis that implanted a prasinophyte green algal chloroplast (Turmel et al. 2009) into an advanced subgroup of euglenoid Euglenozoa, which thereafter abandoned phagotrophy, is not shown. Dates: based on evidence and arguments in Cavalier-Smith (2006c). However accepting an eozoan root adds the further complication that no Eozoa fossilize well, and almost no unambiguous eozoan 12 fossils exist beyond a plausible euglenoid in rather recent amber. The oldest fossils that I accept as unambiguously eukaryotic are testate amoebae (Melanocyrillium) dating back to 760-800 My ago; I do not accept identifications of any of them as Cercozoa. Though some are plausibly Amoebozoa they might in fact come from an extinct, probably neozoan stem group. No Eozoa have pseudopods known to be able to manipulate particles to make tests; heterolobosean pseudopods (Percolozoa, the only ones well established in free-living Eozoa) are apparently not that versatile. But it may not be safe to conclude that none ever did so in the past. If they did not, the ~800 My date would represent the date for neozoa; very likely eukaryotes would be somewhat older; the reasonably good resolution in the basal part of the excavate tree could be interpreted as resulting from either reasonably good temporal spacing or from frozen rapid covarion-like changes during their early diversification. If the former interpretation were correct, then ~900-1000 My ago might be a better rough estimate for the date origin of eukaryotes, taking both sequence tree proportions and Neoproterozoic fossil data into account. In that case the largest acritarchs in the period 800-1000 My ago that cannot confidently be assigned to any protist or bacterial phylum, despite claims to the contrary, might reasonably be interpreted as possible eozoan cysts rather than stem eukaryotes or prokaryotes. However, if there was once an eozoan lineage of testate amoebae represented by the Melanocyrillium fossils, the date of origin of both neozoa and eukaryotes could be more recent than that. Note 3: invalidity of an earlier rooting between unikont and bikont eukaryotes. Both arguments for the root being between unikonts (originally comprising only opisthokonts and Amoebozoa) and bikonts as originally presented are now invalid. First the pattern of ciliary transformation of bikonts in which the anterior cilium was the younger and was transformed into the structurally and functionally different posterior cilium was proposed to be a derived state for bikonts only, whereas unikonts were believed to have either no ciliary transformation or an opposite pattern with the anterior cilium being older (Cavalier-Smith 2002). The assumption of two contrasting patterns of ciliary transformation in eukaryotes was based on the review of Moestrup (2000), and especially a paper describing that pattern in the amoebozoan myxogastrid Physarum (Wright et al. 1980). Both Moestrup and I overlooked that Gely and Wright (1986) had retracted their earlier interpretation, so that ciliary transformation in Physarum (and by implication other Amoebozoa also) has the same pattern as in bikonts. Even though my maximum likelihood 18S rRNA tree showed Apusozoa as grouping within unikonts as sister to Amoebozoa (but other methods gave no bootstrap support: Cavalier-Smith 2002) I then regarded Apusozoa as bikonts as they were all biciliate and one species (originally incorrectly described under the name Rhynchomonas mutabilis) had been observed to have the bikont pattern of ciliary transformation (Griessmann 1913). However evidence is growing from other gene trees also that Apusozoa do belong in unikonts, most likely as sister to opisthokonts rather than Amoebozoa though the possibility that Apusozoa are the paraphyletic ancestors of both opisthokonts and Amoebozoa has not been ruled out (Kim et al. 2006; Moreira et al. 2007; Brown et al. 2009). This means that Amoebozoa and opisthokonts, each of which has been argued to have been ancestrally uniciliate (despite the ancestral opisthokont at least having had two centrioles) probably became uniciliate independently. I now suggest that this happened by suppression of the anterior cilium’s growth in the ancestral opisthokont, making it posteriorly uniciliate, and the independent suppression of posterior ciliary growth in the ancestral amoebozoan making it anteriorly uniciliate (but with reversion to the biciliate condition by relieving this suppression in the ancestor of myxogastrid slime moulds). The second argument for bikonts being derived was the dihydrofolate-reductasethymidylate-synthetase (DHFR-TS) gene-fusion shared by bikonts, including the apusozoan Amastigomonas debruynei (not its correct name: Cavalier-Smith and Chao submitted). If apusomonad Apusozoa are genuinely sisters to opisthokonts then this gene fusion must have been reversed in one or both of Amoebozoa and opisthokonts; both cannot be regarded as representing a 13 primitive uniciliate state. The argument from myosin synapomorphies (Richards and CavalierSmith 2006) for unikont holophyly makes it likely that the DHFR-TS gene fusion was reversed independently in opisthokonts and Amoebozoa by gene duplication and differential deletions, as is mechanistically plausible. Thus neither of the reasons for placing the root outside bikonts has withstood subsequent scrutiny. Neither is an obstacle to the earlier view that the root was in Eozoa because of the primitive mitochondrial genomes of jakobid excavates (Cavalier-Smith 2000). The Ccm/lyase argument for an eozoan root is inherently stronger than the DHFR/TS argument because (a) there is no doubt that the presence of Ccm genes in excavate mitochondrial genomes is the primitive state and (b) it is arguably easier for a gene fusion to be secondarily lost by gene duplication and differential deletions of the two parts than it would be for nuclear lyase to be replaced by mitochondrially coded Ccms by lateral transfer from bacteria into mitochondria (no case of such transfer is known). If the root were amongst core excavates (Loukozoa, Metamonada, Percolozoa) such loss must still be postulated because of the strong sequence evidence that Euglenozoa are related to them on unrooted trees. In principle the root within Eozoa could either be within excavates, e.g. beside the zooflagellate jakobid Loukozoa as sometimes suggested because of their particularly primitive mitochondrial genomes (notably retention of the -proteobacterial RNA polymerase) (CavalierSmith 2000), or between excavates and Euglenozoa, as proposed here. Other once plausible places within excavates are between Loukozoa and Discicristata (Euglenozoa, Percolozoa) because of their different mitochondrial cristae or within or beside Percolozoa because of their absence of Golgi stacking and aberrant short nuclear rRNAs, but neither of these has a strong rationale. Tom40 and Orc arguments discussed above strongly favour a root within or beside Euglenozoa instead. If these are accepted, one must assume that the viral RNA polymerase now used by eukaryotes other than jakobids for transcribing mitochondrial DNA entered the ancestral eukaryote prior to the divergence of neokaryotes and Euglenozoa, and that the -proteobacterial RNA polymerase was immediately lost by Euglenozoa but persisted in neokaryotes for a period until after the divergence of jakobids (the second neokaryote branch after Percolozoa) and was lost independently twice within neokaryotes: by the ancestor of Percolozoa and by the common ancestor of Malawimonas, Metamonada and neozoa. As the time interval between the divergence of Percolozoa and jakobids could have been very short, brief coexistence of two RNA polymerases is not an evolutionary onerous assumption, especially as a similar viral polymerase and the cyanobacterial polymerase have coexisted in chloroplasts with partially overlapping functions for at least 600 My. The present rooting requires only three losses of the -proteobacterial polymerase unlike the unikont/bikont root that required at least four, and more importantly a briefer period of coexistence in only one segment of the tree, not on several segments. ******************************************************************** Rogozin et al (2009) have also argued that the root is within bikonts, postulating that it is not in Eozoa but between Plantae and all other eukaryotes. However, their own analyses of rare conserved changes in numerous proteins actually contradict their conclusion and are fully consistent with the root being within Eozoa as argued here. They used an ingenious four-taxon method (three eukaryote groups plus bacterial outgroup) to analyse eight different trifurcations within eukaryotes. Analyses of all three trifurcations that included eozoa showed with strong statistical support the eozoan group as branching more deeply than either plants or opisthokonts, exactly as on my Fig. 1. However, they assumed that all three results were biased by long-branch artefacts and should all be ignored, basing their conclusions only on the five other analyses that included Neozoa only (making their analyses irrelevant to the question of whether the root is within Eozoa or not!). The branches for the three eozoan taxa (Giardia, Trichomonas, and kinetoplastids) were indeed very long. However, in principle extremely rapid evolution in such a branch by introducing numerous convergences with the bacterial outgroup, could either change the topology of the tree by putting the branch deeper than it should be or instead simply add false amplification to a weaker true signal that it really is a deep branch, and thus not give a topologically incorrect conclusion. A priori there is no way of knowing whether a long branch is giving a false or a true topology; to assume that all 14 analyses including Eozoa gave a false result is purely arbitrary. They might all be topologically correct or some right and some wrong! The subjectivity of how Rogozin et al. (2009) drew conclusions from their analyses is also illustrated in two parts of the neozoan tree. One was the trifurcation involving the amoebozoans Entamoeba and Dictyostelium and opisthokonts. The raw data showed that these two amoebae share more rare amino acid substitutions with each other than with opisthokonts, which is consistent with the holophyly of Amoebozoa strongly indicated by the best available multigene tree (Minge et al. 2009). However, their statistical test, which attempts to correct for long branches by assuming that they necessarily proportionally introduce homoplasies, favoured instead the contradictory idea that Entamoeba branches more deeply than Dictyostelium and that Amoebozoa are paraphyletic. They unwisely concluded that the elaborate statistical test, which makes untestable assumptions about the numerical relationship between branch lengths and misleading convergences, gave the right answer and that the contradictory raw data were misleading and that Amoebozoa really are paraphyletic. Almost certainly this conclusion is wrong and the statistical test simply overcorrected for homoplasy; a multigene tree for scores of genes including 7 Amoebozoa (Minge et al. 2008) is probably more reliable than a statistical analysis with dubiously valid assumptions based on only two Amoebozoa and a tiny number of conservative amino acid positions in many fewer proteins. It is also odd that Rogozin et al. (2009) chose to accept the statistical conclusion for this unikont trifurcation, even though they rejected the statistical conclusion that kinetoplastids are deeper branching than Plantae or opisthokonts as a long-branch artefact, when in fact the Entamoeba branch was objectively longer than that for kinetoplastids. In the case of the metamonads Giardia and Trichomonas, both the raw data and the statistical tests agreed in placing them more deeply than Plantae, yet they concluded that their analyses were ‘best compatible with’ Plantae being deepest! The other problematic interpretation concerned the chromalveolate (chromist), plant, opisthokont trifurcation, where contradictory topologies were favoured by different species samples and different genes. They recognised that there was a strong signal from many genes and chromist species for a sister relationship between Plantae and Chromista as shown in my Fig. 1 and multigene trees (Burki et al. 2009), but dismissed these (without any evidence or specific arguments for any gene) all as cases of replacement of host genes by those from the enslaved red alga. Instead they assumed that the genes that showed chromalveolates as sisters to opisthokonts were giving the true vertical signal for the host; however, it is perfectly possible that it is these genes that are the artefacts by excessive divergence and those indicating a sisterhood of chromist and plants are the true signal. They provide no way of distinguishing these possibilities, making their ‘conclusion’ as to the position of the root totally subjective. Overall one can argue that statistical treatment of such rare amino acid changes involving untestable assumptions about the impact of branch lengths coupled with the necessary restriction of the method to three eukaryote taxa at a time as more likely to lead to artefact than conventional multigene trees, so one cannot regard this method as a panacea to avoid such problems. Note 4. Euglenozoan characters in relation to the position of the root Euglenozoa comprise four classes (Table 1) whose relationships are not thoroughly established. Protein sequence trees suggested that Kinetoplastea (parasitic Trypanosomatida plus the ancestral mostly free-living Bodonida from which they evolved) are sisters to Diplonemea, but recent 18S rRNA evidence for Calkinsia, which I have classified within Postgaardea, suggests that Postgaardea instead might be sister to Kinetoplastea (Yubuki et al. 2009), as assumed (Cavalier-Smith 1998), and raises the possibility that Euglenoidea could be paraphyletic ancestors of the other three classes, rather than sisters of Kinetoplastea as protein trees (poorly sampled for deep-branching euglenoids) have suggested. Because of this topological uncertainty and the paucity of biochemical and total absence of genomic information from the four nutritionally ancestral free-living phagotrophic groups (Bodonida, Diplonemea, Postgaardea and Peranemia [basal phagotrophic euglenoids]) it is not possible yet to say what are the ancestral molecular characters for Euglenozoa. Therefore one 15 cannot currently distinguish between the eukaryotic root being between Euglenozoa and neokaryotes or deep within Euglenozoa themselves. Cytologically it would probably be simplest if the root were between all Euglenozoa and all neokaryotes, as the special features of Euglenozoa (1-3 in Table 2) and excavates would then be divergent specializations of a possibly simpler early eukaryote; that avoids supposing that the remarkably stable euglenozoan pattern was ancestral to the alternative pattern that characterizes excavates. However, given our ignorance about molecular and cytological diversity in the putatively most deeply branching euglenozoan group, the bacteria-eating petalomonad euglenoids (Peranemia), such a conclusion might be premature. Most free-living Euglenozoa have two cilia; petalomonads have only one emergent one (anterior used for gliding on surfaces); though some have two centrioles and a second rudimentary or vestigial non-emergent cilium, and all are often considered as derived from biciliate ancestors. There is unambiguous phylogenetic evidence that the uniciliate trypanosomatids evolved from the biciliate bodonids, but no evidence that the petalomonad genus Scytomonas, which also has only one cilium and centriole (Mignot 1961), had biciliate ancestors; like other petalomonads Scytomonas has simpler mouthparts than most phagotrophic euglenoids or diplonemids and a particularly simple pellicle; it is also the only euglenoid for which sexual cell fusion is known. It is therefore a candidate for a descendant from the long-postulated unicentriolar uniciliate ancestor of bicentriolar eukaryotes. Cultures need to be obtained to test whether it is an early diverging euglenozoan lineage that diverged from other eukaryotes before most of the features now widespread in Euglenozoa (Table 2) evolved or instead arose by simplification from biciliate petalomonads. As Table 2 below indicates, most Euglenozoa have radically different properties from all neokaryotes. Characters 4-9 are clearly derived specialised characters that cannot be regarded as ancestral to those of neokaryotes, as they are as divergent from those of prokaryotes as from standard neokaryotic ones, and it would be mechanistically hard to envisage most of them giving rise later to neokaryotic properties; character 10 is also unlikely to be primitive for eukaryotes. In this respect they differ profoundly from characters like the absence of Tom40 and ORC, which seem to reflect the ancestral conditions respectively in the proteobacterial ancestor of mitochondria and the neomuran ancestor of the rest of the eukaryote cell (Cavalier-Smith 2009b), as arguably do the nine other characters mentioned in the second paragraph of this supplement. Thus the magnitude and number of the special euglenozoan properties of Table 2 are comprehensible consequences of the root being either between Euglenozoa and neokaryotes or deeply amongst deep-branching Euglenozoa. The latter must be studied to see whether any have Tom40 or ORC and which if any of the characters of Table 2. If all petalomonads possessed most or all of the Table 2 properties but had neither Tom40 nor ORC nor any of the other 8 characters mentioned above as absent from trypanosomatids the only plausible place for the root would be between Euglenozoa as a whole and neokaryotes. But more complex character distributions could favour a root either somewhere between trypanosomatids and euglenoids or deep within euglenoids. Note that my contrasting of the morphology of excavate vanes and rods and arguing for a primary eukaryotic bifurcation between them, does not exclude the possibility that they had a simpler common ancestor; conceivably some of their proteins are distantly related. Table 2. Eleven unusual properties of Euglenozoa absent from other eukaryotes ___________________________________________________________________________ 1. Two cilia with dissimilar lattice paraxial rods stemming from parallel centrioles located in a deep anterior reservoir (Simpson 1997). 2. Complex anterior ingestion apparatus ancestrally with dense rod and plicate vanes, sometimes reduced to an MTR pocket (Simpson 1997). 3. Long rod-shaped extrusomes (Simpson 1997). 4. Unique cytochrome c with only one cysteine for haem binding and mechanism of biogenesis unique in the living world (Allen et al. 2008). 16 5. Mitochondrial DNA of multiple circles with extensive U-insertion editing (Gray 2003; Marande et al. 2005). 6. Nuclear succinate dehydrogenase split into two genes for proteins separately imported into mitochondria (Gawryluk & Gray 2009). 7. Nuclear messenger RNA made by trans-splicing splice-leaders onto coding regions (Frantz et al. 2000) (similar behaviour independently evolved in dinoflagellates, and for some nematode genes). 8. Nuclear protein-coding transcripts are almost all polygenic each with several unrelated genes (Berriman et al. 2005). 9. Unusual base J (beta-d-glucopyranosyloxymethyluracil) in their nuclear DNA (Borst & Sabatini 2008) 10. Nuclear chromosomes remain visibly condensed throughout interphase (euglenoids and diplonemids; in kinetoplastids they are never visibly condensed even during mitosis, probably a derived state) (Triemer 1991) 11. Systematically longer 18S rRNA than other eukaryotes with unique expansion segments (also true of Foraminifera and Myxogastria). All 11 unique characters in Table 2 were probably ancestral for Euglenozoa, most being found in at least three of the four classes (4-10 are unstudied in Postgaardea as they have never been cultured, but as they are almost certainly not the deepest branch (Yubuki et al 2009) this is irrelevant to deducing the ancestral state). All except perhaps (10) (condensed chromatin) are probably derived characters for Euglenozoa compared with the ancestral eukaryote. The significance of these 11 remarkable differences from other eukaryotes is that they may constitute one of two early divergent evolutionary responses to the problems of being eukaryotic (the general ‘typical textbook’ features of neokaryotes being the other); unlike the absence of Tom40 and ORC and the nine other arguably primitive characters listed in paragraph 2, most cannot be primitive precursors of the typical pattern seen in neokaryotes. However, some Euglenozoa have additional unusual features that might be ancestral; these include euglenoid multiple mitotic spindles and kinetoplastid glycolysis being located in peroxisome-like microbodies, not the cytosol. Euglena mitochondria synthesize fatty acids anaerobically by unique machinery and make wax esters and ferment them anaerobically in the cytoplasm by enzymes lacking homologues in other eukaryotes (but with bacterial relatives) (Hoffmeister et al.). Euglena small nucleolar guide RNAs (which process rRNA) are smaller and more uniform than in other eukaryotes (Russell et al. 2006), more like those of archaebacteria. This feature could be primitive, but at least five of these 17 unique characters are not ancestral eukaryotic properties, but secondarily derived: three have independently derived parallels in other eukaryotes (notably trans-splicing); the mitochondrial genomes of dinoflagellates are analogously radically changed from the ancestral state best exemplified by the jakobid Reclinomonas. _______________________________________________________________________ Note 5. The clade names neokaryotes, corticates, taxon names Eozoa, Loukozoa, Excavata, and Sarcomastigota, and grade name ‘discicristates’. I invented the name neokaryote to denote all eukaryotes that branch higher in rRNA trees than Euglenozoa (Cavalier-Smith 1993b). It is used in precisely that sense here, assuming that the tree is rooted between Euglenozoa and neokaryotes, even though when proposed it was mistakenly thought that some groups (Metamonada, Microsporidia, Archamoebae and Percolozoa) branched more deeply. Now that we know that Metamonada, Microsporidia, Archamoebae are not primitively amitochondrial, and as argued here are not the first branch on the eukaryote tree, my later redefinition of neokaryote (Cavalier-Smith 1998) as all eukaryotes other that Metamonada sensu Cavalier-Smith (2003b) would not define a clade and lacks utility, and thus understandably never became widely used. So sticking to the original phylogenetic definition but with altered 17 circumscription will not be confusing. The name was coined specifically to emphasize the marked differences in genome organization between Euglenozoa and neokaryotes, which the present rooting stresses even more by seeing it as the primary eukaryotic divergence in both genome organisation and cell structure. I invented the subkingdom name Neozoa (Cavalier-Smith 1983) to denote all protozoa except Discicristata (Euglenozoa and Percolozoa) and Metamonada. That paraphyletic taxon is not retained here as it would be equivalent to Sarcomastigota plus Rhizaria and Heliozoa, which are here placed in two separate kingdoms. I therefore now use neozoa not for a taxon, but as an informal name for the smallest clade that includes the last common ancestor of those three taxa (Fig. 1); this phylogenetic redefinition expands its composition to include all Animalia, Fungi, Chromista, and Plantae. I later invented the subkingdom name Eozoa (Cavalier-Smith 1987) to denote all Protozoa that branched below neozoa on phylogenetic trees, but the present superclass Eopharyngia of Metamonada was excluded as they were then in a separate kingdom Archezoa. Eozoa is here formally emended by including all Metamonada as now circumscribed (Cavalier-Smith 2003b, i.e. including Eopharyngia) as well as Loukozoa (not yet created in 1987), Percolozoa, and Euglenozoa. I invented the phylum name Loukozoa (Cavalier-Smith 1999) and revised it (Cavalier-Smith 2003b,c) to embrace two excavate classes only (Jakobea and Malawimonadea); it is not yet widely used because of widespread aversion to paraphyletic taxa that is based on flawed arguments (Cavalier-Smith 2009c). The class Diphyllatea (Diphylleia, Collodictyon, Sulcomonas) may also belong here (as its groove weakly suggests) but the position of Diphylleia on 18S rRNA trees is extremely unstable, as it sometimes groups with excavates near Loukozoa/Metamonada and sometimes with Amoebozoa or Apusozoa within unikonts. Following the studies that first defined jakobids (O’Kelly 1993), who pointed out that their possession of ciliary vanes and ciliary root characters suggested an affinity with retortamonads, Simpson and Patterson (1999) used the informal name ‘excavates’ to denote eukaryotes which like jakobids and Carpediemonas membranifera (a novel kind of free-living metamonad whose ultrastructure they characterized) had a noticeably ‘scooped out’ or ‘excavated’ ventral feeding groove associated with vaned or flange-bearing cilia and distinctive ciliary roots; initially excavates embraced only a subset of Metamonada with clear grooves and/or vanes (Carpediemonas, Trimastix, retortamonads) and Jakobida (Patterson 1999), an assemblage that was initially thought to be paraphyletic. But the concept was soon extended to include Malawimonas (O’Kelly 1999) and Percolozoa, which have similar ciliary roots. To these ‘core excavates’ Simpson and I independently eventually added Euglenozoa, despite their not having the core excavate ultrastructure, on the assumption that the eukaryote root was between bikonts and unikonts (Cavalier-Smith, 2002, which established the taxon Excavata to include both core excavates and Euglenozoa) coupled with the fact that they grouped on unrooted sequence trees with excavates; this assumption about the root position, I have argued here, was mistaken, but the unrooted grouping has been strongly confirmed. If the present rooting of the tree is correct, both the original taxon Excavata (Cavalier-Smith 2002) and the present phenotypically substantially more homogenous one made by excluding Euglenozoa are paraphyletic. Those ready to accept Excavata in either sense as a taxon should also in principle be willing to accept the paraphyletic taxa Loukozoa, Eozoa, Choanozoa, Sarcomastigota, Protozoa, and Bacteria. The name Sarcomastigota is not, as sometimes incorrectly assumed, a synonym for the old Sarcomastigophora, but was invented for a new protozoan infrakingdom (Cavalier-Smith 1983) that excluded all Eozoa but at first included what are now called Amoebozoa plus those former flagellate and amoeboid protozoa here transferred to kingdom Chromista. Originally it excluded 18 Choanozoa, but these were subsequently added and Heliozoa, Radiozoa and Alveolata excluded (Cavalier-Smith 1998); later Rhizaria also were excluded as evidence for the unikont bikont dichotomy grew (Cavalier-Smith 2002, 2003). Apusozoa have sometimes being excluded (CavalierSmith 2002) but have usually have been placed within Sarcomastigota, as I do here because of increased evidence that Apusozoa are sisters to opisthokonts (Kim et al. 2006; Brown et al. 2009; and our own unpublished multigene data) and because apusomonads have myosin II like other unikonts (Berney and Cavalier-Smith in prep.). Sarcomastigota should be more stable in composition than in the past as it now includes only protozoa with pronounced actomyosin pseudopodial activity dependent on myosin II. The only other eukaryote known to have myosin II is Naegleria (first noted by T. A. Richards pers. comm.; see also Odronitz and Kolmar (2007) who incorrectly assume Naegleria to be related to Amoebozoa); it is unclear whether Naegleria acquired it by lateral gene transfer from a sarcomastigote, in which case myosin II is an additional synapomorphy for unikonts to those mentioned in Fig. 1, or whether instead myosin II originated in the ancestral neokaryote and was lost independently by corticates and metamonads). Discoba: this name was introduced by Hampl et al. (2009) for a putative clade comprising discicristates and jakobids. If the root is within discicristates as argued here, this grouping is not a clade but paraphyletic; it seems of no taxonomic utility and of doubtful descriptive value as (unlike the also paraphyletic discicristates) it does not unite organisms that share phenotypic characters that would make it useful to distinguish them from others by name. Rodríguez-Ezpeleta et al. (2007) discovered insertions in the protein Rpl24A that they interpreted as synapomorphic for discoba. However the insertion in Euglenozoa is one amino acid shorter than in jakobids plus Percolozoa, meaning either that the euglenozoan insertion was independent or that there was a single amino acid deletion in the ancestral euglenozoan or else an independent single amino acid insertion in the ancestor of Percolozoa/Jakobea. One of the two latter possibilities seems most likely as both insertions start with a proline even though there is almost no other sequence similarity between the Euglenozoan and jakobid/percolozoan insertions. Thus on any scenario for the position of the root there must have been at least two evolutionarily independent length changes in this part of the molecule. If my present rooting of the tree is correct three changes are needed; I suggest that either a four or five amino acid insertion occurred in the ancestral eukaryote and that the five amino acids were secondarily deleted in the common ancestor of Neozoa, Metamonada, and Malawimonas. Avoiding postulating this 5-amino acid secondary deletion is not a sufficiently strong reason for placing the root within excavates (e.g. between Malawimonas and jakobids) instead of beside or within Euglenozoa for the many reasons discussed here. I have abandoned the superphylum Discicristata Cavalier-Smith 2002, but “discicristates” is still useful to refer to Euglenozoa and Percolozoa jointly because of their shared discoid mitochondrial cristae, which now appears to have been the ancestral state for eukaryotes. __________________________________________________________________________________ References for Supplementary Material Additional to those in the Printed Version Adl, S. M., Simpson, A. G., Farmer, M. A., Andersen, R. A., Anderson, O. R., Barta, J. R., Bowser, S. S., Brugerolle, G., Fensome, R. A., Fredericq, S., James, T. Y., Karpov, S., Kugrens, P., Krug, J., Lane, C. E., Lewis, L. A., Lodge, J., Lynn, D. H., Mann, D. G., McCourt, R. M., Mendoza, L., Moestrup, O., Mozley-Standridge, S. E., Nerad, T. A., Shearer, C. A., Smirnov, A. V., Spiegel, F. W. & Taylor, M. F. 2005 The new higher level classification of eukaryotes with emphasis on the taxonomy of protists. J. Eukaryot. Microbiol. 52, 399-451. Agrawal, S., van Dooren, G. G., Beatty, W. L. & Striepen, B. 2009 Genetic evidence that an endosymbiont-derived ERAD system functions in import of apicoplast proteins. J. Biol. Chem. 284, 33683-33691. 19 Archibald, J. M. 2009 The puzzle of plastid evolution. Curr. Biol. 19, R81-R88. Archibald, J. M., Rogers, M. B., Toop, M., Ishida, K. & Keeling, P. J. 2003 Lateral gene transfer and the evolution of plastid-targeted proteins in the secondary plastid-containing alga Bigelowiella natans. Proc. Natl Acad. Sci. USA 100, 7678-7683. Bodyl, A., Stiller, J. W. & Mackiewicz, P. 2008 Chromalveolate plastids: direct descent or multiple endosymbioses? Trends Ecol. Evol. 24, 119-121. Bolte, K., Bullmann, L., Hempel, F., Bozarth, A., Zauner, S. & Maier, U. G. 2009 Protein targeting into secondary plastids. J. Eukaryot. Microbiol. 56, 9-15. Borst, P. & Sabatini, R. 2008 Base J: discovery, biosynthesis, and possible functions. Ann. Rev. Microbiol. 62, 235-251. Brown, J. R., Koretke, K. K., Birkeland, M. L., Sanseau, P. & Patrick, D. R. 2004 Evolutionary relationships of Aurora kinases: implications for model organism studies and the development of anti-cancer drugs. BMC Evol. Biol. 4, 39. Brown, M. W., Spiegel, F. W. & Silberman, J. D. 2009 Phylogeny of the "forgotten" cellular slime mould, Fonticula alba, reveals a key evolutionary branch within Opisthokonta. Mol. Biol. Evol. 26, 2699-2709. Burri, L., Williams, B. A., Bursac, D., Lithgow, T., Keeling, P. J. 2006 Microsporidian mitosomes retain elements of the general mitochondrial targeting system. Proc. Natl Acad. Sci. USA 103, 15916-15920. Cavalier-Smith, T. 1982 The origins of plastids. Biol. J. Linn. Soc. 17, 289-306. Cavalier-Smith, T. 1983 A 6-kingdom classification and a unified phylogeny. In Endocytobiology II (ed. W. Schwemmler & H. E. A. Schenk), pp. l027-l034. Berlin: de Gruyter. Cavalier-Smith, T. 1986 The kingdom Chromista: origin and systematics. In Progress in Phycological Research. F. E. Round & D. J. Chapman, eds Vol. 4, pp. 309-347. Biopress Ltd., Bristol. Cavalier-Smith, T. 1993a The origin, losses and gains of chloroplasts. In Origin of Plastids: Symbiogenesis, Prochlorophytes and the Origins of Chloroplasts. R. A. Lewin (ed.). pp. 291-348. Chapman & Hall, New York. Cavalier-Smith, T. 1993b Evolution of the eukaryotic genome. In The Eukaryotic Genome, eds. P. Broda, S. G. Oliver & P. Sims. pp. 333-385. Cambridge University Press. Cavalier-Smith, T. 1994 Origin and relationships of Haptophyta. In The Haptophyte Algae, eds J.C. Green and B. S. C. Leadbeater. Clarendon Press, Oxford. pp. 413-435. Cavalier-Smith, T. 1997 Amoeboflagellates and mitochondrial cristae in eukaryotic evolution: megasystematics of the new protozoan subkingdoms Eozoa and Neozoa. Arch. Protistenk. 147, 237-258. Cavalier-Smith, T. 1998 A revised six-kingdom system of life. Biol. Rev. 73, 203-266. Cavalier-Smith, T. 1999 Principles of protein and lipid targeting in secondary symbiogenesis: euglenoid, dinoflagellate, and sporozoan plastid origins and the eukaryote family tree. J. Euk. Microbiol. 46, 347-366. Cavalier-Smith, T. 2000 Flagellate megaevolution: the basis for eukaryote diversification. In The Flagellates. (eds J. C. Green & B. S. C. Leadbeater), pp. 361-390. London: Taylor and Francis. Cavalier-Smith, T. 2003a Genomic reduction and evolution of novel genetic membranes and proteintargeting machinery in eukaryote-eukaryote chimaeras (meta-algae). Phil. Trans. Roy. Soc. Lond. B 358, 109-134. Cavalier-Smith, T. 2003b The excavate protozoan phyla Metamonada Grassé emend. (Anaeromonadea, Parabasalia, Carpediemonas, Eopharyngia) and Loukozoa emend. (Jakobea, Malawimonas): their evolutionary affinities and new higher taxa. Int. J. Syst. Evol. Microbiol. 53, 1741-1758. Cavalier-Smith, T. 2003c Protist phylogeny and the high-level classification of Protozoa. Eur. J. Protistol. 39, 338-348. Cavalier-Smith, T. 2004 Chromalveolate diversity and cell megaevolution: interplay of membranes, genomes and cytoskeleton. In Organelles, Genomes and Eukaryote Phylogeny Systematics 20 Association Special Volume 68 eds R. P. Hirt & D. S. Horner. Taylor & Francis, London. Pp. 75108. Cavalier-Smith, T. 2006a The tiny enslaved genome of a rhizarian alga. Proc. Natl Acad. Sci. USA 103, 9779-9780. Cavalier-Smith, T. 2006b Rooting the tree of life by transition analysis. Biol. Direct 1: 19. Cavalier-Smith, T. 2006c Cell evolution and earth history: stasis and revolution. Phil. Trans. Roy. Soc. Lond. B. 361, 969-1006. Cavalier-Smith, T. 2007 Evolution and relationships of algae: major branches of the tree of life. In Unravelling the Algae (ed. J. Brodie & J. Lewis), pp. 21-55. Boca Raton: CRC Press. Cavalier-Smith, T. 2009a Megaphylogeny, cell body plans, adaptive zones: causes and timing of eukaryote basal radiations. J. Euk. Microbiol. 56, 26-33. Cavalier-Smith, T. 2009b Predation and eukaryote cell origins: a coevolutionary perspective. Int. J. Biochem. Cell Biol. 41, 307-322. Cavalier-Smith, T. 2009c Deep phylogeny, ancestral groups, and the four ages of life. Phil. Trans. Roy. Soc. B. in press. Cavalier-Smith, T. 2009d Origin of the cell nucleus and sex: roles of intracellular coevolution. Biol. Direct In press. Cavalier-Smith, T. & Chao, E. E. 2006 Phylogeny and megasystematics of phagotrophic heterokonts (kingdom Chromista). J. Mol. Evol. 62, 388-420. Cavalier Smith, T., Allsopp, M. T. E. P. & Chao, E. E. 1994 Chimeric conundra: are nucleomorphs and chromists monophyletic or polyphyletic? Proc. Natnl Acad. Sci. USA. 91, 11368-11272. Dagan, T. & Martin, W. 2009 Seeing green and red in diatom genomes. Science 323, 1651-1652. Dagley, M. J,, Dolezal, P., Likic, V. A., Smid, O., Purcell, A. W., Buchanan, S. K., Tachezy, J. & Lithgow, T. 2009 The protein import channel in the outer mitosomal membrane of Giardia intestinalis. Mol. Biol. Evol. 26, 1941-1947. Dolezal, P., Likic, V., Tachezy, J. & Lithgow, T. 2006 Evolution of the molecular machines for protein import into mitochondria. Science 313, 314-318. Elias, M., Patron, N. J. & Keeling, P. J. 2009 The RAB family GTPase Rab1A from Plasmodium falciparum defines a unique paralog shared by chromalveolates and Rhizaria. J. Eukaryot. Microbiol. 56, 348-356. Fast, N. M., Kissinger, J. C., Roos, D. S. & Keeling, P. J. 2001 Nuclear-encoded, plastid-targeted genes suggest a single common origin for apicomplexan and dinoflagellate plastids. Mol. Biol. Evol. 18, 418-426. Frantz, C., Ebel, C., Paulus, F. & Imbault, P. 2000 Characterization of trans-splicing in euglenoids. Curr. Genet. 37, 349-555. Frommolt, R., Werner, S., Paulsen, H., Goss, R., Wilhelm, C., Zauner, S., Maier, U. G., Grossman, A. R., Bhattacharya, D. & Lohr, M. 2008 Ancient recruitment by chromists of green algal genes encoding enzymes for carotenoid biosynthesis. Mol. Biol. Evol. 25, 2653-2667. Gawryluk, R. M. R. & Gray, M. W. 2009 A split and rearranged nuclear gene encoding the iron-sulfur subunit of mitochondrial succinate dehydrogenase in Euglenozoa. BMC Res. Notes 2, 16. Gely, C. & Wright, M. 1986 The centriole cycle in the amoebae of the myxomycete Physarum polycephalum. Protoplasma 132, 23-31. Gile GH, Faktorová D, Castlejohn CA, Burger G, Lang BF, Farmer MA, Lukes J, Keeling PJ. 2009 Distribution and phylogeny of EFL and EF-1 in Euglenozoa suggest ancestral co-occurrence followed by differential loss. PLoS One. 2009 Gluenz, E., Sharma, R., Carrington, M. & Gull, K. 2008 Functional characterization of cohesin subunit SCC1 in Trypanosoma brucei and dissection of mutant phenotypes in two life cycle stages. Mol Microbiol 69, 666-680. Griessmann, K. 1913 Über marine Flagellaten. Archiv Protistenk. 32, 1-78. Gray, M. W. 2003 Diversity and evolution of mitochondrial RNA editing systems. IUBMB Life 55, 227-233. 21 Hampl, V., Hug, L., Leigh, J. W., Dacks, J. B., Lang, B. F., Simpson, A. G. & Roger, A. J. 2009 Phylogenomic analyses support the monophyly of Excavata and resolve relationships among eukaryotic "supergroups". Proc. Natl Acad. Sci. USA 106, 3859-3864. Hempel, F., Bullmann, L., Lau, J., Zauner, S. & Maier, U. G. 2009 ERAD-derived preprotein transport across the second outermost plastid membrane of diatoms. Mol. Biol. Evol. 26, 1781-1790. Hoffmeister, M., Piotrowski, M., Nowitzki, U. & Martin, W. 2005 Mitochondrial trans-2-enoyl-CoA reductase of wax ester fermentation from Euglena gracilis defines a new family of enzymes involved in lipid synthesis. J. Biol. Chem. 280, 4329–4338. Ishida, K., Cao, Y., Hasegawa, M., Okada, N. & Hara, Y. 1997 The origin of chlorarachniophyte plastids, as inferred from phylogenetic comparisons of amino acid sequences of EF-Tu. J. Mol. Evol. 45, 682-687. Kim, E., Simpson, A. G. & Graham, L. E. 2006 Evolutionary relationships of apusomonads inferred from taxon-rich analyses of six nuclear-encoded genes. Mol. Biol. Evol. 23, 2455-2466. Leblond, J. D., Dahmen, J. L., Seipelt, R. L., Elrod-Erickson, M. J. and Kincaid, R. 2005 Lipid composition of chlorarachniophytes (Chlorarachniophyceae) from the genera Bigelowiella, Gymnochlora, and Lotharella. J. Phycol. 41, 311-321. Liu, Y., Richards, T. A. & Aves, S. J. 2009 Ancient diversification of eukaryotic MCM DNA replication proteins. BMC Evol. Biol. 9, 60. Marande, W., Lukes, J. & Burger, G. 2005 Unique mitochondrial genome structure in diplonemids, the sister group of kinetoplastids. Eukaryot. Cell 4, 1137-1146. Mignot, J.-P. 1961 Contribution à l’étude cytologique de Scytomonas pusilla (Stein) (Flagellé euglénien). Bull Biol Fr Belg 95, 665-678. Minge, M., Silberman, J. D., Orr, R., Cavalier-Smith, T., Shalchian-Tabrizi, K., Burki, F., Skjaeveland, Å. & Jakobsen, K. S. 2009 Evolutionary position of breviate amoebae illuminates the primary eukaryote divergence. Phil. Trans. Roy. Soc. B 276, 597-604. Moestrup, Ø. 2000 The flagellate cytoskeleton: introduction of a general terminology for microtubular roots in protists. In The flagellates: unity, diversity and evolution (ed. B. S. Leadbeater & J. C. Green), pp. 69-94. London: Taylor & Francis. Moreira, D., von der Heyden, S., López-García, P., Bass, D., Chao, E. and Cavalier-Smith, T. 2007 Global eukaryote phylogeny: combined small- and large-subunit ribosomal DNA trees support monophyly of Rhizaria, Retaria and Excavata. Mol. Phylogen. Evol. 44, 255-266. Moore, R. B., Obornik, M., Janouskovec, J., Chrudimsky, T., Vancova, M., Green, D. H., Wright, S. W., Davies, N. W., Bolch, C. J., Heimann, K., Slapeta, J., Hoegh-Guldberg, O., Logsdon, J. M. & Carter, D. A. 2008 A photosynthetic alveolate closely related to apicomplexan parasites. Nature 451, 959-963. Odronitz, F. & Kollmar, M. 2007 Drawing the tree of eukaryotic life based on the analysis of 2,269 manually annotated myosins from 328 species. Genome Biol 8, R196. O'Kelly, C. 1993 The jakobid flagellates: structural features of Jakoba, Reclinomonas and Histiona and implications for the early diversification of eukaryotes. J. Euk. Microbiol. 40, 627-636. O'Kelly, C., Nerad, T. A. 1999 Malawimonas jakobiformis n. gen., n. sp. (Malawimonadidae fam. nov.): a jakoba-like heterotrophic nanoflagellate with discoidal mitochondrial cristae. J. Euk. Microbiol. 46, 522-531. Patron, N. J. & Waller, R. F. 2007 Transit peptide diversity and divergence: A global analysis of plastid targeting signals. BioEssays 29, 1048-1058. Patron, N. C., Rogers, M. B. & Keeling, P. J. 2004 Gene replacement of fructose-1,6-bisphosphate aldolase supports the hypothesis of a single photosynthetic ancestor of chromalveolates. Eukaryot. Cell 3, 1169-1175. Patron, N. J., Waller, R. F. & Keeling, P. J. 2006 A tertiary plastid uses genes from two endosymbionts. J. Mol. Biol. 357, 1373-1382. Patterson, D. J. 1999 The diversity of eukaryotes. Am. Nat. 154, S96-S124. 22 Pusnik, M., Charriere, F., Maser, P., Waller, R. F., Dagley, M. J., Lithgow, T., Schneider, A. 2009 The single mitochondrial porin of Trypanosoma brucei is the main metabolite transporter in the outer mitochondrial membrane. Mol. Biol. Evol. 26: 671-680. Richards, T. A. & Cavalier-Smith, T. 2005 Myosin domain evolution and the primary divergence of eukaryotes. Nature 436, 1113-1118. Robinson, N. P. & Bell, S. D. 2007 Extrachromosomal element capture and the evolution of multiple replication origins in archaeal chromosomes. Proc. Natl Acad. Sci. USA 104, 5806-5811. Rodríguez-Ezpeleta, N., Brinkmann, H., Burger, G., Roger, A. J., Gray, M.W, Philippe, H., Lang, B. F. 2007 Toward resolving the eukaryotic tree: the phylogenetic positions of jakobids and cercozoans. Curr. Biol. 17: 1420-1425. Rogers, M. B., Archibald, J. M., Field, M. A., Li, C., Striepen, B. & Keeling, P. J. 2004 Plastidtargeting peptides from the chlorarachniophyte Bigelowiella natans. J. Eukaryot. Microbiol. 51, 529-535. Russell, A. G., Schnare, M. N. & Gray, M. W. 2006 A large collection of compact box C/D snoRNAs and their isoforms in Euglena gracilis: structural functional and evolutionary insights. J. Mol. Biol. 367, 1548–1565. Sanchez-Puerta, M.V. and Delwiche, C.F. 2008 A hypothesis for plastid evolution in chromalveolates. J. Phycol. 44, 1097–1107. Silver, T. D., Koike, S., Yabuki, A., Kofuji, R., Archibald, J. M. & Ishida, K. 2007 Phylogeny and nucleomorph karyotype diversity of chlorarachniophyte algae. J. Eukaryot. Microbiol. 54, 403-410. Simpson, A. G. B. 1997 The identity and composition of the Euglenozoa. Arch. Protistenk. 148, 318328. Simpson, A. G. B., Patterson, D. J. 1999 The ultrastructure of Carpediemonas membranifera (Eukaryota) with reference to the "excavate hypothesis". Eur. J. Protistol. 35, 353-370. Spork, S., Hiss, J. A., Mandel, K., Sommer, M., Kooij, T. W., Chu, T., Schneider, G., Maier, U. G. & Przyborski, J. M. 2009 An unusual ERAD-like complex is targeted to the apicoplast of Plasmodium falciparum. Eukaryot. Cell 8, 1134-1145. Triemer, R. E. & Farmer, M. A. 1991 The ultrastructural organization of heterotrophic euglenids and its evolutionary implications. In The biology of free-living heterotrophic flagellates (eds D. J. Patterson & J. Larsen), pp. 185-204. Oxford: Clarendon Press. Turmel, M., Gagnon, M. C., O'Kelly, C. J., Otis, C. & Lemieux, C. 2009 The chloroplast genomes of the green algae Pyramimonas, Monomastix, and Pycnococcus shed new light on the evolutionary history of prasinophytes and the origin of the secondary chloroplasts of euglenids. Mol Biol Evol 26, 631-648. Valas, R. E. & Bourne, P. E. 2009 Structural analysis of polarizing indels: an emerging consensus on the root of the tree of life. Biol. Direct 4, 30. Waller, R. F., Jabbour, C., Chan, N. C., Celik, N., Likic, V. A., Mulhern, T. D. & Lithgow, T. 2009 Evidence of a reduced and modified mitochondrial protein import apparatus in microsporidian mitosomes. Eukaryot. Cell 8: 19-26. Wright, M., Moisand, A. & Mir, L. 1980 Centriole maturation in the amoebae of Physarum polycephalum. Protoplasma 105, 149-160. Yubuki, N., Edgcomb, V. P., Bernhard, J. M. & Leander, B. S. 2009 Ultrastructure and molecular phylogeny of Calkinsia aureus: cellular identity of a novel clade of deep-sea euglenozoans with epibiotic bacteria. BMC Microbiol 9, 16.