PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA HYPOGYNA

advertisement
PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF
ACHLYA HYPOGYNA AND THRAUSTOTHECA CLAVATA (OOMYCOTA,
STRAMENOPILA): IMPLICATIONS FOR THE ORIGIN OF CHROMALVEOLATE
PLASTIDS AND THE ‘GREEN GENE’ HYPOTHESIS
Lindsay Rukenbrod
A Thesis Submitted to the
University of North Carolina Wilmington in Partial Fulfillment
of the Requirements for the Degree of
Master of Science
Center for Marine Science
University of North Carolina Wilmington
2012
Approved by
Advisory Committee
D. Wilson Freshwater
Jeremy Morgan
Allison Taylor
J. Craig Bailey
Chair
Accepted by
Digitally signed by Robert Roer
DN: cn=Robert Roer, o=UNCW,
ou=Graduate School and Research,
email=roer@uncw.edu, c=US
Date: 2012.11.27 15:07:17 -05'00'
Dean, Graduate School
This thesis has been prepared in the style and format consistent with the Journal of
Eukaryotic Microbiology.
ii
TABLE OF CONTENTS
ABSTRACT .....................................................................................................................iv
ACKNOWLEDGMENTS ..................................................................................................vi
DEDICATION ................................................................................................................. vii
LIST OF TABLES .......................................................................................................... viii
LIST OF FIGURES ..........................................................................................................ix
CHAPTER 1: Implications for the origin of chromalveolate plastids ............................... X
INTRODUCTION .................................................................................................. 1
METHODS............................................................................................................ 3
RESULTS AND DISCUSSION ............................................................................. 4
Revised Hypotheses for the Evolution of Chromalveolate Plastids ............ 6
CHAPTER 2: Do chromalveolate genomes encode ‘green genes’? ............................ 15
INTRODUCTION ................................................................................................ 16
METHODS.......................................................................................................... 18
RESULTS AND DISCUSSION ........................................................................... 19
Green Genes in Oomycetes and Other Chromalveolates? ...................... 22
SUPPLEMENTAL INFORMATION................................................................................ 32
LITERATURE CITED .................................................................................................... 41
iii
ABSTRACT
Chapter 1
The chromalveolate hypothesis predicts that extant nonphotosynthetic stramenopiles
are secondarily nonphotosynthetic and derived from ancestors bearing a secondary redtype plastid. To test this hypothesis, proteomes of the oomycetes Achlya hypogyna and
Thraustotheca clavata were canvassed for plastid-targeted genes. Proteins for each
species encoding putative plastid-targeting signal peptides were identified, annotated,
and assigned to protein families if possible. Forty-six candidate proteins were culled
from the two genomes. Bioinformatic analyses revealed that the proteomes of Achlya
and Thraustotheca do not encode plastid-targeted genes acquired by endosymbiotic
gene transfer. All proteins possessing non-mitochondrial-targeting signal peptides
identified were judged to belong to the secretome (i.e, extracellularly secreted proteins).
These results indicate that oomycetes are ancestrally aplastidic stramenopiles and do
not support the chromalveolate theory of plastid evolution. Revised hypotheses for the
origin of plastids characterized by chlorophylls a and c and fucoxanthin are presented. It
is concluded that alveolate and stramenopile plastids are likely tertiary or higher order
plastids, not secondary plastids.
Chapter 2
The hypothesis that a green algal symbiosis preceded the red algal symbiont that gave
rise to red-type plastids in the ancestors of the chromalveolates is reexamined. A
network approach was used to detect nuclear encoded proteins from the genomes of
Achlya hypogyna, Thraustotheca clavata, other oomycetes, and other chromalveolates
iv
that cluster with green algal genes. Twelve oomycete proteins clustering with green
algal genes at high stringency were annotated and selected for further analyses.
Representative homologs from all other eukaryotic taxa available were aligned to
sequences comprising each network and maximum likelihood trees were constructed
from these alignments. Protein trees derived from these data exhibited obvious errors
resulting from taxon biases and heterotachy. These results argue that ‘green genes’
detected in phylogenomics studies are artifactual and not indicative of endosymbiotic
gene transfer.
v
ACKNOWLEDGMENTS
My thanks go to my advisor, Dr. J. Craig Bailey, whose enthusiasm about
molecular protistology caught my interest in the very beginning of my scientific
education. His continuous encouragement, wit, and sense of humor made this journey
an enjoyable one. Ian Misner and Dr. Chris Lane of the University of Rhode Island have
also been instrumental in my education, providing feedback and technical support in my
research. I’d also like to thank my committee members, Dr. D. Wilson Freshwater, Dr.
Jeremy Morgan, and Dr. Allison Taylor, for their encouragement and flexibility
throughout this process.
My lab mates past and present, particularly Cory Dashiell, Erika Shwarz, Ashley
Hayes, and Allison Martin, helped me maintain my focus over the years throughout
failed DNA extractions, computer malfunctions, approaching deadlines, and many other
graduate school related challenges.
The Department of Biology and Marine Biology, the Center for Marine Science
and the National Science Foundation provided financial support for my education and
research.
Finally, I’d like to thank my parents and my husband for supporting me every step
of the way.
vi
DEDICATION
I’d like to dedicate this to my mother, whose endless patience has allowed me to
explore life with few restrictions and overwhelming love and support.
vii
LIST OF TABLES
Table
Page
Chapter 1
1.
Protein IDs for 46 hypothetical proteins detected in the genomes of
Thraustotheca and/or Achlya.. ............................................................................ 9
2.
Protein ID numbers, annotations and protein family designations..................... 11
3.
Proteins sorted into one of 14 unique protein families....................................... 13
4.
List of seven proteins from the Achlya and Thraustotheca and putative
homologs found in the Arabidopsis thaliana plastid proteome.. ........................ 14
Chapter 2
1.
List of 12 annotated proteins from the Achlya and/or Thraustotheca proteomes
or other oomycetes found in EGNs . ........................................................................ 24
viii
LIST OF FIGURES
Figure
Page
Chapter 1
1.
Hypotheses for the origin of complex, higher order
chlorophyll a+c-containing plastids in chromalveolates. ....................................... 8
Chapter 2
1.
Three examples of putative green genes in oomycete
genomes based on EGN analysis. .................................................................... 25
2.
DEXDc ML tree................................................................................................... 26
3.
RPB ML tree ....................................................................................................... 27
4.
ALDH ML tree ..................................................................................................... 28
5.
TOR-containing kinase ML tree .......................................................................... 29
6.
YAK1 ML tree: .................................................................................................... 30
7.
ALS ML tree....................................................................................................... 31
ix
CHAPTER 1: Implications for the origin of chromalveolate plastids.
x
INTRODUCTION
The evolutionary origin and subsequent movement of secondary and higher order
plastids among photosynthetic eukaryotes is the subject of intense debate. The principal
key to unraveling the evolutionary history of plastids is an accurate understanding of the
relationships among both host and plastid lineages (Archibald 2009; Green 2011). This
goal is hampered by the mosaic nature of eukaryotic genomes comprised of lineagespecific genes inherited vertically, thousands of genes acquired by endosymbiotic gene
transfer (EGT), and genes obtained via lateral gene transfer (LGT) (Archibald 2008;
Green 2011; Keeling 2009; Larkum 2007).
The chromalveolate hypothesis posits that the alveolates, cryptomonads,
haptophytes and stramenopiles are monophyletic and that the last common ancestor of
these lineages was a photosynthetic alga bearing a red-type plastid (Cavalier-Smith
1999; 2003). This notion is supported, in the first instance, by the fact that
photosynthetic members of these chlorophyll a+c-containing groups all possess redtype plastids surrounded by three or four unit membranes [the so-called chloroplastendoplasmic reticulum, or CER], a feature indicative of secondary endosymbiosis
(Dodge 1975; Foth and McFadden 2003; Guillot and Gibbs 1980a, b; Gibbs 1981a, b;
Köhler et al. 1997). Second, nuclear-encoded plastid-targeted proteins in these algae
are characterized by the presence of a 5’ bipartite signal sequence that directs gene
products to the plastid and across the outer- and inner-pair of plastid membranes (Kroth
2002; Soll and Schleiff 2004). In terms of coding capacity, gene content, and
organization the plastid genomes of chromalveolates resemble those of red algae far
more closely than they resemble the plastid genomes of green algae (Delwiche 1999;
Keeling 2004; Yoon et al. 2002). Cavalier-Smith (1999) originally emphasized the
chromalveolate hypothesis is consistent with idea that the chloroplast endoplasmicreticulum (CER) and complex protein-trafficking systems that characterize
chromalveolates are unlikely to have evolved independently on different occasions (see
Kroth 2002; Ralph et al. 2004).
Over the last decade, tests of the ‘chromalveolate’ concept has been the subject
– implicitly or explicitly – of numerous broad-scale phylogenetic studies. The
chromalveolates have not been recovered as a monophyletic group in any study
(Archibald 2009, Baurain et al. 2010). More recent studies imply the relationships
among chromalveolate host cells and their plastids is more complex than originally
supposed, perhaps involving tertiary and higher-order transfers among hosts (Archibald
2009; Bodyl 2005; Keeling 2004; Sanchez-Puerta and Delwiche 2008). In this paper
the chromalveolate hypothesis is re-examined in light of new genomic data available for
nonphotosynthetic members of the Stramenopila.
The stramenopiles, one of the four principal taxa included in the Chromalveolata,
are divided into two groups. (i)The ‘photosynthetic stramenopiles’, ‘heterokont algae’ or
‘ochrophytes’ - is comprised of chlorophyll a+c-containing photosynthetic algae
including phaeophytes, chrysophytes, and diatoms, eustigmatophytes, pelagophytes,
and xanthophytes (Lee et al. 2000). (ii) Nonphotosynthetic organisms that are
bactivorous, parasitic or saprobic heterotrophs in nature including bicosoecieds,
hyphochytrids, labyrinthulids, oomycetes, thraustochytrids, among others (Lee et al.
2
2000). The oomycetes are the most diverse, well studied, and economically important
of all nonphotosynthetic stramenopiles.
The chromalveolate hypothesis implies that extant aplastidic stramenopiles are
derived from ancestors that once possessed a secondary red-type plastid. However,
there is no ultrastructural or DNA evidence suggesting that bicosoecieds, hyphochytrids,
labyrinthulids, oomycetes, or thraustochytrids possess, or possessed in the past, a
plastid. Furthermore, ultrastructural or DNA sequence evidence for cryptic plastids in
these organisms is absent or controversial (Lee et al 2000; Reyes-Prieto et al. 2008;
Slamovits and Keeling 2008; Stiller et al. 2009).
In this study the proteomes of the oomycetes Achlya hypogyna and
Thraustotheca clavata, were canvassed in search of photosynthesis related genes.
METHODS
Full length predicted proteins were obtained from ongoing genome sequencing projects
for Achlya hypogyna (ATCC48635) and Thraustotheca clavta (ATCC34112) estimated
to encode 17,430 and 12,154 predicted proteins, respectively; additional details will be
published separately. The Achlya and Thraustotheca proteomes were searched for
possible plastid-targeted genes using the signal peptide prediction program ChloroP
(v.1.1) (Emanuelsson et al. 1999). Hypothetical proteins returned from these searches
were subsequently analyzed using SignalP (v.4.0) (Petersen et al. 2011), annotated and
assigned to protein families if possible using the Conserved Domain Database (CDD)
(Marchler-Bauer et al. 2007). Mitochondria-targeted proteins and proteins possessing
3
transmembrane regions identified using TmHMM (v2.0) were removed from the data set
(Krogh et al. 2001). Searches for heterokont-like bipartite plastid-targeting peptides,
consisting of both signal and transit peptide motifs, were conducted using HECTAR
(Gruber et al. 2007; Gschloessl et al. 2008; Waller et al. 2000). Finally, the oomycete
proteins were BLASTed against the Arabidopsis thaliana plastid proteome database
(which includes plastid- and nuclear-encoded plastid targeted proteins) using plprot
v.2.3 (Baginsky et al. 2005; Kleffmann et al. 2004; 2006).
RESULTS AND DISCUSSION
The chromalveolate hypothesis implies that the ancestors of oomycetes were
photosynthetic organisms bearing red-type plastids and putative plastid-related genes
have been reported from the genomes of the plant pathogens Phytophthora ramorum
and P. sojae (Tyler et al. 2006). The competing hypothesis is the long-held view that
oomycetes are ancestrally aplastidic. It is possible that the ancestors of oomycetes
were photosynthetic but that extant members of group have not retained any plastidassociated genes. On the other hand, empirical data including studies of
apicomplexans, dinoflagellates and other taxa imply plastid-associated genes are
unlikely to be completely purged from the genome even in organisms where a vestigal,
nonphotosynthetic plastid is absent (Barbrook et al. 2006; de Koning and Keeling 2004;
Matsuzuki et al. 2008; Wilson 2004; Sanchez-Puerta et al. 2007).
Thirty hypothetical proteins from the Achlya genome and 16 from the
Thraustotheca genome putatively possessing a 5’ plastid-targeting signal peptide were
4
identified (Table 1). Of these 46 proteins 22 are presently characterized as hypothetical
proteins of unknown function; 24 of the proteins were annotated (<1.00E-25) and found
to represent 14 unique protein families (Tables 2, 3).
BLASTp queries revealed that none of the oomycete proteins (Table 2) are
encoded by the 271 eukaryotic plastid genomes sequenced to date. None of the 46
presequences examined here possess the ASAFP (Y/W/L) motif necessary for plastid
import in diatoms, although the significance of this observation is unclear (Gruber et al.
2007) (see supplementary Tables S1 and S2).
Putative homologs to seven of the oomycete proteins were detected in the A.
thaliana plastid proteome (Table 4). These seven oomycete proteins are more-or-less
distant relatives of three A. thaliana genes. Both Achlya and Thraustotheca encode
proteins similar to the zinc-finger type WRKY1 DNA-binding transcription factor that
plays a role in disease resistance in A. thaliana (Dong et al. 2003; Shindo et al. 2012)
Three Achlya and one Thraustotheca proteins putatively encoding cysteine proteinase
RD21A are shared in common with the A. thaliana plastid proteome. Finally, a single
Achlya protein distantly related (6E-17) to A. thaliana aldehyde dehydrogenase (ALDH)
was also detected.
These three genes are not indicators for photosynthesis per se because
homologs have been detected from across the tree of life in photosynthetic (e.g., plants
and green algae) and nonphotosynthetic organisms (e.g., eubacteria, animals, fungi,
and the rhizarian Dictyostelium). Homologs, more closely related to the Achlya and
Thraustotheca proteins, to each of these putative genes have been previously detected
5
in the genomes of Phytophthora infestans, P. sojae (Pythiales) and the white rust
Albugo laibachii (Tyler et al. 2006).
The annotated proteins recovered in this study include nine know to belong to
oomycete secretomes and six of these are common proteases such as chitinase and
cellulase (Tables 2, 3: Birch et al. 2006; Gaulin et al. 2008; Kamoun 2006; Levesque et
al. 2010). One of the proteins belongs to the elicitin family; a family of virulence genes
unique to oomycetes (Jiang et al. 2006). Based upon these data, plastid-associated
genes are not present in the Achlya or Thraustotheca predicted proteomes.
Revised Hypotheses for the Evolution of Chromalveolate Plastids
These data, as well as the study by Stiller et al. (2009), indicate that oomycetes are
ancestrally aplastidic despite reports to the contrary (Tyler et al. 2006). This information
and the results of recent phylogenomics investigations have been synthesized and
revised hypotheses for the evolution of chromalveolate plastids are presented in Figures
1 and 2. These diagrams reflect a number of assumptions that are enumerated for the
sake of clarity. (i) The Chromalveolata sensu stricto is paraphyletic (e.g., , Iida et al.
2007; Khan et al. 2007; reviewed in Green 2011; Rogers et al. 2007). (ii) )omycetes, all
other heterotrophic stramenopiles, as well as the ciliates are ancestrally aplastidic
(Archibald 2008; Reyes-Prieto et al. 2008; Tyler et al. 2006). (iii) The SAR clade is
recognized as natural (Burki et al. 2007; Hackett et al. 2007; Lane & Archibald 2008).
Fourth, recent studies imply that SAR and Hacrobia host cells are likely distantly related
(Baurain et al. 2010; Hackett et al. 2007; Parfrey et al. 2010). For these reasons, no
6
specifically defined relationship between SAR and Hacrobia host cells is implied in
Figure 1. The diagrams comprising Figure 1 are drawn under the assumption that the
Hacrobia is monophyletic (Burki et al. 2007; Hackett et al. 2007; Harper et al. 2005,
Patron et al. 2007).
These hypotheses share elements in common with prior models of
chromalveolate plastid evolution in which multiple plastid acquisitions (or plastid
replacements) are inferred via serial endosymbiotic transfer (Archibald 2008; Bodyl
2005; Bodyl et al. 2009; Bodyl and Moszczynski 2006; Sanchez-Puerta & Delwiche
2008). Two predictions derived from these models bear emphasizing: (1) Alveolates
and Stramenopiles likely possess tertiary or quarternary plastids and (2) it is
conceivable that one of these taxa, the alveolates or stramenopiles, may have obtained
their plastid from the other (Fig. 1). Finally, it is noted that the number of membranes
surrounding higher-order, complex plastids seems to be fixed at four or less.
7
Fig. 1 Hypotheses for the origin of complex, higher order chlorophyll a+c-containing plastids in
chromalveolates. (A) Independent acquisition of a tertiary (3°) plastid in the alveolate and
stramenopile lineages from the Hacrobia lineage. (B) Serial endosymbiotic transfer resulting in
a quarternary (4°) alveolate plastid from the 3° stramenopile plastid. (C) ) Serial endosymbiotic
transfer resulting in a 4° stramenopile plastid from the 3° alveolate plastid.
8
Table 1. Protein IDs for 46 hypothetical proteins detected in the genomes of Thraustotheca
and/or Achlya characterized by the presence of a putative plastid-targeting 5’ signal peptide
sequence. ChloroP was used to detect classical plastid transit peptides. HECTOR was used to
search for bipartite plastid targeting leader sequences characteristic of stramenopiles and other
3°
chromalveolates (Kilian and Kroth 2003, McFadden and van Dooren 2004, Vesteg et al. 2009).
Protein ID
Thraustotheca clavata
THRCLA_02069
THRCLA_03737
THRCLA_03876
THRCLA_04285
THRCLA_04386
THRCLA_04952
THRCLA_05863
THRCLA_06099
THRCLA_07047
THRCLA_08011
THRCLA_10855
THRCLA_10997
THRCLA_11248
SignalP
ChloroP
Y
Y
Y
Y
Y
N
Y
Y
Y
Y
N
N
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
THRCLA_11271
THRCLA_11391
THRCLA_11516
Y
Y
Y
Y
Y
Y
Chloroplast
Signal peptide
Signal peptide
Signal peptide
Signal peptide
Signal peptide
Signal peptide
Signal peptide
Signal peptide
Signal peptide
Signal peptide
Signal peptide
No N-terminal target
peptide found
Chloroplast
Signal peptide
Signal peptide
Achlya hypogyna
ACHHYP_00269
ACHHYP_01095
ACHHYP_01226
ACHHYP_01546
ACHHYP_02169
ACHHYP_02305
ACHHYP_03044
ACHHYP_03052
ACHHYP_04549
ACHHYP_04706
ACHHYP_04908
ACHHYP_05005
Y
Y
Y
Y
Y
Y
Y
N
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Signal Peptide
Signal peptide
Signal peptide
Signal peptide
Chloroplast
Signal peptide
Signal peptide
Signal peptide
Chloroplast
Signal peptide
Signal peptide
Signal peptide
9
HECTAR
Table 1 cont
ACHHYP_05180
ACHHYP_05326
ACHHYP_05770
ACHHYP_06287
ACHHYP_06505
ACHHYP_06977
ACHHYP_07400
ACHHYP_08323
ACHHYP_09221
ACHHYP_09519
ACHHYP_10824
ACHHYP_11025
ACHHYP_11286
ACHHYP_11397
ACHHYP_12628
ACHHYP_13722
ACHHYP_14385
ACHHYP_15409
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
10
Signal peptide
Signal peptide
Signal peptide
Signal peptide
Signal peptide
Signal peptide
Chloroplast
Signal peptide
Chloroplast
Chloroplast
Signal peptide
Chloroplast
Signal peptide
Signal peptide
Chloroplast
Chloroplast
Signal peptide
Chloroplast
Table 2. Protein ID numbers, annotations (<1.00E-25), and protein family designations for 46
proteins from the Thraustotheca and Achlya genomes putatively possessing 5’ plastid-targeting
signal peptides.
Gene/Protein ID
Annotation
pfam
Thraustotheca clavate
THRCLA_02069
putative GPI-anchored serine-rich hypothetical
protein
THRCLA_03737
cd05384: SCP_PRY1_like [COG2340]
THRCLA_03876
hypothetical protein, with EGF-like motif
THRCLA_04285
Kazal-type serine proteinase inhibitor
THRCLA_04386
hypothetical protein
THRCLA_04952
hypothetical protein
THRCLA_05863
hypothetical protein
THRCLA_06099
putative GPI-anchored serine-rich hypothetical
protein
THRCLA_07047
hypothetical protein
THRCLA_08011
cysteine protease family C01A, putative
THRCLA_10855
hypothetical protein
THRCLA_10997
chitinase D-like
THRCLA_11248
hypothetical protein, unknown function
THRCLA_11271
hypothetical protein, elicitin superfamily
THRCLA_11391
beta-N-acetylglucosaminidase
THRCLA_11516
hypothetical protein, unknown function
Achlya hypogyna
ACHHYP_00269
ACHHYP_01095
ACHHYP_01226
ACHHYP_01546
ACHHYP_02169
ACHHYP_02305
ACHHYP_03044
ACHHYP_03052
ACHHYP_04549
ACHHYP_04706
ACHHYP_04908
ACHHYP_05005
putative GPI-anchored serine-rich hypothetical
protein
beta-N-acetylglucosaminidase
hypothetical protein
hypothetical protein
trypsin-like serine protease
putative GPI-anchored serine-rich hypothetical
protein
putative chitinase-like carbohydrate-binding protein
hypothetical protein
hypothetical protein
hypothetical protein encoding ricin_B_lectin
hypothetical protein
puative D-lactate dehydrogenase
11
_
pfam00188:
_
pfam7648
_
_
_
_
_
pfam00112
_
pfam00704
_
pfam00964
pfam00728
_
_
pfam00728
pfam12937
_
pfam13365
_
pfam00704
_
_
pfam00652
_
pfam01565
Table 2 cont
ACHHYP_05180
ACHHYP_05326
ACHHYP_05770
ACHHYP_06287
ACHHYP_06505
ACHHYP_06977
ACHHYP_07400
ACHHYP_08323
ACHHYP_09221
ACHHYP_09519
ACHHYP_10824
ACHHYP_11025
ACHHYP_11286
ACHHYP_11397
ACHHYP_12628
ACHHYP_13722
ACHHYP_14385
ACHHYP_15409
hypothetical protein
Cellulose
hypothetical protein
hypothetical protein
papain family cysteine protease
hypothetical protein
hypothetical protein
hypothetical protein containing PAN domain
hypothetical protein
_
pfam00150
_
_
pfam00112
_
_
pfam00024
_
hypothetical protein encoding ricin_B_lectin
ankyrin repeat protein
hypothetical protein
aldehyde dehydrogenase
hypothetical protein
papain-like cysteine protease C1
hypothetical protein
hypothetical protein
papain-like cysteine protease C1
pfam00652
pfam12796
_
pfam0017
_
pfam00112
_
_
pfam00112
12
Table 3. Proteins investigated in this study were sorted into one of 14 unique protein families,
which are listed below. Note that all proteins investigated (see Table 2) are predicted to have a
5’ signal peptide and that nine of the 14 families include secreted proteins. Six of the families
include proteases and that the elicitin family of virulence proteins are secreted extracellularly
and is unique to oomycetes.
pfam ID
00188
07648
00112
00964
00728
12937
13365
00704
00652
01565
00150
00024
12796
0017
Protein ID
THRCLA_03737
THRCLA_04285
THRCLA_08011
ACHHYP_06505
ACHHYP_12628
ACHHYP_15409
THRCLA_11271
THRCLA_11391
ACHHYP_01095
ACHHYP_01226
ACHHYP_02169
THRCLA_10997
ACHHYP_03044
ACHHYP_04706
ACHHYP_09519
ACHHYP_05005
ACHHYP_05326
ACHHYP_08323
ACHHYP_10824
ACHHYP_11286
Protein family / Conserved domains
Cysteine-rich secretory protein family
Kazal_2: Kazal-type serine protease inhibitor domain
Peptidase_C1: Papain family cysteine protease
Elicitin
Glyco_hydro_20: Glycosyl hydrolase family 20, catalytic
domain
F-box-like
Trypsin_2: Trypsin-like peptidase domain
Glyco_hydro_18: Glycosyl hydrolases family 18
Ricin_B_lectin: Ricin-type beta-trefoil lectin domain
FAD_binding_4: FAD binding domain
Cellulase: Cellulase (glycosyl hydrolase family 5)
PAN_1: PAN domain
Ank_2: Ankyrin repeats
aldehyde dehydrogenase superfamily (ALDH-SF)
13
Table 4. List of seven proteins from the Achlya hypogyna and Thraustotheca clavata oomycete
genomes and putative homologs found in the Arabidopsis thaliana plastid proteome. Reference
refers to functional studies of the genes identified in this analysis.
Oomycete
protein ID
A. thaliana
plastid
proteome ID
ACH_05770
plp_at_01492
THR_04952
THR_08011
ACH_15409
ACH_12628
ACH_06505
ACH_11286
plp_at_01492
plp_at_00089
plp_at_00089
plp_at_00089
plp_at_00089
plp_at_00466
Gene annotation
disease resistance protein
related to DNA-binding
protein WRKY1
disease resistance protein
related to DNA-binding
protein WRKY1
cysteine proteinase
RD21A (=thiol protease
RD21A)
cysteine proteinase
RD21A (=thiol protease
RD21A)
cysteine proteinase
RD21A (=thiol protease
RD21A)
cysteine proteinase
RD21A (=thiol protease
RD21A)
aldehyde dehydrogenase
(ALDH)
14
E - value
Reference
2.00E-20
7.00E-23
4.00E-53
Shindo et al. 2012
1.00E-24
Shindo et al. 2012
1.00E-18
Shindo et al. 2012
9.00E-47
Shindo et al. 2012
6.00E-17
1
2
3
CHAPTER 2: Do chromalveolate genomes encode ‘green genes’?
1
2
INTRODUCTION
One of the most vexing problems in eukaryote systematics is the
3
interrelationships among the so-called ‘chromalveolates’ (Archibald 2008; Cavalier-
4
Smith 1999; Green 2011; Keeling 2004). The Chromalveolata is a paraphyletic taxon
5
whose members can be divided into two groups: The first group (the SAR clade)
6
includes the Alveolates (apicomplexans, dinoflagellates, and ciliates) that are sister to
7
Stramenopiles (including phaeophytes, chrysophytes, oomycetes). In turn, these two
8
clades are sister to the Rhizaria, a group principally comprised of free-living amoebae
9
(Burki et al. 2007; Hackett et al. 2007; Lane and Archibald 2008; Rogers et al. 2007).
10
The second group, the Hacrobia, includes cryptomonads and haptophytes and lesser-
11
known relatives such as the telonemids, centrohelids, and picobiliphytes (Burki et al.
12
2007; Elias and Archibald 2009; Hackett et al. 2007; Okamoto et al. 2009; Rice and
13
Palmer 2006; Patron et al. 2007). The exact relationship between host cells and plastids
14
belonging to members of the SAR and Hacrobia clades is unclear (Baurain et al. 2010;
15
Harper et al. 2005). Despite these uncertainties, it is clear that all photosynthetic
16
chromalveolates possess three or four membrane-bound secondary or higher-order
17
plastids ultimately derived from a red alga (Hackett et al. 2004; Janouskovec et al.
18
2010; Kahn et al. 2007; Yoon et al. 2002; 2004; Sanchez-Puerta et al. 2007). How
19
these plastids were acquired is a contentious issue but most recent models reflect a
20
growing consensus that multiple independent origins and/or serial endosymbiotic events
21
best explain most recent data (Bodyl 2005; Bodyl and Moszczynski 2006; Sanchez-
22
Puerta and Delwiche 2008).
16
1
The understanding of the evolutionary history of chromalveolates has recently
2
been further complicated by the unexpected discovery of so-called ‘green genes’ in
3
chromalveolate genomes. Whole genome sequencing and EST studies have revealed
4
that the genomes of chromalveolate species encode 100s or 1000s of genes apparently
5
derived from within the green algal lineage (Moustafa et al. 2009; Tyler et al. 2006;
6
Woehle et al. 2011). For example, the genomes of the diatoms Phaeodactylum and
7
Thalassiosira reportedly contain thousands of genes whose phylogenetic affinities lie
8
within green algae (Armbrust et al. 2004; Bowler et al. 2008; Chan et al. 2011; Moustafa
9
et al. 2009). Putative ‘green genes’ (albeit fewer in number) have also been detected in
10
the genomes other chromalveolates examined (Cock et al. 2010). The presence of
11
‘green genes’ has lead some authorities to speculate that the last common ancestor of
12
the chromalveolates once harbored a green algal symbiont that was later replaced by a
13
red algal symbiont that gave rise to the chlorophyll a + c-containing red-type plastids
14
that characterize most extant chromalvelates (Armbrust 2009; Dorrell & Smith 2011;
15
Frommolt et al. 2008; Moustafa et al. 2009). In short, the green genes found in
16
chromalveolate genomes are hypothesized to have been obtained via endosymbiotic
17
gene transfer (EGT) (Huang et al. 2004; Reyes-Prieto et al. 2008; Slamovits and
18
Keeling 2008; Tyler et al. 2006;).
19
Other studies – implicitly or explicitly – imply that the green phylogenetic signal in
20
chromalveolate (particularly diatom) genomes may be more apparent than real. Biases
21
associated with heuristic phylogenomics pipelines needed to construct across genome-
22
level trees and the uneven distribution of protein sequences for eukaryotic taxa have
23
been previously described (Stiller et al. 2009; Woehle et al. 2011). In this study, two
17
1
chromalveolates, the nonphotosynthetic stramenopiles Achlya, Thraustotheca, were
2
canvassed for proteins of putative green algal origin. These proteins were annotated,
3
combined with homologs from other oomycete genomes or expressed sequence tag
4
(EST) databases, and homologs representing all other available eukaryotic taxa. The
5
phylogenetic trees obtained were used to (1) determine if nonphotosynthetic, aplastidic
6
oomycetes encode green algal genes similar to those found in diatoms and other
7
chromalveolates. Note, that if oomycetes are ancestrally non-photosynthetic then their
8
genomes should not encode ‘green genes’. (2) Second these trees were used to
9
critically reassess the veracity of green genes found in chromalveolates in toto.
10
11
METHODS
12
13
The genomes of Achlya hypogyna (ATCC 48635) and Thraustotheca clavata
14
(ATCC 34112) were sequenced and assembled yielding 17,430 and 12,154 predicted
15
proteins, respectively. Green genes possibly obtained by HGT or EGT events were
16
identified using evolutionary gene network (EGNs) analyses as described in Bittner et
17
al. (2010). In brief, all sequences were BLAST-ed against one another. Sequences
18
were connected in the EGN connected components graph when they showed a
19
minimum similarity, BLASTp score < E-value threshold, and sequence identity score
20
and BLAST identity percentage equal to or exceeding user determined limits. For
21
example, an EGN network with user defined parameters of ‘1E-20 at 80% similarity’
22
connects sequences that have BLASTp scores below 1E-20 and sequence identities
23
equal to or greater than 80%.
18
1
In this study batches of networks were separately constructed with minimum
2
threshold protein identities of 35, 45 and 65% and E-value thresholds of 1E-20.
3
Networks including oomycete proteins and one or more protein sequences derived from
4
representatives of (1) the green algal lineage (GAL) or (2) Fungi were selected for
5
further investigation. Annotations for candidate HGT/EGT proteins in the Achlya and
6
Thraustotheca genomes were then refined using NCBI’s conserved domain (CDD) and
7
KOG databases (Marchler-Bauer et al. 2007; Tatusov et al. 2003) and then used to
8
drive BLASTp searches aimed at recovering more distantly related eukaryotic homologs
9
from GenBank. Homologous sequences from representative all available eukaryotic
10
lineages were selected and aligned using “Geneious Alignment” with default settings in
11
Geneious v5.5 (Drummond et al. 2011) and manually edited as necessary. Thus, each
12
protein alignment included all sequences in the EGN of interest, as well as a number of
13
more distant homologs from other eukaryotes. Maximum likelihood trees for each
14
protein alignment were constructed using PHYML (Guindon et al. 2010) with the WAG
15
substitution model (Whelan & Goldman 2001) to account for heterotachy and 500
16
bootstrap replicates. Baysian posterior probabilities were calculated with using the Mr.
17
Bayes plugin for Geneious and run with default settings using the WAG substitution
18
model.
19
20
RESULTS AND DISCUSSION
21
22
23
Because they are ancestrally aplastidic, oomycetes are a perfect foil for
examining the hypothesis that chromalveolate genomes harbor varying numbers of
19
1
green genes acquired via EGT from an ancient green algal endosymbiont (Dorrell &
2
Smith 2011; Moustafa et al 2009). Genes of cyanobacterial and/or red algal origin were
3
originally reported for the genomes of Phytopthora ramorum and P. sojae but it has
4
since been demonstrated that these genes are very unlikely to reflect cyanobacterial or
5
red algal contributions to these genomes (Tyler et al. 2006; Stiller et al. 2009; Woehle et
6
al. 2011).
7
In this study 12 protein-encoding genes encoded by the Achlya, Thraustotheca or
8
other oomycete genomes were examined, which, based on EGN analyses, are closely
9
related to genes found in green algae (Table 1). Three exemplary EGN networks are
10
depicted in Figure 1. These networks indicate that Phytopthora spp. share one or more
11
copies of the phosphate dikinase (PPDK) gene in common with the green algae
12
Chlamydomonas and Volvox (Fig. 1a). The PPDK gene is, however, absent from the
13
genomes of Achlya and Thraustotheca and this observation – coupled with the current
14
understanding of oomycete systematics – implies that PPDK was likely acquired in the
15
Phytopthora lineage following the pythialean/saprolegnialean divergence (Beakes &
16
Sekimoto 2008; Sekimoto et al. 2009) In any event, the PPDK network clearly
17
demonstrates a putative green algal gene in Phytopthora spp., that is unknown in other
18
oomycetes. If Phytopthora spp. PPDK genes were acquired via EGT, then this
19
observation is most parsimoniously interpreted as a recent event – not one that can be
20
associated with the presence of a ancient green algal symbiont. All oomycetes
21
examined encode single copies of eukaryotic translation initiation factor 5B and an
22
aldehyde dehydrogenase whose most similar homologs are putatively found in the
23
bryophyte Physcomitrella patens (Fig. 1b, 1c, respectively).
20
1
Maximum likelihood (ML) trees for six of the 12 oomycete proteins of putative
2
green algal origin examined in this study are depicted in Figures 2 – 7. These six were
3
selected for demonstration because they are the most taxon replete and best
4
supported; trees for the remaining six proteins are equally problematic, or worse (see
5
below).
6
A tree comprised of DEXDc homologs is presented in Fig. 2. The EGN for
7
DEXDc implies a green origin for this gene in oomycetes, specifically uniting oomycete
8
homologs with the sequence for Chlamydomonas reinhardtii (not shown). Note,
9
however, in the tree that the C. reinhardtii DEXDc terminates a very long branch and
10
that when other eukaryotic homologs are added the oomycete/green relationship
11
becomes less clear. In fact, this tree implies that oomycetes share a common ancestor
12
with the Opistokonts (fungi and animals), a result clearly at odds with current
13
understanding of eukaryotic systematics. In summary, (at least) two phylogenetic errors
14
are apparent in the DEXDc tree: long branch attraction and a topological error that can
15
likely be traced to problems associated with taxon sampling, i.e. clear homologs to the
16
algal, plant, oomycete, and fungal DEXDc genes have yet to be identified in other
17
eukaryotes. The same issue – taxon sampling – specifically the differential distribution
18
of homologs among eukaryotic lineages also plagues the RPB tree (Fig. 3). Bearing in
19
mind that protein sequences for animals, fungi and plants far outnumber those available
20
for other organisms, the RPB subunit II tree implies that the alveolates are sister to a
21
clade including stramenopiles (brown algae, diatoms, and oomycetes), animals, and
22
green algae + land plants (Fig. 3). This topological error is likely compounded by the
23
observation that the alveolate sequences terminate long branches whereas the
21
1
embryophytes terminate shorter branches, and heterotachy is a well-known source of
2
phylogenetic error (Kolaczkowski and Thornton 2008; Pagel and Meade 2008; Philippe
3
et al. 2008; Shalchian-Tabrzi et al 2006). The ALDH tree implies that the stramenopiles
4
are not monophyletic; green algal sequences are nested within a clade including
5
sequences for alveolates and stramenopiles (Fig. 4).
6
These same types of phylogenetic errors are demonstrated in Figures 5 – 7
7
and are not repeated. What these trees clearly demonstrate, however, is the pervasive
8
influence that the vast number of sequences available for fungi (80+ complete
9
genomes) may have on phylogenomics studies (cf. Stiller et al. 2009). The TOR-
10
containing kinase tree suggests that green algae may not be monophyletic and that
11
green algae and stramenopiles are, again, sister to animals and fungi (Opistokonts)
12
(Fig. 5). The unorthodox relationships among green algae, oomycetes, and fungi are
13
also recovered in the YAK1 tree (Fig. 6). The ALS tree is equally vexing and seems to
14
suggest that the chromalveolates (in toto?) may have obtained their copy of this gene
15
via horizontal gene transfer from fungi (Fig. 7).
16
17
Green Genes in Oomycetes and Other Chromalveolates?
18
19
On the basis of the data collected, the notion that chromalveolate genomes encode
20
hundreds or thousands of genes derived from green algae is false.
21
Critical analyses of protein-encoding sequences from oomycetes and other
22
chromalveolates of putative green algal origin yielded trees seriously compromised by a
23
number of obvious and well-known sources of phylogenetic error. These included at
22
1
minimum biased taxon sampling, long branch attraction, and heterotachy. This
2
argument is bolstered by the curious fact that so-called ‘green genes’ can be detected in
3
oomycetes even though these organisms are ancestrally aplastidic. These results, and
4
those of Stiller et al. (2009), suggest that these biases are so prevalent at this time that
5
broad-scale evolutionary scenarios drawn from phylogenomics studies need to be
6
interpreted with a higher level of skepticism.
7
23
1
2
3
4
Table 1. List of 12 annotated proteins from the Achlya and/or Thraustotheca proteomes or other
oomycetes found in EGN connected components graphs clustering with homologs from green
algae.
Protein
Annotation
TOR-phosphatidylinositol
kinase
Yak1
acetolactate synthase
(ALS or AHAS)
DEXDc
phosphatidylinositol kinase, putative target of rapamycin (TOR)
RPB
RRM
RRM2
Sm_D1
Sm_E
thioredoxin peroxidase
threonine protease
ALDH
PKc-like superfamily, Yak1-like protein kinase
TPP_AHAS[cd02015], Thiamine pyrophosphate (TPP) family,
Acetohydroxyacid synthase (AHAS) subfamily
DEXDc superfamily, premRNAsplicing factor ATPdependent RNA helicase
PRP16 putative
RNA polymerase beta subunit.cd00653: RNA_pol_B_RPB2
RRM superfamily, PREDICTED: cleavage stimulation factor subunit 2-like
RRM superfamily, PREDICTED: similar to RNA binding motif protein
Sm-like superfamily, small nuclear ribonucleoprotein D1
Sm-like superfamily, small nuclear ribonucleoprotein E
thioredoxin-like superfamily, cd03015: PRX_Typ2cys
threonine protease family T01A putative, cd01911: proteasome_alpha
ALDH-SF superfamily, cd07084: ALDH_KGSADH-like
5
6
24
1
2
3
4
5
6
7
8
9
10
11
Fig. 1. Three examples of putative green genes in oomycete genomes based on EGN analysis
conducted at 65% protein identity. (A) All species of Phytophthora in this analysis share a copy
of phosphate dikinase (PPDK: P. infestans gene ID 03724) with Chlamydomonas reinhardtii
and Volvox carteri, two microscopic green algae. Note that PPDK is not encoded on the Achlya
or Thraustotheca genomes. (B) The moss Physcomitrella patens shares both eukaryotic
translation initiation factor 5B (P. infestans gene ID 20386) and (C) an aldehyde dehydrogenase
(P. infestans gene ID 00034) with all oomycetes included in this analysis.
25
1
2
3
4
5
Fig. 2 DEXDc ML tree: Oomycetes are shown sister to animals, sharing a common ancestor
with fungi. The phylogenetic errors demonstrated include long branch attraction and topological
error due to sampling bias.
26
1
2
3
4
5
Fig. 3. RPB ML tree: Alveolates are shown as sister to a clade including stramenoplies, animals,
and GAL. Long branches in the alveolate clad and short branches in the GAL, stramenoplie
and animal clade is indicative of topological error due to heterotachy.
27
1
2
3
4
5
Fig. 4. ALDH ML tree: Stramenopiles and GAL shown as not monophyletic. Long branch
attraction between GAL, stramenopiles, and alveolates is likely responsible for phylogenetic
error.
28
1
2
3
4
5
Fig. 5. TOR-containing kinase ML tree: Stramenopiles are sister to GAL, shown sharing a
common ancestor with animals. Heterotachy and topological error due to sampling bias are
demonstrated.
29
1
2
3
4
5
Fig. 6. YAK1 ML tree: GAL and stramenopiles shown sharing a common ancestor with fungi.
Long branch attraction between GAL and stramenoplies, heterotachy and topological error due
to sampling bias are demonstrated.
30
1
2
3
4
5
6
Fig. 7. ALS ML tree: Two clade tree shown making inferences about the relationship between
the two impossible. Phylogenetic error is likely due to abundance of available fungal genome
data (sampling bias).
31
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
SUPPLEMENTAL INFORMATION
Table S1. Selected hypothetical proteins (n=16) from the Thraustotheca clavata
genome possessing putative 5’ transit peptides. Chloroplast transit peptides predicted
using ChloroP (v.1.1) are shown in bold face. Transit peptide sequences predicted
using SignalP (v.4.0) are underlined.
>THRCLA_02069
MVRISALLGTFALIHAQTTTAPPASASNSWTMTTVNSIQARVVSDAATWDATNKKFG
LVMKQNTVTFPDQYRAAMDTVNTASVEGALFYVQTEGINKQFDVNCMRKTNMSYIWF
LNVTIVQPTFAIAEYADNGGVVPEYGKFIAMDNGQCTPLDTKGTMSDECMTLGGLNYH
ANIGPFIGGEPRKEHLLAKYPDNIWFSYPNSCFTKTFIAKDTKCREAQKGGLCPLGVQP
DGIKCTYSFDILGYIRIDELVGITNLTNSQTGQKYKDRVEFCKDSKVEFDFSTMKSDLTF
WDNPTDEAANTNRTTKMLELYNNLIKTGTGDAAYMKSLPTAAELTAKNPPCWKNSPIC
ATAEFGCRRKLTAQICEKCTSASPDCKKPTSSDSVPPKLTKAVAPPLPTDASGKTTVP
RNPTGAGGNGNAAAAESSASSLVAFTSLIITLAALFA
>THRCLA_03737
MKSTFVLLAAISLVNASSSTKLRGAAPCPNSNSGSSDNSSDYSGSESNWDSGSGSD
WDDCGSGSTSTSDSGSNDYPSNWDSNSGSDTTEEPATYAPAPTSAPTSAPTETPAT
SKGTLKEQIIHQTNLIRAAHGLGPVKWNDELAAKMQAWANSDPQQNGGGHGGPPGN
QNLASFDVCNDNCMRMTGPAWAWYSGEEKLWDYDANKSRDGIWETTGHFSNSMDP
GVNEIACGYSTFYNPQIGHDDSLVWCNYLGGNNGVIPRPRIDQATLEKQLTSAY
>THRCLA_03876
MNLKAWILSVAIASAAAASGSSSGSGSTTDAPLTQENLSSRPGLCNTSKDCAKYTKG
SNVYSCIAVKSNIVNLTTLKQCVLGDGCSGGKAGSCPTFTSWPQKFRQVQPVCAFVA
VPNCNSAVNSQGQVVSVRSLREQAAKPGNVTCFQAKFGSNSSSSDDSATVYGIYQCV
DKKLYAEKNLGYLDNTPKQLQSCAGNVTVVNGQSVSNVLCNGHGTCVPQTDFSDIYK
CLCSTGYSDKDNCGAATGNVCSAFGQCGNGNCNPDTGKCVCPYGSTGDQCSKCDP
AQNNNASVTNMCNGNGKCGIDGTCQCSDGYLGTNCETQIKKNSTASSATGSTTSSKK
SAASGLHEASIAIFSIATIFAAALI
>THRCLA_04285
MQIKSIIATLTLAALAQADNNNCEKSCTKELSPLCASNNETYNNLCLFQIAQCQQPTLTI
SANQSCSTNVKFCTRLCPTVYQPVCGSDNTTYPTECDLKNKACNNPSLTVTKQGACD
NCPKACLEILAPVCGSDGKTYDNTCFLLKTACANPSLNLTFVSTGSCTNGNNTTTTAPP
SGTTLPPSGTTLPPTTSGNPSTTTTPPTTKPASSATTAMLSLMSAAAIAITYML
>THRCLA_04386
MKWQVALLSLVTSGIAQDHCGSTTVPTIVPTPAPTLAPTPAPTPAPTPAPTPAPTPAPT
PAPTPAPTPAPTPAPTPAPTPAPTPAPTPAPTPAPTPAPTPAPTPAPLAVATATWTNLW
SDIVQVATDNTQICIRETNGDVDCKPWSTDSSLPTVYGGHSSNFLATGGGWSISTVNN
VNYLVVISPLYNANVMVLDEAILYAATDGATCCITTSTFRCASQKLDMTFVKMTDKYITS
SSIYNAVIYGVDAQGKLYKGSTASISTGVANWQEVSTPCPFTQVSYDGTTLCGLYAST
NTIVCTSGTLSLQPNWVALQSNKWKQFSITQSYIYAVDTSNNVQRLQISQPIAVAP
>THRCLA_04952
MTLASSPTFSRPLLLPPLTSALSPSIAQQMKRQHECEGGGSVKRHCSTFPYMEMPRL
PSITQPSSHIGYLSESYYPSPTSLPMLPPASTLLQQATRKSMDLVPSNAYAPTLPEPCT
LYKSNENTKPSPSNEEVRGECLDAQCHNSVKHRGYCKLHGGARRCDVPGCPKGVQG
32
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
GNLCIGHGGGKRCRFPGCSKATQSQGLCKAHGGGVRCKYDGCNKSSQGGGFCRRH
GGGKRCSVAGCPRGAQRGTTCAQHGGKAQCMIDGCVRADRGGGYCEVHRKDKVC
RQGYCNRLARIKCEGYCTQHHREFCITSPPQ
>THRCLA_05863
MGSVLVLLFSPLHAWLTSSSSSSLPCLQTSFLLSQNASLDQIATAQPRINDAIVSFLAQ
SNVQSRIQWTNVDITTSTIDGNGPAIVQMCLYVPPNTSVNQVAMAMSATVNWSGLKTS
ISTLHRRFQLFDLTTPLQVLNQVTQYNFQRIPFPYVQWYLVVQGRFDYFWPQIKIKHAIA
VLLNISSSSVIPQDIIFPPYDAYNDIATILPFAITQVNSSTFARTANTLTGPLQDILALHGILL
LTQFPDPNGNGKLQQSVPWPEYPQLDPTSFYPFHNWTPVPNSFVVKLIYGGLLTLTN
MSSVILQVLDVLDSPQTANFTDFQTLTLTYPPNNGTATFESSRYNTLDFIVAGDRSTLE
ANQQTLGESLYQIGVSIFDVIDINSTMQTAQWYPYMQLDCPYNLSALASIIQRIALAAFF
SIPLSSIQLIEIATNSTTFEIACNDTLEQRYLKKQLKETTRWSTVMNNFTANSAFCTIGGE
SLAYPPMFPGSTYGWSQPSSSMDNTCSVNTIELTACDQCDRYLNAVCFTNPNCYQTQ
TTLLSQLLVSSNASSVFQQLSLSTSANTKTLNTLALYYSCIAAFQCLIAPNTSIITSDEVYT
IDINANGANFSTTLYYPQDDIYLVLNDQTTLEEIQINLSSSISNSIFVNVSGTSSSFNVTM
DSVVIPFQLPVIAYSTVPATIQRISASIPQLVFLSNSSNDTTVLLNGKCTTCLTQMDECK
MSPSCPSIAICWSNVVESAISQLDSVYSTLEISTQLISCYENASLEDFEMFLRVQKCLLQ
SSCPISPTLESIVKGTMIVLRSTTGFQTIELTPTPAVTLTIGTESIILSSNSISGLQATMINFL
SPLCQASIQSNTANLTIQFNDFGAPILPTINGTIYSQMPRIFLDRMPLDSSRFGFSYQSY
KQLSPSSLPNAFTTTLNSNCQMCQNLFDQCLLSSFCASIISNFQNTIAGATNAFIGWSV
ALQRLSFDIPEWDQFAQTLSCFEIHNCPINSTISMLKNGRMLLLSSTPVVLSVTFSSSPF
EAAIYVQRFRQPINVSSNSSAAYIQGQFQMNFGSLALTNVSITNTSMELSLNSYYGPTP
EFMVTSSEFSNKTIILGTSMVSVVSYSPAAYFPY
>THRCLA_06099
MKFALVSSLAVLASAQTNNSSAGSNSNVNCPLQFTSACANTQECGTLNGYPLECQV
YGSVKQCVCSKENANCQNSTNIANTIPQFGVCTGGKQCAGSGFKALQTPVRTCSEQL
VCIPQYASGNELQSICHTCSSCKQQNKPDATGRLIFNCTQICPLGQGDPIVTIPPVTTAP
TNSTKKNDSSKGSGSTAGSKPKSAATSIVAGVATVAIVAIASLF
>THRCLA_07047
MILINLLFGLRLCTDGVSLLQQQVPRKPSKRTKQSRCKHVPFVASTALKPTHETLAPL
MPLVVYQEVTENDMAHLISLVDNQDNQEDNEEITENVVADVFVPLVDNQVSQEANEN
VVVDFADNQDNQEANENVAEFEPLVNYQDNQENVAEFVSLVDNQGSQEASENVVVE
LVPSVVCRDSFEPTEEDVAAVLHGRFAANQAALLRVSSFQPADDRSLTAIQLIRYFELY
HLVRMDYNQLRHLEPSRLEKIQLVRLSILERQAIEAMLSDVAELWSRQPNDVSSAKKLQ
WFKNLQYGLMWDMLELLEHQKPDHHCARGLCPQLYQEKLDIIYSE
>THRCLA_08011
MKTIFLTTALLASTSCALQMTNKERNEILDELNKWKQSAVGKAALVHNFLPSSQRQEG
LSIDAKQDLEITRFAHTKKVVEQLNKEHKGSAVFSTNNMFALMSDEEYKKWVKGAFGR
DHKKRQLRGENIQLELTAEQREASGIDWTSNKCMPAVKNQGQCGSCWTFASVGAAE
MAHCLVTGNLLDLAEQQLVDCASDAGQGCQGGWPTKALQYITQTGMCTSRDYPYTA
SDGQCNNSCKKTKLSIGEPVDIQGESALQSALNKQPISVVVEAGNDVWRNYQSGIVQQ
CPGAQSDHAVIAVGYGSDGGDYFKIRNSWGAEWGEQGYIRLRRGVGGKGMCNVAE
GPSYPSMSGKPNPDGPTDEPSNDPTDEPSNDPTDEPSNDPTDEPSNDPTDDPSDDP
TDDPWNGSNDWDWGN
>THRCLA_10855
33
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
MIVPVVLMALTGISITGVTLRWCCSHRQKTSWKEERKEPLLATPVPLPRIKTDVFIERSI
AMDVPLMETCSGCGAWIDPSLAAIANGLCVVCSYQTIPSLEIDIDENISENDETSNDKDS
DKPILTDEDTTADIESPKDVEIQIEESNFEDEMEDISTTSQDMVIPISQDNNCNEGDEDV
EIAEEALALVQDMWDIAYQAHLGVGDDPTADIFVEMALDLDATAEAIKKEPHLLSESFH
FLSLSLASLMELVPEAWVAHVEATELKFEALQFRYHSKLTVENCLDLATHLYELVECAQ
EFGVDPAVASSLMDGLEELVEAIEETPCELVSWLAYLAATVKLLKSYQRDFEQAEMWD
TVVECERNLEPLEMHCWEIYSPC
>THRCLA_10997
MKASLCIATLAAMGSIASSRNIRHHAESVMGNPVQRRSESTRLPTHPLTGYWHDFPN
PAGDTYPLTQITKDWDVIVVAFANSLGSGKVGFDVDPKAGSETQFIKDISTLKAAGKTIV
LSLGGQNGAVTLNDATETANFVSSVYDLIKKFGFDGIDLDLENGISKDLPIINNLITAVKQ
LKQKVGDSFYLSMAPTYGGIWGAYLPIIDGLRNELTQIHVQYYNNGGFVYTDGRTLNE
GTVDCLVGGSVMLIEGFQTNYGNGWKFNGLRPDQVSFGVPSGTSAAGRGFVTPEVV
KRALTCLVQGVGCDTVKPPKTYPTYRGAMTWSINWDSHDGYVFSRPARQALDSLGG
SPPQPNPTAVNPTDAPNPLTNPPTSRPTNTPTVTPTQSPRPTSQPTSLPTSSPSSVPTI
NPTPIPTSVAPQPTQAPSSSC
>THRCLA_11248
MANTIQWLFIYCVIVASQGPPNNGERTCSVTLGGPVSQTSTAGTMSFCTAFPQERCC
LPVHDEYVKSTFYALLDSGYICASATNTAIAHLQTMFCLACDPSMSLYLTPPRNTTFFS
APQTLKVCRALAISFKQHIDAVSPYYFSDCGLTYAGDRNNLCIPKTAISPNMVFPGCSE
GQNICYSTTQGYYSPIWYCSSSPCGPDTPFGLNDIPCSGPTCTPAFQFLNDNRAAKPP
FFEPFAVEIIDESTCAPGESSCCMTDSSIVPTS
>THRCLA_11271
MKTTAFVLALASTAAASSPCTGSAVITAVTPLIAQATTCSTDSGFDLVALISGTTPTDA
QKQKFLTAESCKTLYASVQKSLAGITPACTIGDIDTSGWSTVSMDKGLDALIKSLPSLLA
SSGATNSTSNSTANSTISSTTVSPSSTTAAPAKSGVAATGVTIAAVALTTAILHLNANKQ
QEIHEHLRLTIKESDVETLGEVMSMSLIPAAEAHQFI
>THRCLA_11391
MKLSILLAAFGVVASSSIPKHTYKCNDGVCVQTPLNGAGVSLGSPLLSLRMCEMTCG
AGSLWPYPASVSLGTTATAIDTNKVSHSIKINGAEATSTLTNSIVQTFNEGVKAKTKWV
RGQSEIGAISHSIYGTISSNNEVLGQDTDESYELSIDGPRVKINAATIYGYRHALTTLNQL
IDYDELTNSVKMISKATISDKPAYSHRGIVLDTSRNFYPIESLKRMIDTMGANKLNTFHW
HMTDSSSFPIEINGEPRLTTYGAYSAEQIYTQDQIRDLVQFAKARGVRIIPELDAPAHAG
AGWQWGPKAGYGDLTLCYGADPWMNYCLEPPCGQLNPLNKQVYSVLDTVYKELTSL
FDGDVFHMGGDEVSIPCWNSSKVITDHLKDTNKPGAFFDLWGDFQTKAAAMLNKKVM
VWSSDLTTDPYLKYFEPNNTIIQLWGGSTDGDATRITSQGYDVVASYWDAYYLDCGFG
GWVSKGNGWCAPYKSWQVIYDLDITANMTAANAKHVLGSEVAMWSEIADAHVVETKV
WPRAAALAERLWTNPKTDWKSAMGRMRIQRDRIADAGIGADAVHPLWCRQNPGKCQ
LV
>THRCLA_11516
YTCVAVQTAIAGIALASQCVLGTTCGGNSAGQCPTFSSWSSSYQKIQPVCAFVNVTN
CVNFIKAGSEAKATSGSGSTSTVNCYQATFSANNISQVVSGIYKCVDSGLYVSQNLGAI
KNLTTTQMDVCAGNLTTSVGALCNGHGTCAPTAAFSSKYQCICNEGYSATDNCNVAT
SNVCNAFGSCGAGNTCDTTSKQCSCTTGTTGPQCSLCDPTASSSVVCNGNGVCSSS
GTCTCNSDYTGSLCSRTATTNSTGSNKSSSSSHLVASLATIATCLLAILM
34
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
Table S2. Selected hypothetical proteins (n=30) from the Achlya hypogyna genome
possessing putative 5’ transit peptides. Chloroplast transit peptides predicted using
ChloroP (v.1.1) are shown in bold face. Transit peptide sequences predicted using
SignalP (v.4.0) are underlined.
>ACHHYP_00269
MVRTLSLLLLAAGVAGQTSTTPVPTPAVSNPPFTMTLVNSIQARVVAEAATWDETNQ
KFGLVLKQNTNTFEERYRAVMDTVNTASVEGALYYVQTEGIDKPLQTGCMRKTNMSYI
WFLNITMVQPTFAIAEYQDNGGVVPEYGKFVAMDGGLCTPVGTETPLECLTYGGLNFN
KNLGQWVGGEARKKNGRANYDDNYWFSFPNSCYTMRFDAKTKACRDLQKGGLCPIG
TQPDGVKCTYSFDVLGYLAIDDLVGITSMKNTLTGQNFKGFSEFCKAGKTEYNFADSS
SDLTFWNDPLEPAANANRTKVMMQKYNDLVQNGVGDQKHMKALPSVEELTKANPPC
WKNSPRCATAANGCRRKLLSQICEVCSAPADDCKKPGPNDKAAPMLNKQFQPALPTD
ATGNTKQPRAPNAAPLDAPAGGAGGNVIKGSGAAATSLILATAVGLVALAV
>ACHHYP_01095
MLARLAALIGVAAALQVPFTTYECVRGRCEPRPRSFSPPDSASSLRLCEMTCGAGNL
WPLPTSVSLGTTTRVVSVDYVSHTVTFLDNSVPISPLVGAIQRIFDNTLALKATECALAS
VGGAELAVTASIESGNEVRDYFRTFTMAADDNTMVQELELETDESYTLTIVDGAATIHA
ATVYGYRHALTTLSQLIEYDELSHDMHIISAVTITDAPHFAHRGIVLDTSRQYYSVPAIKR
LLDGMGATKLNSFHWHFTDTASFPIEIKGEPRLTAFGAYHPRSVYTQQAMRDIVAYAR
ARGVRVIPEVDAPSHVGAGWQWGKDAGLGELAVCFGHNPWTEACVEPPCGQLNPF
NPHVYDVLETVYEELNEIFDSDVFHMGGDEVHLGCWNMSAAVTAHMTDRSPDAFYRV
WGRFQMQARQLVGEKKIAVWTSDLTNAPYLRKYFDPASTIIQMWTLSTGSDAARFTA
QGYPVIASYYDAYYLDCGFGNWLLKGADWCTPYHHWSVLYDLDVLHNVPAAQRNLVL
GGEVALWSEEVDEATMDAKIWPRAAAAAERWWSNPVNGTWKDAIDRMRIQRDRLVD
IGLQADALQPLWCRQNAGDLSQGSGISISATVKSKSEALTVDTDESYELSIDGPKVSIN
AATVYGYRHALTTLNQLIDYDEISNSVKMIAKAKIADKPAYSHRGIVLDTARNYYSIDSLK
RLVDTMGANKLNTFHWHFSDSSSFPFEIKSEPRLTSYGAYSKDQVYTQDQIRDFVQFA
KARGVRIIPELDAPSHAGAGWQWGPKAGYGELTLCYGSDPWMDYCLEPPCGQLNPL
NDHVYDILKTVFEEMHGLFDSNVFHMGGDEVSVPCWNSSKVITDHLKNTTSNAPFFDL
WGTFQTKAGALIEKANKKIMVWTSDLTTDPYLKYFKPSNTIVQLWGGSTDGDAERLTS
KGYEVVASYWDAYYLDCGFGGWVSKGNGWCAPYKSWQVIYDLDVRANLTATNAKRV
LGSEVAMWSEIADEKAVEAKIWPRAAALAERLWTNPKTNWKSAMTRMRIQRDRIADA
GVGTDAVHPLWCRQNPGKCTLV
>ACHHYP_01226
MTALADAVWLAVMAFLDGQDLSRLMRVSRAHWRRLQAQVRRWREIQLGLGLGHWV
QRNVRLTINTQVQEAQSLAVQRSPDARVPPRVETIQKELGPIEAERSVHRLTATTPLFT
ATQQAVLVLSFDCTSADTKPLLVHTSQRARTLYTTLTLTIFDRTLRRHVYHKASGDLAT
VPVAEKQAWTNAGATLRCDVASNDKSCQVQLGLPARLDGKIDCYHIERVDFTLHKREL
YPVFSLPLEPSLPTCWIHLQFHDLARAQCLARVSAPCHALLEMAASRTDDTNHPARRT
AVEQLEVATFRSTQPTSLPDISSLAKPGMISMVISGPERHQAFYHTAFGHSGATRKSDS
AHVLAATWVPGVLEFAMYPDTLNRRVLKGIFTLEFAVSGALTSLVVLAQHLSPRRLLRY
NARVASYSRRPEAERNEDA
>ACHHYP_01546
35
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
MVALFLGTAIALALASSATGSFTGLAMPAANSSEPKSGQCKLMKLLPRATQFNVALS
PRHYGRGGHCGRCVQTQCDRCAASAPIIAQVTDRASDVGLSKPMLRALFGSGAPSAV
TWDFVDCPVNDPIALCTKPRNTSAYIIYVQPTNTVAGVQNMTIDGFRGRLTNASYHFKA
PMPANWSNVRVSMKSFTGDAIAASVALRPGRCVTIPHQFSPSPAAASGTPAVIDYDGD
EDADSITVPPPYK
>ACHHYP_02169
MAWIVVLGILAHVATALQSSLCATSSAFSPPGCHANRRLATWSRAIVRLNAGGHVCT
GWFVGSEGHILTAHHCIHKARAVEVVVEETPAQTCPPRTIRGRMTTGIDVVAFSVALDY
ALLRPLNRSVRGPVHLQLHSSAADIVGLEAIVAQHVDASSPVVLSEAGRIVSTTFAGCG
RRDRLAYALDTKASASGSPILSTATGAVLGLHTCGGIHCHGKSVPMWIVIGCSSEPGH
WNSGAVAADVVADLRQRHHLPPDAVAHETLSAPTPSTIIVERGRLVQRAANTTSVDAY
LLTMAMPGRVTLDLLAWTMDAQGRWHDLRRDCDGSFFDTKVILAVVDDADGRPLLRR
IAENDNDTRHQGMGDGSIDNRDAFLDVYLASPGDYYVLVGTAAMLLPAVFAPRLSAPT
DGGQHLYGCGNTRATEANYNLRITTDDGTLQRIEAPFPRTAACSSSARKCPAAHADTA
LTLDAVVAGTLHRTYSSGTSMDHISFELTKAGRIAIDVVSYQEHTNGSIAIDGLHDVCGR
AYLDTVLYVFGATIPSGEYLDPAALVATASDRPPTHVASQRYRSVSTRDPYVEVDLPA
GNFTLVVGQQPLSLFEAVRVLYPGSRETDAPLLCGRPHPFGHYHVFFWVQHRRMLSA
TMPGSFDHAACTHEVCSDSML
>ACHHYP_02305
MKFTTLLVATVFGQNTTTAPSSAPTPAPTKCLLQFTSPCKSSSECGDLNGFNLTCIKS
GSNKQCNFNGGSTVAKDNQFKAADNLVYQFGDCSTASCTTGHGFTEGLPTTVTCQE
PLVCVKEINDNPGVVLKSQCHTCGSCKAQSLKDTRFDCSKVCPLTPAPTTKAPKVPGA
TGSAASSGSGSETSAPATRAPKTGTPAPTAASSASTALVSGIAVVALAFAQLC
>ACHHYP_03044
MAGLIVGILAAVGTFSGSGESISTGTSSTPAPTTHTPTTLSPSPTTKPTTVTPTPTLAN
GLCPLRGMYLSGTSCVACPTPKKTFSVFWESQVDCSTFATSSAAAYVTHIYWSFALID
PTTGTVSSTFQGSSATLKACIAAARAKCIKNYVSIGGATMRQTFVALNSSAQLTTFALS
AAQVVQEYGFDGVDIDDESGNLLAGGDWKANALPNVLVYLQGLKTQLAALPRAATEP
KYQITWDEFPTSLSTGCDLASGDYLRCFDVRIANIVDQVNIMMYNSASSTDYDNFLNVV
TPTEWATAMPASKIVIGGCVGPIGTIGGCAFGAAPTATQLKAYASLLDPALHERLSRMD
LGFMLDLARDELLVLLESEQAHNPGVAVREGEGREDKQQQRRVQREVGAEEVDEAH
VGEERVEGGVRRDLAGVEQQ
>ACHHYP_03052
MAAVSNPLLPLQLALADLLERPIHAALDDALRQPSNEQHLHHCVRSLPPSATVDALD
ASLAFVVHARALLTICSDYLDQHIAPQHALKKITDLLSVSREIANDAEVNATADDADVDE
AATDDSDQFASPKGEPPVGPWSGSETPAAPTSRQSWWAQIWGGDEDNDSAGDDVS
APPEEETLPSLPVEVANTIASLAAFPTNLKLQLHGLEALVEYVHGPCCCESVGPLYAAP
DMLPAVLHAISSLAQSKRAQIAGLSLLANPSSPKANMPMLPANLPTQQVRRLILRAMQR
FKAHAQIQGLGCLALSNLCRGPAISESHALKARGCRLVWSSWLLALICASSGTSMRAH
PLTGGPEDMQYAVLDAGSVAVVEAASRRFQDDDRVRKHADMALREMLQKHASRRAP
QCAFQ
>ACHHYP_04549
MRARAFFVLAGCATAAASPPLPWQSSCQVCAHTGRCGGASSPIKFCGTWPTGACC
CSANVNCPTPGVHATCDCGFLADYPVDAALPPVADVLGYNFS
>ACHHYP_04706
36
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
MRASVLAIAATVAAAANNQTATTKVFSLEVGTVGVHASRNQDSVLIPCKSNVCVPTG
SATLEFCRKACNRETGEHDCTTNCACNGTTPGYMCAGICNKAKTADECGSPVFQTCS
GEDLVPDYECANYKCTNHQRTNYLGANNRCANYERAAYPLPDIHARVRFVKLLNPGE
AHLIEYYTGLYFGPGQNNANDGFIWNPSVGSIKSISGNSCLDAYVAVDHNVYVHTYPC
DDSNPNQWWLYDSSLHQLRHKTHSTMCLDADPNDANKKVQMYLCSPGNANQYFDM
RPILS
>ACHHYP_04908
MTSVVAVTACLLSWLQRSRASPPVAYSAPNSVAFPAEIVHIKVILSRRRSSVLANGVL
PPVAPPRRAAHHEGHLAPLTRDLLSDKSGNAPP
>ACHHYP_05005
MSHCHFAFFVPMLARSLASFTRASRRCFSTEGPFEHRAVSAEVIAELKALYGDRVSTA
ASVREHHGTDESYHTPSPPDVVVYADSTEEVSKILQIASASKTPVIPFGAGSSLEGHISA
LHGGISLDLTNMKSVISVEQENMSCRVQCGVTRLQLESELRATGLFFPVDPGADATLG
GMVATNASGTTTVRYGNMKSNVLGLTAVMADGKIIKTGSKARKSSAGYDLTRLFIGSE
GTLAVVTEVELRLQGVPEAQKIAVCSFPTIQDAVDTCTVIMQMGIPVARMEFMDHKAIE
ATNSYSKLNNIVSPCLVIEMNGTPEEIEHHTATVQALAEEYSVQRMSWAATEEDRKELL
KARHSAWYATMNLVPGSRALSTDVCVPISNLTQVIVDTQADLEASNLVGTIVGHVGDG
NFHVMLPFLPEDEPAVRAFSDRLVERALAADGTCTGEHGIGSGKIKYLRMEHGDSVDV
MRTIKQALDPHNILNPSKLF
>ACHHYP_05180
MYNTADSVAFLSLLTSTVRAITPLPPLQFRVQAKFATGPLPASKPSPSSFISVRFVWNI
LVRLVVYRRRATPTPVDMAQERTVLA
>ACHHYP_05326
MHCTFFLSIVTAALAGVAGHVQQRIRSGAVKARGVNLGSWLVTEHFMMPQSPIYQNV
SADLQPLGEYVVTTALGRAVADPLFKAHRSSWITENDIKEIASFGLNTVRVPVGWWIYE
DPNDSDWQAYSPGGIQYLDALINDWALKYNVAVLVGMHGAKGSQNGEGHSAPQLPG
ESHFTDDADNVYTTMQSAKFIMSRYQSSVAFLGLEMLNEPTITPGRVYNIDRTKLIIYYT
NLYSKLRAICSSCIIMLSPLLNEQYESFGNQWANVLPTGSNNWIDWHKYLIWGFENWS
MKDIINTGTQWIANDITLWQSRRSAPIFVGEWSLAAAEGILGELKNGTNLNTYANRALA
AMKEAKAGWTYWSWKVNATDWRSYGWNMQALLRAGVIDLKNA
>ACHHYP_05770
MSKLSLAFLLHPTALACPPGPEAYVCPLSPETIVCPLSPRVSPASSARAKPKRSPPA
PRSRPCKEPGCTKYAVTRGHCIAHGGGKRCSVEQCPSGAKSNGLCWKHGGSKTCS
FPKCSNRSKTYGVCWSHGGGKQCADPNCTKTALRHGFCWAHGGGKRCRTEGCQR
PAYERNDNLCDVHCAKAS
>ACHHYP_06287
MQLSHILLFATAAAAQHTLLDSGTPEDRPSSWGSPVTKQIPSAVRFRSSGLCGEAQTI
DYVDFMVNTDLADIKANATWIGVEICPSVEDVPACPPTSVAEQIPIEVRGKRTTLHWVP
ATPKVLEPESLYWFIVSSNVENALQAVSWYPGSKRYGTDNDPKSDVASATRMLVPWG
GMDWVVEPSGGVAPLDHRRVPNAKIVVKA
>ACHHYP_06505
MIKSFTITATLLASASSLQMTNKERNELIDELNQWKKSQAGKTALVQGLLPPHPKTESF
DANAKLEAELVRFATTKKVVEKLNAEHNGSAVFSTDNQFALMTDDEFKKYVQGAFGK
PHKKRQLRGENIQLELTPAQREASGKDWTTSKCMPAVKNQGSCGSCWSFAAVGASA
MAHCLVSGKLIDLSEQQLVSCASSAGQGCQGGWPNKALEYIAQTGVCTAADFPYTQS
NGQCKQSCRKNKLSIGRPVDIRGESALQSALDKQPVTVVVEAGNNVWRNYKSGIVKS
37
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
CPGAQSDHAVIAVGYGNGFFKIRNSWGANWGEQGYMRLQKGSGGNGMCNVAEAPS
YPSMSGSPKPNNDDNNMPDDNDD
>ACHHYP_06977
MKISRVAVIGLLFVAARSTRAQSTSSSTQSNTETTSTESTPFSSSSSSGPAPIVDVIAAA
IDAGATPKQAAIVAVAADTGASLAAIIQTAVDAGVSPSIASAVASAANSAAGSGADDVTS
APITTVADAAVDAGATTAQAAAIANAASSGVSGDDLVNVAISVGVPASIASSVASAAGS
AAGTPAPIADVIQAALDSGASLNQAAAIAVAVSAGVSVDDIQTAAIQQGLPASVASSIAS
AVQSTIASAAGSTSADALAANGLGVTSASSTSYVPPSEVTPLKLTGAKDPEAASDVNS
PEAYSFSAPMTSGSTKSSESPLSGISGMFNNIVALVTSAPSPAEEPKPRLRASCRTA
>ACHHYP_07400
MKTPAFLASALFAVATGERPACGPDTPSPTMTPTADPTFAPTSGPTFPPTPAPGQWT
SLGGFAHDISFDGTNVCVKNGDGAFCGFAGQPFDQWKPVATQLKDIEQVACAKGVAF
VWGRSSGDLVMKTINLKTGEEHDAKMQDGESPRQFSTDGSVVCGTTNSRLFGAKVT
NGALGAYSTISEDHEIYKTAVAGEFLIVAGYDGALQATLLDAENWDTFSFDVVPVDLRA
REISTDGVDLCIVTYELDIACSKLSSGLEKWTKVPGEWKTVAVSNNTIYGVDFKSSEIRY
TYLK
>ACHHYP_08323
MVAWAWLPAAAAVVAATETHWSHLGNASSDRGLRIHTPITRADLHDEYNDAPVTQR
RLSGSAASLFRAVAGYGFRGLSNAAIFSGVTLDMCASACVTDARCLSFDYEASTCYIA
HTDRYAYPADFVPRATSTYYEWQGAAATPTIEPNGGRLTSYGAFQLFTTSRAAAMYY
QFKSLENGTVTVYTLYSPGTTVTLPEYPCVVQAYTTKAGLSDSIVLVSNAFTVYAARYA
YLVPFYNGLGFHGLVTRVQLDVQGVKRPRPSRVLEFTDINSTLGIGPFRGQLSTINLTA
YDARLAGFFDAFTGITTTLCPQVESRVAVSTVTYVNVSLQVFQNASRWVLVPAPLYAS
APGDLVFSSSVSLVEEYLYLCPHQNAKGHAGVIAKVNLRAFNATSHLPFQPAIEMLDLT
VIDPSLTGFGSCFANRNYGYFVQRRNAAGLAGQIVRVNLDLFAQPALAVTVLNATTFD
ARFVGFSGAVVYKNVAYLVPFERNKVGLELNPNYKYFPTPTSSIMGRLDLTTFSTVTPV
DLSVLDVKYACGYFGGFTVSYYVYLVPNMWTTDTTSPGVNPYHGLVARLNTLTMNVE
SLDLTLVDPSLKGFMRGFAFGRYAILVPHRNGLTTELPVRLNKSQKNNLGTIVAIDTDNF
TPSGVRYLDLTLALRSQIPNMPDADLRGFIGGGVSGEYGFFVPYFNGVRFSGKVVRVN
LRKFGEVQVLDMTQVHTSLRGFTNAVFPQLYEPTVTSLWNYVIPDGTQTPYTFITVDV
>ACHHYP_09221
MVSVTTPSMTLLGAIALVAGQATVAPTTATPSAPSASPTKGPWAFKSVRTVQARVQA
DVPVWDAAHKEWVAVFPQNTVTFEQRYRAAMDTINTATVEGALFYVQTEGIDKAVQA
ANGCMRKSNMSYIWYYDIEVVQPVYSVAEFGQNTGYAPEYGPFIAMDNGMCTPTSGT
TVPQGCMQFTGLAGNIALGNYIGGEPRTKHQYANYANNYWFSYPNSCFTKSFTAKTD
ACRNSPMQKGGLCPYGTKPDGINCTYSFSVLGYLSIDDLVGITSTVNPQTGKAFSNHM
EFCKAGKYEWDFTTSTGLPFWADPLNVTANAARSAKMMDLYTAKVAAGVGEYANMK
PFPKVSELVAQNPSCSDNSPYCAKQPHGCQRSLLGQICVPCSSASPSCKPPTRAFPA
LPVATTPPPVTDAAGNVVPMSTNLLGQAVPATSSASTVAFSATAAILVLALA
>ACHHYP_09519
MIVSAIVFAVLASAAGQSPLKIASSVPYALTIDGSAPVSTVISNTRATSLSVHIASMNLP
PGATLTIGTVDGKDKVVYTGAHTNLVSDYFIQNKVVVSYAAASYSNNTTPLVAIDKYFA
GTPDAGGLESICSTTGDLSRPAACYATSEPVKYAKARAIARLVIGGSSLCTGWLFGSE
GHLLTNNHCINNDRLAASTQVEFGAECASCSDGSNNVQLACKGTIVASNVTLLATSSK
LDFALVKINLNAGVDLSKYGYLQARDSAPVLNEPVWLAGHPQGDPLRMAVATSNNAE
GAIVSTNVTDSCKDNQVGYLLDTQGGSSGSPVMSTVDNSVVAIHNCGGCDSETPSNG
38
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
GIPLTKILAYLRANNIALPKNSVSAAPAPTTAKTTTASPATPAPSTSAPATAAPKPPTFTL
CSVSNKVISEYYTGLYVAPAGHTANEQFSYSPDTGAIQVQSNGQCLDAYWGGSSFLV
HTWPCDRGNNNQKWTVANNQVMHRVHGVCLTSVAGSKSLGVAPCNAADVRQWIYT
NCDTANVRNFVQLRTPRGALVSEWYSSVLAKQPQSSWTELWEINGQQMRSFSGSTC
LDAYWDNSRFQVHTWQCDPTNGNQQWRVGNSVVAHATHSNLCLDVDPTDPRQAAQ
VWGCHSATINSNQLFDVVAF
>ACHHYP_10824
MASISQWLCLSCWAPMSTPKTTMATDAWCGTFWKHMLMSVSVTPPPACDCSTDGA
TALFFAAQRGHSDIVYLLMSAGATAEESTLGISPKQIAQANGHTIVAAIFDTLPPPLPHRL
HWERSSVLFLSSFLVYRCNLLLLRH
>ACHHYP_11025
MHARFFAPVLGTLSLVAGSATTLAVNSSRTPQVNAQVRRLSKRALPRDMGKSSTSA
QAPEGSSKPDMMKDFPIFLFTIE
>ACHHYP_11286
MASESTPLLALLELPLLKPTSAETIQGHVTALRASFISGAMRPLAARKAQLRAIRALVE
DGCEILQAAMWKDLHKHAAETFVTETSSVLLEVQDHLDNLDDWAAPHKVGTNLLNLP
GSSYIRSDPLGVACIMDTWNYPIMLLLMPLIGAI
>ACHHYP_11397
MDRLLLLSALATAVAVDDAAPRPSRAPLPTTLVPWGSPLAAPTAPCTWGGRAHALD
WNLTTSVPGSRQCFPNLFAADQPLEFPYPRSSYNYDLDPPVVGPRVQVQWTNGVTN
VTAPVAAFDYRTFEMTGDELLFHALPDAPGVYRLAVQAFDWDRASSECRACLAVTDQ
VRPRATVARAGLCGASTTAPYSPEALAAADDRVRALVRYRATATNNDACSDRRCDAV
TVAQTGFLSAFPTAVVDGANAAVDAVPDGWLGCLAAPLSARERQRLTTPLALVDDAR
DYFVALQELYTPFRCGAPPGRPTCAGAASETCALMQAVVLPASHLVARVAVKLKATAG
HIADPAAAFPGAGYLPPSARHLHLAIPCYPTNASFSSFCADTVEWRVSDLFELSAELNA
SQPWGFDAAAPLVTWFVQQGPAWVAVADNKRLAFDKFQDTLVFRAMTPCGQVGEDI
AWTVFSHRAEALSVDAWWNSLWSCGGCNVPKADFSVCRFRFDPTSPLVSAMLHPPA
SCRDAAGRSCRNGCLARGQCNGRSTAASCGQQAGATWCDARGSALLAAAVPRYSL
RSLQCVWQYANTSSANWSVAVDVAVDTAFALKLRNADATELSVSCTLTFDPDTGEPA
VVKTRSLALSLRNCDGPRFEDHALAFVKDRCDASWRPGVGRQPAPRQACAGHLVFP
STTDAAATVLLTPADDLACCSGPVAAFSCQPLPGHPGLKQCQRADTATALLAAEPQA
WPPVALAASLALVFVLVRRRRQPSDTDLSRPLIDGDRC
>ACHHYP_12628
MIVQILALAATASAFTKCHIRHPNRTEVLSTPCPHEYVTELPASFDWRNVNGTNFVTV
SRNQHVPHYCGSCWAFAATSALSDRVRIARERNSEGKDRVLVTRQVNLSPQVLLNC
DKEDMGCHGGEGLSAYRYIHENGIPEEGCQRYLATGHDVGNTCTAIDVCRNCEPSKG
CFPQPSYDTYHVSEYGAVDGEAKMMAEIFARGPIVCGVAVTDEFLNYSGGVIDDKSGR
TDIDHDISIVGWGVDGSGTKYWVGRNSWGTYWGEEGWFRLRRGNNNLGVETDCAF
GVPADDGWPKRHTETTSPAKAAVWSGEIKSLLQPSRAQAKSRAPVHFVGGEKVLSPR
PHEEIDVLALPKQWDWRNIAGINYVTWDKNQHIPQYCGSCWAQATTSALSDRIAILRN
ASWPEIALSPQVVVNCHGGGSCEGGNPGAVYEYAHRHGIPDQTCQAYVAKDGQCNA
LGVCETCWPTNSSFTPGKCVAVPKFKSYYVAEYGHVRGADKMKAELYKRGPIGCGM
HVTDKFEAYTGGIYSEKTWFPIPNHEISIAGWGFDEATQTEYWIGRNSWGTYWGENG
WFRIKMHSDNLGIEGDCDWGVPIPDGSQPLL
>ACHHYP_13722
39
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
MKCFAVLAFAAFAAAATSEQAATTQPATTTAAVTTAAPNTTTVVPLVSTKAPNTTTVTP
APTTKAVTTVPVTTVKANTTAPVTTVPVVTQTNVTSPDETETPEPVIEQPTDAPLPVPT
KKKSNATTVPPSASASISMLSVASVAVAVAAYVM
>ACHHYP_14385
MATSVLALCFSSLTANSTNTPEPKYQTRTVDTVVYESSAKWPKYMGKGSAIQMYTTA
ALSAQILVSFPETTTVLEKVATVGPLVSLSAVIFFGAKYLGERVITNVTSCRTVGQRGIT
DAIYLYLDEFLKIQVAGGLRPKTFECYPKGVSALRLISYLKLVSKDENGMCNVKINRTTF
WLDLGKAQVHQEQSLKILLDGKPLLVRKGKIKKAARA
>ACHHYP_15409
MGLFAPVLAFATVAVAGSSSTTLPTAPASLSTTRSVPLTDRAALIQELAKWKDSKAGK
YAAANGFLKLSRLESAGDAEAELAAFAETKATVEALNQQYPLARFSTENPFALLTNDEF
ATWVSGGRDKVQRKVPEASTTQSTTASIAPGTVDWTMSGCVASVRSQGVCGSCFAF
AAVAAAESAYCLLHDRHLTPFSDQQVLSCGPGNGCMGGWSDQSLAWMASHGVCTG
ASYPHTNDWNTTAAACIPECKALSMPYSSVASVAGEHELEAAIALQPVAVDISATSPVF
KNYESGIITGGCNVDFNHVVLGVGYGVAEVPYFKMKNSWGDWWGEGGFVRLQRGV
GGVGTCGLARHAAYPVVFPMPFNLVTFRGVVISEYYSNLFASAKQGSVNELWTYDAIT
RHITVGSNHQCLDAYPTGSSYAVHTYSCDAKNDNQKWVIDSANHAIKHAVHPTLCLDV
DPNQNNKVQVWSCSPGNQNQWVAVSEERVKLWNVNGNFLASDGNLIQFYSPSSPSY
EWAVSNLDHTWRARSNVGAPDLCLDAYEPWNGGAVHLYTCDSTNGNQKWIYDAKTQ
QLRHLTHVGFCLDMRTALGDKAHLWTCNTPANSLQKFQYKSLTFPA
40
LITERATURE CITED
Archibald, J. M. 2008. The origin and spread of eukaryotic photosynthesis: evolving
views in light of genomics. Bot. Mar., 52:95--103.
Archibald, J. M. 2009. The puzzle of plastid evolution. Curr. Biol., 19:R81--R88.
Armbrust, E. V. 2009. The life of diatoms in the world’s oceans. Nature, 459:185--192.
Armbrust, E. V., Berges, J. A., Bowler, C., Green, B. R., Martinez, D., Putnam, N. H.,
Zhou, S., Allen, A. E., Apt, K. E., Bechner, M., Brzezinski, M. A., Chaal, B. K., Chiovitti,
A., Davis, A. K., Demarest, M. S., Detter, J. C., Glavina, T., Goodstein, D., Hadi, M. Z.,
Hellsten, U., Hildebrand, M., Jenkins, B. D., Jurka, J., Kapitonov, V. V., Kröger, N., Lau,
W. W. Y., Lane, T. W., Larimer, F. W., Lippmeier, J. C., Lucas, S., Medina, M.,
Montsant, A., Obornik, M., Parker, M. S., Palenik, B., Pazour, G. J., Richardson, P. M.,
Rynearson, T. A., Saito, M. A., Schwartz, D. C., Thamatrakoln, K., Valentin, K., Vardi,
A., Wilkerson, F. P. & Rokhsar, D. S. 2004. The genome of the diatom Thalassiosira
pseudonana: Ecology, evolution and metabolism. Science, 306:79-86.
Baginsky, S., Kleffmann, T., von Zychlinski, A & Gruissem, W. 2005. Analysis of
shotgun proteomics and RNA profiling data from Arabidopsis thaliana chloroplasts. J.
Prot. Res., 4:637--640.
Barbrook, A. C., Howe, C. J. & Purton, S. 2006. Why are plastid genomes retained in
non-photosynthetic organisms. Trends Plant Sci., 11:101--108.
Baurain, D., Brinkmann, H., Petersen, J., Rodríguez-Ezpeleta, N., Stechmann, A.,
Demoulin, V., Roger, A. J., Burger, G., Lang, B. F. & Philippe, H. 2010. Phylogenomic
evidence for separate acquisition of plastids in cryptophytes, haptophytes and
stramenoiles. Mol. Biol. Evol., 27:1698--1709.
Beakes, G. W. & Sekimoto, S. 2009. The evolutionary phylogeny of oomycetes insights gained from studies of holocarpic parasites of algae and invertebrates. In: K.
Lamour and S. Kamoun (ed.), Oomycete Genetics and Genomics: Diversity,
Interactions, and Research Tools. John Wiley & Sons, Inc., Hoboken, NJ, USA.
doi: 10.1002/9780470475898.ch1.
Birch, P. R. J., Rehmany, A. P., Pritchard, L., Kamoun, S. & Beynon, J. L. 2006.
Trafficking arms: oomycete effectors enter host plant cells. Trends Microbiol., 14:8--11.
Bittner, L., Halary, S., Payri, C., Cruaud, C., de Reviers, B., Lopez, P. & Bapteste, E.
2010. Some considerations for analyzing biodiversity using integrative metagenomics
and gene networks. Biol. Direct, 5:doi:10.1186/1745-6150-5-47.
Bodyl, A. & Moszczynski, K. 2006. Did the peridinin plastid evolve through tertiary
endosymbiosis? A hypothesis. Eur. J. Phycol., 41:435--448.
Bodyl, A. 2005. Do plastid-related characters support the chromalveolate hypothesis? J.
Phycol., 41:712--719.
Bodyl, A., Stiller, J. W. & Mackiewicz, P. 2009. Chromalveolate plastids: direct descent
or multiple endosymbiosis. Trends Ecol. Evol., 3:119--121.
Bowler, C., Allen, A. E., Badger, J. H., Grimwood, J., Jabbari, K., Kuo, A., Maheswari,
U., Martens, C., Maumus, F., Otillar, R. P., Rayko, E., Salamov, A., Vandepoele, K.,
Beszteri, B., Gruber, A., Heijde, M., Katinka, M., Mock, T., Valentin, K., Verret, F.,
Berges, J. A., Brownlee, C., Cadoret, J. P., Chiovitti, A., Choi, C. J., Coesel, S., De
Martino, A., Detter, J. C., Durkin, C., Falciatore, A., Fournet, J., Haruta, M., Huysman,
M. J., Jenkins, B. D., Jiroutova, K., Jorgensen, R. E., Joubert, Y., Kaplan, A., Kroger, N.,
Kroth, P. G., La Roche, J., Lindquist, E., Lommer, M., Martin-Jezequel, V., Lopez, P. J.,
Lucas, S., Mangogna, M., McGinnis, K., Medlin, L. K., Montsant, A., Oudot-Le Secq, M.
P., Napoli, C., Obornik, M., Parker, M. S., Petit, J. L., Porcel, B. M., Poulsen, N.,
Robison, M., Rychlewski, L., Rynearson, T. A., Schmutz, J., Shapiro, H., Siaut, M.,
Stanley, M., Sussman, M. R., Taylor, A. R., Vardi, A., von Dassow, P., Vyverman, W.,
Willis, A., Wyrwicz, L. S., Rokhsar, D. S., Weissenbach, J., Armbrust E. V., Green B. R.,
Van de Peer, Y., Grigoriev, I. V.. 2008. The Phaeodactylum genome reveals the
evolutionary history of diatom genomes. Nature, 456:239--244.
Burki, F., Shalchian-Tabrizi, K., Minge, M., Skjaevelane, A. Nikolaev, S. I., Jakrobsen,
K. S. & Pawlowski, J. 2007. Phylogenomics reshuffles the eukaryotic supergroups.
PLoS One, 2:e790.
Cavalier-Smith, T. 1999. Principles of protein and lipid targeting in secondary
symbiogenesis: euglenoid, dinoflagellate, and sporozoan plastid origins and the
eukaryote family tree. J. Eukaryot. Microbiol., 46: 347--366.
Cavalier-Smith, T. 2003. Genomic reduction and evolution of novel genetic membranes
and protein-targeting machinery in eukaryote-eukaryote chimaeras (meta-algae).
Philos. Trans. R. Soc. Lond. B. Biol., 359:109--134.
Chan, C. X., Reyes-Prieto, A. & Bhattacharya, D. 2011. Red and green algal origin of
diatome membrane transporters: Insights into enviromental adaptation and cell
evolution. PloS ONE, 6(12):e29138. doi:10.1371/journal.pone.0029138
Cock, J. M., Sterck, L., Rouze, P., Scornet, D., Allen, A. E., Amoutzias, G., Anthouard,
V., Artiguenave, F., Aury, J. M., Badger, J. H., Beszteri, B., Billiau, K., Bonnet, E.,
Bothwell, J. H., Bowler, C., Boyen, C., Brownlee, C., Carrano, C. J., Charrier, B., Cho,
G. Y., Coelho, S. M., Collen, J., Corre, E., Da Silva, C., Delage, L., Delaroque, N.,
Dittami, S. M., Doulbeau, S., Elias, M., Farnham, G., Gachon, C. M. M., Gschloessl, B.,
Heesch, S., Jabbari, K. Jubin, C., Kawai, H., Kimura, K., Kloareg, B., Küpper, F. C.,
42
Lang, D., Le Bail, A., Leblanc, C., Lerouge, P., Lohr, M., Lopez, P. J., Martens, C.,
Maumus, F., Michel, G., Miranda-Saavedra, D., Morales, J., Moreau, H., Motomura, T.,
Nagasato, Ch., Napoli, C. A., Nelson, D. R., Nyvall-Collén, P., Peters, A. F., Pommier,
C., Potin, P., Poulain, J., Quesneville, H., Read, B., Rensing, S. A., Ritter, A., Rousvoal,
S., Samanta, M., Samson, G., Schroeder, D. C., Ségurens, B., Strittmatter, M., Tonon,
T., Tregear, J. W., Valentin, K., von Dassow, P., Yamagishi, T., Van de Peer, Y., &
Wincker, P. 2010. The Ectocarpus genome and the independent evolution of
multicellularity in brown algae. Nature, 465:617--621.
De Koning, A. P. & Keeling, P. J. 2004 Nucleus-encoded genes for plastid-targeted
proteins in Helicosporidium: functional diversity of a cryptic plastid in a parasitic alga.
Eukaryot. Cell, 3:1198--1205.
Delwiche, C. F. 1999. Tracing the thread of plastid diversity through the tapestry of life.
Am. Nat., 154:S164--S177.
Dodge, J. D. 1975. A survey of chloroplast ultrastructure in the dinophyceae. Phycologia
14:253-–263.
Dong, J., Chen, C. & Chen, Z. 2003. Expression profiles of the Arabidopsis WRKY
gene superfamily during plant defense response. Plant Mol. Biol., 51:21--37.
Dorrell, R. G. & Smith, A. G. 2011. Do red and green make brown?: perspectives on
plastid acquisitions within chromalveolates. Eukaryotic Cell, 10:856--868.
Drummond, A. J., Ashton, B., Buxton, S., Cheung, M., Cooper, A., Duran, C., Field, M.,
Heled, J., Kearse, M., Markowitz, S., Moir, R., Stones-Havas, S., Sturrock, S., Thierer,
T. & Wilson, A. 2011. Geneious v5.5. www.geneious.com.
Elias, M. & Archibald, J. M. 2009. Sizing up the genomic footprint of endosymbiosis.
BioEssays, 31:1273--1279.
Emanuelsson, O., Nielsen, H. & von Heijne, G. 1999. ChloroP, a neural network-based
method for predicting chloroplast transit peptides and their cleavage sites
Prot. Sci., 8:978--984
Foth, B. J. & McFadden, G. I. 2003. The apicoplast: a plastid in Plasmodium falciparum
and other Apicomplexan parasites. Int. Rev. Cytol. 224:57--110.
Gaulin, E., Madoui, M. A., Bottin, A., Jacquet, C., Mathé, C., Couloux, A., Wincker, P.,
Dumas, B. 2008. Transcriptome of Aphanomyces euteiches: new oomycete putative
pathogenicity factors and metabolic pathways. PLoS ONE,
doi:10.1371/journal.pone.0001723
Gibbs, S. 1981a. The chloroplast endoplasmic reticulum: structure, function, and
evolutionary significance. Int. Rev. Cytol., 72:49--99.
43
Gibbs, S. 1981b. The chloroplast of some algal groups may have evolved from
endosymbiotic eukaryotic algae. Ann. N.Y. Acad. Sci., 361:193--208.
Green, B. R. 2011. After the primary endosymbiosis: an update on the chromalveolate
hypothesis and the origins of algae with Chl c. Photosynth. Res., 107:103--115.
Gruber, A., Vugrinec, S., Hempel, F., Gould, S. B., Maier, U. G. & Kroth, P. G. 2007.
Protein argeting into complex diatom plastids: functional characterisation of a specific
targeting motif. Plant Mol. Biol. 64:519--530.
Gschloessl, B., Guermeur, Y. & Cock, J. M. 2008. HECTAR: A method to predict
subcellular targeting in heterokonts. BMC Bioinformatics, doi: 10.1186/1471-2105-9393.
Guillot, M. & Gibbs, S. 1980a. Evidence that the chloroplast and nucleomorph of
cryptomonads are remnants of a eukayrotic symbiont. J. Cell Biol., 87:186.
Guillot, M. & Gibbs, S. 1980b. The cryptomonad nucleomorph: its ultrastructure and
evolutionary significance. J. Phycol., 16:558--568
Guindon, S., Dufayard, J. F., Lefort, V., Anisimova, M., Hordijk, W. & Gascuel, O. 2010.
New algorithms and methods to estimate maximum-likelihood phylogenies: assessing
the performance of PhyML 3.0. Sys. Biol., 59:307--321.
Hackett, J. D., Yoon, H. S., Li, S., Reyes-Prieto, A., Rümmele, S. E. & Bhatta charya, D.
2007. Phylogenomic analysis supports the monophyly of cryptophytes and haptophytes
and the association of rhizaria with chromalveolats. Mol. Biol. Evol. 24:1702--1713.
Hackett, J. D., Yoon, H. S., Soares, M. B., Bonaldo, M. F., Casavant, T. L., Sheetz, T.
E., Nosenko, T. & Bhattacharya, D. 2004. Migration of the plastid genome to the
nucleus in a peridinin dinoflagellates. Curr. Biol., 14:213--218.
Harper, J. T., Waanders, E. & Keeling, P. J. 2005. On the monophyly of
chromalveolates using a six-protein phylogeny of eukaryotes. Int. J. Syst. Evol. Micr.,
55:487--496.
Huang, J., Mullapudi, N., Lancto, C. A., Scott, M., Abrahamsen, M. S. & Kissinger, J. C.
2004. Genomic evidence supports past endosymbiosis, intracellular and horizontal
gene transfer in Cryptosporidium parvum. Genome Biol., 11:R88.
Iida, K. Takishita, K., Ohshima, K. & Inagaki, Y. 2007. Assessing the monophyly of
chlorophyll-c containing plastids by multi-gene phylogenies under the unlinked model
conditions. Mol. Phylogenet. Evol., 45:227--238.
44
Janouskovec, J., Horak, A., Obornik, M., Lukes, J. & Keeling, P. J. 2010. A common red
algal origin of the apicomplexan, dinoflagellates and heterokont plastids. Proc. Natl.
Acad. Sci., 107:10949--10954.
Jiang, R. H., Tyler, B. M., Whisson, S. C., Hardham, A. R. & Govers, F. 2006. Ancient
origin of elicitin gene clusters in Phytophthora genomes. Mol. Biol. Evol., 2:338--351.
Kamoun, S. 2006. A catalogue of the effector secretome of plant pathogenic
oomycetes. Annu. Rev. Phytopathol., 44:41--60.
Keeling, P. J. 2004. Diversity and evolutionary history of plastids and their hosts. Am.
J. Bot., 91:1481--1493.
Keeling, P. J. 2009. Role of horizontal gene transfer in the evolution of photosynthetic
eukaryotes and their plastids. Methods Mol. Biol., 532:501--515.
Khan, H., Parks, N., Kozera, C., Curtis, B. A., Parsons, B. J., Bowman, S. & Archibale,
J. M. 2007. Plastid genome sequence of the cryptophytes alga, Rhodomonas salina
CCMP1319: lateral transfer of putative DNA replication machinery and a test of chromist
plastid phylogeny. Mol. Biol. Evol., 24: 1832--1842.
Kleffmann, T., Russenberger, D., von Zychlinski, A., Christopher, W., Sjolander, K.,
Gruissem, W. & Baginsky, S. 2004. The Arabidopsis thaliana chloroplast proteome
reveals pathway abundance and novel protein functions. Curr. Biol., 14:354--362.
Kleffmann, T., Hirsch-Hoffmann, M. Gruissem, W. & Baginsky, S. 2006. plprot: a
comprehensive proteome database for different plastid types. Plant Cell Physiol.,
47:432--436.
Köhler, S., Delwiche, C. F., Denny, P. W., Tilney, L. G., Webster, P., Wilson, R. J.,
Palmer, J. D. & Roos, D. S. 1997. A plastid of probable green algal origin in
apicomplexan parasites. Science, 275:1485--1489.
Kolaczkowski, B. & Thornton, J. W. 2008. A mixed branch length model of heterotachy
improves phlogenetic accuracy. Mol. Biol. Evol., 25:1054--1066.
Kroth, P. G. 2002. Protein transport into secondary plastids and the evolution of
primary and secondary plastids. Int. Rev. Cytol., 221:191--255.
Lane, C. E. & Archibald, J. M. 2008. The eukaryotic tree of life: endosymbiosis takes
its TOL. Trends Ecol. Evol., 5:268--275.
Larkum, A. W. D., Lockhart, P. J. & Howe, C. J. 2007. Shopping for plastids. Trends
Plant Sci., 12:189--195.
Lee J. J., Leedale G. F. & Bradbury P. (eds) 2000. Illustrated Guide to the Protozoa.
45
2nded., Society of Protozoologists, Allen Press, Lawrence, Kansas.
Marchler-Bauer, A., Anderson, J. B., Derbyshire, M. K., DeWeese-Scott, C., Gonzales,
N. R., Gwadz, M., Hao, L., He, S., Hurwitz, D. I., Jackson, J. D., Ke, Z., Krylov, D.,
Lanczycki, C. J., Liebert, C. A., Liu, C., Lu, F., Lu, S., Marchler, G. H., Mullokandov, M.,
Song, J. S., Thanki, N., Yamashita, R. A., Yin, J. J., Zhang, D. & Bryan, S. H. 2007.
CDD: a conserved domain database for interactive domain family analysis. Nucleic Acid
Res., 35:D237--240.
Moustafa, A., Beszteri, B., Maier, U. G., Bowler, C., Valentin, K. & Bhattacharya, D.
2009. Science, 324:1724--1726.
Okamoto, N., Chantangsi, C., Horák, A., Leander, B. S. & Keeling, P. J. 2009.
Molecular phylogeny and description of the novel katablepharid Roombia truncate gen.
et sp. Nov., and establishment of the hacrobia taxon nov. PLoS ONE. 4:e7080.
doi:10.1371/journal.pone.0007080.
Pagel, M. & Meade, A. 2008. Modelling heterotachy in phylogenetic inference by
reversible-jump Markov chain Monte Carlo. Phil. Trans. R. Soc. B., 363:3955--3964.
Parfrey, L. W., Grant, J., Tekle, I. Y., Lasek-Nesselquist, E., Morrison, H. G., Sogin, M.
L., Patterson, D. J. & Katz, L. A. 2010. Broadly sampled multigene analyses yield a wellresolved eukaryotic tree of life. Syst. Biol., 59:518--533.
Patron, N. J., Inagaki, Y. & Keeling, J. P. 2007 Multiple gene phylogenies support the
monophyly of cryptomonads and haptophytes host lineages. Curr. Biol.,17:887-891.
Petersen, T. N., Brunak, S., von Heijne, G. & Nielsen, H. 2011. SignalP 4.0:
discriminating signal peptides from transmembrane regions. Nature Methods, 8:785-786.
Philippe, H., Zhou, Y., Brinkmann, H., Rodrigue, N. & Delsuc, F. 2005. Heterotachy and
long-branch attraction in phyloogenetics. BMC Evol. Biol. 5:50. Doi:10.1186/14712148-5-50.
Ralph, S. A., van Dooren, G. G., Waller, R. F., Crawford, M. J., Fraunholz, J. J., Foth, B.
J., Tonkin, C. J., Roos, D. S. & McFadden, G. I. 2004. Metabolic maps and functions of
the Plasmodium falciparum apicoplast. Nature Rev. Microbiol., 2:203--216.
Reyes-Prieto, A., Moustafa, A. & Bhattacharya, D. 2008. Multiple genes of apparent
algal origin suggest ciliates may once have been photosynthetic. Curr. Biol., 13:956-962.
Rice, D. W. & Palmer, J. D. 2006. An exceptional gene transfer in plastids: gene
replacement by a distant bacterial paralog and evidence that haptophytes and
cryptophytes plastids are sisters. BMC Biol., 4:31.
46
Rogers, M. B., Patron, N. J. & Keeling, P. J. 2007. Horizontal transfer of a eukarotic
plastid--targeted protein ene to cyanobacteria. BMC Biol., 5:26.
Sanchez-Puerta, M. V & Delwiche, C. F. 2008. A hypothesis for plastid evolution in
chromalveolates. J. Phycol., 44:1097--1107.
Sanchez-Puerta, M. V., Lippmeier, J. C., Apt, K. E. & Delwiche, C. F. 2007. Plastid
genes in a non-photosynthetic dinoflagellate. Protist, 158:105--117.
Sekimoto, S., Klochkova, T. A., West, J. A., Beakes, G. W. & Honda, D. 2009.
Olpidiopsis bostrychiae sp. Nov.: an endoparasitic oomycete that infects Bostrychia and
other red algae (Rhodophyta). Phycologia, 48:460--472.
Shindo, T., Misas-Villamil, J. C., Hörger A. C., Song, J. & van der Hoorn, R. A. L. 2012.
A role in immunity for Arabidopsis cystein protease RD21, the ortholog of the tomato
immune protease C14. PloS ONE, 7:e29317. Doi:10.1371/journal.pone.0029317.
Slamovits, C. H. & Keeling, P. J. 2008. Plastid-derived genes in the nonphotosynthetic
alveolates Oxyrris marinus. Mol. Biol. Evol., 25: 1297--1306.
Soll, J. & Schleiff, E. 2004. Protein import into chloroplasts. Nature Rev. Mol. Cell
Biol., 5:198--208.
Stiller, J. W., Huang, J., Ding, Q., Tian, J. & Goodwillie, C. 2009. Are algal genes in
nonphotosynthetic protists evidence of historical plastid endosymbiosis? BMC
Genomics, doi:10.1186/1471-2164-10-484
Tatusov, R.L., Natale, D.A., Fedorova, N.D., Jackson, J., Jacobs, A., Krylov, D.M.,
Mekhedov, S.L., Nikolskaya, A.N., Rao, B.S., Wolf, Y.I., Aravind, L., Lanczycki, C.,
Masumder, R., Sreekumar, K., Vasudevan, S., Walker, D.R., Tatusova, T.A., Yao, K.,
Yin, J., Koonin, E.V. 2003. The COG database: an updated version includes
eukaryotes. BMC Bioinformatics. 4:41.
Tyler, B. M., Tripathy, S., Zhang, X., Dehal, P., Jiang, R. H. Y., Aerts, A., Arredondo, F.
P., Baxter, L., Bensasson, D., Beynon, J. L., Chapman, J., Damasceno, C. M. B.,
Dorrance, A. E., Dou, D., Dickerman, A. W., Dubchak, I. L., Garbelotto, M., Gijzen, M.,
Gordon, S. G., Govers, F., Grunwald, N. J., Huang, W., Ivors, K. L., Jones, R. W.,
Kamoun, S., Krampis, K., Lamour, K. H., Lee, M. K., McDonald, W. H., Medina, M.,
Meijer, H. J. G., Nordberg, E. K., Maclean, D. J., Ospina-Giraldo, M. D., Morris, P. F.,
Phuntumart, V., Putnam, N. H., Rash, S., Rose, J. K. C., Sakihama, Y., Salamov, A. A.,
Savidor, A., Scheuring, C. F., Smith, B. M., Sobral, B. W. S., Terry, A., Torto-Alalibo, T.
A., Win, J., Xu, Z., Zhang, H., Grigoriev, I. V., Rokhsar, D. S., Boore, J. L. 2006.
Phytophthora genome sequences uncover evolutionary origins and mechanisms of
pathogenesis. Science, 313:1261--1266.
47
Whelan, S. & Goldman, N. 2001. A general empirical model of protein evolution
derived from multiple protein families using a maximum-likelihood approach. Mol. Bio.l
Evol,.18:691--699.
Wilson, R. J. M. 2004. Plastid functions in the Apicomplexa. Protist, 155:11--12.
Woehle, C., Dagan, T., Martin, W. F. & Gould, S. B. 2011. Red and problematic green
phylogenetic signals among thousands of nuclear genes from the photosynthetic and
apicomplexa-related Chromera velia. Genome Biol. Evol., 3:1220--1230.
Yoon, H. S., Hackett, J. D., Ciniglia, C., Pinto, G. & Bhattacharya, D. 2004. A
molecular timeline for the origin of photosynthetic eukaryotes. Mol. Biol. Evol., 21:809-818.
Yoon, H. S., Hackett, J. D., Pinto, G. & Bhattacharya, D. 2002. The single, ancient
origin of chromist plastids. Proc. Natl. Acad. Sci. USA, 99:15507--15512.
48
Download