file - BioMed Central

advertisement
1
2
3
4
5
6
SUPPLEMENTARY INFORMATION
7
Thivierge a,b,c, Roger C. Levesquea,e, Steve J. Charettea,b,c#
8
a. Institut de biologie intégrative et des systèmes, Pavillon Charles-Eugène-Marchand, Université
9
Laval, 1030 avenue de la Médecine, Quebec City, QC, Canada, G1V 0A6
Increasing genomic diversity and evidence of constrained lifestyle evolution due to insertion
sequences in Aeromonas salmonicida
Antony T. Vincenta,b,c, Mélanie V. Trudela,b,c, Luca Freschia,e, Vandan Nagard, Cynthia Gagné-
10
b. Centre de recherche de l’Institut universitaire de cardiologie et de pneumologie de Québec
11
(Hôpital Laval), 2725 Chemin Sainte-Foy, Quebec City, QC, Canada, G1V 4G5
12
c. Département de biochimie, de microbiologie et de bio-informatique, Faculté des sciences et de
13
génie, Université Laval, 1045 avenue de la Médecine, Quebec City, QC, Canada G1V 0A6
14
d. Food Technology Division, Bhabha Atomic Research Centre, Mumbai, 400085, India
15
e. Département de microbiologie-infectiologie et immunologie, Faculté de médecine, Université
16
Laval, Quebec City, QC, Canada
17
18
#
19
Pavillon Charles-Eugène-Marchand, 1030 avenue de la Médecine, Université Laval, Quebec
20
City, QC, Canada G1V 0A6
21
Telephone: 418-656-2131, ext. 6914, Fax: 418-656-7176
22
steve.charette@bcm.ulaval.ca
Corresponding author: Steve J. Charette, Institut de Biologie Intégrative et des Systèmes (IBIS),
23
24
1
25
Supplementary Experimental Procedures
26
27
Phylogenetic analyses
28
To perform a robust core genome phylogeny, we wrote an in-house Perl script called
29
CoreFinder.pl that relies on BioPerl modules [1] to find the genes involved in the core genome.
30
The script uses coding sequences extracted from a GenBank file and sequentially performs
31
tblastn [2] searches in fasta or multi-fasta (for draft genomes) files (Figure S1). We used A.
32
hydrophila ATCC 7966T [3], which is the A. hydrophila type strain, as a reference. The genome
33
of this strain has been well studied and has a high-quality annotation. The others aeromonads
34
used in the present study are listed in Table S1. The parameter used to search the CDSs was at
35
least a 85% query cover for various similarity percent (25% to 100%, with 5% steps). The
36
graphical interpretation of the results revealed three linear sections and two breakpoints estimated
37
at 40% and 80% similarity (Figure S2). To verify the importance of this parameter (e.g., the
38
similarity percent) with respect to the final phylogeny, we performed all subsequent analyses (as
39
indicated in the main manuscript) at 40% and 80% similarity.
40
41
To choose the most appropriate phylogenetic model, the Akaike Information Criterion (AIC) and
42
the Bayesian Information Criterion (BIC) were computed using jModelTest version 2.1.7 [4] for
43
both matrixes. In both cases, while the best-fit model was GTR+Γ closely followed by GTR+I+Γ
44
(Table S2), there was no significant difference between the two. However, as discussed and
45
reviewed elsewhere [5], the consideration of a rate class with a rate zero caused by invariable
46
sites is meaningless since the α parameter, which governs the shape of the gamma distribution,
47
already allows low-rate sites through an L-shaped gamma distribution caused by an α < 1.
48
Moreover, the use of the mixture model +I+Γ might result in an over-parameterization since it
2
49
would be difficult to optimize both parameters. We thus used the GTR+Γ model for both
50
matrixes at 40% and 80% similarity.
51
52
Species relatedness was inferred by average nucleotide identity (ANI) analyses for key taxa using
53
JSpecies version 1.2.1 [6]. MUMmer version 3.23 [7] was used to perform the analyses since it
54
provides more robust results for genomes sharing a high level of similarity (ANI > 90%) than
55
blast searches [6]. Two taxa were considered to be members of the same species if they shared an
56
ANI ≥ 96%, a value that is well adapted to the aeromonads [8].
57
58
Bacterial growth at 7°C
59
The Indian isolates (Y577, Y567 and Y47) as well as A. salmonicida subsp. pectinolytica
60
(34melT), A. salmonicida subsp. smithia (JF4097), A. salmonicida subsp. masoucida (NBRC
61
13784T), and A. salmonicida subsp. salmonicida (01-B526) were inoculated on furunculosis agar
62
or on tryptic soy agar (TSA) from frozen stocks and were grown at 18°C for 24 to 48 h. The
63
isolates were then inoculated in 3 ml of lysogeny broth (LB) and were incubated at 7°C overnight
64
with shaking at 200 rpm. The turbidity was adjusted to an optical density of 0.1 at 595 nm
65
(OD595), and the cultures were incubated at 7°C with shaking at 200 rpm. The ODs were read
66
systematically every hour for 8 h. The experiment was performed in triplicate.
67
68
PCR assays
69
We performed PCR assays using previously published conditions [9] to verify whether the pAsa5
70
plasmid of strain RS 534 had lost its type three secretion system (TTSS) by the recombination of
71
ISAS11B and ISAS11C [10].
72
3
73
Plasmid characterization
74
The contigs for the strains sequenced in the present study were locally mapped on the
75
chromosome sequence of the A. salmonicida reference strain A449 [11], the only A. salmonicida
76
strain with a fully assembled chromosome, using CONTIGuator version 2.7.4 [12]. Identity
77
searches of the unmapped contig sequences were performed by blast searches against the NCBI
78
nr/nt database. Sequence manipulations were performed using the bioinformatics package
79
EMBOSS version 6.6.0.0 [13].
80
81
The plasmid sequences that were discovered were automatically annotated by the RAST
82
webserver [14]. All the putative CDSs were manually curated by performing blastp searches
83
against the NCBI nr/nt database. Putative toxin-antitoxin systems were found by TAfinder [15].
84
85
The average copy number of each new plasmid for each chromosome was calculated using the
86
sequencing depth, a procedure that has been successfully used in the past [16]. We filtered the
87
sequencing reads using Trimmomatic version 0.32 [17] with the parameters suggested in the
88
manual. The resulting filtered sequencing reads were mapped on the gyrB gene (single copy per
89
chromosome) with CUSHAW3 version 3.0.3 [18] without allowing any mismatches in order to
90
avoid cross-mapping from reads related to other genes. The reads were also mapped on the
91
plasmid sequences. The average coverages were calculated using Qualimap version 2.1.1-dev
92
[19].
93
94
Biochemical tests
95
Three Indian A. salmonicida isolates (Y47, Y567 and Y577) were further phenotypically
96
characterized using a set of biochemical tests as described by Pavan et al. [20] and Abbott et al.
4
97
[21]. All tests were performed in triplicate and according to conventional protocols with suitable
98
positive and negative controls and incubated at 35°C for 48 – 72 h (unless mentioned). The tests
99
for carbohydrate fermentation and extracellular enzymes were read daily for 7 days; whereas,
100
tests for Voges-Proskauer, polypectate degradation and production of brown diffusible pigment
101
on tryptic soya agar (TSA) were incubated at 25°C for 2 to 4 days.
102
5
103
Supplementary results
104
Sequencing results
105
Despite the stable average coverage of the assemblies, the N50 values, which are an indicator of
106
contigs length, varied considerably (Table S3). For example, the Y567 Indian strain had a N50
107
value two times higher than the other strains while JF4097 (smithia) had a lower N50 value and
108
the smallest large contig. Large repeated elements such as ISs, duplicated genes, and ribosomal
109
RNA clusters cause contig breaks during de novo assembly [22], which suggests that the A.
110
salmonicida subsp. smithia genome contained numerous large repeated elements.
111
112
Molecular phylogeny optimization
113
At 40% similarity (of the translated sequences), the core genome was estimated at 1645 genes
114
compared to 1190 genes at 80%. The functional categories of the genes (at 40 and 80%) were
115
found using an in-house Perl script as explained in the main manuscript to verify whether there
116
was an enrichment of one or more categories. There were major differences in the relative
117
abundance of the functional categories at 40% and 80% similarity in only two categories (J:
118
translational, ribosomal structure and biogenesis and K: transcription), indicating that the
119
gains/losses were uniform in the other categories (Figure S3). The relative importance of the J
120
category at 80% is higher than at 40%, which is consistent with this conserved process. The high
121
relative importance for the K category at 40% is in accordance with the capacity of various
122
aeromonads to react to a wide diversity of stimuli.
123
124
The basic features of the phylogenetic analyses are presented in the Table S4. The matrix at 80%
125
similarity had 35% fewer sites than the matrix at 40% similarity. In both cases, the values of the
126
alignment patterns, which are the numbers of different patterns in the matrixes, corresponded to
6
127
approximately 60% of the total number of sites. There was no significant difference between the
128
α parameters of the two phylogenetic analyses as estimated by RAxML, meaning that the 35%
129
more sites at 40% similarity shared the same rate as the other sites.
130
131
There were differences between the resulting trees in terms of bootstrap values and topology
132
(Figures S4 and S5). In fact, the phylogenetic analysis at 80% similarity had the weakest
133
bootstrap values (Figure S4), indicating that the 35% more sites obtained at 40% similarity are
134
important for obtaining a more robust tree (Figure S5). The topology diverged for the clade
135
containing Aeromonas veronii. This observation was understandable since this clade had a weak
136
bootstrap value, even with the matrix at 40% similarity (Figure S5). Based on the bootstrap
137
values, we thus believe that the tree based on the core genome found at 40% similarity more
138
accurately represents the true evolution links between the taxa, which is why we used this
139
phylogenetic tree for the remainder of the study.
140
141
Phylogenetic position of A. salmonicida
142
As mentioned in the main manuscript, the molecular phylogeny of the present paper revealed that
143
A. salmonicida CBA100, a recently deposited Chilean strain [23], is phylogenetically closer to A.
144
bestiarum than to A. salmonicida (Figure S5). To verify the relatedness of the CBA100 strain and
145
A. bestiarum, the average nucleotide identity (ANI) values were computed for some key taxa
146
(Figure S6). The fact that the ANI value between CBA100 and A. bestiarum is above 96%
147
reinforce the close evolutionary link between both taxa and let believe at a miss-classification of
148
CBA100.
149
7
150
Strain Y577 shared a clade with A. salmonicida subsp. pectinolytica. Based strictly on the
151
molecular phylogeny and the ANI values, we cannot rule out the possibility that Y577 is in fact
152
A. salmonicida subsp. pectinolytica. As previously published, the pectinolytica subspecies is the
153
only known aeromonad with pectinase activity [20]. Interestingly, all the genes in the
154
pectinolytica subspecies that are needed to degrade and use pectin as a carbon source [24] were
155
found in the genome of Y577. Given this, the pectinase activity of Y577 was verified and was
156
confirmed experimentally (Table S5). It is tempting to suggest that Y577 is a member of the
157
pectinolytica subspecies or a new subspecies sharing a near common ancestor. However, the
158
overall chromosomal organizations of Y577 and the pectinolytica subspecies strains appeared
159
divergent (Figure 1, main manuscript), which is unusual for such closely related strains given the
160
chromosomal uniformity of salmonicida subspecies strains. The results of some other
161
biochemical tests also diverged (Table S5), suggesting that strains Y577 and A. salmonicida
162
subsp. pectinolytica 34melT may not belong to the same subspecies.
163
164
Strains Y47 and Y567 formed a basal clade to the masoucida and salmonicida subspecies.
165
However, like the relation between Y577 and the pectinolytica 34melT strain, we cannot infer that
166
Y47 and Y567 belong to the same subspecies based solely on the molecular phylogeny and the
167
ANI values, especially since there were also macro-chromosomal differences between the two
168
strains. If they belong to the same subspecies, this would indicate that they display significant
169
genomic plasticity. There are also differences between many of the biochemical test results
170
(Table S5), which also points to a potential taxonomic difference. Surprisingly, Y47 and Y567
171
were also pectinolytic. While 34melT and Y577 bore genes coding for three lyases involved in the
172
first step of pectin degradation, Y567 and Y47 did not. The genomes of strains Y567, Y47, and
173
Y577 (as a positive control) were annotated using the RAST webserver [25] to verify whether
8
174
they possessed a subsystem related to pectin degradation. The annotation of Y577 contained the
175
three lyases (EC 4.2.2.2, EC 4.2.2.6 and EC 4.2.2.9) in the “D-galacturonate and D-glucuronate
176
utilization” subsystem while the annotations of Y47 and Y567 did not contain any enzymes
177
involved in pectin degradation. The pectinase activities of Y47 and Y567 likely involved an
178
unknown pathway and are potentially the result of convergent evolution (i.e., when compared to
179
the strains pectinolytica 34melT and Y577). This result is interesting since it evokes that
180
pectinolytic activity could be important for mesophilic A. salmonicida.
181
182
Bacterial growth at 7°C
183
The capacity of various A. salmonicida isolates to grow at 7°C was tested in addition to 18°C and
184
37°C (main manuscript). The same trend as at 18°C was observed, with the mesophilic strains
185
growing more efficiently than the psychrophilic ones (Figure S7). The isolate JF4097 of the
186
subspecies smithia was not able to grow at this temperature. This was expected knowing that this
187
isolate had a weak growth capacity at 18°C (main manuscript).
188
189
Investigation of the plasmidome
190
The putative chromosomal contigs were removed and the remaining contigs were analyzed in
191
order to investigate the plasmidome of the strains for which the DNA was sequenced in the
192
present study. This resulted in the identification of three small cryptic plasmids in Indian strain
193
Y47 (Figure S8). To our knowledge, this was the first time that these plasmids had been found.
194
We named them pY47-1, pY47-2, and pY47-3 and deposited their sequences in GenBank under
195
accession numbers KT334396, KT334397, and KT334398, respectively. There were no clear
196
known functions associated with these plasmids. All bore a putative type II toxin-antitoxin
197
maintenance system and/or a phage resistance mechanism [26]. The plasmids pY47-2 and pY47-
198
3 are ColE2-type replicon plasmids with a short RNA (RNA I) replication regulator [27].
9
199
Interestingly, a blastn (word size of 11) search revealed sequence identity and structural
200
similarity between pY47-3 and the ColE2-type replicon plasmids pAQ2-1 and pAQ2-2 in
201
Aeromonas sobria and Aeromonas hydrophila, respectively [28]. However, unlike these
202
plasmids, pY47-3 did not bear the qnrS2 quinolone resistance gene.
203
204
The high sequencing depth provided by the Illumina technology was used to infer the average
205
copy number per chromosome of each plasmid. As it has been reported elsewhere [29], high copy
206
numbers of ColE2-type replicon plasmids are maintained per cell ( 24 copies for pY47-2 and
207
13 copies for pY47-3). The plasmid pY47-1, for which the incompatibility group is unknown,
208
also had a high copy number (22 copies). It is important to mention that inferring relative
209
plasmid copy numbers has an inherent bias since it is assumed that there is a single copy of the
210
chromosome in each cell, which is not true after the replication. However, the results showed that
211
these plasmids were maintained at much higher copy numbers than the bacterial chromosome. No
212
plasmids were found in strain Y567 while Y577 harbored a pY47-3 plasmid that shared more
213
than 99% identity with the one in Y47 (7 point mutations).
214
215
A plasmid maintained at a high copy number (~40 copies/cell) was also found in A. salmonicida
216
subsp. smithia JF4097 and was subsequently named pJF4097 and its sequence deposited in
217
GenBank under the accession number: KT334395 (Figure S9). pJF4097 bears the mobABCD
218
genes, which are related to mobilization, an ISAS11, and a gene encoding an ExoY-like protein,
219
which is a type-three secretion system (TTSS) effector in the human pathogen Pseudomonas
220
aeruginosa [30].
221
10
222
The A. salmonicida subsp. salmonicida RS 534 strain harbors the same five plasmids as the A449
223
reference strain [11], including the large plasmid pAsa4, which encodes many drug resistance
224
genes; pAsa5, which normally bears the type-three secretion system, and the pAsa1, pAsa2, and
225
pAsa3 cryptic plasmids [31]. Basic bioinformatics analyses showed that the pAsa5 plasmid of the
226
RS 534 strain has lost its TTSS. It is known that this region is bordered by two ISAS11s (B and
227
C) and that growth above 25°C may result in the recombination of the two ISAS11s and the loss
228
of the TTSS [9,10]. We confirmed by PCR that the TTSS was lost by a recombination of
229
ISAS11B and C (Figure S10).
230
231
The pan-genome analyze
232
We used an in-house Perl script as indicated in the main manuscript to find the pan-genome of A.
233
salmonicida. The resulting binary matrix (i.e., presence/absence) was used to map the characters
234
(i.e., the genes) on a phylogenetic tree based on the core genome (Figure S11A). This analysis
235
made it possible to determine which genes were acquired or lost during evolution and,
236
consequently, may have played a role in the adaption of a given strain. As indicated in the main
237
manuscript, three functional categories (K, N and X) at branch 1 experienced many events (i.e.,
238
gains and losses) (Figure S11B). The L, R, T and U categories have also acquired and lost many
239
genes, but this can more likely be attributed to general rather than mesophilic-to-psychrophilic
240
evolution. In the case of branch 2 (Figure S11C), the three functional categories exhibiting most
241
important changes are energy production and conversion (C) (only losses for this category),
242
carbohydrate transport and metabolism (G), replication, recombination, and repair (L).
243
Interestingly only gains have been detected for the category related to the mobilome (X).
244
11
245
Unfortunately, it was impossible to assign a cluster of orthologous groups (COGs) at 45,4 and
246
59,1% of the genes for the branches 1 and 2, respectively and, consequently, to infer their
247
functional categories. This highlights a drawback of bioinformatics analyses and their
248
dependence on incomplete and poorly curated databases.
249
250
Functional categories of the genes under positive selection in the mesophilic lineages
251
A total of 322 genes in the A. salmonicida lineages appear to be under positive selection for
252
various lineages among the salmonicida species, including 241 that were specific to at least one
253
mesophilic lineage. We used a COG assignment of these 241 genes to find their relative
254
functional categories (Figure S12). Many categories in the mesophilic lineages were under
255
positive selection, indicating that these lineages may have a high evolutionary potential.
256
12
257
258
259
260
261
Table S1. Aeromonads used in the study.
Species
Strain
Accession no.
Reference
T
A. allosaccharophila
CECT 4199
CDBR00000000 [8]
A. allosaccharophila
BVH88
CDCB00000000 [8]
A. australiensis
CECT 8023T
CDDH00000000 [8]
T
A. bestiarum
CECT 4227
CDDA00000000 [8]
A. bivalvium
CECT 7113T
CDBT00000000 [8]
A. caviae
CECT 838T
CDBK00000000 [8]
A. dhakensis
CIP 107500
CDBH00000000 [8]
A. diversa
CECT 4254T
CDCE00000000 [8]
A. encheleia
CECT 4342T
CDDI00000000
[8]
A. enteropelogenes
CECT 4487T
CDCG00000000 [8]
A. eucrenophila
CECT 4224T
CDDF00000000 [8]
A. fluvialis
LMG 24681T CDBO00000000 [8]
A. hydrophilaa
ATCC 7966T CP000462
[3]
T
A. jandaei
CECT 4228
CDBV00000000 [8]
A. media
CECT 4232T
CDBZ00000000 [8]
A. molluscorum
848T
AQGQ00000000 [32]
T
A. piscicola
LMG 24783
CDBL00000000 [8]
A. popoffii
CIP 105493T
CDBI00000000
[8]
A. rivuli
DSM 22539T CDBJ00000000
[8]
A. salmonicida subsp. salmonicida
A449
CP000644
[11]
A. salmonicida subsp. salmonicida
01-B526
AGVO01000000 [33]
A. salmonicida subsp. salmonicida
RS534
JYFF00000000
This study
A. salmonicida subsp. salmonicida
JF3224
JXTA00000000
[9]
A. salmonicida subsp. salmonicida
CIP 103209
CDDW00000000 [8]
A. salmonicida subsp. salmonicida
2009-144K3
JRYV00000000
[34]
A. salmonicida subsp. salmonicida
2004-05MF26 JRYW00000000 [34]
A. salmonicida
CBA100
JPWL00000000
[23]
A. salmonicida subsp. achromogenes
AS03
AMQG00000000 [35]
A. salmonicida subsp. smithia
JF4097
JZTI00000000
This study
T
A. salmonicida subsp. pectinolytica
34mel
ARYZ00000000 [36]
A. salmonicida subsp. masoucida
NBRC 13784T BAWQ00000000 N/Ab
A. salmonicida
Y47
JZTF00000000
This study
A. salmonicida
Y567
JZTG00000000
This study
A. salmonicida
Y577
JZTH00000000
This study
A. sanarellii
LMG 24682T CDBN00000000 [8]
A. schubertii
CECT 4240T
CDDB00000000 [8]
T
A. simiae
CIP 107798
CDBY00000000 [8]
A. sobria
CECT 4245T
CDBW00000000 [8]
A. species
AH4
ERX552948c
[8]
A. species
AMC34
AGWU00000000 N/A
A. taiwanensis
LMG 24683T CDDD00000000 [8]
A. tecta
CECT 7082T
CDCA00000000 [8]
T
A. veronii
CECT 4257
CDDK00000000 [8]
a: This strain was used as a model to find the genes involved in the core genome.
b: N/A means that no publication is associated with the sequence.
c: Only the sequencing reads were available via the SRA database for A. species AH4. The reads were de
novo assembled as indicated in the “Methods of the main manuscript” section.
13
262
Table S2. The five best models and their –InL, AIC, and BIC values.
Similarity
40%
Model
GTR+Γ
GTR+I+Γ
HKY+Γ
HKY+I+Γ
SYM+Γ
-lnL
12827871
12827940
12832779
12832849
13005302
AIC
25655928
25656069
25665737
25665878
26010784
80%
BIC
25656993
25657146
25666756
25666909
26011815
-lnL
7941102
7941148
7944260
7944305
8071711
AIC
15882391
15882484
15888698
15888791
16143602
BIC
15883417
15883521
15889679
15889783
16144595
263
264
14
265
Table S3. Assembly results.
Strains
Contigs
Largest contigs (kbp)
N50 (kbp)
Average coverage
Assembly size (Mbp)
A449 fraction (%)a
266
267
268
Y47
Y567
Y577
JF4097
RS 534
118
395.27
117.77
66.92
4,710233
85.077
47
448.10
217.34
68.03
4,554847
85.607
104
383.04
101.76
78.53
4,736410
83.731
344
109.61
28.95
88.72
4,307768
84.059
123
382.43
119.02
62.59
4,889640
97.599
a: The chromosome sequence of the strain A449 (A. salmonicida subsp. salmonicida) [11] was
used. This feature was found using QUAST version 3.1 [37].
15
269
Table S4. Phylogenetic features.
Similarity percent
40%
80%
1645
1190
Genes
696,249 454,574
Sites
Alignment patterns 420,006 271,519
GTR+Γ
GTR+Γ
Best model
1.703415 1.686099
α parameter
270
16
271
Table S5. Biochemical tests used for the mesophilic A. salmonicida strains.
Biochemical tests
Strains
Y577 Y47 Y567
Indole (35°C)
+
+
+
+
ONPG
+
+
+
+
VP (25°C)
+
+
+
+
Simmons citrate
+
+
+
+
Esculin hydrolysis
+
+
Polypectate degradation (25°C)
+
+
+
+
Motility (35°C)
+
+
+
Brown pigment (25°C)
+
Growth (37°C)
+
+
+
+
Dnase
+
+
+
+
Lipase
+
+
+
+
Gelatinase
+
+
+
+
H2S
VP (35°C)
ODC
LDC
+
+
+
ADH
+
+
+
Urease
Cellobiose
+
+
+
Salicin
+
+
Sorbitol
+
+
+
+
Rhamnose
Mannitol
+
+
+
+
Sucrose
+
+
+
+
Glucose (gas)
+
+
+
+
L-Arabinose
+
+
+
+
Lactose
+
+
+
+
Glycose
+
+
+
+
Inositol
Melibiose
Glu
+
+
+
+
Amygdalin
+
+
Hemolysin (sheep, horse)a
+
+
+
+
a: These results are from [20]. We have used horse blood agar for assessing hemolysis; whereas,
Pavan et al. (2000) [20] have used sheep blood agar plates.
34melT a
272
273
274
17
275
Initial GenBank
file (Genome 1)
CDS extraction
tblastn
List of CDSs
276
277
278
279
Sequence(s)
(Genome 2)
...
Sequence(s)
(Genome n)
Sequence(s)
(Genome n-1)
tblastn
List of CDSs
List of CDSs
Core genome
Translated
sequences
List of CDSs
with known
biological function
Figure S1. Conceptual schematization of the in-house CoreFinder.pl Perl script.
18
Core genome(genes)
1500
1000
500
0
100
280
281
282
283
284
285
286
85
70
55
40
25
similarity (%)
Figure S2. Number of genes involved in the core genome based on the similarity percent used
with the CoreFinder.pl script. The blue dots at 40% and 80% indicate the similarity percent used
to perform the optimization analyses.
19
287
288
289
290
291
Figure S3. Relative abundance of 26 functional categories for genes used to construct the
phylogenetic matrixes at 40 and 80% similarity.
20
292
293
294
295
296
297
298
299
300
301
Figure S4. (A) Molecular core genome phylogeny of 43 aeromonads inferred from the sequences
of 1190 genes (determined using the 80% similarity) by maximum-likelihood using the GTR+Γ
model and a 1000 rapid bootstrap analysis. Only bootstrap values under 100 are shown. For
clarity, the bootstrap values have been removed for the taxa of the salmonicida species. The
mesophilic strains are in red while the psychrophilic strains are in blue. (B) Zoom of salmonicida
species with equal branch lengths. Only bootstrap values under 100 are shown. The mesophilic,
intermediate, and psychrophilic strains are shown in red, purple, and blue, respectively.
21
.p
e
56
A.
s
-Y
A.
ila
sis
i
7
sY5
77
ss
ub
sp
o ff ii
A.
A. p
op
A . pi sc ic
ol a
A. s - CBA100
m
ph
en
s
ae
dro
ne
nd
ak
ge
ja
sis
A
.s
47
-Y
u
ss
A.
s
A. ve
ron
A. sa cc ha ro
ph
lo
ien
dh
pe
A.
ro
iali
ial
hy
te
fluv
A.
str
H4
s-A
A.
en
A.
au
i ar u
A. be s t
ie
pec
A. s
A.
A.
ctin
oly
tica
302
bs
s
ma
p.
ub
ss
A.
ila - BV H8 8
sp.
subs
A. s
96
ii
ou
sm
A. s
sub
sp
34
ch
el
a
ei
A.
s
su
su
p.
bs
sa
si
m
iu m
ia
e
A.
omo
gene
s
52 6
id a - 01 -B
lm
p.
on
bs
sa
ic
p.
lm
id
a
sa
on
-J
s a lm o n
. sa
lm
lm
ici
F3
on
da
22
o n ic
icid
ic id a 2 0 0 9 -1
id a
aRS
-C
IP
10
32
- A4
4
53
44K3
9
4
09
4
div
sa
sch
er
A.
ube
rtii
um
A. mol l uscor
A. r i v u l i
rel
an
na
sa
A.
iw
ta
A.
bs
s
ss
u
A.
a lv
A . b iv
0.08
A. me
d ia
lii
A
n
.e
A.
A. c
avi
ae
en
op
la
hi
en
cr
A.
ta
s
tec
si
A.
eu
A.
304
305
306
307
308
309
310
a
chr
p. a
A. s su
bsp.
b ri a
A. so
303
ithi
A. s subsp. salmonic ida - 2004-05M F26
T 4199
A. sacc haro phila - CEC
C
- AM
ie s
pec
A. s
a
ic
p. sa lm on
A. s su bs
Aeromonads
96
cid
Figure S5. Molecular phylogeny of 43 aeromonads inferred from 1645 core genes by maximumlikelihood using the GTR+Γ model. Only bootstrap values under 100 are shown in this figure.
All the bootstrap values for the salmonicida subspecies are given on Figure 1 (main article) for
clarity. The red branches correspond to mesophilic taxa, the purple branch corresponds to
intermediate taxon and the blue branch corresponds to psychrophilic taxa. The strain numbers are
shown only when there are two taxa from the same species or subspecies.
22
312
313
314
315
316
---
A. s CBA100
97.49
---
A. s subsp.
pectinolytica
90.54
90.62
---
A. s Y577
90.51
90.63
97.52
---
A. s Y567
90.69
90.83
97.14
97.11
---
A. s Y47
90.63
90.78
97.11
97.13
97.64
---
A. s subsp.
masoucida
A. s subsp.
smithia
A. s subsp.
achromogenes
A. s subsp.
salmonicida
90.52
90.59
97.08
97.08
97.59
97.47
---
90.56
90.71
97.08
97.09
97.58
97.47
99.70
---
90.54
90.75
97.11
97.11
97.59
97.50
99.68
99.62
---
90.52
90.75
97.06
97.06
97.55
97.46
99.75
99.69
99.64
---
A. bestiarum
A. s CBA100
A. s subsp.
pectinolytica
A. s Y577
A. s Y567
A. s Y47
A. s subsp.
masoucida
A. s subsp.
smithia
A. s subsp.
achromogenes
A. s subsp.
salmonicida
311
A. bestiarum
Figure S6. Average nucleotide identity (ANI) analyses for some A. salmonicida subspecies
included in this study. A. bestiarum is also included for comparative purposes with A.
salmonicida CBA100. Two taxa were considered as belonging to the same subspecies if they
shared an ANI value ≥ 96 (yellow and green).
23
0.6
A. salmonicida
O.D (595 nm)
pectinolytica
Y577
Y567
0.4
Y47
masoucida
01-B526
0.2
0.0
0
317
2
4
6
8
Time (h)
318
Figure S7. Growth curves at 7°C for selected A. salmonicida subspecies. The growth curves
319
were determined three times in independent experiments. The means of three replicates with
320
standard error of the mean are shown for each subspecies.
321
24
repB
relE
repA
relB
RNA I
mob
chemotaxis
protein
Origin of
replication
repB
pY47-1
pY47-2
12 495 bp
6 042 bp
repB
parB
chemotaxis
protein
parA
mob
acyltransferase
parD
parE
repA
GTP-binding
protein
RNA I
pY47-3
ccdB
Origin of
replication
5 104 bp
ccdA
mobC
mobD
mobB
mobA
322
323
324
325
326
327
Figure S8. The three high-copy plasmids found in the Indian strain Y47. The pY47-3 plasmid
was also found in the Indian strain Y577. The blue arrows represent genes with a known
function, the green arrows represent genes encoding hypothetical proteins, and the black arrow
represents the putative RNA regulator.
25
mobD
mobA
mobB
tnp
ISAS11
pJF4097
6 231 bp
mobC
RNA I
exoY-like
RNA II
328
329
330
331
332
333
Figure S9. The high-copy plasmid pJF4097 found in A. salmonicida subsp. smithia. The blue
arrows represent genes with a known function, the green arrows represent genes encoding
hypothetical proteins, the black arrows represent the putative RNAs regulator, and the grey
rectangle represents the ISAS11.
26
1
2 3 4
334
335
336
337
338
339
Figure S10. Result of the PCR assay confirming that the RS 534 strain lost its TTSS by the
recombination of two ISAS11s (B-C rearrangement [10]). The wells are as follows: (1) 2-log
DNA ladder (New England Biolabs), (2) RS 534, (3) JF3224 (positive control), and (4) 01-B526
(negative control).
27
1700
1600
1500
148/547
pectinolytica
salmonicida
achromogenes
masoucida
smithia
A
370/348
1400
1300
1200
572/279
489/412
1100
514/335
Y47
Y567
Y577
1000
900
25/326
800
700
212/149
70/45
117/156
217/235
1
300
100
66/34
0
popoffii (outgroup)
B
Gain
0.09
Relative importance
500
200
46/119
Loss
1
0.06
0.03
0.00
A B C D E F G H I
J K L M N O P Q R S T U V W X Y Z
Functional category
2
Gain
Loss
0.10
Relative Importance
600
400
201/235
44/262
C
2
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
RNA processing and modification
Chromatin structure and dynamics
Energy production and conversion
Cell cycle control, cell division, chromosome partitioning
Amino acid transport and metabolism
Nucleotide transport and metabolism
Carbohydrate transport and metabolism
Coenzyme transport and metabolism
Lipid transport and metabolism
Translation, ribosomal structure and biogenesis
Transcription
Replication, recombination and repair
Cell wall/membrane/envelope biogenesis
Cell motility
Posttranslational modification, protein turnover, chaperones
Inorganic ion transport and metabolism
Secondary metabolites biosynthesis, transport and catabolism
General function prediction only
Function unknown
Signal transduction mechanisms
Intracellular trafficking, secretion, and vesicular transport
Defense mechanisms
Extracellular structures
Mobilome: prophages, transposons
Nuclear structure
Cytoskeleton
0.05
0.00
340
A B C D E F G H I
J K L M N O P Q R S T U V W X Y Z
Functional category
28
341
342
343
344
345
346
347
Figure S11. Pan-genome analysis of selected A. salmonicida subspecies, including A. popoffii as
an outgroup. (A) Distribution of the pan-genome on a phylogenetic tree for some key taxa. The
phylogenetic tree was based on the tree found using the core genome. The green and black values
indicate the number of genes acquired and lost, respectively, for the specific branch using the
parsimonious Dollo model. The branch lengths represent the number of genes acquired or lost.
For A. salmonicida subsp. salmonicida the strain used was 01-B526. Relative importance of 26
functional categories for the genes implicated in branches 1 (B) and 2 (C).
29
348
Number of genes
30
20
10
0
A B C D E F G H
349
350
351
352
I
J
K
L M N O P Q R S T U V W X Y Z
Functional category
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
RNA processing and modification
Chromatin structure and dynamics
Energy production and conversion
Cell cycle control, cell division, chromosome partitioning
Amino acid transport and metabolism
Nucleotide transport and metabolism
Carbohydrate transport and metabolism
Coenzyme transport and metabolism
Lipid transport and metabolism
Translation, ribosomal structure and biogenesis
Transcription
Replication, recombination and repair
Cell wall/membrane/envelope biogenesis
Cell motility
Posttranslational modification, protein turnover, chaperones
Inorganic ion transport and metabolism
Secondary metabolites biosynthesis, transport and catabolism
General function prediction only
Function unknown
Signal transduction mechanisms
Intracellular trafficking, secretion, and vesicular transport
Defense mechanisms
Extracellular structures
Mobilome: prophages, transposons
Nuclear structure
Cytoskeleton
Figure S12. Functional categories of the genes under positive selection in the A. salmonicida
mesophilic lineages.
30
353
References
354
355
1. Stajich JE, Block D, Boulez K, Brenner SE, Chervitz SA, Dagdigian C, et al. The Bioperl
toolkit: Perl modules for the life sciences. Genome Res. 2002;12:1611–8.
356
357
358
2. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, et al. Gapped BLAST
and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. .
1997;25 :3389–402.
359
360
3. Seshadri R, Joseph SW, Chopra AK, Sha J, Shaw J, Graf J, et al. Genome sequence of
Aeromonas hydrophila ATCC 7966T: Jack of all trades. J. Bacteriol. 2006;188:8272–82.
361
362
4. Darriba D, Taboada GL, Doallo R, Posada D. jModelTest 2: more models, new heuristics and
parallel computing. Nat. Methods. 2012;9:772–772.
363
364
365
5. Jia F, Lo N, Ho SYW. The impact of modelling rate heterogeneity among sites on
phylogenetic estimates of intraspecific evolutionary rates and timescales. PLoS One.
2014;9:e95722.
366
367
6. Richter M, Rosselló-Móra R. Shifting the genomic gold standard for the prokaryotic species
definition. Proc. Natl. Acad. Sci. U. S. A. 2009;106:19126–31.
368
369
7. Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M, Antonescu C, et al. Versatile and
open software for comparing large genomes. Genome Biol. 2004;5:R12.
370
371
372
8. Colston SM, Fullmer MS, Beka L, Lamy B, Gogarten JP. Bioinformatic Genome Comparisons
for Taxonomic and Phylogenetic Assignments Using Aeromonas as a Test Case. MBio.
2014;5:1–13.
373
374
375
9. Emond-Rheault J-G, Vincent AT, Trudel M V, Frey J, Frenette M, Charette SJ. AsaGEI2b: a
new variant of a genomic island identified in the Aeromonas salmonicida subsp. salmonicida
JF3224 strain isolated from a wild fish in Switzerland. FEMS Microbiol. Lett. 2015;362:fnv093.
376
377
378
10. Tanaka KH, Dallaire-Dufresne S, Daher RK, Frenette M, Charette SJ. An Insertion SequenceDependent Plasmid Rearrangement in Aeromonas salmonicida Causes the Loss of the Type
Three Secretion System. PLoS One. 2012;7:e33725.
379
380
381
11. Reith ME, Singh RK, Curtis B, Boyd JM, Bouevitch A, Kimball J, et al. The genome of
Aeromonas salmonicida subsp. salmonicida A449: insights into the evolution of a fish pathogen.
BMC Genomics. 2008;9:427.
382
383
12. Galardini M, Biondi EG, Bazzicalupo M, Mengoni A. CONTIGuator: a bacterial genomes
finishing tool for structural insights on draft genomes. Source Code Biol. Med. 2011;6:11.
31
384
385
13. Rice P, Longden I, Bleasby A. EMBOSS: The European Molecular Biology Open Software
Suite. Trends Genet. 2000;16:276–7.
386
387
14. Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, et al. The RAST Server:
rapid annotations using subsystems technology. BMC Genomics. 2008;9:75.
388
389
15. Shao Y, Harrison EM, Bi D, Tai C, He X, Ou HY, et al. TADB: A web-based resource for
Type 2 toxin-antitoxin loci in bacteria and archaea. Nucleic Acids Res. 2011;39:D606–11.
390
391
392
393
16. Rasko DA, Rosovitz MJ, Økstad OA, Fouts DE, Jiang L, Cer RZ, et al. Complete sequence
analysis of novel plasmids from emetic and periodontal Bacillus cereus isolates reveals a
common evolutionary history among the B. cereus-group plasmids, including Bacillus anthracis
pXO1. J. Bacteriol. 2007;189:52–64.
394
395
17. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence
data. Bioinformatics. 2014;30:2114–20.
396
397
18. Liu Y, Popp B, Schmidt B. CUSHAW3: sensitive and accurate base-space and color-space
short-read alignment with hybrid seeding. PLoS One. 2014;9:e86869.
398
399
400
19. García-Alcalde F, Okonechnikov K, Carbonell J, Cruz LM, Götz S, Tarazona S, et al.
Qualimap: Evaluating next-generation sequencing alignment data. Bioinformatics.
2012;28:2678–9.
401
402
403
20. Pavan ME, Abbott SL, Zorzópulos J, Janda JM. Aeromonas salmonicida subsp. pectinolytica
subsp. nov., a new pectinase- positive subspecies isolated from a heavily polluted river. Int. J.
Syst. Evol. Microbiol. 2000;50:1119–24.
404
405
21. Abbott SL, Cheung WKW, Janda JM. The genus Aeromonas: Biochemical characteristics,
atypical reactions, and phenotypic identification schemes. J. Clin. Microbiol. 2003;41:2348–57.
406
407
22. Vincent AT, Boyle B, Derome N, Charette SJ. Improvement in the DNA sequencing of
genomes bearing long repeated elements. J. Microbiol. Methods. 2014;107:186–8.
408
409
410
23. Valdes N, Espinoza C, Sanhueza L, Gonzalez A, Corsini G, Tello M. Draft Genome
Sequence of the Chilean isolate Aeromonas salmonicida strain CBA100. FEMS Microbiol. Lett.
2015;362:fnu062.
411
412
413
24. Pavan ME, Pavan EE, López NI, Levin L, Pettinari MJ. Living in an extremely polluted
environment: clues from the genome of melanin-producing Aeromonas salmonicida subsp.
pectinolytica 34melT. Appl. Environ. Microbiol. 2015;81:5235–48.
414
415
416
25. Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, et al. The SEED and the Rapid
Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res.
2014;42:D206–14.
32
417
418
26. Samson JE, Magadán AH, Sabri M, Moineau S. Revenge of the phages: defeating bacterial
defences. Nat. Rev. Microbiol. 2013;11:675–87.
419
420
27. Sugiyama T, Itoh T. Control of ColE2 DNA replication: in vitro binding of the antisense
RNA to the Rep mRNA. Nucleic Acids Res. 1993;21 :5972–7.
421
422
423
28. Han JE, Kim JH, Choresca JH, Shin SP, Jun JW, Chai JY, et al. First description of ColEtype plasmid in Aeromonas spp. carrying quinolone resistance (qnrS2) gene. Lett. Appl.
Microbiol. 2012;55:290–4.
424
425
29. Horii T, Itoh T. Replication of ColE2 and ColE3 plasmids: The regions sufficient for
autonomous replication. Mol. Gen. Genet. MGG. 1988;212:225–31.
426
427
428
30. Yahr TL, Vallis AJ, Hancock MK, Barbieri JT, Frank DW. ExoY, an adenylate cyclase
secreted by the Pseudomonas aeruginosa type III system. Proc. Natl. Acad. Sci. U. S. A.
1998;95:13899–904.
429
430
31. Boyd J, Williams J, Curtis B, Kozera C, Singh R, Reith M. Three small, cryptic plasmids
from Aeromonas salmonicida subsp. salmonicida A449. Plasmid. 2003;50:131–44.
431
432
433
32. Spataro N, Farfán M, Albarral V, Sanglas A, Lorén JG, Fusté MC, et al. Draft Genome
Sequence of Aeromonas molluscorum Strain 848TT, Isolated from Bivalve Molluscs. Genome
Announc. 2013;1:e00382–13.
434
435
436
33. Charette SJ, Brochu F, Boyle B, Filion G, Tanaka KH, Derome N. Draft genome sequence of
the virulent strain 01-B526 of the fish pathogen Aeromonas salmonicida . J. Bacteriol.
2012;194:722–3.
437
438
439
34. Vincent AT, Tanaka KH, Trudel M V, Frenette M, Derome N, Charette SJ. Draft genome
sequences of two Aeromonas salmonicida subsp. salmonicida isolates harboring plasmids
conferring antibiotic resistance. FEMS Microbiol. Lett. 2015;362:1–4.
440
441
442
35. Han JE, Kim JH, Shin SP, Jun JW, Chai JY, Park SC. Draft Genome Sequence of Aeromonas
salmonicida subsp. achromogenes AS03, an Atypical Strain Isolated from Crucian Carp
(Carassius carassius) in the Republic of Korea. Genome Announc. 2013;1:e00791–13.
443
444
445
36. Pavan ME, Pavan EE, López NI, Levin L, Pettinari MJ. Genome Sequence of the MelaninProducing Extremophile Aeromonas salmonicida subsp. pectinolytica Strain 34melT. Genome
Announc. 2013;1:e00675–13.
446
447
37. Gurevich A, Saveliev V, Vyahhi N, Tesler G. QUAST: quality assessment tool for genome
assemblies. Bioinformatics. 2013;29:1072–5.
448
449
33
Download