Supplementary Information (docx 48K)

advertisement
1
Multi-omics analysis of niche specificity provides
new insights into
2
ecological adaptation in bacteria
3
Bo Zhu1*, Muhammad Ibrahim1*, Zhouqi Cui1, Guanlin Xie1, Gulei Jin2, Michael
4
Kube3, Bin Li1$, Xueping Zhou1$
5
1
6
University, Hangzhou 310029, China
7
2
Hangzhou Guhe Info Co., Ltd, Hangzhou 310029, China
8
3
Albrecht Daniel Thaer-Institute of Agricultural and Horticultural Sciences,
9
Humboldt-Universität zu Berlin, 14195 Berlin, Germany
State Key Laboratory of Rice Biology, Institute of Biotechnology, Zhejiang
10
11
Running title: Ecological adaptation in B. seminalis
12
13
*Authors contribute equally to the work
14
$
15
Bin Li, Xueping Zhou
Corresponding author:
16
17
Mailing address: State Key Laboratory of Rice Biology, Institute of Biotechnology,
18
Zhejiang University, 310058, Hangzhou, China.
19
Phone: 86-571-88982412. Fax: 86-571-88982412.
20
libin0571@zju.edu.cn; zzhou@zju.edu.cn
21
22
Conflict of Interest Statement
23
The authors declare no conflict of interest.
24
1
25
Materials and Methods
26
Strains used in this study
27
B. seminalis strains DSM 23518 (= LMG 24067 T), 0901, S9 and R456 originated
28
from CF patient’s sputum (Vanlaere et al 2008), diseased apricot (Fang et al 2009),
29
westlake water (Fang et al 2011), and rice rhizospheric soil (Li et al 2011),
30
respectively. Unless otherwise specified, cultures of bacterial strains were maintained
31
on nutrient agar (NA) or nutrient broth (NB) media at 30°C prior to use. Cultures
32
were stored long term in 20% aqueous glycerol at -80°C.
33
Characterization of ecological roles
34
B. seminalis strains were tested for virulence in the alfalfa model (Bernier et al 2003),
35
which was carried out as described by Ibrahim et al. (2012). Pathogenicity of B.
36
seminalis to apricot was examined according to the method of Fang et al. (2009)
37
except that premature fruits were inoculated with 10 μL of bacterial suspensions at the
38
concentration of 1 × 105 CFU/mL using sterilized tips. Inhibition of B. seminalis on
39
the mycelial growth of R. solani was determined according to the method of Li et al.
40
(2011). The morphology of bacterial cells was observed using a JEOL JSM-6400
41
scanning electron microscope (Hitachi, Tokyo, Japan).
42
Growth in various niches
43
Adaptation of B. seminalis strains to various niches were investigated by incubating
44
the four strains under CF, water and soil extract media, respectively, while plant
45
condition was excluded for only strain 0901 was pathogenic to apricot. Water medium
46
that contains M9 minimal salts with 3% glycerol, was used to simulate the water
47
environment (Schell et al 2011). CF medium was prepared to mimic the sputum of CF
48
patients according to the method of (Dinesh 2010). Soil extract medium was prepared
49
to mimic soil conditions based on recent paper (Yoder-Himes et al 2009) with the
50
exception that soil was collected from the rice rhizosphere, which was the original
51
niche for strain R456. In addition, three different niche conditions were tested for
52
each of strains S9, DSM 23518, and R456. The bacterial numbers were counted based
53
on the measurement of OD 600 value (Ibrahim et al 2012).
2
54
Whole genome sequencing, assembly and annotation
55
Bacterial genomic DNA, isolated using Wizard Genomic DNA Purification Kit
56
(Promega, Madison, WI, USA), was used for whole-genome sequencing, which was
57
performed by using Pacbio sequencing (Pacific Biosciences, Menlo Park, CA, USA),
58
454 sequencing (Roche, Branford, CT, USA) and Illumina sequencing (Illumina, San
59
Diego, CA, USA). Sequence runs for four single-molecule real-time (SMRT) cells
60
were performed on the PacBio RS II sequencer with a 120-minute movie time/SMRT
61
cell. SMRT Analysis portal version 2.1 was used for read filtering and adapter
62
trimming, with default parameters, and postfiltered data of 350 - 580 Mb (around 40 -
63
60X coverage) on each cell/per strain with an average read length of 7 kb were
64
considered for further assembly. All the four genomes were first de novo assembled
65
using HGAP assembly protocol, which is available with the SMRT Analysis packages
66
and accessed through the SMRT Analysis Portal version 2.1. After this first round,
67
PBJelly V14.1.14 was used to fill and reduce as many captured gaps as possible to
68
produce upgraded draft genomes (English et al 2012). As B. seminlais genomes are
69
much bigger than that of the normal bacteria, around 50 scaffolds were generated
70
after this step. Then quality filtered Illumina and 454 sequencing reads were then used
71
to correct the false SNPs and Indels due to the low coverage in some regions. Also,
72
these reads were used enabling gap closure on the pre-assembled genomes by using
73
WGS-assembler and SSPACE (Boetzer et al 2011, Myers et al 2000). Finally, the
74
consensus was obtained based on the above procedure. If it was not complete
75
sequence, scaffolding and gap closure were repeated again until we get the almost
76
complete bacterial genome sequences.
77
Coding DNA Sequences (CDSs) were predicted using Prodigal version 2.6 with
78
default parameters (Hyatt et al 2010). To refine the accuracy, RNA-Seq results were
79
also used for improvement of gene prediction. Gene functions were automatically
80
assigned by RAST annotation engine (Aziz et al 2008) Predicted genes were
81
compared via Blastn against the genomic sequences to verify the accuracy. rRNA
82
operons and tRNA were predicted by RNAmmer and tRNAscan-SE (Lagesen et al
83
2007, Lowe and Eddy 1997), while additional analysis was carried out by using
84
NCBI’s uniprot database (http://www.ncbi.nlm.nih.gov/), COG (Tatusov et al 2001)
85
KEGG (Ogata et al 1999) and GO terms (Ashburner et al 2000).
86
Variant calling
3
87
Paired-end reads generated from Illumina sequencing were mapped onto genome
88
sequence by using Burrows–Wheeler Alignment (Li and Durbin 2009). Default
89
settings were used except the maximum edit distance was set to 0.02 (-n 0.02).
90
MarkDuplicates command in Picard (http://picard.sourceforge.net/) was used to
91
remove the reads that mapped to the same positions in strain DSM 23518 genome
92
(PCR duplications). After IndelRealigner and BaseRecalibrator, SNPs and Indels
93
were called using GATK (Gac et al 2013, Tenaillon et al 2012). Default settings were
94
used except the maximum read depth in GATK was set to 500 (-dcov 500). The
95
generated SNPs and Indels were then filtered using custom Perl scripts to minimize
96
the false positive mutation calls. First, mutations with a total read depth below 20X
97
were discarded. Second, SNPs and Indels with a Phred quality score below 30 were
98
removed. Third, the mutation calls were only kept when at least 80% of the reads was
99
positive. The lists of SNPs/Indels were then annotated by in-house Perl scripts. For
100
the mutations that happened in the coding regions, PROVEAN was used to predict
101
whether a protein sequence variation is deleterious or neural (Chieng et al 2012).
102
Phylogenetic and comparative genome analysis
103
The sequences from four whole genome sequenced strains were aligned and
104
visualized by using Murasaki software (Popendorf et al 2010). For genome-based
105
phylogeny, in addition to the four B. seminalis genomes that sequenced in this study,
106
28 complete Burkholderia genome sequences were obtained from Burkholderia
107
Genome Database (Winsor et al 2008). Furthermore, a well-resolved phylogenetic
108
tree were also generated based on the multi-locus sequence analysis (MLSA) of the
109
atpD, gltB, gyrB, lepA, phaC, recA and trpB genes, which has been widely applied in
110
identification and discrimination of the Burkholderia species (Spilker et al 2009). The
111
identity of strains was confirmed by calculating whole-genome average nucleotide
112
identity (ANI) based on Blast and MUMer algorithm by using JSpecies (Richter and
113
Rosselló-Móra 2009). Multiple sequence alignment was done by using Muscle 3.8
114
(Edgar 2004) and ML tree was generated by MEGA 6 (Tamura et al 2013). In addition,
115
GIs were detected by applying IslandViewer which integrated with mostly used GI
116
detection algorithem IslandPick, SIGI-HMM and IslandPath-DIMOB (Langille and
117
Brinkman 2009).
118
DNA methylation analysis pipeline
4
119
SMRT generated data was analyzed with RS_Modification and Motif Analysis
120
pipeline in SMRT analysis 2.2, which was provided by Pacific Biosciences SMRT
121
portal with default parameters. In this default parameters, coverage and IPD
122
(inter-pulse duration) ratio were calculated by dividing a methylated base in the DNA
123
template to an incorporation opposite of a canonical base (Lluch-Senar et al 2013).
124
All the data sets contain kinetic values for each reference position and DNA strand
125
with the corresponding sequences generated from assembly procedure. For statistical
126
analysis, methylation site positions were divided into three parts (up-stream 200 bp
127
coding region, coding region and down-stream 200 bp coding region). For every gene,
128
top methylated strain was then selected out for further analysis.
129
Growth conditions for RNA-Seq analysis
130
In order to simulate the original niche environments of four B. seminalis strains, 2 mL
131
of overnight cultured bacteria were inoculated into 50 mL of the following four types
132
of media. Water medium that contains M9 minimal salts (0.6% Na2PO4 + 0.3%
133
KH2PO4 + 0.05% NaCl + 0.1% NH4Cl + 0.02% MgSO4 + 0.015% CaCl2) with 3%
134
glycerol, was used for simulation of the water environment (Schell et al 2011). CF
135
medium was prepared according to the method of (Dinesh 2010). Briefly, 5.0 g/L
136
mucin from pig stomach mucosa (Sangon Biotech), 4.0 g/L low molecular-weight
137
salmon sperm DNA (Fluka), 5.9 mg diethylenetriaminepentaacetic acid (DTPA)
138
(Sigma), 5.0 g/L NaCl (Sigma), 2.2 g/L KCl (Sigma), 1.8 g/L Tris base (Sigma), were
139
mixed together autoclaved and 5.0 mL/L egg yolk emulsion (Oxoid), 5.0 g/L
140
casamino acids (Sangon Biotech) were added when temperature reached to 37°C after
141
autoclaving. Soil extract medium was prepared to mimic soil conditions based on
142
recent paper (Yoder-Himes et al 2009) with the exception that soil was collected from
143
the rice rhizosphere, which was the original niche for strain R456. Plant condition to
144
obtain in vivo bacteria was prepared according to the method of our recent paper (Li
145
et al 2014).
146
Total RNA harvesting
147
Each bacterial strain was incubated under its condition to stationary phase. After
148
centrifugation of 4500 g at 4°C, pellets were re-suspended in 3 mL of PBS. One
5
149
milliliter of bacterial culture was subjected to RNA purification by RNeasy Mini Kit
150
(Qiagen) and eluted in 50 µl of RNase-free water. Samples were treated with DNaseI
151
to remove any residual DNA and purified by phenol-chloroform-isoamyl alcohol
152
extraction and ethanol precipitation.
153
mRNA purification and cDNA synthesis
154
Ten micrograms from each total RNA sample was treated with the MICROBExpress
155
Bacterial mRNA Enrichment kit (Ambion) and RiboMinus™ Transcriptome Isolation
156
Kit (Bacteria) (Invitrogen) following the manufacturer’s instructions. Samples were
157
resuspended in 15 μL of RNase-free water. Bacterial mRNAs were chemically
158
fragmented to the size range of 200-250 bp using 1 × fragmentation solution (Ambion)
159
for 2.5 min at 94°C. cDNA was generated according to instructions given in
160
SuperScript Double-Stranded cDNA Synthesis Kit (Invitrogen). Briefly, each mRNA
161
sample was mixed with 100 pmol of random hexamers, incubated at 65°C for 5 min,
162
chilled on ice, mixed with 4 μL of First-Strand Reaction Buffer (Invitrogen), 2 μL of
163
0.1 M DTT, 1 μL of 10 mM RNase-free dNTPmix, 1 μL of SuperScript III reverse
164
transcriptase (Invitrogen), and incubated at 50°C for 1 h. To generate the second
165
strand, the following Invitrogen reagents were added: 51.5 μL of RNase-free water,
166
20 μL of second-strand reaction buffer, 2.5 μL of 10 mM RNase-free dNTP mix, 50 U
167
E. coli DNA Polymerase, 5 U E. coli RNase H, and incubated at 16 °C for 2.5 h.
168
RNA Sequencing
169
The Illumina Paired End Sample Prep kit was used for RNA-Seq library creation
170
according to the manufacturer’s instructions as follows: Fragmented cDNA was
171
end-repaired, ligated to Illumina adaptors, and amplified by 18 cycles of PCR.
172
Paired-end 100-bp reads were generated by high-throughput sequencing with the
173
Illumina Hiseq2000 Genome Analyzer instrument.
174
RNA-Seq data analysis
175
After removing the low quality reads and adaptors, RNA-Seq reads were aligned to
176
the corresponding B. seminalis genome using TopHat 2.0.7 (Trapnell et al 2009),
177
allowing for a maximum of two mismatches. If reads mapped to more than one
178
location, only the one showing the highest score was kept. Reads mapping to rRNA
179
and tRNA regions were removed from further analysis. After getting the reads number
6
180
from every sample, edgeR with TMM normalization method was used to determine
181
the DEGs. Significantly differentially expressed genes (FDR value < 0.05 and at least
182
two fold changes) were selected for further analysis. Cluster 3.0 and Treeview 1.1.6
183
were used to generate the heatmap cluster based on the RPKM values (de Hoon et al
184
2004, Saldanha 2004).
185
COG enrichment analysis
186
All the DEGs between different strains or conditions will be classified by COG
187
category (Tatusov et al 2001). Based on the whole-genome COG classification, the
188
significance of COG category about DEGs under the same COG category will be
189
tested based on the Hypergeometric Distribution,



p
 
n
190
N M
n i
M
i
i x
N
n
191
In which, N means the number of genes in the genome, M means the number of genes
192
assigned to one COG category in the whole genome, n means the number of DEGs
193
and I means the number of genes fill into one COG category in DEGs. The results
194
were shown on Table S4.
195
Validation of mix sample method
196
Each sample was derived from a pool of five biological replicates, which has been
197
developed to increase the efficiency and cost-effectiveness with equivalent statistical
198
power (Greenwald et al 2012, Peng et al 2003). To validate the accuracy of
199
mix-sample method, single biological RNA sample from SE of strain DSM 23518
200
were prepared. Correlation coefficient between samples was determined by statistical
201
analysis.
202
Quantitative real-time PCR
203
Total RNAs were extracted from exponentially growing cells, using an RNeasy Mini
204
spin columns Kit (Qiagen) and was treated with a unit of RNase-free DNase I
7
205
(Qiagen), and cDNA synthesis was performed with a Moloney murine leukemia virus
206
reverse transcriptase first-strand cDNA synthesis kit (QIAGEN). The cDNA was then
207
used directly as the template for qRT-PCR using a SYBER Green master mix (Protech
208
Technology Enterprise Co., Ltd.) on an ABI Prism 7000 sequence detection system
209
(Applied Biosystems). Primers for quantitative real-time PCR (qRT-PCR) of the
210
selected genes were designed by using Primer 3 based on the genome sequences
211
(Untergasser et al 2012). All these primers are listed in Table S3 and an annealing
212
temperature of 58ºC was used for all the primers. Short-chain dehydrogenase
213
(BCAL2694), which has been proved to be stably expressed in Bcc, was used as
214
internal control (Van Acker et al 2013). Fold changes were calculated according to the
215
delta-delta CT method and the values were also shown on Table S3. The correlation
216
between RNA-Seq results and qRT-PCR results were tested by Pearson's correlation
217
method.
218
219
Supplementary references
220
221
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM et al (2000). Gene
Ontology: tool for the unification of biology. Nat Genet 25: 25-29.
222
223
Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA et al (2008). The
RAST Server: rapid annotations using subsystems technology. BMC Genomics 9: 75.
224
225
226
Bernier SP, Silo-Suh L, Woods DE, Ohman DE, Sokol PA (2003). Comparative
analysis of plant and animal models for characterization of Burkholderia cepacia
virulence. Infect Immun 71: 5306-5313.
227
228
Boetzer M, Henkel CV, Jansen HJ, Butler D, Pirovano W (2011). Scaffolding
pre-assembled contigs using SSPACE. Bioinformatics 27: 578-579.
229
230
Chieng S, Carreto L, Nathan S (2012). Burkholderia pseudomallei transcriptional
adaptation in macrophages. BMC Genomics 13: 328.
231
232
de Hoon MJL, Imoto S, Nolan J, Miyano S (2004). Open source clustering software.
Bioinformatics 20: 1453-1454.
233
234
Dinesh SD (2010). Artificial
doi:10.1038/protex.2010.212.
Sputum
8
Medium.
Protocol
Exchange
235
236
Edgar RC (2004). MUSCLE: multiple sequence alignment with high accuracy and
high throughput. Nucleic Acids Res 32: 1792-1797.
237
238
239
English AC, Richards S, Han Y, Wang M, Vee V, Qu J et al (2012). Mind the gap:
upgrading genomes with Pacific Biosciences RS long-read sequencing technology.
PLoS One 7: e47768.
240
241
Fang Y, Li B, Wang F, Liu B, Wu Z, Su T et al (2009). Bacterial fruit rot of apricot
caused by Burkholderia cepacia in China. Plant Pathol J 25: 429-432.
242
243
244
Fang Y, Xie G, Lou M, Li B, Muhammad I (2011). Diversity analysis of
Burkholderia cepacia complex in the water bodies of West Lake, Hangzhou, China.
The Journal of Microbiology 49: 309-314.
245
246
247
Gac M, Cooper TF, Cruveiller S, Médigue C, Schneider D (2013). Evolutionary
history and genetic parallelism affect correlated responses to evolution. Mol Ecol 22:
3292-3303.
248
249
250
Greenwald JW, Greenwald CJ, Philmus BJ, Begley TP, Gross DC (2012). RNA-seq
analysis reveals that an ECF σ Factor, AcsS, regulates achromobactin biosynthesis in
Pseudomonas syringae pv. syringae B728a. PLoS One 7: e34804.
251
252
253
Hyatt D, Chen GL, LoCascio PF, Land ML, Larimer FW, Hauser LJ (2010). Prodigal:
prokaryotic gene recognition and translation initiation site identification. BMC
Bioinformatics 11.
254
255
256
257
Ibrahim M, Tang Q, Shi Y, Almoneafy A, Fang Y, Xu L et al (2012). Diversity of
potential pathogenicity and biofilm formation among Burkholderia cepacia complex
water, clinical, and agricultural isolates in China. World J Microb Biot 28:
2113-2123.
258
259
260
Lagesen K, Hallin P, Rødland EA, Stærfeldt HH, Rognes T, Ussery DW (2007).
RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids
Res 35: 3100-3108.
261
262
263
Langille MGI, Brinkman FSL (2009). IslandViewer: an integrated interface for
computational identification and visualization of genomic islands. Bioinformatics 25:
664-665.
264
265
266
267
Li B, Liu BP, Yu RR, Lou MM, Wang YL, Xie GL et al (2011). Phenotypic and
molecular characterization of rhizobacterium Burkholderia sp. strain R456
antagonistic to Rhizoctonia solani, sheath blight of rice. World J Microb Biot 27:
2305-2313.
268
269
270
Li B, Ibrahim M, Ge M, Cui Z, Sun G, Xu F et al (2014). Transcriptome analysis of
Acidovorax avenae subsp. avenae cultivated in vivo and co-culture with Burkholderia
seminalis. Sci Rep 4.
9
271
272
Li H, Durbin R (2009). Fast and accurate short read alignment with Burrows–Wheeler
transform. Bioinformatics 25: 1754-1760.
273
274
275
Lluch-Senar M, Luong K, Lloréns-Rico V, Delgado J, Fang G, Spittle K et al (2013).
Comprehensive methylome characterization of Mycoplasma genitalium and
Mycoplasma pneumoniae at single-base resolution. PLoS Genetics 9: e1003191.
276
277
Lowe TM, Eddy SR (1997). tRNAscan-SE: a program for improved detection of
transfer RNA genes in genomic sequence. Nucleic Acids Res 25: 955-964.
278
279
Myers EW, Sutton GG, Delcher AL, Dew IM, Fasulo DP, Flanigan MJ et al (2000).
A whole-genome assembly of Drosophila. Science 287: 2196-2204.
280
281
Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, Kanehisa M (1999). KEGG: Kyoto
encyclopedia of genes and genomes. Nucleic Acids Res 27: 29-34.
282
283
284
Peng X, Wood CL, Blalock EM, Chen KC, Landfield PW, Stromberg AJ (2003).
Statistical implications of pooling RNA samples for microarray experiments. BMC
Bioinformatics 4: 26.
285
286
287
Popendorf K, Tsuyoshi H, Osana Y, Sakakibara Y (2010). Murasaki: A Fast,
Parallelizable Algorithm to Find Anchors from Multiple Genomes. PLoS One 5:
e12651.
288
289
Richter M, Rosselló-Móra R (2009). Shifting the genomic gold standard for the
prokaryotic species definition. Proc Natl Acad Sci U S A 106: 19126-19131.
290
291
Saldanha AJ (2004). Java Treeview—extensible visualization of microarray data.
Bioinformatics 20: 3246-3248.
292
293
294
Schell MA, Zhao P, Wells L (2011). Outer Membrane Proteome of Burkholderia
pseudomallei and Burkholderia mallei From Diverse Growth Conditions. J Proteome
Res 10: 2417-2424.
295
296
297
Spilker T, Baldwin A, Bumford A, Dowson CG, Mahenthiralingam E, LiPuma JJ
(2009). Expanded multilocus sequence typing for Burkholderia species. J Clin
Microbiol 47: 2607-2610.
298
299
Tamura K, Stecher G, Peterson D, Filipski A, Kumar S (2013). MEGA6: molecular
evolutionary genetics analysis version 6.0. Mol Biol Evol 30: 2725-2729.
300
301
302
Tatusov RL, Natale DA, Garkavtsev IV, Tatusova TA, Shankavaram UT, Rao BS et
al (2001). The COG database: new developments in phylogenetic classification of
proteins from complete genomes. Nucleic Acids Res 29: 22-28.
303
304
Tenaillon O, Rodríguez-Verdugo A, Gaut RL, McDonald P, Bennett AF, Long AD et
al (2012). The molecular diversity of adaptive convergence. Science 335: 457-461.
10
305
306
Trapnell C, Pachter L, Salzberg SL (2009). TopHat: discovering splice junctions with
RNA-Seq. Bioinformatics 25: 1105-1111.
307
308
Untergasser A, Cutcutache I, Koressaar T, Ye J, Faircloth BC, Remm M et al (2012).
Primer3—new capabilities and interfaces. Nucleic acids research 40: e115-e115.
309
310
311
Van Acker H, Sass A, Bazzini S, De Roy K, Udine C, Messiaen T et al (2013).
Biofilm-grown Burkholderia cepacia complex cells survive antibiotic treatment by
avoiding production of reactive oxygen species. PLoS ONE 8: e58943.
312
313
314
315
316
Vanlaere E, LiPuma JJ, Baldwin A, Henry D, De Brandt E, Mahenthiralingam E et al
(2008). Burkholderia latens sp. nov., Burkholderia diffusa sp. nov., Burkholderia
arboris sp. nov., Burkholderia seminalis sp. nov. and Burkholderia metallica sp. nov.,
novel species within the Burkholderia cepacia complex. Int J Syst Evol Microbiol 58:
1580-1590.
317
318
319
Winsor GL, Khaira B, Van Rossum T, Lo R, Whiteside MD, Brinkman FSL (2008).
The Burkholderia Genome Database: facilitating flexible queries and comparative
analyses. Bioinformatics 24: 2803-2804.
320
321
322
Yoder-Himes D, Chain P, Zhu Y, Wurtzel O, Rubin E, Tiedje JM et al (2009).
Mapping the Burkholderia cenocepacia niche response via high-throughput
sequencing. Proc Natl Acad Sci U S A 106: 3976-3981.
323
324
325
326
327
328
329
330
331
332
333
334
335
11
336
Supplementary Figure and Table Legends
337
Figure S1: Distribution of differentially expressed genes along the chromosome.
338
Grey thick circles sorted by strain 0901 from inner to outer represent strains 0901,
339
DSM 23518, R456 and S9 chromosomes, respectively. The red, green, blue and black
340
peaks outside the chromosome represent the log2 RPKM values of genes under CF,
341
apricot, soil and water conditions. Outside the black peak (water RPKM value) is the
342
heatmap of genes density every 10 kb along the chromosome from blue to red.
343
Figure S2: Full genome alignment among the four strains 0901, DSM 23518, R456
344
and S9 of Burkholderia seminalis.
345
Figure S3: Expression pattern cluster based on the normalized RPKM values. The
346
cluster of RNA-Seq samples based on the log2 RPKM values.
347
Figure S4: The histogram of the number of DNA methylation in Burkholderia
348
seminalis strains 0901, S9, R456 and DSM 23518.
349
Figure S5: Phylogenetic relationship of four Burkholderia seminalis strains to other
350
species of Burkholderia. (a) Maximum-likelihood tree was constructed by using
351
MLSA from four sequenced B. seminalis strains in this study and other 28
352
Burkholderia strains. Among these strains, B. seminalis DSM 23518 (= LMG 24067),
353
B. lata 383, B. thailandensis E264, B. mallei ATCC 23344, B. phymatum STM815, B.
354
phytofirmans PsJN and B. xenovorans LB400 are type strains. (b) Maximum
355
likelihood tree was constructed based on whole genome sequences. Among these
356
strains, the type strains are the same as that of (a).
357
Figure S6: Correlation coefficient between SE-single sample and SE-mix sample of
358
strain DSM 23518 based on the log2 RPKM values.
12
359
Figure S7: Correlation coefficient between SE-mix sample and W-mix sample of
360
strain DSM 23518 based on the log2 RPKM values.
361
362
Table S1: Physiological characteristics of Burkholderia seminalis strains 0901, S9,
363
R456 and DSM 23518.
364
Table S2: Comparison of general genomic features between Burkholderia seminalis
365
strains 0901, DSM 23518, R456 and S9.
366
Table S3: Summary of RNA-Seq results (Illumina HiSeq 2000).
367
Table S4: Integrated information of Burkholderia seminalis strains 0901, S9, R456
368
and DSM 23518.
369
Table S5: Average Nucleotide Identity (ANI) among the Burkholderia seminalis
370
genomes and the selected Burkholderia cenocepacia genomes.
371
Table S6: COG enrichment results from DEGs. a), strain 0901; b), strain DSM 23518;
372
c), strain S9; d), strain R456.
373
Table S7: Gene clusters involved in niche adaptation.
374
Table S8: (a): Primers of qRT-PCR used in this study. (b): Internal primer used in
375
qRT-PCR and its RPKM values in different strains and conditions.
376
13
Download