emi12582-sup-0001-si

advertisement
Table S1: Summary EBPR performance characteristics and genome presence for all 13 metagenomic samples.
Sample
NCBI BioSample
Accession
F2411
SAMN02445067
M92408
SAMN00778987
G2411
SBR
operating
condition
date
MLSS (mg/L)
1
floc, fast fed
24/11/08
2400
VFA levels at
end of
anaerobic
(mg/L)
0.51
P removed ppm
(after feed – at end
of anaerobic)
Days of operation
when sample was
taken
Genome (ppk1 type)
34.06
137
BA-93 (IA)
1
floc, fast fed
24/08/10
3580
0.54
13.22
775
BA-93 (IA), SK-02 (IIC)
SAMN02445066
2
granule, fast fed
24/11/08
8305
1.1
39.45
137
BA-91 (IIC), BA-92 (IC), BA-94 (IIF)
M82408
SAMN00778986
3
floc, slow fed
24/08/10
2485
0.67
17.11
111
SK-01 (IIC)
M92705
SAMN02445076
4
floc, fast fed
27/05/11
2385
0
39.8
107
SK-11 (IIF)
M92206
SAMN02445080
4
floc, fast fed
22/06/11
1970
0.74
14.2
133
SK-11 (IIF)
M90108
SAMN02445083
4
floc, fast fed
1/08/11
1135
1.07
3
173
SK-11 (IIF), SK-12 (IIF)
M90709
SAMN02445085
4
floc, fast fed
7/09/11
1475
2.67
32.6
210
SK-11 (IIF), SK-12 (IIF)
M92511
SAMN02445089
4
floc, fast fed
25/11/11
3625
0.5
8.6
289
SK-12 (IIF)
M91801
SAMN02445092
4
floc, fast fed
18/01/12
2315
2.73
9.4
343
SK-12 (IIF)
M81706
SAMN02445068
5
floc, slow fed
17/06/11
1960
0.83
18.2
128
SK-11 (IIF)
M80509
SAMN02445069
5
floc, slow fed
5/09/11
1830
0
22.2
208
SK-11 (IIF)
M81612
SAMN02445070
5
floc, slow fed
16/12/11
4120
0.48
2.6
310
SK-11 (IIF)
Table S2: Binning strategy used to recover each of the Accumulibacter genome
sequenced in this study.
Genome
SK-01
Metagenome Presence
M82408
NCBI Biosample ID
SAMN00778986
SK-02
M92408
SAMN00778987
SK-11
BA-91
M90108,
M92511,
M92705,
M92206,
M90809,
M91801,
M81706,
M81612,
M80509,
M90108,
M92511,
M92705,
M92206,
M90809,
M91801
G2411
SAMN02850038
SAMN02850039
SAMN02850040
SAMN02445080
SAMN02445085
SAMN02445092
SAMN02445068
SAMN02445070
SAMN02445069
SAMN02850038
SAMN02850039
SAMN02850040
SAMN02445080
SAMN02445085
SAMN02445092
SAMN02445066
BA-92
G2411
SAMN02445066
BA-93
F2411
SAMN02445067
BA-94
G2411
SAMN02445066
SK-12
Binning Strategy
Tetranucleotide frequencies using ESOM and
coverage binning
Tetranucleotide frequencies using ESOM and
coverage binning
GroopM based differential coverage binning
GroopM based differential coverage binning
Tetranucleotide frequencies
coverage binning
Tetranucleotide frequencies
coverage binning
Tetranucleotide frequencies
coverage binning
Tetranucleotide frequencies
coverage binning
using ESOM and
using ESOM and
using ESOM and
using ESOM and
Table S3: Assembly parameters used to recover each Accumulibacter draft genome
Genome
SK-01
NCBI Biosample ID
SAMN00778986
SK-02
SAMN00778987
SK-11
SK-12
BA-91
SAMN02850038
SAMN02850039
SAMN02850040
SAMN02445080
SAMN02445085
SAMN02445092
SAMN02445068
SAMN02445070
SAMN02445069
SAMN02850038
SAMN02850039
SAMN02850040
SAMN02445080
SAMN02445085
SAMN02445092
SAMN02445067
SAMN02445066
SAMN02445066
BA-92
BA-94
BA-93
SAMN02445067
Assembly Parameters
CLC Genomics Workbench 5.5
Word size: 22
Bubble Size: 50
Insert Size: 180-400
Quality trimming using threshold 0.01
Velvet 1.0.19
Kmer: 47
Expected Coverage: 100
Coverage Cutoff: 2
Insert Size: 300
No quality trimming
CLC Genomics Workbench 5.5
Word size: 27
Bubble Size: 119
Insert Size: 180-400
Quality trimming using threshold 0.01
CLC Genomics Workbench 5.5
Word Size: 23
Bubble Size: 50
Insert Size: 180-400
Quality trimming using threshold 0.01
CLC Genomics Workbench 5.5
Word Size: 24
Bubble Size: 50
Insert Size: 180-400
Quality trimming using threshold 0.01
Figure S1: Reactor operation and sampling timeline. Each shaded box represents the
timeframe for reactor operation over a period of ~3 years. Each reactor is shaded based
on its broad operational conditions and vertical lines through each box represent sampling
points for metagenomic sequencing.
IID
IIF
IIC
IIB
IIA
IA
IB
IC
ID
IE
Figure S2: Maximum likelihood phylogenetic tree of Accumulibacter ppk1 nucleotide
sequences. Monophyletic Accumulibacter clades are shown mostly as compressed
wedges, however clade IIC, IIF, IA and IC are expanded to show the placement of
genomes sequenced in this study (coloured in red) and the two other Accumulibacter
genome representatives, coloured blue. Black circles on node branches indicate >70%
bootstrap support.
Ca. Accumulibacter sp. SK-02
Ca. Accumulibacter sp. SK-01
IIC
Ca. Accumulibacter sp. BA-91
Ca. Accumulibacter sp. SK-12
Ca. Accumulibacter sp. SK-11
IIF
Ca. Accumulibacter sp. BA-94
Ca. Accumulibacter phosphatis UW-1
IIA
Ca. Accumulibacter sp. UW-2
Ca. Accumulibacter sp. BA-93
Ca. Accumulibacter sp. BA-92
IA
IC
Dechlorosoma suillum PS
Dechloromonas aromatica RCB
Azoarcus sp. BH72
Thauera sp. MZ1T
Aromatoleum aromaticum EbN1
Uliginosibacterium gangwonense DSM 18521
Rhodocyclaceae bacterium RZ94
Methyloversatilis sp. RZ18-153
Methyloversatilis universalis FAM500
Methyloversatilis sp. NVD
Neisseria meningitidis 053442
Figure S3: Maximum likelihood phylogenetic tree of Rhodocyclaceae constructed from
the concatenated alignment of 38 single-copy genes; Neisseria meningitides was used as
the outgroup. Node labels refer to bootstrap support and Accumulibacter genomes are
annotated by their ppk1 type.
Figure S4: Boxplots of SNP frequencies for core Accumulibacter genes for each of the
genomes for the metagenome samples in which they were present. Genes that contained a
SNP frequency greater than 1.5x the interquartile range plus the third quartile are marked
as outliers using a blue plus symbol. The median SNP frequency for all of the genes is
marked with a red line.
Figure S5: Mean coverage per contig for all Accumulibacter strains sequenced in these
study in SBR4 and SBR5, which were sampled at six and three timepoints respectively.
Figure S6: Comparison of COG categories for unique genes in each Accumulibacter
genome. COG categories: C, energy production and conservation; D, cell cycle control,
cell division, chromosome partitioning; E, amino acid transport and metabolism; F,
nucleotide transport and metabolism; G, carbohydrate transport and metabolism; H,
coenzyme transport and metabolism; I, lipid transport and metabolism; J, translation,
ribosome structure and biogenesis; K, transcription; L, replication, recombination and
repair; M, cell wall/membrane/envelope biogenesis; N, cell motility; O, postranslational
modification, protein turnover, chaperones; P, inorganic ion transport and metabolism; Q,
secondary metabolites biosynthesis, transport and catabolism; R, general function
prediction; S, function unknown; T, signal transduction; U, intracellular trafficking,
R
uB
is
C
O
secretion and vesicular transport; V, defence mechanisms.
Accumulibacter BA-93
...NNN...
20
79
07
11
85
20
79
07
11
86
20
79
07
11
90
20
79
07
11
89
20
79
07
11
88
20
79
07
11
87
20
79
07
11
91
20
79
07
11
92
Accumulibacter UW-2
Figure S7: Alignment of Accumulibacter UW-2 and BA-93 genomes around the position
of RuBisCO. Both genomes are syntenous at this region, represented by the grey shading
between genes, however there is a gap in the scaffold for the UW-2 genome that has
removed the majority of RuBisCO and two other genes. The IMG gene IDs are given for
the
genes
in
the
Accumulibacter
UW-2
genome.
Brucella melitensis NI
Ochrobactrum anthropi ATCC 49188
Pseudovibrio sp. FO-BEG1
Tistrella mobilis KA081020-065
Maricaulis maris MCS10
Methylobacterium sp. 4-46
Methylobacterium radiotolerans JCM 2831
Methylobacterium nodulans ORS 2060
Oligotropha carboxidovorans OM5
Hyphomicrobium denitrificans ATCC 51888
Hyphomicrobium sp. MC1
Azorhizobium caulinodans ORS 571
Methylocella silvestris BL2
Caulobacter sp. K31
Hyphomonas neptunium ATCC 15444
Paracoccus denitrificans PD1222
Roseobacter denitrificans OCh 114
Phenylobacterium zucineum HLK1
Acidiphilium multivorum AIU301
Burkholderia sp. KJ006
Burkholderia ambifaria MC40-6
Burkholderia multivorans ATCC 17616
Burkholderia glumae BGR1
Burkholderia gladioli BSR3
Burkholderia pseudomallei 1026b
Burkholderia thailandensis E264
Ralstonia pickettii 12J
Bordetella petrii DSM 12804
Polaromonas naphthalenivorans CJ2
Polaromonas sp. JS666
Albidiferax ferrireducens T118
Acidovorax avenae subsp. avenae ATCC 19860
Accumulibacter Sk-01
Accumulibacter BA-91
Burkholderia sp. CCGE1003
Burkholderia phymatum STM815
Methylibium petroleiphilum PM1
Alicycliphilus denitrificans K601
Acidovorax sp. JS42
Delftia sp. Cs1-4
Thiomonas intermedia K12
Pseudogulbenkiania sp. NH8B
Nitrobacter hamburgensis X14
Nitrobacter hamburgensis X14
Nitrobacter hamburgensis X14
Nitrobacter winogradskyi Nb-255
Nitrobacter winogradskyi Nb-255
Anaeromyxobacter dehalogenans 2CP-1
Anaeromyxobacter sp. Fw109-5
Geobacter metallireducens GS-15
Geobacter lovleyi SZ
Geobacter metallireducens GS-15
Geobacter lovleyi SZ
Geobacter sp. M18
Desulfomonile tiedjei DSM 6799
Desulfobacterium autotrophicum HRM2
Methylophaga nitratireducenticrescens
Thiobacillus denitrificans ATCC 25259
Thauera sp. MZ1T
Aromatoleum aromaticum EbN1
Acidithiobacillus ferrivorans SS3
Acidithiobacillus caldus SM-1
Edwardsiella tarda FL6-60
Shewanella halifaxensis HAW-EB4
Ralstonia eutropha JMP134
Ralstonia eutropha H16
Cupriavidus metallidurans CH34
Ralstonia solanacearum Po82
Ralstonia pickettii 12D
Burkholderia thailandensis E264
Burkholderia sp. YI23
Achromobacter xylosoxidans A8
Bordetella petrii DSM 12804
Herbaspirillum seropedicae SmR1
Pseudogulbenkiania sp. NH8B
Chromobacterium violaceum ATCC 12472
Herminiimonas arsenicoxydans
Candidatus Vesicomyosocius okutanii HA
Alkalilimnicola ehrlichii MLHE-1
gamma proteobacterium HdN1
Salmonella bongori NCTC 12419
Salmonella enterica subsp. enterica serovar Heidelberg str . B182
Citrobacter rodentium ICC168
Citrobacter koseri ATCC BAA-895
Escherichia coli str. clone D i14
Klebsiella pneumoniae KCTC 2242
Klebsiella oxytoca KCTC 1686
Enterobacter lignolyticus SCF1
Enterobacter cloacae subsp. dissolvens SDM
Enterobacter sp. 638
Shimwellia blattae DSM 4481 = NBRC 105725
Klebsiella pneumoniae KCTC 2242
Enterobacter cloacae subsp. dissolvens SDM
Escherichia coli str. K-12 substr. W3110
Enterobacter lignolyticus SCF1
Cronobacter sakazakii ES15
Shimwellia blattae DSM 4481 = NBRC 105725
Dickeya zeae Ech1591
Dickeya dadantii Ech703
Pectobacterium carotovorum subsp. carotovorum PC1
Rahnella aquatilis HX2
Pantoea vagans C9-1
Erwinia billingiae Eb661
Serratia sp. AS13
Providencia stuartii MRSN 2154
Proteus mirabilis HI4320
Vibrio sp. EJY3
Shewanella sp. MR-7
Halomonas elongata DSM 2581
Chromohalobacter salexigens DSM 3043
Alcanivorax borkumensis SK2
Marinobacter hydrocarbonoclasticus ATCC 49840
Alteromonas macleodii str. Deep ecotype
Hahella chejuensis KCTC 2396
Methylophaga nitratireducenticrescens
Pseudomonas stutzeri DSM 4166
Pseudomonas fluorescens F113
Moraxella catarrhalis RH4
Psychrobacter arcticus 273-4
Shewanella woodyi ATCC 51908
Shewanella halifaxensis HAW-EB4
Kangiella koreensis DSM 16069
Pseudomonas aeruginosa DK2
Stenotrophomonas maltophilia K279a
0.4
Figure S8: Maximum likelihood phylogenetic tree of Accumulibacter NarG with NarG
proteins
from
representative
(Neisseriales) used as the outgroup.
Burkholderiales
genomes.
Pseudogulbenkiania
Figure S9: Contig coverage per 10 kbp block for each of the Accumulibacter strains
sequenced in the study. All contigs from each draft genome are shown concatenated in
order across the x-axis for each subplot. If contigs were less than 10 kbp in length then
the average coverage across the total length of that contig was used. For each genome,
only the samples for which they were identified are shown.
Download