Table S1: Summary EBPR performance characteristics and genome presence for all 13 metagenomic samples. Sample NCBI BioSample Accession F2411 SAMN02445067 M92408 SAMN00778987 G2411 SBR operating condition date MLSS (mg/L) 1 floc, fast fed 24/11/08 2400 VFA levels at end of anaerobic (mg/L) 0.51 P removed ppm (after feed – at end of anaerobic) Days of operation when sample was taken Genome (ppk1 type) 34.06 137 BA-93 (IA) 1 floc, fast fed 24/08/10 3580 0.54 13.22 775 BA-93 (IA), SK-02 (IIC) SAMN02445066 2 granule, fast fed 24/11/08 8305 1.1 39.45 137 BA-91 (IIC), BA-92 (IC), BA-94 (IIF) M82408 SAMN00778986 3 floc, slow fed 24/08/10 2485 0.67 17.11 111 SK-01 (IIC) M92705 SAMN02445076 4 floc, fast fed 27/05/11 2385 0 39.8 107 SK-11 (IIF) M92206 SAMN02445080 4 floc, fast fed 22/06/11 1970 0.74 14.2 133 SK-11 (IIF) M90108 SAMN02445083 4 floc, fast fed 1/08/11 1135 1.07 3 173 SK-11 (IIF), SK-12 (IIF) M90709 SAMN02445085 4 floc, fast fed 7/09/11 1475 2.67 32.6 210 SK-11 (IIF), SK-12 (IIF) M92511 SAMN02445089 4 floc, fast fed 25/11/11 3625 0.5 8.6 289 SK-12 (IIF) M91801 SAMN02445092 4 floc, fast fed 18/01/12 2315 2.73 9.4 343 SK-12 (IIF) M81706 SAMN02445068 5 floc, slow fed 17/06/11 1960 0.83 18.2 128 SK-11 (IIF) M80509 SAMN02445069 5 floc, slow fed 5/09/11 1830 0 22.2 208 SK-11 (IIF) M81612 SAMN02445070 5 floc, slow fed 16/12/11 4120 0.48 2.6 310 SK-11 (IIF) Table S2: Binning strategy used to recover each of the Accumulibacter genome sequenced in this study. Genome SK-01 Metagenome Presence M82408 NCBI Biosample ID SAMN00778986 SK-02 M92408 SAMN00778987 SK-11 BA-91 M90108, M92511, M92705, M92206, M90809, M91801, M81706, M81612, M80509, M90108, M92511, M92705, M92206, M90809, M91801 G2411 SAMN02850038 SAMN02850039 SAMN02850040 SAMN02445080 SAMN02445085 SAMN02445092 SAMN02445068 SAMN02445070 SAMN02445069 SAMN02850038 SAMN02850039 SAMN02850040 SAMN02445080 SAMN02445085 SAMN02445092 SAMN02445066 BA-92 G2411 SAMN02445066 BA-93 F2411 SAMN02445067 BA-94 G2411 SAMN02445066 SK-12 Binning Strategy Tetranucleotide frequencies using ESOM and coverage binning Tetranucleotide frequencies using ESOM and coverage binning GroopM based differential coverage binning GroopM based differential coverage binning Tetranucleotide frequencies coverage binning Tetranucleotide frequencies coverage binning Tetranucleotide frequencies coverage binning Tetranucleotide frequencies coverage binning using ESOM and using ESOM and using ESOM and using ESOM and Table S3: Assembly parameters used to recover each Accumulibacter draft genome Genome SK-01 NCBI Biosample ID SAMN00778986 SK-02 SAMN00778987 SK-11 SK-12 BA-91 SAMN02850038 SAMN02850039 SAMN02850040 SAMN02445080 SAMN02445085 SAMN02445092 SAMN02445068 SAMN02445070 SAMN02445069 SAMN02850038 SAMN02850039 SAMN02850040 SAMN02445080 SAMN02445085 SAMN02445092 SAMN02445067 SAMN02445066 SAMN02445066 BA-92 BA-94 BA-93 SAMN02445067 Assembly Parameters CLC Genomics Workbench 5.5 Word size: 22 Bubble Size: 50 Insert Size: 180-400 Quality trimming using threshold 0.01 Velvet 1.0.19 Kmer: 47 Expected Coverage: 100 Coverage Cutoff: 2 Insert Size: 300 No quality trimming CLC Genomics Workbench 5.5 Word size: 27 Bubble Size: 119 Insert Size: 180-400 Quality trimming using threshold 0.01 CLC Genomics Workbench 5.5 Word Size: 23 Bubble Size: 50 Insert Size: 180-400 Quality trimming using threshold 0.01 CLC Genomics Workbench 5.5 Word Size: 24 Bubble Size: 50 Insert Size: 180-400 Quality trimming using threshold 0.01 Figure S1: Reactor operation and sampling timeline. Each shaded box represents the timeframe for reactor operation over a period of ~3 years. Each reactor is shaded based on its broad operational conditions and vertical lines through each box represent sampling points for metagenomic sequencing. IID IIF IIC IIB IIA IA IB IC ID IE Figure S2: Maximum likelihood phylogenetic tree of Accumulibacter ppk1 nucleotide sequences. Monophyletic Accumulibacter clades are shown mostly as compressed wedges, however clade IIC, IIF, IA and IC are expanded to show the placement of genomes sequenced in this study (coloured in red) and the two other Accumulibacter genome representatives, coloured blue. Black circles on node branches indicate >70% bootstrap support. Ca. Accumulibacter sp. SK-02 Ca. Accumulibacter sp. SK-01 IIC Ca. Accumulibacter sp. BA-91 Ca. Accumulibacter sp. SK-12 Ca. Accumulibacter sp. SK-11 IIF Ca. Accumulibacter sp. BA-94 Ca. Accumulibacter phosphatis UW-1 IIA Ca. Accumulibacter sp. UW-2 Ca. Accumulibacter sp. BA-93 Ca. Accumulibacter sp. BA-92 IA IC Dechlorosoma suillum PS Dechloromonas aromatica RCB Azoarcus sp. BH72 Thauera sp. MZ1T Aromatoleum aromaticum EbN1 Uliginosibacterium gangwonense DSM 18521 Rhodocyclaceae bacterium RZ94 Methyloversatilis sp. RZ18-153 Methyloversatilis universalis FAM500 Methyloversatilis sp. NVD Neisseria meningitidis 053442 Figure S3: Maximum likelihood phylogenetic tree of Rhodocyclaceae constructed from the concatenated alignment of 38 single-copy genes; Neisseria meningitides was used as the outgroup. Node labels refer to bootstrap support and Accumulibacter genomes are annotated by their ppk1 type. Figure S4: Boxplots of SNP frequencies for core Accumulibacter genes for each of the genomes for the metagenome samples in which they were present. Genes that contained a SNP frequency greater than 1.5x the interquartile range plus the third quartile are marked as outliers using a blue plus symbol. The median SNP frequency for all of the genes is marked with a red line. Figure S5: Mean coverage per contig for all Accumulibacter strains sequenced in these study in SBR4 and SBR5, which were sampled at six and three timepoints respectively. Figure S6: Comparison of COG categories for unique genes in each Accumulibacter genome. COG categories: C, energy production and conservation; D, cell cycle control, cell division, chromosome partitioning; E, amino acid transport and metabolism; F, nucleotide transport and metabolism; G, carbohydrate transport and metabolism; H, coenzyme transport and metabolism; I, lipid transport and metabolism; J, translation, ribosome structure and biogenesis; K, transcription; L, replication, recombination and repair; M, cell wall/membrane/envelope biogenesis; N, cell motility; O, postranslational modification, protein turnover, chaperones; P, inorganic ion transport and metabolism; Q, secondary metabolites biosynthesis, transport and catabolism; R, general function prediction; S, function unknown; T, signal transduction; U, intracellular trafficking, R uB is C O secretion and vesicular transport; V, defence mechanisms. Accumulibacter BA-93 ...NNN... 20 79 07 11 85 20 79 07 11 86 20 79 07 11 90 20 79 07 11 89 20 79 07 11 88 20 79 07 11 87 20 79 07 11 91 20 79 07 11 92 Accumulibacter UW-2 Figure S7: Alignment of Accumulibacter UW-2 and BA-93 genomes around the position of RuBisCO. Both genomes are syntenous at this region, represented by the grey shading between genes, however there is a gap in the scaffold for the UW-2 genome that has removed the majority of RuBisCO and two other genes. The IMG gene IDs are given for the genes in the Accumulibacter UW-2 genome. Brucella melitensis NI Ochrobactrum anthropi ATCC 49188 Pseudovibrio sp. FO-BEG1 Tistrella mobilis KA081020-065 Maricaulis maris MCS10 Methylobacterium sp. 4-46 Methylobacterium radiotolerans JCM 2831 Methylobacterium nodulans ORS 2060 Oligotropha carboxidovorans OM5 Hyphomicrobium denitrificans ATCC 51888 Hyphomicrobium sp. MC1 Azorhizobium caulinodans ORS 571 Methylocella silvestris BL2 Caulobacter sp. K31 Hyphomonas neptunium ATCC 15444 Paracoccus denitrificans PD1222 Roseobacter denitrificans OCh 114 Phenylobacterium zucineum HLK1 Acidiphilium multivorum AIU301 Burkholderia sp. KJ006 Burkholderia ambifaria MC40-6 Burkholderia multivorans ATCC 17616 Burkholderia glumae BGR1 Burkholderia gladioli BSR3 Burkholderia pseudomallei 1026b Burkholderia thailandensis E264 Ralstonia pickettii 12J Bordetella petrii DSM 12804 Polaromonas naphthalenivorans CJ2 Polaromonas sp. JS666 Albidiferax ferrireducens T118 Acidovorax avenae subsp. avenae ATCC 19860 Accumulibacter Sk-01 Accumulibacter BA-91 Burkholderia sp. CCGE1003 Burkholderia phymatum STM815 Methylibium petroleiphilum PM1 Alicycliphilus denitrificans K601 Acidovorax sp. JS42 Delftia sp. Cs1-4 Thiomonas intermedia K12 Pseudogulbenkiania sp. NH8B Nitrobacter hamburgensis X14 Nitrobacter hamburgensis X14 Nitrobacter hamburgensis X14 Nitrobacter winogradskyi Nb-255 Nitrobacter winogradskyi Nb-255 Anaeromyxobacter dehalogenans 2CP-1 Anaeromyxobacter sp. Fw109-5 Geobacter metallireducens GS-15 Geobacter lovleyi SZ Geobacter metallireducens GS-15 Geobacter lovleyi SZ Geobacter sp. M18 Desulfomonile tiedjei DSM 6799 Desulfobacterium autotrophicum HRM2 Methylophaga nitratireducenticrescens Thiobacillus denitrificans ATCC 25259 Thauera sp. MZ1T Aromatoleum aromaticum EbN1 Acidithiobacillus ferrivorans SS3 Acidithiobacillus caldus SM-1 Edwardsiella tarda FL6-60 Shewanella halifaxensis HAW-EB4 Ralstonia eutropha JMP134 Ralstonia eutropha H16 Cupriavidus metallidurans CH34 Ralstonia solanacearum Po82 Ralstonia pickettii 12D Burkholderia thailandensis E264 Burkholderia sp. YI23 Achromobacter xylosoxidans A8 Bordetella petrii DSM 12804 Herbaspirillum seropedicae SmR1 Pseudogulbenkiania sp. NH8B Chromobacterium violaceum ATCC 12472 Herminiimonas arsenicoxydans Candidatus Vesicomyosocius okutanii HA Alkalilimnicola ehrlichii MLHE-1 gamma proteobacterium HdN1 Salmonella bongori NCTC 12419 Salmonella enterica subsp. enterica serovar Heidelberg str . B182 Citrobacter rodentium ICC168 Citrobacter koseri ATCC BAA-895 Escherichia coli str. clone D i14 Klebsiella pneumoniae KCTC 2242 Klebsiella oxytoca KCTC 1686 Enterobacter lignolyticus SCF1 Enterobacter cloacae subsp. dissolvens SDM Enterobacter sp. 638 Shimwellia blattae DSM 4481 = NBRC 105725 Klebsiella pneumoniae KCTC 2242 Enterobacter cloacae subsp. dissolvens SDM Escherichia coli str. K-12 substr. W3110 Enterobacter lignolyticus SCF1 Cronobacter sakazakii ES15 Shimwellia blattae DSM 4481 = NBRC 105725 Dickeya zeae Ech1591 Dickeya dadantii Ech703 Pectobacterium carotovorum subsp. carotovorum PC1 Rahnella aquatilis HX2 Pantoea vagans C9-1 Erwinia billingiae Eb661 Serratia sp. AS13 Providencia stuartii MRSN 2154 Proteus mirabilis HI4320 Vibrio sp. EJY3 Shewanella sp. MR-7 Halomonas elongata DSM 2581 Chromohalobacter salexigens DSM 3043 Alcanivorax borkumensis SK2 Marinobacter hydrocarbonoclasticus ATCC 49840 Alteromonas macleodii str. Deep ecotype Hahella chejuensis KCTC 2396 Methylophaga nitratireducenticrescens Pseudomonas stutzeri DSM 4166 Pseudomonas fluorescens F113 Moraxella catarrhalis RH4 Psychrobacter arcticus 273-4 Shewanella woodyi ATCC 51908 Shewanella halifaxensis HAW-EB4 Kangiella koreensis DSM 16069 Pseudomonas aeruginosa DK2 Stenotrophomonas maltophilia K279a 0.4 Figure S8: Maximum likelihood phylogenetic tree of Accumulibacter NarG with NarG proteins from representative (Neisseriales) used as the outgroup. Burkholderiales genomes. Pseudogulbenkiania Figure S9: Contig coverage per 10 kbp block for each of the Accumulibacter strains sequenced in the study. All contigs from each draft genome are shown concatenated in order across the x-axis for each subplot. If contigs were less than 10 kbp in length then the average coverage across the total length of that contig was used. For each genome, only the samples for which they were identified are shown.