SUPPLEMENTARY MATERIAL for Taxonomic Relatedness Shapes Bacterial Assembly in Activated Sludge of Globally Distributed Wastewater Treatment Plants Feng Ju1, Yu Xia1, Feng Guo1, Zhiping Wang1,2, Tong Zhang1* 1Environmental Biotechnology Lab, The University of Hong Kong SAR,China; 2School of Environmental Science and Engineering, Shanghai Jiao Tong University, Shanghai, China. Submitted to Environmental Microbiology *Corresponding author phone: +852-28578551; fax: +852-25595337; e-mail: zhangt@hkucc.hku.hk Supporting Information S1: Molecular methods Supporting Information S2: Topological properties of the activated sludge bacterial network Supporting Information S3: R orders used to analyse the species co-occurrence patterns Figures Figure S1 Flowchart of the network analysis of activated sludge bacterial community Figure S2 Degree distribution of the nodes for the activated sludge association network (closed circles) and random network (open squares) of identical size, respectively. Figure S3 The relative abundance of phylum Acidobacteria (upper panel) and Verrucomicrobia (lower panel) in the 50 activated sludge samples. Figure S4 The relative abundance of functional genera in the 50 activated sludge samples. Figure S5 Examples of strong and significant correlations within different groups of bacterial genera in the activated sludge. Tables Table S1 The 50 activated sludge samples from globally-collected wastewater treatment plants (WWTPs) and the 454 datasets used in the network analysis Table S2 Strong (Spearman’s ρ <-0.6) and significant (P-value < 0.01) negative correlations identified between bacterial genera of activated sludge. Table S3 Clustering coefficient and average path length of the microbial co-occurring network of activated sludge (AS) and comparisons to other real ecological networks and the corresponding Erdös-Réyni random networks Table S4 The observed co-occurring incidence for intra- and inter-phylum/class co-occurrence versus that is expected by random association. Table S5 The inter-correlations between activated sludge generalists (nodes in Figure 4) and the number of co-occurring samples. Supplementary Information S1: Molecular methods For the first group of the 17 industrial AS samples, the V3-V4 regions of the 16S rRNA genes (~465 nucleotides) were amplified with 338F (5’-ACTCCTACGGRAGGCAGCAG-3’) and 802R (5’-TACNVGGGTATCTAATCC-3’) (Claesson et al 2010). Barcodes for sample multiplexing during sequencing were modified in the 5’ terminus of the forward primer. Each polymerase chain reaction (PCR) was conducted in a 100 ml reaction system using MightyAmp polymerase (TaKaRa, Otsu, Japan) and an i-Cycler (BioRad, Hercules, CA, USA) under the following PCR conditions: initial denaturation at 94 °C for 5 min, 35 cycles at 94 °C for 50 s, 40 °C for 30 s and 72 °C for 90 s, and a final extension at 72 °C for 5 min. The resulting barcoded PCR products for each sample were first purified, then mixed in equal concentrations by mass, and finally sequenced on a Roche 454 FLX Titanium platform (Roche) at the Genome Research Center of the University of Hong Kong. Supplementary Information S2: Topological properties of the activated sludge bacterial network For the AS network, the observed APL (3.55), CC (0.55) and modularity (MD, 0.57; values > 0.4 suggest that the network has a modular structure (Newman 2006)) were all greater than the APLr (2.74), CCr (0.057) and MDr (0.35) of their respective Erdös-Réyni random networks (Table S1). The high CC/CCr ratio of 10.2 strongly suggested that the correlation network had ‘small world’ properties, that is, nodes are more connected than in an identical-size random network (Watts and Strogatz 1998) (see Supplementary Information S1 for more details about the network topologies). Although the effect size (measured by log response ratio of CC/CCr) was smaller than for the soil microbial network (3.31) (Barberán et al 2011), it is much higher than for other ecological networks, such as food-web networks (-1.20 to 1.34) (Dunne et al 2002, Montoya et al 2006) and functional gene networks of grassland soil (0.79 to 1.28) (Zhou et al 2010), slightly higher than the marine three-domain (Bacteria, Archaea and Protozoa) network (1.81) (Steele et al 2011), and comparable to the upper bound of pollinator-plant networks (2.39) (Montoya et al 2006, Olesen et al 2006). Interestingly, a CC of 0.43 for the one-domain (Bacteria) activated sludge microbial network in this study was higher than the 0.33 for the two-domain (Bacteria and Archaea) soil microbial network (Barberán et al 2011), and both of these are higher than the 0.27 for three-domain marine network (Steele et al 2011). This may indicate a higher frequency of strong co-occurrence between taxa of the bacteria domain than between a bacterial taxon and a taxon of archaeal or eukaryotic domain. While the higher CC implies that the AS bacterial network included more highly correlated edges (relationships) than other ecological networks, comparisons of topological properties between these networks offer insights into the effects of the general characteristics of different habitat types on the assembly structure of various microbial communities. Based on the modularity class, the entire network could be parsed into 7 major modules (i.e., clusters of species that interact more among themselves than with other species, compared to a random association), with 82 out of total 107 vertices occupied by the three largest modules: Module I, II and III (Figure 1c). Two types of hubs (highly connected nodes) appeared in the AS co-occurring network: (1) genera highly connected within one module (i.e., within-module hubs, e.g., Hyphomicrobium, Prosthecobacter, and Thiobacter) and (2) genera acting as ‘‘connectors’’ between multiple modules (i.e., within-module hubs, such as Clostridium XI, Variovorax, and Gp16). The occurrence of minor highly connected hubs in the “small-world” AS bacterial network rendered it more robust to change, whereas the elimination of hubs from the network would change its structure dramatically (Albert et al 2000). Thus, these hubs could be regarded as microbial ‘keystone species’ in AS. In particular, the between-module hubs were important for associating different modules within the AS network (e.g., Gp16 connects Module I and III). In Modules II and III, the number and vertex degree of the between-module hubs far exceeded that of within-module hubs (Figure 1c), possibly revealing a very close linkage between these two modules in the AS network. Supporting Information S3: R orders used to analyse the species co-occurrence patterns a) C-score calculation oecosimu (data.matrix, nestedchecker, method="swap",nsimul=30000) b) Spearman’s correlation calculation rcorr (t(data.matrix), type="spearman") c) Multiple testing correction for P value adjustment p.adjust (p.matrix, method="BH") d) Generation of graph using igraph graph.adjacency(qualified.data.matrix., weight=T, mode="undirected") e) Generation of 10000 Erdös-Réyni random graphs for (i in 1:10000) { g <- erdos.renyi.game(107, 382,'gnm',weight=T,mode="undirected") g<- simplify(g1) V(g)$label <- taxonomy (name) V(g)$degree <- degree(g) write.graph(g, paste(i,".random.gml", sep = ""),format='gml'} Supplementary Figures 454 pyrosequencing of 16S rRNA amplicons of 50 activated sludge Bacterial abundance matrix Correlation analysis (e.g. Spearman ) Inter-phylum/class correlation Intra-phylum/class correlation Visualization of activated sludge network Module detection and degree distribution Co-occurrence patterns within AS generalists and functional groups Who are the core species? And how they interact with each other Bacterial incidence matrix Species occupancy & checkerboard score testing Activated sludge generalists Non-random cooccurrence patterns Generation of 10000 ErdösRéyni random networks Incidence of inter- and intra-phylum/class cooccurrence Network topological characterization Whether non-random cooccurrence or NOT? Figure S1 Flowchart of the network analysis of activated sludge bacterial community P(k) 0.1 2 Power (R =0.92) 0.01 2 Gaussian (R =0.91) 0 2 4 6 8 10 12 14 16 18 20 22 24 26 Vertex degree (k) Figure S2 Degree distribution of the nodes for the activated sludge association network (closed circles) and random network (open squares) of identical size, respectively. The node degree (i.e., the number of edges connected to the node) is plotted against the probability P(k) that a node would have that degree in the network. The solid line shows the power law fitting (R2=0.92) of degree distribution in the activated sludge association network, and the dashed line shows the Gaussian fitting to the degree distribution from the random network (R2=0.91). For the entire activated sludge (AS) bacterial network, the node degree distribution best follows a scale-free power law distribution (P(k)= 0.3446*(1+k)-0.9665; R2=0.92, Figure S1), resulting from the preferential attachment of new vertices to the more highly connected vertices. This is quite different from the Poisson shape of the random network of an identical size (best fitted by a Gaussian curve, R2=0.91) (Figure S1). Although the exponent of power law distribution for the AS bacteria network was much smaller than the typical coefficients of 2-4 observed for many large networks such as social or protein interaction networks (Amaral et al 2000, Newman 2003, Palla et al 2005), this value was very close to certain other ecological networks (e.g., a marine microbial network (Steele et al 2011) and some food webs (Dunne et al 2002)). This structural similarity among these ecological networks, in contrast with the Gaussian connectivity distribution predicted by the expectation of randomness, also indicated the existence of meaningful, nonrandom associations in the AS bacterial network. S1-S21: Municipal activated sludge S22-S50: Industrial activated sludge Acidobacteria Relative abundance (%) 50 S27 Polymer 45 40 S34-37: Morpholine 12 S41-42: Acrylonitrile S26: Texile S29: Polymer 8 4 0 0 4 8 12 16 20 24 28 32 36 40 44 48 12 Verrucomicrobia Relative abundance (%) 10 8 6 4 2 0 0 4 8 12 16 20 24 Sample ID 28 32 36 40 44 48 Figure S3 The relative abundance of phylum Acidobacteria (upper panel) and Verrucomicrobia (lower panel) in the 50 activated sludge samples. The relative abundance is calculated as the number of sequences that are assigned to the taxa divided by the number of total 16S rRNA gene sequences in the sample. a 0.01% Diversity: Shannon index Eveness: Simpson index Richness: number of genera 0.01% 0.1% 400 1.0 300 0.9 200 0.8 5.5 0.1% 0.01% 0.1% 5.0 4.5 4.0 3.5 3.0 100 2.5 0.7 2.0 0 1.5 Municipal Industrial 0.6 b Percentage (%) in total 16S pyrotags 1.0 Municipal Industrial 100 * * * * * * * * 10 * * * 1 0.1 Zo o gl oe a us cc co ue ra Tr ic ho N itr os ic m Th a ro os bi pi ra um a ni do ho ic vi ef lu D yp H cu oc on om or hl ec D G or as s 0 Figure S4 Comparison between the municipal (S1-S21) and industrial (S22-S50) activated sludge (AS) samples in terms of: (I) the bacterial richness, evenness and diversity, and (II) functional groups with significantly-different (t-test, P-value < 0.05) abundances in the two groups of AS. In inset a, the 0.01 and 0.1% cutoffs represent the minimum supportive (relative) abundance for a genera to be calculated as existed. In inset b, asterisk above boxes of each genus stands for the significance of the difference: *P-value < 0.05, **P-value <0.005 and ***P-value < 0.0005. Boxplots represent observation-normalized results, where boxes represent interquartile range, whiskers indicate 0th and 100th percentiles and plus symbols indicate outliers. The underlined genera are AS functional generalists (occurred in at least 60% of the AS samples) widely-distributed in all activated sludge samples. Figure S4a shows bacterial diversity and richness of the municipal AS were higher than the industrial AS. Figure S4b shows 8 out of total 24 functional groups (Figure 4) had significantly different abundance (t-test, P-value < 0.05) between the municipal and industrial AS. For example, Dechloromonas and Nitrosospria had higher abundance in the municipal AS, whereas Hyphomicrobacterium, Defluviicoccus and Thauera were more abundant in the industrial AS. a Nitrosospira 0.7 Thauera Mesorhizobium 25 Spearman's ρ = 0.64 Relative abundance (%) Relative abundance (%) 0.6 0.5 0.4 0.3 0.2 0.1 Hyphomicrobium Spearman's ρ = 0.63 20 15 10 5 0 0 1 5 9 b individual WWTP 5 9 1.2 Spearman's ρ = 0.85 Relative abundance (%) Relative abundance (%) 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1 13 17 21 25 29 33 37 41 45 49 WWTP id Bifidobacterium Blautia 13 17 21 25 29 33 37 41 45 49 WWTP id Bifidobacterium Lactococcus Spearman's ρ = 0.71 1 0.8 0.6 0.4 0.2 0 1 5 9 c 1.8 13 17 21 25 29 33 37 41 45 49 WWTP id Gp16 Gp6 Gp3 1 Relative abundance (%) Relative abundance (%) 1.4 1.2 1 0.8 0.6 0.4 0.2 9 1.8 Spearman's ρ = 0.67-0.78 1.6 5 13 17 21 25 29 33 37 41 45 49 WWTP id Bdellovibrio Acidovorax Spearman's ρ = 0.61 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0 0 1 5 9 13 17 21 25 29 33 37 41 45 49 WWTP id 1 5 9 13 17 21 25 29 33 37 41 45 49 WWTP id Figure S5 Examples of strong and significant correlations within different groups of bacterial genera in the activated sludge. A correlation is considered as strong and significant when the Spearman’s correlation coefficient (ρ) > 0.6, and P-value < 0.01. The x-axis of each subfigure corresponds to the id (Table S1) of individual WWTP and the y-axis is relative representation of each taxon in the corresponding total 16S rRNA pyrotag library. Supplementary Tables Table S1 The 50 activated sludge samples from globally-collected wastewater treatment plants (WWTPs) and the 454 datasets used in the network analysis. Samples with ID from S01 to S21 are collected from full-scale WWTPs treating municipal wastewater; samples with ID from S22-S50 are collected from full-scale WWTPs treating different industry wastewater. MWW: municipal wastewater; IWW: industrial wastewater. 16S datasets from 454 pyrosequencing Activated sludge sample Sam ple ID S01 S02 S03 S04 S05 S06 S07 S08 S09 S10 S11 WWTP Latitude location and Wastewater (city, Longitud constitutes country) e Guelph, 43.54, Canada -80.24 removal foaming or not or not A/O Yes NA 22 01-2010 21308 224 A/A/O Yes NA 17 05-2010 24557 223 A/O Yes NA 10 04-2010 24455 224 A/O Yes NA 16 05-2010 26077 224 Yes NA 16 12-2010 21714 223 time 16S (0C) (mm-year) pyrotag s ge length (bp) MWW Columbia, 34.00, predominent USA -81.03 MWW 33.24, predominent -84.26 MWW Haerbin, 45.80, predominent A/A/O+M China 126.53 MWW BR Beijing, 39.90, China 116.40 95% MWW A/A/O Yes NA 17 12-2009 23106 223 Qingdao, 36.06, China 120.38 70% MWW A/A/O Yes NA 26 12-2009 24160 223 Nanjing, 32.06, China 118.79 NO NA 27 11-2009 20927 223 Shanghai, 31.23, China 121.47 CAS NO NA 17 11-2009 20659 224 Wuhan, 30.59, predominent China 114.30 MWW OD NO NA 18 12-2009 22227 223 Guangzhou, 23.12, China 113.26 60% MWW CAS NO NA 13 04-2010 21525 223 90% MWW A/O Yes NA 27 08-2010 19151 223 100% MWW A/O Yes NA 27 08-2010 24442 223 30 11-2009 27648 223 31 11-2009 27366 223 Griffin, USA i,Hong Kong, 22.50, 147.12 Stanley,Hong 22.50, Kong, China 147.12 g Kong, g Kong, China S17 setup ature Avera predominent Sha-Tin,Hon S16 & of clear 1.32, China S15 nitrogen Sampling 103.77 Sha-Tin,Hon S14 Process Temper Singapore China S13 Bulking Ulu Pandan, Shek-Wu-Hu S12 85% MWW Number Design for 22.38, 114.19 22.38, 114.19 Tai-Po,Hong 22.44, Kong, China 114.16 Yuen-Long, 22.44, 85% MWW 70% MWW CAS+MB R Yes 95% MWW A/O Yes (Jan. to Mar.) Yes 95% MWW A/O Yes (Jan. to Mar.) 95% MWW A/O Yes NA 31 11-2011 33307 208 95% MWW A/O Yes NA 30 11-2011 28600 208 Data sourc e this study this study this study this study this study this study this study this study this study this study this study this study this study this study this study this study this Hong Kong, 114.02 study China SRR S18 Buenos-Aires -34.60, predominent , Argentina -58.38 MWW CAS NO NA 21 11-2008 8141 408 6277 55, SRA SRR S19 Buenos-Aires -34.60, predominent , Argentina -58.38 MWW CAS NO NA 19 03-2009 9596 398 6277 56, SRA S20 S21 Seoul, Korea Seoul, Korea 37.56, predominent 126.97 MWW 37.56, predominent 126.97 MWW Attac A/A/O Yes NA 23 06-2009 43991 453 hed AS Susp A/A/O Yes NA 23 06-2009 23695 435 ende d AS SRR S22 Cordoba, -31.39, Whey filtering Argentina -64.18 IWW CAS NO NA 22 05-2010 19032 420 6277 52, SRA S23 Buenos-Aires -34.60, , Argentina -58.38 Petroleum refinery SRR Yes, but A/O IWW low NA 34 05-2009 13004 421 efficiency 6277 53, SRA SRR S24 Buenos-Aires -34.60, Pharmaceutical , Argentina -58.38 IWW CAS NO NA 21 05-2011 22405 408 6277 54, SRA SRR S25 Buenos-Aires -34.60, Textile dyeing , Argentina -58.38 IWW CAS NO NA 33 09-2008 19187 418 6277 57, SRA SRR S26 Buenos-Aires -34.60, Textile dyeing , Argentina -58.38 IWW CAS NO NA 28 05-2011 7252 417 6277 58, SRA SRR S27 Buenos-Aires -34.60, Acrylic , Argentina -58.38 polymer IWW CAS NO NA 23 05-2009 15276 438 6277 59, SRA SRR S28 Buenos-Aires -34.60, Pharmaceutical , Argentina -58.38 IWW CAS NO NA 22 07-2011 11220 411 6277 60, SRA SRR S29 Buenos-Aires -34.60, Acrylic , Argentina -58.38 polymer IWW CAS NO NA 23 09-2008 13491 415 6277 61, SRA S30 Cordoba, -31.39, Whey filtering Argentina -64.18 IWW SRR CAS NO NA 20 06-2011 9972 423 6277 73, SRA S31 Buenos-Aires -34.60, Petroleum , Argentina -58.38 refinery IWW SRR Yes, but A/O low NA 35 05-2011 5859 388 efficiency 6277 74, SRA SRR S32 Buenos-Aires -34.60, , Argentina -58.38 Pet food IWW CAS NO NA 22 01-2011 9307 400 6277 75, SRA SRR S33 Buenos-Aires -34.60, , Argentina -58.38 Pet food IWW CAS NO NA 22 09-2008 23234 402 6277 76, SRA S34 S35 S36 S37 S38 S39 S40 S41 S42 Shanghai, 31.23, Morpholine China 121.47 IWW Shanghai, 31.23, Morpholine China 121.47 IWW Shanghai, 31.23, Morpholine China 121.47 IWW Shanghai, 31.23, Morpholine China 121.47 IWW Shanghai, 31.23, China 121.47 Shanghai, 31.23, China 121.47 Shanghai, 31.23, China 121.47 A/O Yes A/O Yes A/O Yes A/O Yes Coking IWW A/A/O Yes Coking IWW A/A/O Coking IWW Shanghai, 31.23, Acrylonitrile China 121.47 IWW Shanghai, 31.23, Acrylonitrile China 121.47 IWW Yes(som 25 08-2012 6886 417 25 08-2012 8699 416 25 08-2012 8675 416 25 08-2012 8066 417 NA 25 08-2012 6815 425 Yes NA 25 08-2012 6158 425 A/A/O Yes NA 25 08-2012 5960 425 A/O NA NA 25 08-2012 7799 419 A/O NA NA 25 08-2012 7978 419 BCO Yes NA 20 10-2012 3993 402 BCO Yes NA 20 10-2012 3104 402 BCO Yes NA 20 10-2012 3496 402 BCO Yes NA 20 10-2012 3461 402 BCO Yes NA 20 10-2012 3965 403 etimes) Yes(som etimes) Yes(som etimes) Yes(som etimes) this study this study this study this study this study this study this study this study this study 80% printing S43 Wuxi, China 31.49, and dyeing 120.31 IWW, 20% this study MWW 80% printing S44 Wuxi, China 31.49, and dyeing 120.31 IWW, 20% this study MWW 80% printing S45 Wuxi, China 31.49, and dyeing 120.31 IWW, 20% this study MWW 80% printing S46 Wuxi, China 31.49, and dyeing 120.31 IWW, 20% this study MWW 80% printing S47 Wuxi, China 31.49, and dyeing 120.31 IWW, 20% MWW this study 80% printing S48 Wuxi, China 31.49, and dyeing 120.31 IWW, 20% BCO Yes NA 20 10-2012 4674 401 BCO Yes NA 20 10-2012 4889 402 BCO Yes NA 20 10-2012 3790 401 this study MWW 85% printing S49 Wuxi, China 31.49, and dyeing 120.31 IWW, 15% this study chemical IWW 80% printing S50 Wuxi, China 31.49, and dyeing 120.31 IWW, 20% MWW Abbreviations: A/O, anoxic/aerobic; A/A/O, anaerobic/anoxic/aerobic; CAS, conventional activated sludge; MBR, membrane bioreactor; NA, not available; OD: oxidation ditch; STPs, sewage treatment plants. BCO:biological contact oxidation process. this study Table S2 Strong (Spearman’s ρ <-0.6) and significant (P-value < 0.01) negative correlations identified between bacterial genera of activated sludge. In total, 20 out of total 402 pairs of significant and robust correlations were found as negative correlations related to 23 (out of total 110) co-occurring bacterial genera. The module referred to the sub-clusters of the positive co-occurrence network of activated sludge bacteria (Figure 1c). NA: genera not found in any module of the positive network. Half of the 20 pairs of strong negative correlations were related to Alicycliphilus and Sphingopyxis. These two bacterial genera were mainly detected in the industrial AS samples and were mostly negatively correlated with genera (e.g. Flavobacterium, Blautia and Variovorax) appearing more in municipal AS (module I). Their roles in the industrial AS were most likely to be associated with biodegradation of alicycli and hydrocarbon compounds, respectively. Node1 Node1-affiliated phylum/class module Alicycliphilus Betaproteobacteria III Alicycliphilus Betaproteobacteria Alicycliphilus Betaproteobacteria Alicycliphilus Node2 Node2-affiliated Correlations phylum/class module Flavobacterium Bacteroidetes I -0.70626 III Luteimonas Gammaproteobacteria I -0.68482 III Blautia Firmicutes I -0.60155 Betaproteobacteria III Prosthecobacter Verrucomicrobia I -0.62994 Alicycliphilus Betaproteobacteria III Opitutus Verrucomicrobia I -0.62136 Sphingopyxis Alphaproteobacteria III Variovorax Betaproteobacteria I -0.6181 Sphingopyxis Alphaproteobacteria III Blautia Firmicutes I -0.62385 Sphingopyxis Alphaproteobacteria III Parachlamydia Chlamydiae II -0.64476 Sphingopyxis Alphaproteobacteria III Azoarcus Betaproteobacteria NA -0.62192 Sphingopyxis Alphaproteobacteria III Lewinella Bacteroidetes NA -0.62018 Rhodoplanes Alphaproteobacteria III Opitutus Verrucomicrobia I -0.64346 Rhodoplanes Alphaproteobacteria III Zoogloea Betaproteobacteria I -0.64166 Sphaerobacter Chloroflexi III Curvibacter Betaproteobacteria I -0.66003 Sphaerobacter Chloroflexi III Rubrivivax Betaproteobacteria NA -0.65837 Flavobacterium Bacteroidetes I Hyphomicrobium Alphaproteobacteria III -0.63994 Flavobacterium Bacteroidetes I Elioraea Alphaproteobacteria III -0.6208 Parvibaculum Alphaproteobacteria I Byssovorax Deltaproteobacteria II -0.62944 Actinobacteria Bifidobacterium I Caenimonas Betaproteobacteria V -0.60046 Lewinella Bacteroidetes NA Caenimonas Betaproteobacteria V -0.64731 Lewinella Bacteroidetes NA Ideonella Betaproteobacteria VII -0.65017 Table S3 Clustering coefficient and average path length of the bacterial co-occurring network of activated sludge (AS) and comparisons to other real ecological networks and the corresponding Erdös-Réyni random networks CC a CCr CC/CCr ln(CC/CCr) APL APLr ln(APL/APLr) 0.56 0.067 8.36 2.12 3.42 2.57 0.28 Soil microbial network 0.33 0.012 27.5 3.31 5.53 3.88 0.35 Marine microbial network 0.27 0.044 6.14 1.81 2.99 2.62 0.13 Food-web network 0.02 to 0.43 0.03 to 0.33 0.30 to 3.80 -1.20 to 1.34 1.33 to 3.74 1.41 to 3.73 0.34 to 1.32 Pollinator-plant networks 0.72 to 1.00 0.08 to 1.0 1.0 to 10.9 0.0 to 2.39 1.0 to 2.31 ND ND Functional gene networks 0.10 to 0.22 0.028 to 0.099 2.22 to 3.57 0.79 to 1.28 3.09 to 4.21 3.00 to 3.84 0.030 to 0.091 AS network (this study) Other ecological networks a b CC is the average clustering coefficient for the real network; CCr is the clustering coefficient identified from a random network of identical size; CC/Clr is the ratio of CC to CCr; APL is the average path length for the real network; APLr is the average path length for the random network; ln(CC/CCr) and ln(APL/APLr) are the log response ratio for the average clustering coefficient and average path length between the observed and random networks; ND indicates missing data. b Topological properties of the soil microbial networks were calculated using data of (Barberán et al 2011); The topological properties for the marine microbial network, food-web network, pollinator-plant network, and functional gene networks were quoted from (Steele et al 2011). Table S4 The observed co-occurring incidence for intra- and inter-phylum/class co-occurrence versus that is expected by random association. The observed co-occurring incidence (O) of two taxa as the relative percentage of the number of observed edges between them in the total 382 edges of the AS positive network, while the random co-occurring incidence (R) was estimated using two different approaches: (I) RER was the mean value of the observed co-occurring incidences for 10,000 identical-sized Erdös-Réyni random networks; (II) RTheo was the theoretical incidence of co-occurrence calculated by considering the phylum/class frequencies and random association. Here we only consider those strong co-occurrence patterns (between two phyla/classes) that were consistently supported by both the above two approaches and at least 5 edges. Overall, 91 out of total 382 edges in the AS network were observed as intra-phylum co-occurrence correlations (edges), while an average of 48.4 (±6.3) intra-phylum co-occurrence correlations were obtained from 10000 Erdös-Réyni random networks. Affiliated phyla/class for negatively-related nodes Frequencies of phylum/class Node1 Node2 Node1 Node2 Number of observed edges α-proteobacteria Firmicutes β-proteobacteria Actinobacteria Acidobacteria γ-proteobacteria δ-proteobacteria Verrucomicrobia Bacteroidetes Chloroflexi α-proteobacteria Firmicutes β-proteobacteria Actinobacteria Acidobacteria γ-proteobacteria δ-proteobacteria Verrucomicrobia Bacteroidetes Chloroflexi 23 8 24 10 8 10 8 3 4 4 23 8 24 10 8 10 8 3 4 4 Firmicutes β-proteobacteria α-proteobacteria γ-proteobacteria α-proteobacteria α-proteobacteria β-proteobacteria β-proteobacteria δ-proteobacteria Verrucomicrobia β-proteobacteria β-proteobacteria δ-proteobacteria Firmicutes Actinobacteria α-proteobacteria α-proteobacteria δ-proteobacteria Verrucomicrobia β-proteobacteria γ-proteobacteria WS3 α-proteobacteria Chlamydiae Chloroflexi ε-proteobacteria Firmicutes γ-proteobacteria Verrucomicrobia α-proteobacteria Chloroflexi Actinobacteria γ-proteobacteria β-proteobacteria Firmicutes Actinobacteria Acidobacteria Acidobacteria Firmicutes Acidobacteria β-proteobacteria Actinobacteria δ-proteobacteria Firmicutes Acidobacteria Acidobacteria Chloroflexi Firmicutes Actinobacteria γ-proteobacteria Bacteroidetes Bacteroidetes Acidobacteria γ-proteobacteria Actinobacteria Acidobacteria α-proteobacteria Chlamydiae Actinobacteria Actinobacteria δ-proteobacteria Actinobacteria 8 24 23 10 23 23 24 24 8 3 24 24 8 8 10 23 23 8 3 24 10 1 23 1 4 1 8 10 3 23 4 10 10 24 8 10 8 8 8 8 24 10 8 8 8 8 4 8 10 10 4 4 8 10 10 8 23 1 10 10 8 10 The co-occurring incidence (%) 29 14 14 12 9 5 4 2 1 1 Observed (O) 7.59 3.66 3.66 3.14 2.36 1.31 1.05 0.52 0.26 0.26 RandomER (RER) 4.45 0.58 4.88 0.83 0.57 0.83 0.58 0.28 0.31 0.31 RandomTheo (RTheo) 4.46 0.49 4.87 0.79 0.49 0.79 0.49 0.05 0.11 0.11 22 18 17 13 12 11 11 10 10 10 9 9 9 7 6 6 6 6 6 5 5 5 4 4 4 4 4 4 4 3 3 5.76 4.71 4.45 3.40 3.14 2.88 2.88 2.62 2.62 2.62 2.36 2.36 2.36 1.83 1.57 1.57 1.57 1.57 1.57 1.31 1.31 1.31 1.05 1.05 1.05 1.05 1.05 1.05 1.05 0.79 0.79 1.42 4.24 9.71 1.42 4.05 3.23 3.39 3.39 1.14 1.29 4.23 3.36 1.14 1.15 1.42 1.62 3.24 1.42 0.60 1.70 0.76 0.33 4.05 0.35 0.63 0.51 0.33 1.76 0.60 3.24 0.74 1.41 4.23 9.73 1.41 4.06 3.24 3.39 3.39 1.13 1.27 4.23 3.39 1.13 1.13 1.41 1.62 3.24 1.41 0.53 1.69 0.71 0.14 4.06 0.18 0.56 0.41 0.14 1.76 0.53 3.24 0.71 O/RER O/RTheo 1.7 6.4 0.8 3.8 4.1 1.6 1.8 1.9 0.9 0.8 1.7 7.4 0.8 4.0 4.8 1.6 2.1 9.9 2.5 2.5 4.1 1.1 0.5 2.4 0.8 0.9 0.8 0.8 2.3 2.0 0.6 0.7 2.1 1.6 1.1 1.0 0.5 1.1 2.6 0.8 1.7 4.0 0.3 3.0 1.7 2.1 3.2 0.6 1.7 0.2 1.1 4.1 1.1 0.5 2.4 0.8 0.9 0.9 0.8 2.3 2.1 0.6 0.7 2.1 1.6 1.1 1.0 0.5 1.1 3.0 0.8 1.9 9.3 0.3 5.9 1.9 2.6 7.4 0.6 2.0 0.2 1.1 δ-proteobacteria Verrucomicrobia Verrucomicrobia Verrucomicrobia WS3 α-proteobacteria β-proteobacteria Chlamydiae δ-proteobacteria γ-proteobacteria γ-proteobacteria Gemmatimonadet es WS3 α-proteobacteria Bacteroidetes β-proteobacteria ε-proteobacteria ε-proteobacteria Firmicutes γ-proteobacteria Spirochaetes Verrucomicrobia WS3 WS3 WS3 WS3 γ-proteobacteria Bacteroidetes Firmicutes Spirochaetes δ-proteobacteria Chlamydiae Chloroflexi Acidobacteria Chlamydiae Acidobacteria Chlamydiae Acidobacteria Firmicutes Bacteroidetes Acidobacteria Chlamydiae Chloroflexi δ-proteobacteria Chloroflexi Chloroflexi β-proteobacteria Acidobacteria Actinobacteria α-proteobacteria β-proteobacteria Chlamydiae 8 3 3 3 1 23 24 1 8 10 10 10 4 8 1 8 1 4 8 1 8 1 3 3 3 3 3 2 2 2 2 2 2 0.79 0.79 0.79 0.79 0.79 0.52 0.52 0.52 0.52 0.52 0.52 1.42 0.37 0.52 0.28 0.33 0.50 1.70 0.33 0.33 1.42 0.35 1.41 0.21 0.42 0.05 0.14 0.41 1.69 0.14 0.14 1.41 0.18 0.6 2.1 1.5 2.8 2.4 1.0 0.3 1.6 1.6 0.4 1.5 0.6 3.7 1.9 14.8 5.6 1.3 0.3 3.7 3.7 0.4 3.0 1 1 23 4 24 1 1 8 10 1 3 1 1 1 1 8 8 4 8 1 4 8 4 4 24 8 10 23 24 1 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 0.52 0.52 0.26 0.26 0.26 0.26 0.26 0.26 0.26 0.26 0.26 0.26 0.26 0.26 0.26 0.33 0.33 1.63 0.63 0.52 0.29 0.33 0.63 0.75 0.52 0.52 0.35 0.51 0.52 0.26 0.14 0.14 1.62 0.56 0.42 0.07 0.14 0.56 0.71 0.42 0.42 0.18 0.41 0.42 0.02 1.6 1.6 0.2 0.4 0.5 0.9 0.8 0.4 0.3 0.5 0.5 0.7 0.5 0.5 1.0 3.7 3.7 0.2 0.5 0.6 3.7 1.9 0.5 0.4 0.6 0.6 1.5 0.6 0.6 14.8 Table S5 The inter-correlations between activated sludge (AS) generalists (nodes in Figure 4) and the number of co-occurring samples. The AS generalists are defined as genera that were widely distributed in both at least 60% municipal (12 samples) and 60% indudtrial AS (18 samples) Cosmopolitan affiliated phylum Acidobacteria Actinobacteria Alphaproteobacteria Betaproteobacteria Chloroflexi Deltaproteobacteria Firmicutes Gammaproteobacteri a Gemmatimonadetes Cosmopolitan Co-occurrent Co-occurrent affiliated phylum spearman’s ρ Number of co-occurring AS Gp3 Gp3 Gp16 Ilumatobacter Hyphomicrobium Methylocystis Paracoccus Hyphomicrobium Mesorhizobium Mesorhizobium Mesorhizobium Bauldia Paracoccus Mesorhizobium Pseudolabrys Pseudolabrys Pseudolabrys Pseudolabrys Pseudolabrys Bauldia Pseudolabrys Xanthobacter Diaphorobacter Acidovorax Thauera Acidovorax Nitrosospira Diaphorobacter Caldilinea Longilinea Sphaerobacter Sphaerobacter Bdellovibrio Clostridium_XI Clostridium_XI Clostridium_sensu_st ricto Clostridium_sensu_st ricto Clostridium_sensu_st ricto Clostridium_sensu_st ricto Steroidobacter Steroidobacter Steroidobacter Gemmatimonas Gp4 Gp6 Gp6 Mycobacterium Thauera Bradyrhizobium Clostridium_sensu_stricto Sphaerobacter Sphaerobacter Hyphomicrobium Nitrosospira Hyphomicrobium Gp16 Gp16 Sphaerobacter Thauera Nitrosospira Hyphomicrobium Mesorhizobium Pseudolabrys Gp16 Hyphomicrobium Comamonas Bdellovibrio Sphaerobacter Ferruginibacter Hyphomicrobium Thiobacillus Mycobacterium Gp3 Leucobacter Gp16 Comamonas Mycobacterium Ilumatobacter Clostridium_XI Mycobacterium Gp6 Gp16 Hyphomicrobium Sphaerobacter Thiobacillus Gp6 Acidobacteria Acidobacteria Acidobacteria Actinobacteria Betaproteobacteria Alphaproteobacteria Firmicutes Chloroflexi Chloroflexi Alphaproteobacteria Betaproteobacteria Alphaproteobacteria Acidobacteria Acidobacteria Chloroflexi Betaproteobacteria Betaproteobacteria Alphaproteobacteria Alphaproteobacteria Alphaproteobacteria Acidobacteria Alphaproteobacteria Betaproteobacteria Deltaproteobacteria Chloroflexi Bacteroidetes Alphaproteobacteria Betaproteobacteria Actinobacteria Acidobacteria Actinobacteria Acidobacteria Betaproteobacteria Actinobacteria Actinobacteria Firmicutes Actinobacteria Acidobacteria Acidobacteria Alphaproteobacteria Chloroflexi Betaproteobacteria Acidobacteria 0.61 0.78 0.67 0.61 0.63 0.70 0.61 0.67 0.68 0.66 0.64 0.62 0.68 0.61 0.63 0.63 0.69 0.78 0.66 0.61 0.71 0.63 0.61 0.61 0.69 0.65 0.69 0.64 0.62 0.61 0.64 0.70 0.62 0.69 0.64 0.90 0.67 0.64 0.78 0.62 0.68 0.63 0.62 39 39 32 40 48 40 37 37 37 37 34 33 32 32 30 30 30 30 30 30 30 28 41 37 37 37 34 31 40 39 36 32 41 39 39 37 37 37 32 41 37 31 41 Reference Albert R, Jeong H, Barabási A-L (2000). Error and attack tolerance of complex networks. Nature 406: 378-382. Amaral LAN, Scala A, Barthélémy M, Stanley HE (2000). Classes of small-world networks. Proceedings of the National Academy of Sciences 97: 11149-11152. Barberán A, Bates ST, Casamayor EO, Fierer N (2011). Using network analysis to explore co-occurrence patterns in soil microbial communities. ISME J 6: 343-351. Claesson MJ, Wang Q, O'Sullivan O, Greene-Diniz R, Cole JR, Ross RP et al (2010). Comparison of two next-generation sequencing technologies for resolving highly complex microbiota composition using tandem variable 16S rRNA gene regions. Nucleic Acids Res 38: e200-e200. Dunne JA, Williams RJ, Martinez ND (2002). Food-web structure and network theory: the role of connectance and size. Proceedings of the National Academy of Sciences 99: 12917-12922. Montoya JM, Pimm SL, Solé RV (2006). Ecological networks and their fragility. Nature 442: 259-264. Newman MEJ (2003). The structure and function of complex networks. SIAM review 45: 167-256. Newman MEJ (2006). Modularity and community structure in networks. Proceedings of the National Academy of Sciences 103: 8577-8582. Olesen JM, Bascompte J, Dupont YL, Jordano P (2006). The smallest of all worlds: pollination networks. J Theor Biol 240: 270-276. Palla G, Derényi I, Farkas I, Vicsek T (2005). Uncovering the overlapping community structure of complex networks in nature and society. Nature 435: 814-818. Steele JA, Countway PD, Xia L, Vigil PD, Beman JM, Kim DY et al (2011). Marine bacterial, archaeal and protistan association networks reveal ecological linkages. ISME J 5: 1414-1425. Watts DJ, Strogatz SH (1998). Collective dynamics of ‘small-world’networks. Nature 393: 440-442. Zhou J, Deng Y, Luo F, He Z, Tu Q, Zhi X (2010). Functional molecular ecological networks. MBio 1.