1 Characterization of the Relative Importance of Human- and Infrastructure- 2 Associated Bacteria in Graywater: A Case Study 3 SUPPORTING INFORMATION 4 SCOTT P. KEELY1,4#,NICHOLE E. BRINKMAN1,4, BRIAN D. ZIMMERMAN1,3, DAVID WENDELL3, 5 KRISTEN M. EKEREN2, SUSAN K. DE LONG2, SYBIL SHARVELLE2, JAY L. GARLAND1; 1National 6 Exposure Research Laboratory, United States Environmental Protection Agency, Cincinnati, OH, 7 45268; 2Department of Civil and Environmental Engineering, Colorado State University, Fort 8 Collins, CO, 80523, 3Department of Energy, Environmental, Biological & Medical Engineering 9 and 4McMicken College of Arts and Sciences, Department of Biological Sciences, University of 10 Cincinnati, Cincinnati, OH 45221. 11 #corresponding author 12 Scott P. Keely, Ph.D. 13 National Exposure Research Laboratory 14 US Environmental Protection Agency 15 26 W. Martin Luther King Drive 16 Cincinnati, OH 45268 17 18 1 19 20 Table S1. Characteristics of Raw GW from the CSU Collection System Parameter* Average Standard Deviation TOC (mg l-1-C) 44 12.2 Turbidity (NTU) 32 4.2 NH3-N (mg l-1-N) 8.4 2.2 Total coliforms (log cfu 100 ml-1) 8.4 0.6 E. coli (log cfu 100 ml-1) 4.2 2.5 21 *TOC indicates total organic carbon; NTU indicates nephelometric turbidity units. Note that this data is 22 for GW collected post ET from the “Raw Graywater Sampling Port” in Fig. 1. Some changes to water 23 quality will have occurred due to storage (e.g., reduction in turbidity due to settling). 24 2 25 26 27 Table S2. Characteristics of GW from the UC Laundry Parameter* Average Standard Deviation Dissolved Oxygen (mg l-1) 7.4 0.6 pH 7.6 0.02 Specific Conductivity (µS) 599.0 86.1 Oxidation Reduction Potential (mV) 326.3 209.7 Salinity (ppt) 0.29 0.05 *µS indicates micro seimens; mV indicates millivolts; ppt indicates parts-per-trillion. 28 29 30 3 31 32 33 34 35 Table S3. 454 Read Statistics Sample Type Number of Samples Total Number of Raw Reads Total Number of Unique Reads Percent Chimeras Average Number of HighQuality Reads SH 18 518,502 225,136 LA 24 914,881 440,540 ET 6 269,138 111,232 BC 3 80,994 34,007 10.39 ±3.28* 2.07 ± 2.90 10.35 ± 2.39 4.45 ± 1.56 0.20† 9,789 ± 2215* 14,685 ± 2273 10,112 ± 1776 10,844 ± 1598 10,997† Average Number of Classified Reads (Genus) 5,063 ± 1,281* 12,234 ± 1,961 5,148 ± 1,144 6,000 ± 1433 8,415† PW 1 38,673 11,019 *Standard deviation †Standard deviation was not calculated because only one sample was sequenced. 36 4 Average Number of Genera Total Number of Genera 86 ± 24‡ 191 105 ± 21 295 53 ± 11 90 82 ± 8 107 37† 37 37 38 Table S4. OTU statistics and BLAST analysis of partially-classified 16S rRNA gene reads observed in BC and ET MEGAN Tree Taxonomic group Alphaproteobacteria GW BC % OTU OTU Reads length No. OTU-1 (bp) 1 23.55 474 % Identity % Coverage Acc. No. Isolation Source 97 100 JX521421 Terrestrial Sulfidic Springs Betaproteobacteria BC 2 86.24 521 99 100 AY945914 Sludgeseeded Bioreactor Coriobacteriaceae BC 3 89.07 506 99 97 DQ168843 Anaerobic Sludge Oxalobacteraceae BC 4 87.32 517 99 100 GU563775 Parasitic Nematode Propionibacteriaceae BC 5 35.44 501 99 99 EU812976 Biogas Reactor Fed Corn Silage Xanthomonadaceae BC 6 80.19 527 99 100 EF515229 Microbial Fuel Cell Anode Bacteroidetes ET 7 59.45 515 99 100 JX828412 IronReducing Culture Betaproteobacteria ET 8 44.84 521 99 100 AY945914 Sludgeseeded Bioreactor Clostridiales ET 9 42.63 515 98 100 JN368364 Pitcher Plant Enterobacteriaceae ET 10 24.60 519 99 99 FJ393177 Microbial Fuel Cell Rhodocyclaceae ET 11 18.86 523 99 100 JF341927 Concrete Sewer Biofilm 39 5 40 41 Table S5. Bonferroni-corrected p-values for GW groups GW Group GW Group LA SH Dec. SH SH 0.0015 Dec. SH 0.0015 0.003 ET 0.0015 0.0015 0.0045 BC 0.0015 0.0105 0.1695 ET 0.021 42 43 6 44 Figure Legends 45 Figure S1. A black-and-white screen shot of the network model in Supplemental File 2. The 46 network was constructed using the edge-weighted spring embedded model with read counts as 47 weights. The larger nodes are GW samples (i.e., SH, ET, BC and LA) and the smaller nodes 48 represent genera. 49 Figure S2. Relative abundance analysis of the classified GW genera for PW. Genera with 5 or 50 more reads are shown. The number of reads per genus is specified above each bar. 51 52 Supplemental Files 53 Data S1. Microsoft Excel file containing bacterial genera read counts for GW samples. 54 Data S2. Network model of the relationships between GWs. A network was constructed using 55 the edge-weighted spring embedded model with read counts as edge weights. The names of 56 the GW and genera are given on the nodes. Node colors and sizes represent relative number of 57 network connections. Green, yellow, brown, orange and red nodes indicated >4, 4, 3, 2 and 1 58 connections, respectively. The edge colors indicate read abundance: green specifies the 59 greatest number of reads; yellow, orange and brown are intermediate levels; and red was the 60 lowest. 61 7 62 63 Figure S1 64 65 66 67 8 68 Figure S2 69 9