1 Supplementary Methods 2 Land cover classification and catchment delineation 3 West Virginia GIS Technical Center (http://wvgis.wvu.edu/) provided spatial data downloaded 4 into ArcMap 10.0. We delineated catchments for samplings location from 30 meter DEM 5 (digital elevation dataset) (US Geological Survey National Elevation Dataset) with ArcMap 10.0 6 spatial analyst hydrology tools. Land cover classification of mines and reclaimed mines were 7 identified using 1-meter color orthophotos from the USDA’s National Agriculture Imagery 8 Program. 9 10 Biofilm community 16S rRNA gene analysis 11 We homogenized biofilm from 2 of the veneers and subsampled 0.25g wet mass for DNA 12 extraction. Community DNA was extracted from the biofilm using PowerBiofilmTM DNA 13 isolation kit (MO BIO Carlsbad, CA, USA) and stored at -20°C. Bacteria community DNA was 14 amplified at the 27 to 338 region of the 16S (small-subunit ribosomal) RNA gene (regions V1 15 and V2 using the Escherichia coli genome numbering system). Forward 27F primer had a Roche 16 Titanium Fusion Primer Adapter A, followed by a 4 nucleotide key, then a 454-specific 10 17 nucleotide MID barcode (Supplementary Table S2) for each sample site and finally, the 16S 18 template-specific sequence (5’CCA TCT CAT CCC TGC GTG TCT CCG AC TCAG NNN 19 NNN NNN N AG AGT TTG ATC CTG GCT CAG 3’). Reverse primer 338R (5’ GCT GCC 20 TCC CGT AGG AGT 3’) had no barcode, thus sequencing was unidirectional using Roche 454 21 Lib-L kit. The polymerase chain reaction recipe was 2.5μL (10mM) of each forward and reverse 22 primer, 1μL of template DNA, 4μL of dNTPs (1mM each), 2.5μL BSA (10mg/mL), 0.75μL 23 MgCl2 (50mM), 2.5μL 10x buffer, 1μL Platinum Taq polymerase, 9.1μL ddH2O. MgCl2, 10x 24 buffer, and Taq polymerase were all from Invitrogen Platinum kit. 25 The PCR program was 5 min initial denaturation at 95°C followed by 25 cycles of 26 amplification: 95 °C for 60 s, 52 °C for 60 s, and 72 °C for 105 s. After the last cycle 27 amplification was extended at 72 °C for 7 min. Samples, including negative controls (no 28 template added), were amplified in triplicate. All PCR work was completed in a laminar flow 29 hood. Replicate PCR samples were pooled and purified with QIAquick PCR Purification Kit 30 (Qiagen, Valencia, CA). Purified PCR products were normalized with SequalPrep™ 31 Normalization Plate Kit (Applied Biosystems®, Life Technologies Grand Island, NY, USA). 32 Equimolar purified PCR amplicons were combined in 3 microcentrifuge tubes each with a set of 33 barcodes with MID1-10. Samples were sent to Genome Sequencing and Analysis Core Resource 34 at Duke University (Durham, NC, USA) for pyrosequencing with a Roche 454 Life Sciences 35 Genome Sequencer Flex Titanium instrument (Branford, CT, USA). 36 37 38 39 Bacterial community analyses 40 started at random configurations with the nmds function in ecodist (Goslee & Urban, 2007). The 41 final solution for both the Bray-Curtis distance and GUniFrac distance NMDS ordinations were 42 created using a stepdown procedure. GUniFrac distances were based on a UPGMA phylogenetic 43 tree (Chen et al., 2012). The final solution used three axes to relieve stress from the 44 configuration, but the third axis did not increase explanatory power of the ordination. We rotated 45 the ordination result to achieve highest variance on Axis 1. We partitioned variation in our 46 community distance matrices (both Bray-Curtis and GUniFrac) among our a priori and post hoc NMDS ordinations using counts relativized to 1543 sequences were created using 200 runs 47 groupings with permutational multivariate analysis of variance (perMANOVA , (Anderson, 48 2001). For the a priori grouping, we decomposed total variation among three orthogonal 49 contrasts: mined vs. unmined, reclaimed vs. active, and mainstem vs. valley fill. For the post hoc 50 grouping we decomposed total variation among two orthogonal contrasts: mined vs. unmined, 51 and group A vs. group B. We tested the significance of each contrast using a pseudo F-statistic 52 generated from random permutation of the site categories This test was carried out using 999 53 permutations in the adonis function of the vegan package (Oksanen et al., 2011) in R version 54 2.14.1 (R Development Core Team 2011). 55 We characterized taxa driving the multivariate patterns using indicator species analysis 56 (Dufrêne & Legendre 1997; De Cáceres et al. 2010; De Cáceres & Legendre 2009) with PC- 57 ORD software. This technique analyzed the association between relative abundance and 58 frequency of taxa and their designated sample site groups and identified which species had the 59 greatest indicator value (percent of perfect indication, where 100% signifies perfect indication) 60 in a designated site group. The index is greatest for an OTU when it is found within all of the 61 sites comprising one group. This analysis was done using PC-ORD software (McCune & 62 Mefford, V6). Monte Carlo randomizations were used to test for statistical significance of the 63 OTU indicator using 4999 permutations. OTUs selected as good indicators were those with 64 indicator values >0.3 and p< 0.05 as recommended by Dufrene & Legendre (1997). 65 Bacteria taxa and environmental analysis 66 Environmental variables used in the PCA were those that differed significantly (p ≤ 0.05) 67 between mined and unmined sites or between post hoc A and B sites (described below) using a 68 two-tailed student’s t-test. They included average water chemistry that differed between mined 69 and unmined sites (Table 1) as well as biofilm Ca, Cd, Mg, Mn, Ni, Sr, Th and Zn. Chemistry 70 variables, biofilm biomass, and biofilm C content that failed the Shapiro-Wilk normality test 71 were transformed using log, square root, or inverse functions as needed. 72 Bacterial diversity along the mining gradient 73 Chao1 richness estimator (Chao, 1984) is a useful index for phylotype richness of uncultured 74 microbial communities that include many rare taxa because it uses a non-parametric estimation 75 that accounts for datasets skewed towards low-abundance classes, which is typical for microbial 76 community datasets. Shannon diversity index is a commonly used estimate of macroorganism 77 diversity that accounts for group abundance and evenness, but focuses on the diversity of 78 common taxa (Hill, 1973). 79 80 References 81 82 83 Anderson, M. J. (2001). A new method for non-parametric multivariate analysis of variance. Austral Ecology, 26(2001), 32–46. 84 85 Chao, A. (1984). Nonparametric estimation of the number of classes in a population. Scandinavian Journal of Statistics, 11(4), 265–270. 86 87 88 89 Chen, J., Bittinger, K., Charlson, E. S., Hoffmann, C., Lewis, J., Wu, G. D., … Li, H. (2012). Associating microbiome composition with environmental covariates using generalized UniFrac distances. Bioinformatics (Oxford, England), 1–8. doi:10.1093/bioinformatics/bts342 90 91 92 De Cáceres, M., & Legendre, P. (2009). Associations between species and groups of sites: indices and statistical inference. Ecology, 90(12), 3566–74. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/20120823 93 94 95 De Cáceres, M., Legendre, P., & Moretti, M. (2010). Improving indicator species analysis by combining groups of sites. Oikos, 119(10), 1674–1684. doi:10.1111/j.16000706.2010.18334.x 96 97 Dufrene, M., & Legendre, P. (1997). Species assemblages and indicator species: the need for a flexible asymmetrical approach. Ecological Monographs, 67(3), 345–366. 98 Goslee, A. S., & Urban, D. (2007). Package “ ecodist .” 99 100 Hill, M. O. (1973). Diversity and evenness : a unifying notation and its consequences. Ecology, 54(2), 427–432. 101 102 McCune, B., & Mefford, M. J. (n.d.). PC-ORD. Multivariate analysis of ecological data. Gleneden Beach, Oregon, USA: MjM Software. 103 104 Oksanen, A. J., Blanchet, F. G., Kindt, R., Legen-, P., Minchin, P. R., Hara, R. B. O., … Wagner, H. (2011). Package “ vegan .” 105