1 Supplementary Information 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Halomonas sulfidaeris-dominated Communities Inhabit 1.8km-deep Saline Sandstone Reservoirs 31 32 8 33 34 35 36 37 38 39 40 9 41 42 12 43 44 45 13 Yiran Dong1,2, Charu Gupta Kumar1, Nicholas Chia1,3,4, Pan-Jun Kim1,5, Philip A. Miller1,2,6, Nathan D. Price1,3,7, Isaac K.O. Cann1,8,9, Theodore M. Flynn1,2,10, Robert A. Sanford1,2, Ivan G. Krapac11, Randall A. Locke II11, Pei-Ying Hong12, Hideyuki Tamaki12,13, Wen-Tso Liu1,12, Alvaro G. Hernandez14, Chris L. Wright14, Mark A. Mikel14, Jared L. Walker15, Mayandi Sivaguru1, Glenn Fried1, Anthony C. Yannarell16, Roderick I. Mackie1, Bruce W. Fouke1,2,3,9,11* 1 Energy Biosciences Institute, Institute for Genomic Biology, University of Illinois UrbanaChampaign, 1206 W. Gregory Drive, Urbana, Illinois 61801 USA 2 Department of Geology, University of Illinois Urbana-Champaign, 1301 W. Green Street, Urbana, Illinois 61801 USA 3 Institute for Systems Biology, 401 Terry Avenue North, Seattle, Washington 98109 USA 4 Department of Surgery, Mayo Clinic, 200 First Street S.W., Rochester, Minnesota 55905, USA 5 Asia Pacific Center for Theoretical Physics, San 31 Hyoja-dong, Nam-gu, Pohang, Gyeongbuk, 790-784 Korea 6 Shell Oil Company, 150 N. Dairy Ashford Drive, Houston, Texas 77079 USA 7 Chemical and Biomolecular Engineering, University of Illinois Urbana-Champaign, 600 S. Mathews Avenue, Urbana, Illinois 61801 USA Department of Animal Sciences, University of Illinois Urbana-Champaign, 1207 W. Gregory Drive, Urbana, Illinois 61801 USA Department of Microbiology, University of Illinois Urbana-Champaign, 601 S. Goodwin Avenue, Urbana, Illinois 61801 USA 10 Biosciences Division, Argonne National Laboratory, 9700 South Cass Avenue, Argonne, Illinois 60439 USA 11 Illinois State Geological Survey, Prairie Research Institute, University of Illinois UrbanaChampaign, 615 E. Peabody Drive, Urbana, Illinois 61801 USA Civil and Environmental Engineering, University of Illinois Urbana-Champaign, 205 N. Mathews Avenue, Urbana, Illinois 61801 USA Bioproduction Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Central 6, Higashi 1-1-1, Tsukuba, Ibaraki, 305-8566 Japan 46 47 48 49 50 51 52 53 54 55 56 57 58 14 Roy J. Carver Biotechnology Center, University of Illinois Urbana-Champaign, 2613 Institute for Genomic Biology, 1206 W. Gregory Drive, Urbana, Illinois 61801 USA 15 Schlumberger Carbon Services, 14090 SW Freeway, Suite 300, Sugar Land, Texas 77478 USA 16 Department of Natural Resources and Environmental Sciences, University of Illinois at Urbana-Champaign, 1102 South Goodwin Avenue, Urbana, Illinois 61801 USA * Corresponding Author: Bruce W. Fouke, Institute for Genomic Biology, University of Illinois Urbana-Champaign, 1206 W. Gregory Drive, Urbana, Illinois 61801 USA (E-mail: fouke@illinois.edu) 59 60 61 62 Index Geological and Hydrological Setting ...................................................................................... 1 63 Drilling and Subsurface Reservoir Analyses 64 Well Drilling ............................................................................................................................ 5 65 Rock Core Collection .............................................................................................................. 6 66 Schlumberger Quicksilver Probe ............................................................................................. 6 67 Wire-Line Log Electrical Data Collection............................................................................... 8 68 Drillstem Test (DST) ............................................................................................................... 8 69 Collection of Quicksilver Probe MDT Water Samples ...................................................... 10 70 Gas collection ........................................................................................................................ 11 71 Centrifugation of water samples ............................................................................................ 11 72 Geochemical Analyses .......................................................................................................... 12 73 Strontium Isotopic Analysis .................................................................................................. 13 74 Microscopic Analyses 75 Lithology analysis of thin sections ........................................................................................ 14 76 Fluorescence In-Situ Hybridization (FISH) for cell enumeration ......................................... 15 77 Cultivation of Organisms ...................................................................................................... 17 78 Microbial Community Analyses 79 DNA extraction ..................................................................................................................... 18 80 Development of 16S rRNA Clone library ............................................................................. 18 81 Terminal Restriction Fragment Length Polymorphism (T-RFLP) ........................................ 20 82 Metagenomic Analyses 83 V1-V3 16S rRNA Hypervariable Region Pyrosequencing ................................................... 20 84 Halomonas sulfidaeris strain Esulfide1 as a Reference Genome for D5872 Metagenome... 21 85 454 pyrosequencing of D5872 Metagenome and H. sulfidaeris Esulfide1 genome ............. 22 86 Metagenomic and Genomic Analyses ................................................................................... 22 87 Figures and Tables 88 Figure S1. Fluorescence hybridization of the D5872 microbial consortium ......................... 25 89 Figure S2. T-RFLP analysis of D5872 microbial community and drilling mud ................... 26 90 Figure S3. Phylogenetic relationship of Halomonas species in D5872 microbial community 91 versus drilling mud ............................................................................................................... 27 92 Figure S4. Rarefaction of D5872 and DDM microbial community richness ........................ 28 93 Figure S5. Distribution of GC-content with length for D5872 Metagenome contigs ........... 29 i 94 95 96 Figure S6. Taxonomic assignment of D5872 Metagenome in ORFs predicted in contigs and singletons with ORFs longer than 100 aa ............................................................................ 30 Figure S7. Significant enrichment of SEED subsystems in D5872 Metagenome proteins 97 over H. sulfidaeris Esulfide1 proteins .................................................................................. 31 98 Figure S8. KEGG pathway depicting degradation of dicholorobenzoate as an example of the 99 biodegradative capability of halogenated hydrocarbons amongst D5872 syntrophs ........... 32 100 Figure S9. KEGG pathway depicting reductive fixation of ammonia into amino acids ....... 33 101 Table S1. Geochemistry of formation fluid collected from the horizon 1.8 km (5872 feet) in 102 103 depth of the Decatur Well, IL............................................................................................... 34 Table S2. Summary of D5872 microbial community identified by clone library and 104 pyrosequencing..................................................................................................................... 35 105 References ............................................................................................................................... 36 106 Other Tables Attached as Excel Documents 107 108 Table S3. Predicted proteins involved in translation, ribosome structure, and biogenesis in D5872 Metagenome 109 Table S4. Predicted proteins involved in transcription in D5872 Metagenome 110 Table S5. Predicted proteins involved in replication, recombination, and DNA repair in 111 D5872 Metagenome 112 Table S6. tRNA genes identified in D5872 Metagenome 113 Table S7. Comparison of SEED class2 category assignments in D5872 Metagenome contig 114 115 116 117 118 119 120 121 122 123 124 125 126 and singleton proteins (D5872) and SEED microbial proteins (SEED) Table S8. Predicted proteins involved in inorganic ion transport and metabolism in D5872 Metagenome Table S9. Predicted proteins involved in carbohydrate transport and metabolisms in D5872 Metagenome Table S10. Predicted proteins involved in osmolyte synthesis and degradation in D5872 Metagenome Table S11. Predicted proteins involved in nucleotide transport and metabolism in D5872 Metagenome Table S12. Predicted proteins involved in amino acid transport and metabolisms in D5872 Metagenome Table S13. Predicted proteins involved in lipid transport and metabolisms in D5872 Metagenome ii 127 Table S14. Predicted proteins involved in nitrate, nitrite and nitric oxide transport and 128 metabolisms in D5872 Metagenome 129 Table S15. Predicted proteins involved in defense mechanisms in D5872 Metagenome 130 Table S16. Predicted proteins involved in cell motility in D5872 Metagenome 131 132 iii 133 Geologic and Hydrologic Setting 134 A thorough understanding of the geologic and hydrologic history of the Paleozoic Mt. Simon 135 Sandstone in the Illinois Basin is required to accurately decipher the ecology and evolution of 136 its indigenous deep subsurface microbial community. Sedimentary rock reservoirs like the Mt. 137 Simon Sandstone, often associated with hydrocarbon migration and entrapment, are also 138 important targets for large-scale carbon capture, transport and storage (CCS) projects (Morse 139 and Leetaru, 2003; DOE et al., 2007; Leetaru et al., 2009). The Mt. Simon Sandstone is a 494 140 m-thick (1620 ft.) deposit of Cambrian-age sandstones at the base of the Sauk Sequence 141 (Sloss, 1963; Buschbach, 1964) that unconformably overlies Precambrian weathered granites, 142 granodiorites and rhyolites that comprise the basement of the Illinois Basin (Bradbury and 143 Atherton, 1965; Atekwana, 1996). The uppermost basement surface was heavily eroded prior 144 to deposition of the Mt. Simon Sandstone, creating a paleotopographic surface with as much 145 as 600 m (2000 ft.) of vertical relief (Leetaru and McBride, 2009). The Mt. Simon Sandstone 146 was therefore deposited in an arid high-relief depositional environment containing alluvial 147 fans and braided river systems (Leetaru et al., 2009). 148 The porosity and permeability of lithologies like the Mt. Simon Sandstone are 149 strongly influenced by both depositional (primary) and post-depositional (diagenetic) 150 alteration processes. The early Paleozoic section of the Illinois Basin in the Midcontinent of 151 North America, which includes the Mt. Simon Sandstone, has experienced multiple events of 152 regional subsurface fluid flow and resulting diagenesis that were driven by tectonic events 153 (Bethke et al., 1984; Bethke, 1986; Bethke and Marshak, 1990; Bethke et al., 1991). 154 Specifically, the Mt. Simon Sandstone experienced increasing subsidence and burial 155 throughout the Paleozoic, having reached a maximum burial depth of 2.4 km and 156 temperatures of 100°C in the early Mesozoic (Rowan et al., 2002; Makowitz et al., 2006). 1 157 Several studies have been conducted on the Mt. Simon Sandstone formation waters 158 (Visocky et al., 1985; Siegel, 1989). However, the wells from which these waters were 159 sampled penetrated extremely shallow sections of the Mt. Simon Sandstone in the northern 160 portions of the Illinois Basin. Furthermore, they do not provide a complete suite of elemental 161 and isotopic water analyses. Therefore, interpretation of the origin and chemistry of the burial 162 brines that precipitated the quartz cement overgrowths observed in the Mt. Simon Sandstone 163 in the present study has been completed using analogous formation water collected from 164 Silurian and Devonian strata in the Illinois Basin (Stueber and Walter, 1991). This approach 165 was necessary because the geochemical analyses of deep subsurface Mt. Simon Sandstone 166 formation water collected from the Illinois Basin–Decatur Project (IBDP) is currently being 167 completed by the ISGS and the data are not yet available. 168 Formation water from Silurian to Devonian strata in the Illinois Basin has a chloride 169 (Cl)/bromide (Br) ratio that plots near a modeled seawater evaporation trajectory (Carpenter, 170 1978) and stops just short of halite precipitation (Stueber and Walter, 1991). The Cl/Br ratio 171 during seawater evaporation remains linear because both Cl and Br are inert in most 172 biological and diagenetic processes until the brine reaches halite saturation (Stueber and 173 Walter, 1991). At the point of halite precipitation Br (kD= 0.032) will preferentially remain 174 within the brine as the Cl is incorporated into the halite (Carpenter, 1978; McCaffrey et al., 175 1987). This suggests that the Mt. Simon Sandstone formation water originated at the surface 176 from evaporated seawater. In addition, cation/Br ratios within the Silurian to Devonian strata 177 formation water suggest that water and rock interactions further modified the evaporated 178 seawater (Stueber and Walter, 1991). For example, sodium (Na)/Br and potassium (K)/Br 179 ratios are both depleted in the Silurian-Devonian formation water when compared to normal 180 modern-day seawater, and fall below the seawater evaporation trajectory (Riley and Chester, 181 1971; Carpenter, 1978; Stueber and Walter, 1991). This loss of Na is generally explained by 2 182 the albitization of feldspar (Carpenter, 1978; Land and Prezbindowski, 1981; Stueber and 183 Walter, 1991) and simultaneous loss of K during the precipitation of potassium feldspar 184 cements (Egeberg and Aagaard, 1989), as represented by: 185 2KAlSi3O8 + 2.5Al2Si2O5(OH)4 + Na+ =NaAlSi3O8 + 2KAl3Si3O10(OH)2 + 2SiO2 + 2.5H2O + 186 H+ 187 This reaction is an additional potential source for the precipitation of quartz 188 overgrowth cements within the Mt. Simon Sandstone, as well as clay cementation (Worden 189 and Morad, 2000). Furthermore, the evaporation of seawater and the resulting increase in Na 190 would tend to drive this reaction forward and further enhance quartz cementation. 191 The distribution of Silurian and Devonian formation water Cl/Br along modeled 192 seawater evaporation trajectories suggests that this modified subsurface evaporated seawater 193 also experienced mixing with recharging meteoric water (Stueber and Walter, 1991). The 194 possibility that evaporated seawater mixed with meteoric water and/or seawater has been 195 evaluated by Land and Prezbindowski (1981). They suggest that the formation water 196 chemistry, which is depleted in deuterium, is inconsistent with isotopic fractionation during 197 seawater evaporation (Holser, 1979; Knauth and Beeunas, 1986; Stueber and Walter, 1991) 198 Stueber and Walter (1991) suggest that, based on δD and δ18O covariations, the formation 199 water was diluted by mixtures of as much as 50 % meteoric water. This addition of meteoric 200 water into the Mt. Simon Sandstone may have been the source of oxygen to precipitate the 201 multiple iron oxide cements in the Lower and Upper Stratigraphic Intervals. 202 The Mt. Simon Sandstone formation water 87Sr/86Sr ratios are greater than the Sr- 203 isotope ratios of Cambrian seawater (Veizer et al., 1999). This implies that dissolved Sr was 204 derived from a source other than connate burial sweater within the Illinois Basin. Potential 205 sources include: (1) overlying Paleozoic shales; (2) granites and rhyolites comprising the 206 Precambrian igneous crystalline basement; and (3) shales and sands within the Mt. Simon 3 207 Sandstone. Several previous studies have concluded that Silurian and Devonian formation 208 water in the Illinois Basin derived dissolved radiogenic Sr from the New Albany shale group 209 (Stueber et al., 1987; Bethke et al., 1991; Stueber and Walter, 1991). However, it is uncertain 210 whether this would apply to the Mt. Simon Sandstone, which has no obvious vertical 211 hydrologic connectivity with the overlying Eau Claire shale (Leetaru et al., 2009; Bowen et 212 al., 2010). It is also possible that dissolved radiogenic Sr was derived from water-rock 213 interaction with granites and rhyolites comprising the Precambrian basement (Clark and Fritz, 214 1997). This hypothesis is viable due to the contact between the Mt. Simon Sandstone and the 215 Precambrian basement. Further, the Precambrian basement would not need to contribute 216 much radiogenic Sr because the 87Sr/86Sr ratio is likely very high (Clark and Fritz, 1997). The 217 third hypothesis of deriving radiogenic Sr from shales and sands within the Mt. Simon 218 Sandstone appears to be feasible, as well. Since the Mt. Simon Sandstone is primarily derived 219 from Precambrian igneous basement, a significant concentration of radiogenic rubidium 220 (87Rb) is expected (Clark and Fritz, 1997). 87Rb is directly linked to potassium (K) 221 geochemistry, for which it will substitute readily for K+ in potassium feldspar, and decay into 222 the daughter product, 87Sr, thus increasing the 87Sr/86Sr ratios (Clark and Fritz, 1997). 223 Furthermore, the radiogenic Sr within the thin shale beds that occur in the Mt. Simon 224 Sandstone appears to be highly accessible. Stueber et al. (1987) conducted leachate 225 experiments on later Paleozoic shale formations within the Illinois Basin, which liberated 226 high Sr ratios of 0.72793 to 0.75472. It is therefore reasonable to extrapolate these values to 227 the shale within the Mt. Simon Sandstone. 228 4 229 Drilling and Subsurface Reservoir Analyses 230 Well Drilling 231 Drilling, geological analyses and formation water sampling were conducted at the IBDP well 232 site in Decatur, Illinois (39°52’36.58” N, 88°53’36.00” W) by Schlumberger Carbon Services 233 using multiple techniques and down hole tools. Detailed descriptions of these are available at 234 http://www.slb.com and briefly summarized here. Drilling of the IBDP well began in 235 February 2009. The maximum drilling depth of 2205 m was reached in April 26, 2009, and 236 the well was completed in May 2009. Rotary drilling was conducted, which uses sections of 237 hollow pipe (drillstem) that are screwed together and added as the depth of the well increases. 238 Attached to the leading end of the drillstem is a tri-cone bit that cuts through the bedrock to 239 produce a slurry mixture of drilling mud, formation water and broken up fragments of 240 bedrock (cuttings). To prevent the borehole from becoming clogged by this drilling mud 241 slurry and to cool and lubricate the drill bit, drilling fluids are continuously pumped down 242 through the drillstem and out of the spinning drill bit. This mud and rock slurry then returns 243 to the surface, where the cuttings are separated and the drilling muds are replenished and re- 244 circulated back down into the drillstem. The drilling fluids consisted of proprietary mixture 245 of bentonite clays, polymers (Guar Gum) and local groundwater, with the density, viscosity 246 and other characteristics of the drilling mud continually adjusted based on the changing 247 physical properties of the well with increasing depth of penetration (Phelps and Fredrickson, 248 2002). Completion of the IBDP well required 16 tri-cone drill bits (ranging from 66 cm-, 45 249 cm- and 31 cm-diameter) composed of a combination of steel, tungsten-carbide, 250 polycrystalline diamond compact and diamond cutting and grinding edges. In addition, three 251 well casings (surface, intermediate and production) were inserted into the borehole. During 252 this drilling process, water and rock sampling was conducted with a drillstem test (DST), 253 multiple downhole wireline logging tools, and borehole and sidewall rock coring. 5 254 255 Rock Core Collection 256 Three whole sections of borehole cores were taken during the drilling process; one in the Eau 257 Claire Formation and two within the Mt. Simon Sandstone. Whole coring is the process of 258 extracting cylindrical, intact rock samples from specific subsurface horizons using a 259 specialized drill coring bit in conjunction with a core barrel and core catcher. The coring bit 260 is open in the center, allowing the bit to travel around a cylindrical rock sample, which is then 261 held in the core barrel. The core is held inside the core barrel with the core catcher until the 262 core is extracted. These three whole cores were 9.14 m (30 ft) long and about 10.16 cm (4 in) 263 in diameter. The depths at the top of the cores of the Eau Claire Formation and the two within 264 the Mt. Simon Sandstone were 1668 m, 1950 m and 2057 m, respectively. The whole cores 265 were used to establish the facies and depositional environments of the Mount Simon 266 Sandstone (Bowen et al., 2010). 267 Additional cores were acquired through the sidewall coring via percussion and 268 mechanical sampling. Percussion sampling involves firing a hollow core bit into the 269 formation and then extracting the bit with the rock core material inside (2.3 cm x 5 cm cores). 270 Mechanical sampling utilized a Mechanical Sidewall Coring Tool (MSCT) hollow rotary drill 271 core that cut into the formation and extracted similar sized core plugs. A total of 51 sidewall 272 cores were collected from the Mount Simon Sandstone. 273 274 275 Schlumberger Quicksilver Probe The Quicksilver probe is a newly developed wireline formation tester (WFT) 276 that was lowered into the IBDP wellbore on a wireline logging cable. The 277 WFT pumping system is designed such that the sterilized sampling bottles 278 can be isolated from the formation water pumping system. Therefore, 6 279 extensive pumping and associated pump tests can be completed without 280 actually collecting the fluids and contaminating the sterilized and sealed 281 sample collect bottles. A series preliminary pump tests were completed by 282 positioning the probe at a variety of stratigraphic horizons within the Mt. 283 Simon Sandstone that were identified from wire-line log data. This was 284 completed in order to determine which stratigraphic horizons would produce 285 a high enough flow rate for eventual pumping and sampling. Once the target 286 stratigraphic sample horizon at 5872’ was determined with this technique, 287 the probe was stabilized with lateral support arms against the wall of the 288 well bore and dual straddle packers were positioned and inflated above and 289 below the sample depth (Ayan et al., 2013). 290 probes, the ®Quicksilver Probe is unique in that it is comprised of two pumps, which 291 include: (1) an outer 20 cm-diameter concentric ring pump (called the guard probe); and (2) 292 an inner 3 cm-diameter central pump (called the sample probe). When activated, both of 293 these pumps extend horizontally from the probe to seat directly against the face of the 294 borehole wall (described in detail at 295 http://www.slb.com/services/characterization/wireline_open_hole/insitu_fluid/quicksilver_pr 296 obe_wireline_sampling.aspx). The ®Quicksilver Probe simultaneously pumps fluid from the 297 guard and sample probes with separated flowlines. The guard probe draws contaminated 298 fluid from the outer portions of the target rock formation and these fluids are discarded into 299 the wellbore below the packers. Conversely, the sample probe collects fluid from the cleaner 300 central portion of the rock formation (Akkurt et al., 2006; Schlumberger, 2007). The water 301 collected through the center sample pump is subjected to real-time geochemical monitoring 302 with WFT sensors of K, Br and other drilling mud contaminants, until a minimum threshold Compared to conventional WFT 7 303 level of ≤ 4% mud infiltrate contamination is reached. In the case of the 1.8 km-deep sample 304 collected from the Mt. Simon Sandstone, approximately 4L of formation water were 305 collected after more than 2.5 hours of pumping and monitoring at depth. In addition, prior to 306 the application in the present study, the ®Quicksilver Probe had been used exclusively to 307 collect uncontaminated oil samples (Akkurt et al., 2006; Ayan et al., 2013). 308 309 Wire-Line Log Electrical Data Collection 310 The core data were used in conjunction with a suite of wireline collected using a 311 Schlumberger Modular Formation Dynamics Tester (MDT), a wireline tool. A wireline tool 312 is a cylindrical device that consists of one to multiple tools that each collects specific data, 313 such as porosity and permeability. The wireline tool was lowered to the base of the well and 314 was slowly pulled to the surface to provide wireline data. Wireline data, or logs, provide a 315 better comprehension of the entire reservoir characteristics and well integrity by obtaining 316 different parameters, such as porosity, density, lithologies, seismic profiling and casing 317 cement, and can provide physical water and rock samples. The measurements used to create 318 these logs were the gamma ray, neutron porosity, standard deviation permeability (SDR Perm) 319 and Elemental Analysis (ELAN). The gamma ray log measurements recorded the amount of 320 naturally occurring radioactivity within a deposit from elements, such as uranium and 321 thorium. The neutron porosity log is based on the effect of a formation on neutrons emitted 322 by a source. Hydrogen has the largest effect on slowing or capturing neutrons and because 323 hydrogen is mainly in pore fluids, the porosity can be calculated (slb.com). The ELAN log is 324 Schlumberger’s analysis of all the wireline logs that have been collected, which are combined 325 together and adjusted to core data or fluid samples to create an overall log of the well. 326 8 327 Drillstem Test (DST) 328 A drillstem test (DST) was conducted on drilling mud and formation water within the 329 Ordovician St. Peter Formation on March 7, 2009. The DST tested well pressure and 330 permeability, as well as permitted water collection by placing temporary packers, which are 331 flexible elastomeric devices that inflate or expand to seal a wellbore, directly above and 332 below the stratigraphic interval being sampled. The opposing pressure within drillstem was 333 released through pressure step-down valves that allowed fluids within the targeted subsurface 334 formation to flow up the drillstem as a result of subsurface formation pressure. The valves 335 were then closed, the packers were removed and the drillstem was removed from the 336 borehole. The mixture of cutting fluid slurry and formation water traveled up the drillstem 337 and was collected at the surface. These drillstem samples, collected from each length of pipe 338 as it is brought to the surface and disconnected, ranged in composition from nearly 100 % 339 drilling mud to being formation water that was highly contaminated with drilling mud. The 340 drilling mud collected was used as an “internal tracer” to evaluate the quality of formation 341 water collected using a Quicksilver probe (as described below). 342 The DST yielded 2 L of “less contaminated” formation water and approximately 212 343 L of drilling mud and highly contaminated formation water from the St. Peter Formation. 344 Water samples were collected in 19 liter buckets by catching the water that flowed out of the 345 bottom of the each section of drillstem pipe as it was brought out of the ground and 346 unscrewed. This broke the seal between the pipes and allowed the formation water to flow 347 out from between the pipes at the surface. The 2 L sample of less contaminated formation 348 water collected from the end of the drillstem was emptied straight into two autoclaved 1 L 349 Nalgene bottles. Additionally, a small proportion of collected water was filtered with 0.45µm 350 filters and analyzed by the ISGS to obtain the aqueous geochemistry. These 0.45µm filters 351 were immediately collected, with sterile gloved hands and tweezers, and stored at -80˚C 9 352 (Pinkart et al., 2002). After collection of the water samples and filters, the samples were 353 returned to University of Illinois at Urbana-Champaign where the drilling mud and water 354 samples were placed in a 4˚C dark cold room, with sample splits of drilling mud and 355 formation water and the filters placed in a -80˚C freezer. 356 Collection of Quicksilver Probe MDT Water Samples 357 The clean formation water samples obtained from the Quicksilver probe modular dynamic 358 formation tester (MDT) were used for gas collection, aqueous geochemical analysis and 359 microbial DNA extractions. Prior to deployment and collection, the Quicksilver MDT 360 probe was thoroughly cleaned and sanitized with multiple rinses of MilliQ water, bleach and 361 ethanol in Schlumberger facilities in Brownstown, Texas, where it was immediately sterile 362 wrapped, packed and shipped on Schlumberger trucks to Decatur, Illinois. Once the 363 Quicksilver Probe MDT reached the IBDP well site, it was again cleaned and pressurized 364 with 100 % ethanol before it was set down the borehole. When the probe reached the 1.8 km 365 sample depth horizon within the borehole, collection of the formation water began by 366 monitoring the resistivity in the flowlines in real-time. The factors that affect the actual 367 contamination (e.g., vertical permeability, horizontal permeability, pump rate, formation fluid 368 mobility and sanding potential) were taken into consideration and the tool pump was adjusted 369 to the optimum speed and drawdown. Due to the contrast between the formation water and 370 the liquid phase of drilling fluid, changes in resistivity could be observed during the transition 371 from initial drilling mud filtrate to the virgin formation water. The pumping lasted until 372 resistivity became asymptotical to be as low as 0.02 ohms and very constant. The resistivity 373 acts a definitive tracer, which indicates that the drilling fluid has been evacuated and that 374 contamination has been removed (Phelps and Fredrickson, 2002). Additional tracers and 375 procedures that are presented by Phelps and Fredrickson (2002) were thoroughly considered, 376 but ultimately abandoned due to high expenses. 10 377 The 4 L sample bottle was sterilized, sealed and inserted into the ®Quicksilver Probe. 378 The sample bottle remained closed at depth until 2.5 hours of test pumping was completed 379 and the < 4% geochemical contamination level had been reached. At that time, the sample 380 bottle was filled in situ within the ®Quicksilver Probe and automatically sealed when full for 381 transport back to the surface. Immediately upon return to the surface, the 4 L sample bottle 382 was removed from the ®Quicksilver Probe within a mobile Schlumberger laboratory unit. 383 The formation water was then immediately offloaded under sterile conditions into sterilized 1 384 L Nalgene bottles, leaving no gas head in the bottles. There were then placed on ice within a 385 dark cooler and immediately transported 87 km back to the laboratories at the University of 386 Illinois at Urbana-Champaign (UIUC). Ensuing sample handling and treatment is described 387 below and under a series of sample preparation as introduced below. 388 389 Gas collection 390 The headspace gases of the MDT sample (D5872) were collected for isotopic analyses. Gas 391 collection was accomplished by using serum bottles that have been evacuated with nitrogen 392 gas, a 10 mL syringe, an 18-gauge needle, a 22-gauge needle and a T-valve. With gloved 393 hands and sterilized lab bench, the T-valve was placed onto the 10 mL syringe with the 18- 394 gauge needle. Using a Bunsen burner to heat up the 18-gauge needle, the needle was used to 395 penetrate through the cap of the 1L Nalgene bottle into the headspace. Quickly, the T-value 396 was opened and the syringe plunger was pulled back to 6 mL, which was followed 397 immediately by closing the T-valve. After replacing the 18-gauge needle with a 22-gauge 398 needle when the T-valve was closed, the 6 mL of gas was injected into the evacuated serum 399 bottles. The bottles were then sent to Isotech Laboratories Inc., IL, for isotopic analyses of 400 gas components. 401 11 402 Centrifugation of water samples 403 The formation water collected from the MDT was centrifuged on a Thermo Scientific Sorvall 404 Lengend RT Plus fixed angle centrifuge (Thermo Scientific, MA) immediately after it was 405 transported back to the laboratory at UIUC. The centrifugation process was conducted with 406 gloved hands on a clean lab bench and all the samples were kept at 4˚C. The 1L Nalgene 407 samples were opened in a high air flow environment to eliminate airborne contamination. 408 Negative controls were created and treated identically to the samples. The centrifugation was 409 performed at 9,000 rpm for 20 minutes at 4˚C. The supernatant from this operation was 410 decanted into sterilized Nalgene bottles and stored in the 4˚C room. The pellets were pooled 411 and washed with pre-cooled sterile 1×PBS buffer solution (pH 7.4) twice. The final pellets 412 were then re-suspended in RNAlater® (Ambion Inc., TX), incubated at 4 ˚C for 24 hours 413 before stored in the -80˚C freezer for long-term storage. 414 415 Geochemical Analyses 416 Geochemical parameters, such as pH, conductivity and electoral potential (Eh), were 417 measured on the freshly collected formation water in the field with the ORION® 5-Star 418 pH/RDO*/Conductivity portable meters (Thermo Scientific, MA) connected with appropriate 419 probes. Density was measured with a portable density meter (Mettler-Toledo Inc., OH). 420 Supernatant of centrifuged samples was promptly aliquoted and sent on ice to the 421 geochemistry laboratory of the ISGS, the Nuclear Engineering Teaching Laboratory of the 422 University of Texas, Austin and the Prairie Analytical Systems, Inc (Prairie Lab) 423 (Springfield, IL) to quantify ion and trace metal concentrations. The samples were kept in at 424 4 °C in dark before analyses. In order to assess the systematic errors at different laboratories, 425 some of the parameters were repeated in different labs. Ion composition and concentrations 426 were determined at ISGS by using a Dionex 211i ion chromatograph (IC) (Dionex Company, 12 427 CA) installed with an Ionpac AG14 Guard Column, an Ionpac AS14 Analytical Column, and 428 a CDM-3 conductivity detector cell with a DS4 detection stabilizer following U.S. EPA 429 Method 300.0 (Pfaff, 1993). Some of the ion and metal quantification were performed with 430 an IC and an inductively coupled plasma mass spectroscopy (ICP-MS) at the Prairie Lab. 431 TOC was measured with a Shimadzu TOC-V CSN (Shimadzu Cooperation, Japan). 432 433 Strontium Isotopic Analysis 434 Sr analyses were completed on the formation water from the MDT to obtain the modern 435 87 436 conjunction with solid rock samples. The preparation for these samples started with the Sr- 437 spec resin and Sr columns under laminar flow hoods with gloves in a clean lab. The Sr-spec 438 resin by Eichrom was thoroughly cleaned by adding ultrapure deionized (DI) water to the 439 resin and leaving under a heat lamp overnight, then decanted and refilled with more ultrapure 440 DI water. This process was repeated three times. The Teflon Sr columns were then loaded 441 with the Sr-spec resin slurry, using the upmost care not to create any bubbles within the resin 442 by letting the Sr-spec resin slowly settle. The columns were then washed with one full 443 reservoir >2mL) of 3N HNO3, followed by one full reservoir of ultrapure DI water and 1mL 444 of 3N HNO3 to precondition the column for samples. 445 Sr/86Sr ratios for further diagenetic episodes and basinal brine fluid migrations in Once the resin and columns were created, Sr samples were prepared in laminar flow 446 hoods in a clean lab with gloved hands by drying down from 1mL from the centrifuged water 447 samples in acid (HCl) washed beakers. A negative control and a duplicate were treated 448 identically to the samples. When a small solid formed from the dry down process, the 449 samples were brought up into 4 mL of 3N nitric acid (HNO3). When the Sr columns were 450 preconditioned with the 1mL of 3 N HNO3 (see above), 100µL of the sample was added to 451 the columns and allowed to flow through the column completely. Next, three additions of 13 452 2mL 3N HNO3 was added to the columns and allowed to completely drip through the column. 453 This was followed by two additions of 2mL of warm 0.05N HNO3 to elute the Sr. The eluted 454 Sr was collected in new, HCl cleaned beakers. The collections from the 0.05N HNO3 were 455 evaporated to dryness and 50 µL of concentrated (10.5 N) HNO3 was added to the small solid 456 to dry down again, which will break up organic residues. 457 The samples were brought up in 2% HNO3 with a 1:100 dilution in a laminar flow 458 hood for mass spectrometer analyses. The Sr analyses were conducted on a Nu Plasma HR 459 (MC-ICP-MS) (multicollector inductively-coupled-plasma mass spectrometer) with 460 consistent plasma parameters of: (1) coolant gas: 13 L/min; (2) auxiliary gas: 0.9 L/min; (3) 461 RF power: 1300W. The mass spectrometer has an electrostatic analyzer (ESA) that is used to 462 filter the ions by their kinetic energies and is followed by a magnetic sector, used to separate 463 ions to their mass over charge. By combining this magnetic sector with an electrostatic 464 analyzer, creates double-focusing and allows for exceptional resolution with “flat-topped” 465 peak signals. 466 467 Microscopic Analyses 468 Lithology analysis of thin sections 469 Polished uncovered thin sections (30 µm in thickness) were prepared from the chosen 470 sidewall cores by Wagner Petrographic, UT (http://www.wagnerpetrophic.com). Each thin 471 section was analyzed under plane-light (PL) and polarized light using an Axiovert 200M 472 Fluorescent Microscope with Apotome (Carl Zeiss Company, Oberkochen, Germany) 473 operated by Axiovision software, allowing for ultrahigh resolution imaging and microscopy 474 (125-200 nanometer working resolution). Images were collected with an AxioCam MRc5 475 color camera and an AxioCam MRm high-resolution camera and analyzed for grain size and 14 476 shape using the Axiovision software. All of the sidewall core lithologies were described 477 using the siliciclastic sedimentary rock classification scheme (Williams et al., 1982). 478 479 Fluorescence In-Situ Hybridization (FISH) for cell enumeration 480 Microbial samples were stained with fluorescence dyes for enumeration of microorganisms 481 with a method modified from Pernthaler et al. (2000). Right after arriving at the laboratory 482 on ice, cells in 1 liter of MDT formation water and about 200 mL of the DST slurry were 483 concentrated by centrifuging at 8,500 rpm for 20 min at 4 °C to collect biomass. The pellets 484 were pooled, washed twice with 15 mL PBS buffer and fixed with 4 % paraformaldehyde 485 solution at 4 °C for 4 hours. The fixed sample was washed twice with PBS and the stored in 486 PBS and ethanol mixture (1:1, v:v) at -20 °C before staining and microscopy observation. 487 The fixed samples were diluted in 1 mL PBS buffer solution at the proper dilution ratio. 488 In order to minimize unspecific stain with soil matrix (e.g. clay), a series of dilution from 489 1:10 to 1:1000 were performed in PBS buffer. The samples were filtered through black 490 polycarbonate filters with 0.22 m pores (GE Water and Process Technologies, PA) and air- 491 dried. Then, they were sequentially hybridized with cy3 labeled EUB 338 probe (EUB338- 492 cy3) (5 µg/mL) (Integrated DNA Technologies Inc., IA), 5×Sybr Green I (Invitrogen Corp., 493 CA) and TOPRO3 (5 µg/mL) (Sigma-Aldrich Corp., MO). A series of preliminary 494 experiments were performed on the samples from D5872 to optimize the experimental 495 conditions. Wavelength channels were carefully selected to achieve the minimal overlap of 496 different fluorescence dyes. As shown in Fig. S1, the spots visualized in the cy3 channel are 497 not all co-localized in the Sybr Green I, suggesting no significant overlap occurred between 498 these two channels. In our preliminary experiment, weak signal intensity and smear signals 499 were observed in the DAPI channel. Thus, the DAPI dye employed in the original protocol 500 (Pernthaler et al., 2000) was replaced with TOPRO3. Compared to DAPI, TOPRO3 exhibits 15 501 several advantages, including: (1) TOPRO3 is very specific to DNA; (2) the Far Red channel 502 (emission wavelengths above 650 nm) compared to DAPI (400-470 nm), does not interfere 503 with the natural autofluorescence inherent to organic matter (which is possible using the 504 DAPI channel). Therefore, emission the wavelengths above 650 nm observed for most 505 biological and organic compound autofluorescence is effectively minimized. EUB338-cy3, 506 the universal probe for bacteria (Amann et al., 1995), was used in this study for the 507 enumeration of the cell density in the D5872 and drilling mud samples. Staining was 508 performed with a hybridization buffer containing 30 % formamide (Sigma-Aldrich Corp., 509 MO). It was followed with hybridization by 5×Sybr Green I solution for 30 min in dark and 510 staining by TOPRO3 for 3 min. For every hybridization step, excessive dye was washed by 511 washing solution (for EUB338-cy3) (Pernthaler et al., 2000) or PBS buffer (for Sybr Green I 512 and TOPRO3). The membranes were mounted with 1-2 drops of ProLong® Gold Antifade 513 Reagent (Invitrogen Corp., CA) and covered with cover slips. After keeping the samples at 514 room temperature in dark for 24 hours, the cover slips were sealed with nail polish and the 515 samples were stored at 4°C before they were viewed under the confocal microscope. 516 With 40×-oil emersion lens and the Zeiss Axiovert 200M with the Apotome Structured 517 Illumination Optical Sectioning System with appropriate filters, about 12 to 25 Z-stack 518 images were taken (at 5 m interval and 60-125 m total distance depending on the thickness 519 of the area) for each spot. An extended depth of focus image using all focal planes was 520 created in the Axiovision program on each channel. The pictures taken from the three 521 channels were overlapped and only the overlapping regions of EUB338, SYBR Green I and 522 TOPRO3 signals were identified and defined as stained cells. Such cells turned white after 523 this process and all the pixels in other colors than white were discarded with the aid of Adobe 524 Photoshop (Adobe Systems, CA). For each sample, such operations were repeated for at least 525 10 spots randomly distributed on the membrane. The number of cells per unit volume of 16 526 formation water was calculated as average number of signals per unit of area normalized by 527 the area of filtered sample on the membrane, dilution ratio during sample preparation and 528 initial volume of formation water used for FISH. 529 Cultivation of Organisms 530 Significant efforts were made to grow culturable organisms from the D5872 formation water 531 sample. The groundwater used for cultivation was flushed with sterile cotton filtered N2:CO2 532 (80:20, v:v) and stored at 4 °C in dark right after arriving at the UIUC laboratory before 533 development of microcosms within a week. A variety of experimental conditions, including 534 media, salt concentrations and nutrients were created. The pH of the media was buffered 535 with 20 mM bicarbonate and N2:CO2 (80:20, v:v) headspace for the anaerobic media. The 536 sterilized anaerobic media were prepared under N2:CO2 (80:20, v:v) headspace and in the 537 serum tubes sealed with butyl rubber stoppers and aluminum seals. All the amendments were 538 added from sterilized anaerobic stock solutions. 539 Two media, including the synthetic groundwater (Roh et al., 2002) and R2A medium 540 were selected. A series of geochemical conditions were created, including fermentation, iron-, 541 nitrate-, and sulfate-reducing conditions with glucose, Fe(III)-citrate, NaNO3 and Na2SO4 (5 542 mM each), respectively. A fatty acid mixture (acetate, lactate, butyrate (5 mM) each) as 543 electron donors and carbon source was amended in the non-fermentation microcosms. Under 544 each condition, groundwater was injected with the final volume 10 and 50 % of the total 545 volume, respectively, both to adjust the final salt concentrations and as the source of 546 indigenous organisms. 547 The cultures were stored still at 37 °C in dark and manually shaken every 3-5 days. 548 In order to determine whether organisms were successfully cultured, the cultures were read 549 for OD600 using a SPECTRONIC 20D+ spectrophotometer (Thermos Scientific, MA) and 550 observed under a Nikon optical microscopy (Nikon Instruments Inc., Japan) every 3-5 days 17 551 for the first month and at the weekly frequency for the rest of three months. The microcosms 552 filtered through 0.22 mm membrane was measured at monthly frequency for the changes of 553 fatty acid concentrations using an HPLC. 554 Microbial Community Analyses 555 DNA extraction 556 High molecular weight genomic DNA was extracted directly from the frozen centrifuged 557 pellets (as described in Centrifugation of water samples above) using a procedure similar to 558 that of Tsai and Olson (1991). All reagents used were of molecular-biology grade or higher 559 purity. Cells were lysed by incubating one filter at 37°C in the lysis solution (0.15 M NaCl, 560 0.1 M Na2EDTA, pH=8.0) with 15 mg lysozyme per mL, followed by another incubation in 561 STS solution (0.1 M NaCl, 0.5 M Tris-Cl, pH=8.0, 10% sodium dodecyl sulfate) (1991) . 562 After freeze-thaw cycling between a 55°C water bath and liquid nitrogen to further disrupt 563 cell membranes for three times, proteins were removed through sequential extractions with 564 equal volumes of phenol (pH=7.8), phenol:chloroform:isoamyl alcohol (25:24:1, pH=7.8) 565 and chloroform:isoamyl alcohol (24:1). DNA, which remained in the aqueous phase, was 566 precipitated in isopropanol containing 2.5 M ammonium acetate and 100 µg/mL glycogen 567 (Ambion Inc., TX). The centrifuged DNA pellet was then air dried and resuspended in 568 nuclease-free TE buffer. We confirmed both the quality and quantity of DNA recovered by 569 gel electrophoresis. DNA samples were loaded onto a 0.8 % agarose gel and stained with 570 ethidium bromide. The size and intensity of the genomic DNA bands were compared visually 571 with a quantitative DNA ladder (Hyperladder I, Bioline USA, Boston, MA). Due to the low 572 biomass in the environmental samples, DNA may be present but not visible via gel 573 electrophoresis. 574 18 575 Development of 16S rRNA Clone library 576 In order to develop clone libraries for full-length 16S rRNA genes, PCR was performed with 577 primers for Bacteria 8F (5′-AGA GTT TGA TCC TGG CTC AG-3′) or Archaea (21F 5’- 578 TCC GGT TGA TCC YGC CGG-3’) with a universally conserved reverse primer (1492R, 5′- 579 GGT TAC CTT GTT ACG ACT T-3′). PCR amplification was performed in 50 μL reaction 580 volumes using the TaKaRa Ex TaqTM kit (Takara Bio Inc., CA) following the manufacturer’s 581 instructions. Amplification began with a denaturing hot start at 95°C and was followed by 25 582 cycles of the following: 30 s of denaturing at 94°C, 30 s for primer annealing at 55°C, and 90 583 s for extension at 72°C. The PCR was completed with a final extension step at 72°C for 5 584 minutes. Size and intensity of the PCR product was confirmed with agarose electrophoresis 585 as described above. Considering the precaution of contamination, appropriate negative 586 controls were used to ensure no sample was contaminated with exogenous DNA. Equivalent 587 volume of sterile ultrapure water was filtered in the same manner as the formation water 588 samples and then subjected the filter to the same DNA extraction procedure. A negative 589 procedural control containing no sample during DNA extraction was prepared as well. No 590 DNA was detected in any control extraction by gel electrophoresis, and no amplification was 591 observed in any PCR reactions using these samples as a template. The amplified PCR 592 products were purified with Qiaquick PCR purification kit (Qiagen Inc., CA). Purified PCR 593 products were cloned into p-GEMT Easy Vector® and were transformed into JM109 High 594 Efficiency Competent Cells as instructed by the manufacturer (Promega Corporation, WI). 595 For each 16S rRNA clone library, a total of 196 recombinant plasmids were extracted from 596 randomly picked clones and sequenced at the Illinois Biotechnology Center of University of 597 Illinois, Urbana. 598 599 The full-length 16S rRNA gene sequences from clone libraries were assembled with Sequencher 4.9 (Gene Codes Corporation, Ann Arbor, Michigan). Sequences with low 19 600 qualities were eliminated. Alignments and distant matrices of assembled sequences were 601 generated with the aid of NAST (DeSantis et al., 2006b) and PHYLIP (Retief, 2000) 602 programs from Greengenes (DeSantis et al., 2006a). All sequences were checked for chimeric 603 artifacts by Bellerophon (version 3) (Huber et al., 2004) and the chimeras were removed 604 from the datasets before the following sequence analyses. Taxonomic classification was 605 determined by using the sequence match tool of the Ribosomal Database Project (RDP) II 606 database (Maidak et al., 2001) and double checked with the sequence classification program 607 from the Greengenes (DeSantis et al., 2006a). Operational taxonomic units (OTU) were 608 determined at sequence similarity levels of 99 %. One representative sequence per OTU was 609 used to construct phylogenetic trees based on neighbor-joining algorithm (Jukes-Cantor 610 correction) in ARB (Ludwig et al., 2004). The robustness of the inferred tree topologies was 611 evaluated after 1000 bootstrap replicates of the neighbor-joining data. 612 613 Terminal Restriction Fragment Length Polymorphism (T-RFLP) 614 Preparation of T-RFLP samples was similar to that in development of clone library. Full- 615 length 16S rRNA genes were amplified from genomic DNA using FAM-labeled 8F and 616 1492R at the same thermal condition. The PCR products were purified using QIAquick® PCR 617 purification Kit (Qiagen Inc., CA). The purified PCR products were digested with RsaI (New 618 England Biolabs Inc., MA) and analyzed with an AB 3730xl DNA Analyzer and using 619 ROX1000 as the standard. The results were visualized using the GeneMapper® software and 620 analyzed with a Microsoft Excel script designed by our group. 621 20 622 Metagenomic Analyses 623 V1-V3 16S rRNA Hypervariable Region Pyrosequencing 624 PCR products for bar-coded pyrosequencing were amplified with the primer pair targeting 625 V1-V3 hypervariable region of 16S rRNA genes (position 27-338 (V1-V3), Escherichia coli 626 numbering). Each primer was attached with a 12 bp Roche 454 pyrosequencing bar-code 627 sequence, a TC liner and a Roche 454 sequencing adapter. Amplification, pooling and 628 pyrosequencing were performed as described (Jindou et al., 2008). The single band of PCR 629 products was purified from the cut agarose gel (0.8 %) with QIAGENE® Gel Extraction Kit 630 and the concentrations were quantified with a Qubit® Fluorometer (Invitrogen Co., CA) after 631 staining the DNA using a Quant-iT™ dsDNA BR Assay Kit (Invitrogen Co., CA). Equal 632 amount of PCR products for each sample were pooled and be run on a Roche Titanium 454 633 pyrosequencing machine at the W. M. Keck Center, University of Illinois, Urbana- 634 Champaign. 635 Pyrosequencing reads were first assessed for quality by using several criteria: reads that 636 had degenerate bases, overall low-quality score, without a complete barcode and primer at 637 one end or shorter than 300 bp were removed. The remaining reads were aligned against a 638 SILVA based bacterial reference (Pruesse et al., 2007), and chimeras were determined with 639 the ChimeraSlayer package in Mothur (Haas et al., 2011) and excluded from further analyses. 640 The remaining sequences were trimmed with the aid of Jalview (Clamp et al., 2004) to cover 641 the same aligned region. The reads were then preclustered (Huse et al., 2010) and assigned 642 into OTUs to generate rarefaction curves. Sequencing recovery was estimated using non- 643 parametric abundance-based coverage estimator (Chao and Lee, 1992) at different 644 dissimilarity levels. The pyrosequencing reads were classified into different taxonomic 645 groups by comparing them with those entries in the SILVA database bacterial reference 646 (Pruesse et al., 2007). The number of representative reads for each OUT determined at 97 % 21 647 as the cutoff value at each taxonomic level was counted and the proportions of each group in 648 a sample were calculated. 649 650 Halomonas sulfidaeris strain Esulfide1 as a Reference Genome for D5872 Metagenome 651 Halomonas sulfidaeris strain Esulfide 1 (H. sulfidaeris Esulfide1) was selected as the 652 reference genome for the D5872 Metagenome. The culture was purchased from the America 653 Type Culture Collection (ATCC) and grown on the agar plates containing the Halomonas 654 medium as recommended by ATCC and 15 % agar. Single colonies were picked and 655 inoculated in liquid Halomonas medium for 36-48 hours at 30 C. After the cells were 656 collected by centrifugation, genomic DNA was extracted with the method as described 657 above. 658 659 Pyrosequencing of D5872 Metagenome and H. sulfidaeris Esulfide1 genome 660 Genomic DNA of D5872 H. sulfidaeris-dominated microbial community and H. sulfidaeris 661 Esulfide1 was sequenced at the W. M. Keck Center, University of Illinois Urbana- 662 Champaign. A combination of 454 shotgun, 454 paired-ends (3-5kb inserts), and Illumina 663 sequences (500 bp inserts) were performed (Kumar et al., In preparation) using a Roche GS- 664 FLX sequencer and a Illumina HiSeq2000, respectively. The libraries for 454 Roche shotgun 665 and mate-pair sequencing of 3kb libraries were prepared following the standard protocols 666 from Roche (Roche). For Illumina sequencing, the genomic DNA was fragmented into 667 roughly 500 bp reads using standard Illumina protocols and paired-end sequenced. 668 669 Metagenomic and Genomic Analyses 670 The genomic reads sequenced using 454 array-based pyrosequencing (Ronaghi, 2001) were 671 assembled using the Newbler assembler v2.3 (www.454.com). Reads were quality-trimmed 22 672 using default trimming provided by Newbler. For the metagenome, the unassembled raw 673 reads (singletons) were trimmed using Newbler’s quality trimming, and singletons <100 bp 674 in length were removed. Sequences that were identical to each other were also removed in 675 order to retain a unique sequence set. Stringent parameters for minimum overlap between two 676 reads (60 bp in length and 95% in identity) were used to prevent assembly into chimeric 677 contigs. Assembly was compared with different parameters (default) and versions (v2.0) of 678 the Newbler assembler, and GC-content distribution of the contigs were compared between 679 these assemblies to ensure a high quality assembly. 680 tRNA genes were predicted in the contigs and singletons using tRNAScan-SE (Lowe 681 and Eddy, 1997; Schattner et al., 2005), and rRNA were predicted using a Python based 682 script, rna_hmm3 (Huang et al., 2011), that runs on HMMER3.0 (Eddy, 2009) to search for 683 partial 5S, 16S, and 23S genes in the metagenome contigs. From the predicted coordinates, 684 the rRNA sequences were parsed and the corresponding sequences were assembled using the 685 CAP3 sequence assembly program (Huang and Madan, 1999) to identify complete operons. 686 Metagenomic reads often contain partial open-reading frames (ORFs). MetaGeneMark (Zhu 687 et al., 2010) was used to predict protein encoded genes in the contigs and singletons allowing 688 for partial ORFs. The program uses hidden Markov models (HMM) trained on short bacterial 689 and archaeal sequences based on their GC-content and codon usage. MetaGeneMark 690 predictions were shown to compare favorably with other prediction tools, such as MetaGene 691 (Noguchi et al., 2006) and MetaGeneAnnotator (Noguchi et al., 2008). Duplicates amongst 692 the predicted proteins were removed from the corresponding ORFs. In order to determine the 693 percentage of proteins representative of the reference genome, the predicted proteins were 694 searched against the predicted proteins of the reference genome H. sulfidaeris Esulfide1 695 (Kumar et al., In preparation). Conserved protein families, domains, and motifs were 696 predicted in the metagenome translated ORFs by searching Pfam (Finn et al., 2010) and 23 697 TIGRfam (Haft et al., 2001) databases using HMMER3.0 (Eddy, 2009). Known enzymes 698 were predicted using PRIAM (Claudel-Renard et al., 2003) at an e-value cutoff of 10-10. This 699 was followed by searching the proteins against the non-redundant (nr) database at an e-value 700 of 10-5 using Blastp (Altschul et al., 1997). The Blastp results were used as input for 701 taxonomic assignment of the ORFs using MEGAN v4 (Huson et al., 2007), and also for 702 assigning ORFs to known SEED categories. 703 Taxonomic classification of the assembled contigs and singletons was performed 704 using NCBI Blastp (Altschul et al., 1990; Altschul et al., 1997) search against the NCBI non- 705 redundant protein database (nr) and the gi numbers (NCBI sequence identifiers) for the top 706 10 hits for each protein were picked. Of the 42,408 (91%) proteins that had hits at an E-value 707 of <10-5, only 109 proteins were unassigned to any taxon or assigned as root. Using a SEED 708 server script, the corresponding SEED protein-encoding gene (PEG) IDs for the gi 709 numbers were identified in theSEED database (Aziz et al., 2008; Meyer et al., 2008). 710 Sorted by e-values for each gene, the PEGs were mapped to the subsystems within 711 theSEED database and the first subsystem that exists for a PEG was picked. For 712 theSEED classification, Level 2 within the hierarchy was used where possible. 713 Otherwise, Level 1 classification of theSEED was used. In order to identify enriched 714 SEED categories in the metagenome, counts of proteins assigned to each SEED 715 category for the metagenome proteins were compared to those for proteins in the 716 entire SEED database using the Fischer’s exact test. To correct for multiple 717 comparisons, a false discovery rate error adjustment to the P-values was made using 718 the Bejamini-Hochberg method (Benjamini and Hochberg, 1995). All categories that 719 were below an adjusted P-value of <0.05 were considered significantly enriched in the 720 metagenome. The predicted proteins were also classified on the basis of Cluster of 24 721 Orthologous (COG) domains using reverse-position-specific Blast (rpsBlast) 722 (Marchler-Bauer et al., 2002) search against a database of position-specific scoring 723 matrices (PSSMs) of automatically aligned sequences classified in the COG resource 724 for unicellular species (Tatusov et al., 2003). The rpsBlast search results were parsed, 725 followed by generation of counts of metagenome proteins that were assigned to each 726 COG category. 727 Figures and Tables (a) Sybr Green I 5 mm (c) TOPRO3 5 mm (b) EUB338-cy3 5 mm (d) Overlapped 5 mm 728 729 Figure S1. Fluorescence hybridization of the D5872 microbial consortium by Sybr Green (a), 730 cy3 labeled probe EUB338 (b) and TOPRO3 (c). Subfigure (d) shows overlapped pictures (a- 731 c). The spots appearing at the same location in (a)-(c) will overlap in (d) and are thus 25 732 determined as an effectively stained cell. The arrows point to one of the identified stained 733 cells. 26 1 (a) D5872 872 Fraction 0.8 0.6 0.4 0.2 421 165 645 700 859 888 0 0 300 600 900 Length of Fragments 1 (b) DDM Fraction 0.8 0.6 646 0.4 0.2 119 422-423 467 307 470 562 823 881 869 896 0 0 734 300 600 900 Length of Fragments 735 Figure S2. T-RFLP analysis of D5872 microbial community (a) and drilling mud (b). Full- 736 length 16S rRNA genes were multiplied with bacterial primers 8F and 1492R and digested 737 with restriction enzyme RsaI. Individual fragment abundance was normalized to the total 738 abundance and shown as the fraction. The values above the peaks indicate the length of the 739 corresponding fragments in the unit of base pairs. 740 27 741 (a) (b) 742 743 744 745 746 747 748 749 750 751 752 753 754 Figure S3. Phylogenetic relationship of Halomonas species in D5872 microbial community 755 versus the DST drilling mud (DDM) by 454 pyrosequencing reads targeting V1-V3 756 hypervariable region of 16S rRNA genes (a) and comparison of abundance of each OTU in 757 the two samples (b). Each sample code indicates one OTU identified at 97 % cutoff value. 758 Escherichia coli (X80725) was used as the out-group and the scale bar indicates 0.1 changes 759 per nucleotide position. The accession number of each type strain is in parentheses. The 760 values listed in the parentheses after each OTU indicate numbers of sequences in the D5872 761 versus those in the drilling mud. The abundance of each OUT is proportional to the size of 762 bars in (b). 763 764 765 28 766 (a) (b) 200 250 D5872 DDM OTUs 150 OTUs D5872 200 100 50 DDM 150 100 50 0 0 0 50 100 150 Number of Reads 200 0 2000 4000 6000 8000 Number of Reads 10000 767 768 Figure S4. Rarefaction of D5872 and the DST drilling mud (DDM) microbial community 769 richness using full-length 16S rRNA genes (a) and pyrosequencing reads of the V1-V3 770 hypervariable region of 16S rRNA genes (b). The cutoff values for (a) and (b) are 99 % and 771 97 %, respectively. 772 29 773 774 775 776 777 778 779 780 781 782 783 Figure S5. Distribution of GC-content with length for D5872 Metagenome contigs. The solid 784 black line represents the average GC-content for H. sulfidaeris Esulfide1 (0.54). 785 786 30 (a) (b) 787 788 789 Figure S6. Taxonomic assignment of D5872 Metagenome ORFs predicted in contigs (a) and 790 singletons with ORFs longer than 100 aa (b). The ‘Others’ category comprises a mix of 791 diverse taxa and includes proteins, majority of which were classified within root, bacteria, 792 and cellular organisms. 793 31 794 795 Figure S7. Significant enrichment of SEED subsystems in D5872 Metagenome proteins over 796 H. sulfidaeris Esulfide1 proteins. X-axis indicates the negative log of P-value. Y-axis denotes 797 the difference in percentage of proteins in D5872 Metagenome and H. sulfidaeris Esulfide1 798 that are assigned to SEED subsystems. The enriched subsystems are highlighted in red. 799 800 801 32 802 803 Figure S8. KEGG pathway depicting degradation of dicholorobenzoate as an example of the 804 biodegradative capability of halogenated hydrocarbons amongst D5872 Metagenome. 805 Enzymes highlighted in green are the ones present in D5872 Metagenome. 806 33 807 808 Figure S9. KEGG pathway depicting reductive fixation of ammonia into amino acids. 809 Enzymes highlighted in green are the ones present in D5872 Metagenome. 34 810 Table S1. Geochemistry of formation fluid collected from the horizon 1.8 km (5872 feet) in 811 depth of the Decatur Well, ILa. Geochemical Parameters Depth (m) T (°C) Formation Pressure (bar) Redox Potential (Eh) (mV/pH unit) D5872 1789 49.9 181 166.5 Drilling Mud Wellhead 26.5 NAb NA 7.98 NA c BDL NA 240011* 14412 15.70±0.28 213.4±16.2 52767±5552 2.025±0.035 6.45±0.27 BDL 2.30 503±15 7260 BDL 21200 642 12.20 12732±987 BDL 292.300.71 4.98±0.62 3.03 BDL 2392±11 NA BDL BDL BDL 350±77 131.0±4.2 4.65±0.35 48±11 52.5±4.3 1.14±0.05 1581±128 0.0634±0.0032 -21.9 -35.16 -63 0.71123±0.00001 (1.5±0.4)106 NA NA NA NA (8.4±3.5)1010 pH (0.3) Dissolved oxygen (DO) TOC (mg/L)d TDS (×103 ppm)e, f F- (ppm) Br- (ppm) Cl- (ppm) I- (ppm) NO3--N (ppm) PO43- (ppm) Ba2+(ppm) SO42-(ppm) Ca2+(ppm) Fe2+(ppm) K+(ppm) Mg2+(ppm) Mn2+(ppm) Na+(ppm) Zn2+(ppm) Isotope (‰) 13CO2 13CH4 13DC1 87Sr/86Sr Cell density (cells/L) 812 a 813 on filtered formation fluid or gaseous phase by the laboratories at UIUC, ISGS, UT-Austin 814 and Isotech Inc. Average was calculated if a parameter was measured at different 815 laboratories and uncertainties indicate standard deviation of replicate measurements. b NA: 816 not measured. c BDL: below detection limit. d TOC: total organic carbon. e TDS: total 817 dissolved salts. The samples were measured after passing through 0.45 m filter membrane; 818 ppm indicates parts per million. f The detection limits for the ions are (ppm): F- (0.05); Br- 819 (0.25); Cl- (3.5); I- (0.1); NO3- (0.25); Ba2+ (0.5); PO43- (0.8); SO42- (0.25); Ca2+ (10); Fe2+ (14); 820 K+ (19); Mn2+ (0.1); Na+ (12); Zn2+ (0.001). Temperature, Eh, pH and DO were measured in-situ, while other parameters were measured 35 821 822 Table S2. Summary of D5872 microbial community identified by clone library and pyrosequencing 16S rRNA clone library Number of sequences/reads Number of OTUa Number of sequences/reads affiliated to genus Halomonas Closest type strain related to the dominant sequences/reads Coverage 823 824 825 V1-V3 pyrosequencing D5872 166 9 Drilling Mud 156 39 D5872 8856 120 Drilling Mud 9918 201 163 4 8637 237 H. sulfidaeris H. neptunia H. sulfidaeris 0.98 0.84 0.995 H. neptunia and H. sulfidaeris 0.992 a 99 % and 97 % were used as the cutoff values for OTUs in analyses of 16S rRNA clone library and pyrosequencing, respectively. 36 826 References 827 Akkurt, R., Bowcock, M., Davies, J., Del Campo, C., Hill, B., Joshi, S. et al. (2006) Focusing 828 on downhole fluid sampling and analysis. In Oilfield Review: Schlumberger, pp. 4-19. 829 Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. (1990) Basic local 830 alignment search tool. J Mol Biol 215: 403-410. 831 Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, 832 D.J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search 833 programs. Nucleic Acids Res 25: 3389-3402. 834 Amann, R.I., Ludwig, W., and Schleifer, K.H. (1995) Phylogenetic Identification and in-Situ 835 Detection of Individual Microbial-Cells without Cultivation. Microbiological reviews 59: 836 143-169. 837 Atekwana, E.A. (1996) Precambrian basement beneath the central Midcontinent United 838 States as interpreted from potential field imagery, in B. A. van der Pluijm and P. A. 839 Catacosinos, eds., . In Basement and basins of eastern North America: Geological Society of 840 America Special Paper. van der Pluijm, B.A., and Catacosinos, P.A. (eds): Geological 841 Society of America. 842 Ayan, C., Corre, P.-Y., Firinu, M., Garcia, G., Kristennsen, M.R., O'Keefe, M. et al. (2013) 843 New dimensions in wireline formation testing. In Oilfield Review: Schlumberger, pp. 32-41. 844 Aziz, R.K., Bartels, D., Best, A.A., DeJongh, M., Disz, T., Edwards, R.A. et al. (2008) The 845 RAST Server: rapid annotations using subsystems technology. BMC Genomics 9: 75. 846 Benjamini, Y., and Hochberg, Y. (1995) Controlling the false discovery rate: A practical and 847 powerful approach to multiple testing. J Roy Statist Soc Ser B 57: 289–300. 848 Bethke, C.M. (1986) Hydrologic constraints on the genesis of the upper mississippi valley 849 mineral district from Illinois basin brines. Econ Geol 81: 233-249. 37 850 Bethke, C.M., and Marshak, S. (1990) Brine migrations across North America — The plate 851 tectonics of groundwater. Annu Rev Earth Pl Sc 18: 287–315 852 Bethke, C.M., Pruitt, J.D., and Barrows, M.H. (1984) Petrographic, geochemical, and 853 paleohydrologic evidence of nature of petroleum migration in Illinois basin. AAPG Bulletin 854 68: 454-454. 855 Bethke, C.M., Reed, J.D., and Oltz, D.F. (1991) Long-range petroleum migration in the 856 Illinois Basin. AAPG Bulletin 75: 925-945. 857 Bowen, B.B., Ochoa, R., Wilkens, N.D., Brophy, J., Lovell, T.R., Fischietto, N. et al. (2010) 858 Depositional and diagenetic variability within the Cambrain Mount Simon Sandstone: 859 Implications for carbon dioxide sequestration. Environmental Geosciences 18: 69-89. 860 Bradbury, J.C., and Atherton, E. (1965) The Precambrian basement of Illinois. In Illinois 861 Geol Survey Circ, p. 13. 862 Buschbach, T.C. (1964) Cambrian and Ordovician strata of north- eastern Illinois. In Illinois 863 State Geological Survey Report of Investigations, p. 90. 864 Carpenter, A.B. (1978) Thirteenth annual forum on the geology of industrial minerals: 865 Oklahoma Geological Survey. 866 Chao, A., and Lee, S.M. (1992) Estimating the Number of Classes Via Sample Coverage. J 867 Amer Statist Assoc 87: 210-217. 868 Clamp, M., Cuff, J., Searle, S.M., and Barton, G.J. (2004) The Jalview Java alignment editor. 869 Bioinformatics 20: 426-427. 870 Clark, I., and Fritz, P. (1997) Environmental isotopes in hydrogeology. New York: CRC 871 Press LLC. 872 Claudel-Renard, C., Chevalet, C., Faraut, T., and Kahn, D. (2003) Enzyme-specific profiles 873 for genome annotation: PRIAM. Nucleic Acids Res 31: 6633-6639. 38 874 DeSantis, T.Z., Hugenholtz, P., Larsen, N., Rojas, M., Brodie, E.L., Keller, K. et al. (2006a) 875 Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with 876 ARB. Appl Environ Microbiol 72: 5069-5072. 877 DeSantis, T.Z., Jr., Hugenholtz, P., Keller, K., Brodie, E.L., Larsen, N., Piceno, Y.M. et al. 878 (2006b) NAST: a multiple sequence alignment server for comparative analysis of 16S rRNA 879 genes. Nucleic Acids Res 34: 394-399. 880 DOE, Energy, Office of Fossile Energy, and NETL (2007) Carbon Sequestration Technology 881 Roadmap and Program Plan. 882 Eddy, S.R. (2009) A new generation of homology search tools based on probabilistic 883 inference. Genome Inform 23: 205-211. 884 Egeberg, P.K., and Aagaard, P. (1989) Origin and evolution of formation waters from oil 885 fields on the Norwegian shelf. Applied Geochemistry 4: 131-142. 886 Finn, R.D., Mistry, J., Tate, J., Coggill, P., Heger, A., Pollington, J.E. et al. (2010) The Pfam 887 protein families database. Nucleic Acids Res 38: D211-222. 888 Haas, B.J., Gevers, D., Earl, A.M., Feldgarden, M., Ward, D.V., Giannoukos, G. et al. (2011) 889 Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced 890 PCR amplicons. Genome Res 21: 494-504. 891 Haft, D.H., Loftus, B.J., Richardson, D.L., Yang, F., Eisen, J.A., Paulsen, I.T., and White, O. 892 (2001) TIGRFAMs: a protein family resource for the functional identification of proteins. 893 Nucleic Acids Res 29: 41-43. 894 Holser, W.T. (1979) Elements and Isotopes in Evaporites. In Marine Minerals: Reviews in 895 Mineralogy and Geochemistry. Burns, R.G. (ed): Mineralogical Society of America, pp. 295- 896 346. 897 Huang, X., and Madan, A. (1999) CAP3: A DNA sequence assembly program. Genome Res 898 9: 868-877. 39 899 Huang, Y., Li, W., Finn, P.W., and Perkins, D.L. (2011) Ribosomal RNA Identification in 900 Metagenomic and Metatranscriptomic Datasets. In Handbook of Molecular Microbial 901 Ecology I: Metagenomics and Complementary Approaches. Hoboken, NJ, USA.: John Wiley 902 & Sons, Inc., pp. 387-391. 903 Huber, T., Faulkner, G., and Hugenholtz, P. (2004) Bellerophon: a program to detect 904 chimeric sequences in multiple sequence alignments. Bioinformatics 20: 2317-2319. 905 Huse, S.M., Welch, D.M., Morrison, H.G., and Sogin, M.L. (2010) Ironing out the wrinkles 906 in the rare biosphere through improved OTU clustering. Environ Microbiol 12: 1889-1898. 907 Huson, D.H., Auch, A.F., Qi, J., and Schuster, S.C. (2007) MEGAN analysis of metagenomic 908 data. Genome Res 17: 377-386. 909 Jindou, S., Brulc, J.M., Levy-Assaraf, M., Rincon, M.T., Flint, H.J., Berg, M.E. et al. (2008) 910 Cellulosome gene cluster analysis for gauging the diversity of the ruminal cellulolytic 911 bacterium Ruminococcus flavefaciens. FEMS Microbiol Lett 285: 188-194. 912 Knauth, L.P., and Beeunas, M.A. (1986) Isotope geochemistry of fluid inclusions in Permian 913 halite with implications for the isotopic history of ocean water and the origin of saline 914 formation waters. Geochim Cosmochim Acta 50: 419–433. 915 Kumar, C.G., Dong, Y., Olsen, G.J., Kim, P.-J., Cann, I.K.O., Mackie, R. et al. (In 916 preparation) H. sulfidaeris: metabolic strategy unites hydrothermal vents and subsurface 917 arenite rocks. 918 Land, L.S., and Prezbindowski, D.R. (1981) The origin and evolution of saline formation 919 water, Lower Cretaceous carbonates, south-central Texas, U.S.A. . J Hydrol 54: 51-74. 920 Leetaru, H.E., and McBride, J.H. (2009) Reservoir uncertainty, Precambrian topography, and 921 carbon sequestration in the Mt. Simon Sandstone, Illinois Basin. Environmental Geosciences 922 16: 235-243. 40 923 Leetaru, H.E., Frailey, S., Morse, D., Finley, R.J., Rupp, J.A., Drahozval, J.A., and McBride, 924 J.H. (2009) Carbon sequestration in the Mt. Simon Sandstone saline reservoir. In Carbon 925 dioxide sequestration in geological media—State of the science: AAPG Studies in Geology. 926 Grobe, M., Pashin, J.C., and Dodge, R.L. (eds), pp. 261-277. 927 Lowe, T.M., and Eddy, S.R. (1997) tRNAscan-SE: a program for improved detection of 928 transfer RNA genes in genomic sequence. Nucleic Acids Res 25: 955-964. 929 Ludwig, W., Strunk, O., Westram, R., Richter, L., Meier, H., Yadhukumar et al. (2004) ARB: 930 a software environment for sequence data. Nucleic Acids Res 32: 1363-1371. 931 Maidak, B.L., Cole, J.R., Lilburn, T.G., Parker, C.T., Jr., Saxman, P.R., Farris, R.J. et al. 932 (2001) The RDP-II (Ribosomal Database Project). Nucleic Acids Res 29: 173-174. 933 Makowitz, A., Lander, R.H., and Milliken, K.L. (2006) Diagenetic modeling to assess the 934 relative timing of quartz cementation and brittle grain process during compaction. AAPG 935 Bulletin 90: 873-885. 936 Marchler-Bauer, A., Panchenko, A.R., Shoemaker, B.A., Thiessen, P.A., Geer, L.Y., and 937 Bryant, S.H. (2002) CDD: a database of conserved domain alignments with links to domain 938 three-dimensional structure. Nucleic Acids Res 30: 281-283. 939 McCaffrey, M.A., Lazar, B., and Holland, H.D. (1987) The evaporation path of seawater and 940 the coprecipitation of Br- and K- with halite. J Sediment Res 57: 928-937. 941 Meyer, F., Paarmann, D., D'Souza, M., Olson, R., Glass, E.M., Kubal, M. et al. (2008) The 942 metagenomics RAST server - a public resource for the automatic phylogenetic and functional 943 analysis of metagenomes. BMC Bioinformatics 9: 386. 944 Morse, D.G., and Leetaru, H.E. (2003) 3D Geologic Visualization, Key to Managing 945 Cambrian, Mt. Simon Sandstone Aquifer Gas Storage Reservoirs in Illinois. In Western 946 Regional Annual Joint Meeting of American Association of Petroleum Geologists and the 947 Society of Petroleum Engineers. Long Beach, CA. 41 948 Noguchi, H., Park, J., and Takagi, T. (2006) MetaGene: prokaryotic gene finding from 949 environmental genome shotgun sequences. Nucleic Acids Res 34: 5623-5630. 950 Noguchi, H., Taniguchi, T., and Itoh, T. (2008) MetaGeneAnnotator: detecting species- 951 specific patterns of ribosomal binding site for precise gene prediction in anonymous 952 prokaryotic and phage genomes. DNA Res 15: 387-396. 953 Pernthaler, J., Glöckner, F.O., Schönhuber, W., and Amann, R. (2000) Fluorescence in situ 954 Hybridization (FISH) with rRNA-targeted Oligonucleotide Probes. In Maring Microbiology. 955 Paul, J.H. (ed): Academic Press, pp. 207-226. 956 Pfaff, J.D. (1993) Method 300.0 Determination of Inorganic Anions by Ion Chromotography. 957 US Environmental Protection Agency, Cincinnati, OH. 958 Phelps, T.J., and Fredrickson, J.K. (2002) Drilling, Coring, and Sampling Subsurface 959 Ennvironments. In Manual of Environmental Microbiology. Hurst, C.J., and Crawford, R.L. 960 (eds). 961 Pinkart, H.C., Ringelberg, D.B., Piceno, Y.M., Macnaughton, S.J., and White, D.C. (eds) 962 (2002) Biochemical Approaches to Biomass Measurements and Community Structure 963 Analysis. Washington, DC: American Society for Microbiology. 964 Pruesse, E., Quast, C., Knittel, K., Fuchs, B., Ludwig, W., Peplies, J., and Glöckner, O. (2007) 965 SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA 966 sequence data compatible with ARB. Nucleic Acids Res 35: 7188-7196. 967 Retief, J.D. (2000) Phylogenetic analysis using PHYLIP. Methods Mol Biol 132: 243-258. 968 Riley, J.P., and Chester, R. (1971) Introduction to Marine Chemistry: Academic Press. 969 Roche http://www.roche-applied-science.com/index.jsp. 970 Roh, Y., Liu, S.V., Li, G., Huang, H., Phelps, T.J., and Zhou, J. (2002) Isolation and 971 characterization of metal-reducing thermoanaerobacter strains from deep subsurface 972 environments of the Piceance Basin, Colorado. Appl Environ Microbiol 68: 6013-6020. 42 973 Ronaghi, M. (2001) Pyrosequencing sheds light on DNA sequencing. Genome Research 11: 974 3-11. 975 Rowan, E.L., Goldhaber, M.B., and Hatch, J.R. (2002) Regional fluid flow as a factor in the 976 thermal history of the Illinois basin: Constraints from fluid inclusions and the maturity of 977 Pennsylvanian coals. AAPG Bulletin 86: 257-277. 978 Schattner, P., Brooks, A.N., and Lowe, T.M. (2005) The tRNAscan-SE, snoscan and snoGPS 979 web servers for the detection of tRNAs and snoRNAs. Nucleic Acids Res 33: W686-689. 980 Schlumberger (2007). Quicksilver Probe, The fluid purity you need, in a fraction of the time. 981 URL 982 http://www.slb.com/media/services/evaluation/reservoir/quicksilver_probe_brochure.pdf 983 Siegel, D.I. (1989) Geochemistry of the Cambrian-Ordovician aquifer system in the northern 984 Midwest, United States. US Geological Survey Professional Paper 1405-D: 76. 985 Sloss, L.L. (1963) Sequences in the cratonic interior of North America. Geol Soc Am Bull 74: 986 93–114. 987 Stueber, A.M., and Walter, L.M. (1991) Origin and chemical evolution of formation waters 988 from Silurian-Devonian strata in the Illinois basin, USA. Geochim Cosmochim Ac 55: 309- 989 325. 990 Stueber, A.M., Pushkar, P., and Hetherington, E.A. (1987) A strontium isotopic study of 991 formation waters from the Illinois basin, U.S.A. Appl Geochem 2: 477-494. 992 Tatusov, R.L., Fedorova, N.D., Jackson, J.D., Jacobs, A.R., Kiryutin, B., Koonin, E.V. et al. 993 (2003) The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4: 994 41. 995 Tsai, Y.L., and Olson, B.H. (1991) Rapid method for direct extraction of DNA from soil and 996 sediments. Applied and Environmental Microbiology 57: 1070-1074. 43 997 Veizer, J., Ala, D., Azmy, K., Bruckschen, P., Buhl, D., Bruhn, F. et al. (1999) Sr-87/Sr-86, 998 delta C-13 and delta O-18 evolution of Phanerozoic seawater. Chem Geol 161: 59-88. 999 Visocky, A.P., Sherrill, M.G., and Cartwright, K. (1985) Geology, hydrology, and water 1000 quality of the Cambrian and Ordovician systems in northern Illinois. In: State of Illinois, 1001 Department of Energy and Natural Resources. 1002 Williams, H., Turner, F., and Gilbert, C. (1982) Petrography: An Introduction to the Study of 1003 Rocks in Thin Section. NY: W. H. Freeman & Company. 1004 Worden, R.H., and Morad, S. (eds) (2000) Quartz cementation in oil field sandstones: a 1005 review of the key controversies: Blackwell Science. 1006 Zhu, W., Lomsadze, A., and Borodovsky, M. (2010) Ab initio gene identification in 1007 metagenomic sequences. Nucleic Acids Res 38: e132. 1008 1009 44