1 Supplementary Information 2 3 Detailed Materials and Methods 4 Bacterial Cultures 5 6 7 8 9 10 Escherichia coli K12 MG1655 was obtained from the Coli Genetic Stock Center (Yale University, New Haven, CT). Meiothermus ruber DSM 1279 and Pedobacter heparinus DSM 2366 were obtained from the DSMZ (Braunschweig, Germany). Liquid cultures of these organisms were grown to mid-log phase in their respective media: E. coli, LB broth at 37°C; M. ruber, DSMZ medium 256 (peptone 5g/L, yeast extract 1g/L, soluble starch 1g/L, pH 8.0) at 55°C; P. heparinus, nutrient broth at 28°C. From each culture 5 aliquots of 1x106 cells were pelleted by centrifugation. 11 Treatments 12 13 14 15 16 17 18 19 20 21 22 For each treatment one aliquot (1x106 cells) was resuspended in a final volume of 1mL. Cryogenic storage involved resuspending the cells in 1x phosphate buffered saline (PBS; pH = 7.4) and adding glycerol at a final concentration of 20% (v/v) followed by storage at -80°C. For ethanol fixation, the cell pellet was resuspended in 1x PBS (pH = 7.4) and 100% ethanol was added to produce a final concentration of 70% (v/v) followed by incubation at room temperature for 2 hours and storage at 4°C. Formaldehyde fixation required resuspension of the cells in 1x PBS (pH = 7.4) and the addition of a 20% paraformaldehyde solution to produce a final concentration of 4% (v/v) followed by incubation at room temperature for 2 hours and storage at 4°C. In order to reverse crosslinks induced by formaldehyde fixation we adapted a method used in chromatin immunoprecipitation studies that breaks the crosslinks using high temperature and a high salt buffer (Nelson et al., 2006). The fixed cells were washed twice in 2x PBS (pH = 7.4), incubated at 65°C for 12 hours and then placed on ice for flow sorting. 23 24 25 26 The FISH treatment was carried out by resuspending the cells in hybridization buffer (0.9M NaCl, 20mM Tris, 30% v/v Formamide, 0.01% v/v SDS) and incubation at 46°C for 1 hour. This was followed by two wash steps of pelleting the cells and resuspending in wash buffer (102mM NaCl, 20mM Tris, 0.01% v/v SDS).The cells were then incubated at 48°C for 15 min, then placed on ice until flow sorting. 27 Cell sorting 28 29 30 31 32 33 34 35 All cells were washed twice with sterile PBS to minimize carryover of fixation chemicals into the amplification reaction. Cells were collected following the clean sorting procedures in Rodrigue et al. (2009). Briefly, the cells were sorted on a Cytopeia Influx Cell Sorter (BD Biosciences, San Jose, CA) into 384 well plates containing 1µL of UV sterilized TE per well. The cells were stained with SYBR Green I (Invitrogen, Grand Island, NY) and illuminated by a 488 nm laser (Coherent Inc., Santa Clara, CA). Each organism was sorted onto a separate plate. The plate layout consisted of two rows for each treatment. Within each row the first two wells had nothing sorted into them to serve as negative controls, wells 3-4 had 100 cells sorted into each to serve as positive controls, and wells 5-24 each had a single cell sorted 1 36 37 into it. Thus each treatment consisted of 4 wells of negative controls, 4 wells of positive controls and 40 wells of single cells (Figure 1). 38 Cell lysis and Genome Amplification 39 40 41 42 43 44 45 Cells were lysed and their genomes amplified following the procedures of Woyke et al. (2011). Briefly, cells were lysed by exposure to an alkaline solution for 5 min at room temperature before neutralization. Real-time multiple displacement amplification (MDA) was done using the Repliphi Phi29 reagents (Epicentre, Madison, WI) in total reaction volumes of 15 µL. The reactions were incubated at 30°C for 16 hours and the reaction kinetics were monitored on a Roche LightCycler 480 (Roche Applied Science, Indianapolis, IN) analogous to a real time PCR with readings being taken every 15 minutes. To reduce contamination all reagents were exposed to UV light (Woyke et al., 2011). 46 16S rRNA Screening 47 48 49 50 51 52 53 The MDA products were diluted 20-fold in TE buffer and 1 µL aliquots of this diluted MDA product served as the template DNA in 10 µL total volume real-time PCR reactions. The PCR reactions used the KAPA SYBR FAST Roche LightCycler 480 2x qPCR Master Mix (KAPA Biosystems, Wilmington, MA) and were run on the Roche LightCycler 480. The PCR targeted a ~450bp fragment of the 16S rRNA gene using primers: 926F (GAAACTYAAAKGAATTGRCGG) and 1392R (ACGGGCGGTGTGTRC). The PCR reactions were run with the following program: 95°C for 3:00, 35 cycles of 95°C for 0:30, 60°C for 0:30, 72°C for 0:45. PCR products were Sanger sequenced using the ABI 3730xl DNA Analyzer. 54 Sequencing 55 56 57 58 59 Samples were indexed and combined into two pools of 36 samples each. Shotgun sequencing was performed on Illumina HiSeq instruments (San Diego, CA) following standard protocols with reads of 2 x 150bp. Each pool was loaded on 2 lanes of a flow cell which produced an average total sequence of 2.3 Gbp per single cell. The sequence data is available for download at http://genome.jgi.doe.gov/SincelFixationRD/SincelFixationRD.info.html. 60 Data analysis 61 62 63 64 65 66 67 68 69 70 71 The FASTQ output files of all sequencing runs were normalized to an average genome coverage of 315 reads per base by random subsampling. After subsampling, 10.85 million reads for each P. heparinus, 6.51 million reads for each M. ruber, and 9.74 million reads for each E. coli cell were mapped back to the respective reference genomes using the BWA alignment software version 0.5.9-r16 (Li & Durbin, 2009). Assessment of mapped read coverage was performed using SAMtools (Li et al., 2009). For de novo assembly, a k-mer filter was used to normalize the sequence data after Illumina artifact- and quality-trimming was applied. Reads representing highly abundant k-mers were removed such that no k-mers with a coverage of more than 30x were present after filtering. Reads with an average kmer depth of less than 2x were removed. The scripts for the normalization tool are available at http://genome.jgi.doe.gov/SincelFixationRD/SincelFixationRD.info.html. Following these steps the normalized Illumina reads were assembled using Velvet version 1.1.04 (Zerbino & Birney, 2008) and 1-3 2 72 73 74 Kbp simulated paired-end reads were created from the Velvet contigs. The final assemblies were performed using the normalized Illumina reads with the simulated paired-end reads using Allpaths-LG version r41043 (Gnerre & MacCallum, 2011). 75 76 References 77 78 Gnerre S, MacCallum I. (2011). High–quality draft assemblies of mammalian genomes from massively parallel sequence data. PNAS 108:1513–1518. 79 80 Li H, Durbin R. (2009). Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760. 81 82 Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al.. (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics 25:2078–2079. 83 84 Nelson JD, Denisenko O, Sova P, Bomsztyk K. (2006). Fast chromatin immunoprecipitation assay. Nucleic Acids Res 34:e2. 85 86 Rodrigue S, Malmstrom RR, Berlin AM, Birren BW, Henn MR, Chisholm SW. (2009). Whole genome amplification and de novo assembly of single bacterial cells. PLoS ONE 4:e6864. 87 88 Woyke T, Sczyrba A, Lee J, Rinke C, Tighe D, Clingenpeel S, et al.. (2011). Decontamination of MDA reagents for single cell whole genome amplification. PLoS ONE 6:e26161. 89 90 Zerbino D, Birney E. (2008). Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 18:821–829. 91 3 92 93 94 95 96 97 Figure S1. Estimated genome completeness vs. Cp value. Genome completeness determined by mapping the reads to the reference genomes. A base pair was considered to be recovered if at least 10 reads were mapped to cover it. Cp value is the time at which the inflection point of the real-time amplification curve occurs. A linear regression line for all the data is also shown. 98 99 100 Figure S2. Normalization of sequencing data. The sequencing data was subsampled to provide a consistent average read depth of 315x for all single-cells. 101 102 103 104 105 106 107 Figure S3. Percent of the genomes recovered for each organism/treatment combination. Light grey bars are the genome recovery by mapping the reads to the reference genomes. A base pair was considered to be recovered if at least 10 reads were mapped to cover it. Dark grey bars are the genome recovery by mapping contigs produced by de novo assembly to the reference genomes. Within each strain the treatments are significantly different from each other (P. heparinus p < 0.0001, E. coli p < 0.0001, M. ruber p = 0.01). 108 4