Additional files Tables Additional file 1: Table S1. Amplification primers for subsequent SOLiD sequencing. The letters in bold are the 10-mer indexing barcodes. Primer sequences were derived from [Smith et al., 2010]. Barcodes are in bold. Universal forward primer. 5'-CCTCTCTATGGGCAGTCGGTGATTACTGAGGTCGGTACACTCT-3' Individual reverse primers. Simulated clinical sample A: 5'CTGCCCCGGGTTCCTCATTCTCTAGGCTGTCTACTGCTGTACGGCCAAGGC GAGTAGCCGTGACTATCGACT-3' Simulated clinical sample B: 5'CTGCCCCGGGTTCCTCATTCTCTGTGACCTACTCTGCTGTACGGCCAAGGC GAGTAGCCGTGACTATCGACT-3' Simulated clinical sample C: 5'CTGCCCCGGGTTCCTCATTCTCTGCGTATTGGGCTGCTGTACGGCCAAGGC GAGTAGCCGTGACTATCGACT-3' Simulated clinical sample D: 5'CTGCCCCGGGTTCCTCATTCTCTAAGGGATTACCTGCTGTACGGCCAAGGC GAGTAGCCGTGACTATCGACT-3' Simulated clinical sample E: 5'CTGCCCCGGGTTCCTCATTCTCTGTTACGATGCCTGCTGTACGGCCAAGGC GAGTAGCCGTGACTATCGACT-3' Clinical sample A08-2: 5'CTGCCCCGGGTTCCTCATTCTCTATGGGTGTTTCTGCTGTACGGCCAAGGC GAGTAGCCGTGACTATCGACT-3' Clinical sample A10-4: 5'CTGCCCCGGGTTCCTCATTCTCTGAGTCCGGCACTGCTGTACGGCCAAGG CGAGTAGCCGTGACTATCGACT-3' Clinical sample A19-4: 5'CTGCCCCGGGTTCCTCATTCTCTAATCGAAGAGCTGCTGTACGGCCAAGGC GAGTAGCCGTGACTATCGACT-3' 1 Clinical sample A01-1: 5'CTGCCCCGGGTTCCTCATTCTCTGGTCGTCGAACTGCTGTACGGCCAAGGC GAGTAGCCGTGACTATCGACT-3' Clinical sample A22-3: 5'CTGCCCCGGGTTCCTCATTCTCTGAGGGATGGCCTGCTGTACGGCCAAGG CGAGTAGCCGTGACTATCGACT-3' Clinical sample A10-2: 5'CTGCCCCGGGTTCCTCATTCTCTGAAGGCTTGCCTGCTGTACGGCCAAGGC GAGTAGCCGTGACTATCGACT-3' Clinical sample A17-3: 5'CTGCCCCGGGTTCCTCATTCTCTGTAATTGTAACTGCTGTACGGCCAAGGC GAGTAGCCGTGACTATCGACT-3' Clinical sample A16-4: 5'CTGCCCCGGGTTCCTCATTCTCTGTCATCAAGTCTGCTGTACGGCCAAGGC GAGTAGCCGTGACTATCGACT-3' Clinical sample A03-2: 5'CTGCCCCGGGTTCCTCATTCTCTGCATGTCACCCTGCTGTACGGCCAAGGC GAGTAGCCGTGACTATCGACT-3' Clinical sample A27-2: 5'CTGCCCCGGGTTCCTCATTCTCTCTAGTAAGAACTGCTGTACGGCCAAGGC GAGTAGCCGTGACTATCGACT-3' Clinical sample A07-1: 5'CTGCCCCGGGTTCCTCATTCTCTTAAAGTGGCGCTGCTGTACGGCCAAGGC GAGTAGCCGTGACTATCGACT-3' Clinical sample A07-2: 5'CTGCCCCGGGTTCCTCATTCTCTAAGTAATGTCCTGCTGTACGGCCAAGGC GAGTAGCCGTGACTATCGACT-3' Clinical sample A20-3: 5'CTGCCCCGGGTTCCTCATTCTCTATGTCATAAGCTGCTGTACGGCCAAGGC GAGTAGCCGTGACTATCGACT-3' Clinical sample A25-2: 5'CTGCCCCGGGTTCCTCATTCTCTAAGCAGGAGTCTGCTGTACGGCCAAGG CGAGTAGCCGTGACTATCGACT-3' Additional file 2: Table S2. Clinical samples: comparison of BigDye-terminator reads, Tag4 fluorescent signals, and SOLiD reads. The BigDye-terminator data 2 are from [Hyman et al., 2012]. For the purposes of this table, those bacteria whose presence was supported by less than ten BigDye-terminator reads have been ignored. Novel bacteria and bacteria without a public genome sequence have also been ignored because they cannot be detected by the molecular probes. “1”, a majority of molecular probes for this genome was positive. “0”, a majority of molecular probes for this genome was not positive. A01-1 Bacterium B. longum C. jejuni L. crispatus L. jensenii P. aeruginosa T. pallidum BigDye-terminator reads (%) 96% < 1% Probe/Tag4 Probe/SOLiD 0 0 1 1 0 0 1 1 1 1 1 1 Probe/Tag4 Probe/SOLiD 1 1 1 0 A03-2 Bacterium L. gasseri L. jensenii BigDye-terminator reads (%) 44% 44% A03-3 Bacterium BigDye-terminator reads (%) 88% L. gasseri Probe/Tag4 0 A07-1 Bacterium L. gasseri L. jensenii BigDye-terminator reads (%) 16% 72% Probes/Tag4 Probes/SOLiD 1 1 1 0 BigDye-terminator reads (%) Probes/Tag4 Probes/SOLiD 0 0 1 1 1 1 A07-2 Bacterium E. coli L. gasseri L. jensenii 21% 75% 3 A08-2 Bacterium L. crispatus L. jensenii BigDye-terminator reads (%) 95% < 1% Probes/Tag4 Probes/SOLiD 1 1 1 1 BigDye-terminator reads (%) Probes/Tag4 Probes/SOLiD 1 0 1 0 1 0 0 0 1 1 0 1 Probes/Tag4 Probes/SOLiD 1 0 1 0 A10-2 Bacterium B. fragilis C. glutamicum L. crispatus L. gasseri S. pyogenes T. pallidum < 1% 95% 1% A10-4 Bacterium L. crispatus L. gasseri BigDye-terminator reads (%) 89% < 1% A12-2 Bacterium L. jensenii BigDye-terminator reads (%) 74% Probes/Tag4 BigDye-terminator reads 88% Probes/Tag4 1 BigDye-terminator reads (%) 100% Probes/Tag4 1 A13-4 Bacterium L. jensenii A16-2 Bacterium L. gasseri A16-3 4 1 Bacterium BigDye-terminator reads (%) 100% L. gasseri Probes/Tag4 1 A16-4 Bacterium B. longum Janthinobacter L. crispatus N. gonorrhoeae S. pyogenes U. urealyticum BigDye-terminator reads (%) Probes/Tag4 Probes/SOLiD 1 1 1 1 1 1 0 0 1 0 0 0 BigDye-terminator reads (%) 70% Probes/Tag4 Probes/SOLiD 1 1 BigDye-terminator reads (%) 97% 2% Probes/Tag4 Probes/SOLiD 1 0 1 1 0 0 BigDye-terminator reads (%) 81% Probe/Tag4 Probe/SOLiD 1 1 BigDye-terminator reads (%) Probe/Tag4 Probe/SOLiD 1 1 1 0 0 1 1 1 100% A17-3 Bacterium L. gasseri A19-4 Bacterium L. crispatus L. gasseri M. genitalium A20-3 Bacterium L. crispatus A22-3 Bacterium E. faecalis L. crispatus L. jensenii T. pallidum 86% 13% A23-1 5 Bacterium L. crispatus L. jensenii L. gasseri BigDye-terminator reads (%) 96% 2% 2% Probe/Tag4 BigDye-terminator reads (%) 93% Probe/Tag4 1 1 0 A24-1 Bacterium L. jensenii 0 A25-2 Bacterium A. baumannii L. crispatus L. jensenii M. genitalium BigDye-terminator reads (%) 86% < 1% Probe/Tag4 Probe/SOLiD 1 1 1 1 0 1 1 0 Probe/Tag4 Probe/SOLiD 1 1 1 1 A27-2 Bacterium L. crispatus L. jensenii BigDye-terminator reads (%) 95% 4% Additional file 3: Table S3. Bacteria and RefSeq genome sequence numbers. nc_id NC_008752 NC_008782 NC_009085 NC_010410 NC_010400 NC_005966 NC_008570 NC_003228 NC_006347 NC_004663 NC_009614 NC_008618 NC_004307 NC_010816 genus Acidovorax Acidovorax Acinetobacter Acinetobacter Acinetobacter Acinetobacter Aeromonas Bacteroides Bacteroides Bacteroides Bacteroides Bifidobacterium Bifidobacterium Bifidobacterium species avenae sp. baumannii baumannii baumannii, sp. hydrophila fragilis fragilis thetaiotaomicron vulgatus adolescentis longum longum 6 strain subsp__citrulli_AAC00_1 JS42 ATCC_17978 AYE Complete genome ADP1 subsp__hydrophila_ATCC_7966 NCTC_9343 YCH46 VPI_5482 ATCC_8482 ATCC_15703 NCC2705 DJO10A NC_010551 NC_008060 NC_010508 NC_008390 NC_008836 NC_010084 NC_006350 NC_009076 NC_007651 NC_009256 NC_007951 NC_009714 NC_002163 NC_003912 NC_009707 NC_009839 NC_000117 NC_007429 NC_010280 NC_010287 NC_008261 NC_008262 NC_003450 NC_006958 NC_009342 NC_010002 NC_009778 NC_009436 NC_004668 AC_000091 NC_000913 NC_004431 NC_007946 NC_008253 NC_008563 NC_009800 NC_010473 NC_010376 NC_009441 NC_009613 NC_003454 NC_009659 NC_006814 NC_008497 NC_008054 NC_008529 NC_008530 NC_004567 NC_000908 Burkholderia Burkholderia Burkholderia Burkholderia Burkholderia Burkholderia Burkholderia Burkholderia Burkholderia Burkholderia Burkholderia Campylobacter Campylobacter Campylobacter Campylobacter Campylobacter Chlamydia Chlamydia Chlamydia Chlamydia Clostridium Clostridium Corynebacterium Corynebacterium Corynebacterium Delftia Enterobacter Enterobacter Enterococcus Escherichia Escherichia Escherichia Escherichia Escherichia Escherichia Escherichia Escherichia Finegoldia Flavobacterium Flavobacterium Fusobacterium Janthinobacterium Lactobacillus Lactobacillus Lactobacillus Lactobacillus Lactobacillus Lactobacillus Mycoplasma ambifaria cenocepacia cenocepacia cepacia mallei multivorans pseudomallei pseudomallei thailandensis vietnamiensis xenovorans hominis jejuni jejuni jejuni jejuni trachomatis trachomatis trachomatis trachomatis perfringens perfringens glutamicum glutamicum glutamicum acidovorans sakazakii sp. faecalis coli coli coli coli coli coli coli coli magna johnsoniae psychrophilum nucleatum sp. acidophilus brevis delbrueckii delbrueckii gasseri plantarum genitalium 7 MC40_6_chromosome_1 AU_1054_chromosome_1 MC0_3_chromosome_1 AMMD_chromosome_1 NCTC_10229_chromosome_I ATCC_17616_chromosome_1 K96243_chromosome_1 1106a_chromosome_I E264_chromosome_I G4_chromosome_1 LB400_chromosome_1 ATCC_BAA_381 subsp__jejuni_NCTC_11168 RM1221 subsp__doylei_269_97 subsp__jejuni_81116 D_UW_3_CX A_HAR_13 L2b_UCH_1_proctitis 434_Bu ATCC_13124 SM101 ATCC_13032 ATCC_13032 R SPH_1 ATCC_BAA_894 "638" V583 W3110_DNA str__K_12_substr__MG1655 CFT073 UTI89 "536" APEC_O1 HS str__K_12_substr__DH10B ATCC_29328 UW101 JIP02_86 subsp__nucleatum_ATCC_25586 Marseille NCFM ATCC_367 subsp__bulgaricus_ATCC_11842 subsp__bulgaricus_ATCC_BAA_365 ATCC_33323 WCFS1 G37 NC_002946 NC_003112 NC_003116 NC_008767 NC_010120 NC_006085 NC_002516 NC_008463 NC_009656 NC_008027 NC_004129 NC_007492 NC_009439 NC_002947 NC_002947 NC_009512 NC_010322 NC_010501 NC_009434 NC_002758 NC_002951 NC_003923 NC_009487 NC_009632 NC_009641 NC_009782 NC_002976 NC_004461 NC_010943 NC_004116 NC_004368 NC_007432 NC_004350 NC_004606 NC_006086 NC_008023 NC_009332 NC_000919 Neisseria Neisseria Neisseria Neisseria Neisseria Propionibacterium Pseudomonas Pseudomonas Pseudomonas Pseudomonas Pseudomonas Pseudomonas Pseudomonas Pseudomonas Pseudomonas Pseudomonas Pseudomonas Pseudomonas Pseudomonas Staphylococcus Staphylococcus Staphylococcus Staphylococcus Staphylococcus Staphylococcus Staphylococcus Staphylococcus Staphylococcus Stenotrophomonas Streptococcus Streptococcus Streptococcus Streptococcus Streptococcus Streptococcus Streptococcus Streptococcus Treponema gonorrhoeae meningitidis meningitidis meningitidis meningitidis acnes aeruginosa aeruginosa aeruginosa entomophila fluorescens fluorescens mendocina putida putida putida putida putida stutzeri aureus aureus aureus aureus aureus aureus aureus epidermidis epidermidis maltophilia agalactiae agalactiae agalactiae mutans pyogenes pyogenes pyogenes pyogenes pallidum FA_1090 MC58 Z2491 FAM18 "053442" KPA171202 PAO1 UCBPP_PA14 PA7 L48 Pf_5 PfO_1 ymp KT2440 KT2440 F1 GB_1 W619 A1501 subsp__aureus_Mu50 subsp__aureus_COL subsp__aureus_MW2 subsp__aureus_JH9 subsp__aureus_JH1 subsp__aureus_str__Newman subsp__aureus_Mu3 RP62A ATCC_12228 K279a 2603V_R NEM316 A909 UA159 SSI_1 MGAS10394 MGAS2096 str__Manfredo subsp__pallidum_str__Nichols NC_011374 Ureaplasma urealyticum serovar_10_str_ATCC_33699 8 Figures. Additional file 4: Figure S1. Quantitative data for the SOLiD assay for simulated clinical sample A (SCA). The red crosses indicate the known concentrations of each genomic DNA (right ordinate). The horizontal lines indicate the number of sequence reads for each individual molecular probe (left ordinate). Individual bacteria are listed alphabetically across the abscissa. The number of reads for L. acidophilus is just above background. Additional file 5: Figure S2. Quantitative data for the SOLiD assay for simulated clinical sample C (SCC). The red crosses indicate the known concentrations of each genomic DNA (right ordinate). The horizontal lines indicate the number of sequence reads for each individual molecular probe (left ordinate). Individual bacteria are listed alphabetically across the abscissa. Some 9 of the molecular probes for L. crispatus and L. jensenii DNAs registered positive by cross-reaction with L. gasseri DNA. 1 2 Additional file 6: Figure S3. Quantitative data for the SOLiD assay for simulated clinical sample D (SCD). The red crosses indicate the known concentrations of each genomic DNA (right ordinate). The horizontal lines indicate the number of sequence reads for each individual molecular probe (left ordinate). Individual bacteria are listed alphabetically across the abscissa. Some of the molecular probes for L. crispatus and L. jensenii DNAs registered positive by cross-reaction with L. gasseri DNA. 10 3 4 5 6 Additional file 7: Figure S4. Quantitative data for the SOLiD assay for simulated clinical sample E (SCE). The red crosses indicate the known concentrations of each genomic DNA (right ordinate). The horizontal lines indicate the number of sequence reads for each individual molecular probe (left ordinate). Individual bacteria are listed alphabetically across the abscissa. 11 7 8 9 10 11 12 13 12