Supplementary Figures - Word file (305 KB )

advertisement
dsDNA fragments
Bio
P
A
P
Ligation
B
Bio
Fill in
Bio
Bio
Bio
Capture on SA-Beads & Wash
Alkaline Elution
A
B
Supplementary Figure 1. Non-phosphorylated A and B adaptors are ligated to the
ends of phosphorylated, polished, double-stranded genomic DNA fragments. The A
and B adaptors differ in both nucleotide sequence and the presence of a 5’ biotin
tag on the B adaptor. Nicks are present at the 3’-junctions of each of the adaptors
and the library fragment are filled in by the strand-displacement activity of Bst DNA
polymerase. Streptavidin- biotin interactions are used to remove fragments flanked
Page 1 of 12
Manuscript 2005-05-05204
533560850
by homozygous adaptor sets (A/A and B/B) and to generate single stranded library
templates. Fragments are bound to Streptavidin beads; unbound material
(composed of homozygous A/A adaptor sets, which lack biotin) is washed away.
The immobilized fragments are then denatured; both strands of the B/B fragments
remain immobilized through the biotinylated B adaptor, while A/B fragments are
washed free and used in subsequent sequencing steps. Replicate library
preparations were observed to yield coverage of the genome and oversample with
CV’s of 5% or less.
Page 2 of 12
Manuscript 2005-05-05204
533560850
60
50
40
Fluorescence
30
20
10
0
125
120
115
110
105
100
95
90
85
80
75
70
65
60
55
50
45
40
35
Time (seconds)
Supplementary Figure 2. Size distribution of nebulized DNA sample. Sharp
flanking peaks are upper and lower reference markers.
Page 3 of 12
Manuscript 2005-05-05204
533560850
Supplementary Figure 3. Kinetic modeling of single well. Assumption: 10
million DNA copies per bead, [DNA] = 0.3 μM.
Page 4 of 12
Manuscript 2005-05-05204
533560850
Supplementary Figure 4. Chemical cross-talk modeling. At t=0, [DNA]well 1 = 0.3
μM, [DNA]well 2 = 0.
Page 5 of 12
Manuscript 2005-05-05204
533560850
Error Distribution
1.0%
0.9%
0.8%
0.7%
Error
0.6%
0.5%
0.4%
0.3%
0.2%
0.1%
0.0%
1-mer
2-mer
3-mer
Insertion
4-mer
5-mer
6-mer
Deletion
Supplementary Figure 5. Detailed error rates in sequencing a mixture of 6 test
fragments, as a function of homopolymer length. Single base error rates are
referenced to the total number of single bases sequenced. For each
homopolymer, the error rate is referenced to the total number of bases
sequenced that belong to homopolymers of that length.
Page 6 of 12
Manuscript 2005-05-05204
533560850
900,000
800,000
20,000
18,000
700,000
16,000
14,000
12,000
Number of Flows
600,000
10,000
8,000
6,000
4,000
500,000
0-mer
1-mer
2-mer
2,000
0
0.5
0.6
0.55
0.65
0.7
0.75
0.8
400,000
300,000
200,000
100,000
0
0
0.5
1
1.5
2
2.5
3
Normalized Signal
Supplementary Figure 6. Typical histogram of signal intensities for negative
and positive flows.
Page 7 of 12
Manuscript 2005-05-05204
533560850
9
8
7
Mean Signal (µ)
6
µ = 0.0186 + 0.98956*n
5
(R2 = 0.99999)
4
3
2
1
0
0
1
2
3
4
5
6
7
8
Homopolymer (n)
Supplementary Figure 7. Average of the flow signals ascribed to various
homopolymers for the mapped reads of the M. genitalium run discussed in the
paper.
Page 8 of 12
Manuscript 2005-05-05204
533560850
9
Error Distribution
6.0%
5.0%
4.85%
4.06%
Error
4.0%
3.08%
3.0%
2.23%
2.20%
2.0%
1.65%
1.0%
0.01%
0.0%
1-mer
0.04%
0.02%
2-mer
3-mer
Individual read error
0.05%
4-mer
0.21%
0.10%
5-mer
6-mer
Consensus error
Supplementary Figure 8. Detailed error rates in sequencing an M. genitalium
library, as a function of homopolymer length. As for test fragments, single base
error rates are referenced to the total number of single bases sequenced; for
homopolymers, the error rate is referenced to the total number of bases
sequenced that belong to homopolymers of each length. The error rates are
shown for individual reads and after the consensus sequence was formed using
all reads, without Z-score restriction.
Page 9 of 12
Manuscript 2005-05-05204
533560850
60
50
Depth of Coverage
40
30
20
10
0
0
100
200
300
400
500
Genome position (kb)
Supplementary Figure 9. Depth of coverage as a function of genome position for
the M. genitalium run. Slightly lower coverage in isolated regions is due to the
presence of repeat regions excluded in the mapping.
Page 10 of 12
Manuscript 2005-05-05204
533560850
600
60
Observed phred Score
50
40
30
20
10
0
0
10
20
30
40
50
Predicted phred Score
Supplementary Figure 10. Correlation between predicted and observed quality
scores for a sequencing run of C. jejuni (data not shown).
Page 11 of 12
Manuscript 2005-05-05204
533560850
60
Read Lengths - Double Ended Sequencing
(2 x 21 cycles)
4500
4000
3500
Number of Reads
3000
2500
2000
1500
1000
500
0
0
20
40
60
80
100
Read Length
Read 1 (63184 reads)
Read 2 ( 56027 reads)
Supplementary Figure 11. Read lengths of paired end reads. Note this was for
a 21 cycle run so the average length is commensurate with the lower number of
cycles.
Page 12 of 12
Manuscript 2005-05-05204
533560850
120
Download