How Accurate is Heterozygote Base Calling in Dye

advertisement
How Accurate is Heterozygote Base Calling in Dye-Terminator Sequencing?
GR Taylor1,2, LA Ellis1, MD Robinson1, RS Charlton1, RF Mueller1, MA Knowles2, DT
Bishop2
Regional DNA Laboratory1 and ICRF Mutation Detection Facility2, St James's University
Hospital, Leeds LS9 7TF, UK
Introduction
Recent developments in dye terminator chemistry1 have led to the suggestion that, with their
reported improvement in accuracy and inherent simplicity of use, they could render fourcolour dye primer sequencing obsolete. This would have significant impact on mutation
detection strategies both for candidate gene and diagnostic applications. At present quality
control criteria for diagnostic sequencing are lacking, but given the increasing medical use of
DNA sequencing this is likely to become a medical-legal issue of great importance. In this
study we investigated dye terminator sequencing with the new Big-Dye terminators using a
set of cloned mismatch templates as well as amplicons from human genomic DNA (p53,
VHL, CFTR and a series of cloned sequence variants). Plasmids were either sequenced
directly from minipreps or re-amplified prior to sequencing. Sequence runs of over 800
bases were achieved. The best resolution of longer fragments was obtained with LongRanger gels. After 500 bases the reliability of base calling began to fall. We conclude that
reliable heterozygote detection by dye terminator sequencing in sequence runs of up to 500
bases is possible, but that the sequence should be generated from both strands for the
detection of heterozygous point mutations.
Aims
Explore the limits of diagnostic quality sequencing using cloned control and diagnostic
examples.
Determine the maximum read length for reliable heterozygote calling using slab and
capillary gels
Verify the sequence of the ATCC mismatch clone series.
Investigate the detection of heterozygotes when present as a minority using reconstruction
experiments.
Identify quality control criteria for diagnostic DNA sequence.
Methods
Sequencing used the ABI Big-Dye terminator kit according to the manufacturer's protocol
except that volumes were halved and "Half-term Big-Dye" diluent (Genpak Ltd) was added
at an equal volume. Cycle sequencing was 30 cycles of 95oC (5 seconds), 53oC (5 seconds)
and 72oC (4 minutes). After a final extension time of 10 minutes the samples were cooled to
4oC for up to 15 hours. Samples were prepared for sequencing by ethanol precipitation at
room temperature in the presence of sodium acetate. The precipitated DNA was redissolved
in 95% formamide, 1% dextran blue in TBE and loaded onto 32 cm well to read gels. Gels
were either 4% 19:1 acrylamide:bis or Hydrolink 4.25% “Singels” both in TBE with urea.
Electrophoresis used 1x run speeds with the temperature set to either 51 or 45oC. For
capillary sequencing, samples were resupended in template suppressant buffer (TSB) and
loaded without dextran blue.
The following sequences were analysed: PCR products derived from the plasmid pJD series
of cloned mismatches (ATCC Accession No 87584); exons from the human p53, CFTR and
VHL genes.
Sequence analysis used the ABI analysis software and alignment of sequences was
performed using “Align” in DNAstar according to the Wilbur and Lipman algorithm or by
Clustal in Sequence Navigator.
Mutation detection using Big Dye terminators
VHL Wild type
522 delTG
p53 T>A called correctly in one direction, but missed on reverse strand
Deletions are easy to detect because of the downstream effects
CFTR M470V missed on the forward stand, but called on the reverse strand. Setting base calling
to detect all putative heterozygotes generates more miscalls because it is not sensitive the context
of the sequence mismatches.
Mismatch control plasmids
When the plasmid series was grown and amplified one of the insertion clones (pJDTid2), labeled
above, gave PCR products that were larger than expected. The sequence of 500 bases around the
mismatch is therefore being checked in all of our ATCC stocks. So far the C and A clones have
been confirmed to have the correct sequence.
Row 1 ; A, row 2 C, row 3 A/C heteroduplex mismatch on forward and reverse strands. Forward
sequence (nt121) called as C, reverse sequence (nt313) called as T.
Sequencing heteroduplexes
Mismatch detection in control plasmid amplicons
pJDC
Mixture
pJDA
The reverse strand was more difficult for the software to call, the G at this position was weak even
in the homozygote, reliable detection of this mutation requires sequencing the reverse
complement. Mismatch was visible (though not called by the software) in the mixture when the
primer was 50 bases away but 340 bases away, the weak G signal was lost. Sequencing in only
one direction would have missed this heterozygote.
Neighboring sequence profile was affected by the base change, making reliable automated base
calling difficult. In this case, although a cytosine peak is seen in the mixture, it is not reported by
the software. There is a trade-off in adjusting the signal:noise ratio to give useful base calls yet not
missing minor peaks in heterozygotes.
PCR amplicons generated from plasmids identical except at one base position enable sequencing
performance characteristics to be evaluated systematically
Sequence read length is limited by gel resolution
Apart from miscalls right at the beginning of the sequence, sequencing an amplicon of
ɭ,000 bases revealed only 2 errors in the first 650 bases.
The first error was due to an “A” called as “N” because of a background “T” peak.
Background “T” peaks have been seen on several occasions. The second error
was a missed “C” peak, weak after 2 “G”s. The error rate increased rapidly
after 690 bases, mostly due to the poor resolution of peaks in this region of the gel.
390 bases
650 bases
100 bases
1000 bases
Discussion
Big Dye terminator chemistry lacks the precision of dye primer sequencing. However the greater
convenience of terminator chemistry makes it attractive for routine use, particularly for fragments
of 500 bases and less. Although heterozygotes are visible by eye, it is very difficult to set the basecalling sensitivity to call heterozygotes without miscalling background peaks. Reliable
heterozygote detection by dye terminator sequencing is possible in sequence runs of up to 500
bases, but the sequence should be generated from both strands for the detection of heterozygous
point mutations and the sequence should still be visually inspected.
Whilst it is possible to sequence plasmid DNA beyond 800 bases using Big Dye terminators, we
found that PCR products rarely read beyond 600 bases, after which base calling errors started to
appear. Reading beyond 600 bases loses accuracy for two reasons: weaker signals and poor gel
resolution. Signal strength could be improved by increasing the cycle number and loading more
sample. Hydrolink gel was superior in our hands to 19:1 acrylamide:bis, giving increased base
separation and longer reads.
Further improvements may be possible using 48cm gels instead of the 36cm gels used in this
study. Gel integrity was better preserved by setting the temperature to 45oC rather than the default
51oC. We found little advantage in column purification to remove dye terminators, although
unincorporated dye terminator peaks were reduced through the use of half-term diluent in the
sequencing mix. Sequencing using the 310 61 cm capillary give equivalent resolution at 500 bases
to the 377 using Hydrolink gels; larger fragments were not analysed in this study.
Summary
The progress in sequencing the human genome means that in the immediate future there will be an
increasing demand to re-sequence genes, both for diagnostic (mutation identification),
epidemiological and candidate gene studies.
There is a clear need for quality assessment of sequencing systems, particularly for diagnostic
applications. The detection of heterozygotes has been recognized as a special case in point3.
We observed a number of context-dependent sequencing errors, which would be difficult to check
manually on a large scale and which may be difficult to deal with using default settings using
Factura or Phred. One solution is to directly compare sequences with a known standard using a
peak subtraction algorithm 4. This may quickly identify sequence anomalies, increasing the
throughput of sequence checking.
References :
1 Rosenblum, BB; Lee, LG; Spurgeon, SL; Khan, SH; Menchen, SM; Heiner, CR; Chen, SM
(1997) New dye-labeled terminators for improved DNA sequencing patterns Nucleic Acids
Research 25 4500-4504
2 Deeble VJ; Roberts, E; Robinson, MD; Woods, CG; Bishop, DT; Taylor, GR (1999)
Comparison of enzyme mismatch cleavage and chemical cleavage of mismatch on a defined set of
heteroduplexes. Genetic Testing 1 1-8
3 MN Kronick “Heterozygote sequencing using automated DNA sequencing technology” in
Laboratory Methods for the Detection of Mutations and Polymorphisms in DNA (Ed GR Taylor)
CRC Press 1997
4 Bonfield, J. K., Rada, C., and Staden, R. Automated detection of point mutations using
fluorescent sequence trace subtraction. Nucleic Acids Research 26, 3404-3409. 1998.
Download