DNA SEQUENCING Objectives: 1) To understand these strategies of dideoxy DNA sequencing. a) Shotgun b) Primer walking d) Cycle sequencing e) Automated sequencing 2) To understand the process of data assembly. 3) To recognize common artifacts in 4 color chromatogram. General review of dideoxy sequencing Dideoxy sequencing requires enough forehand knowledge of the sequence to make a primer. This diagram illustrates the dideoxy terminator sequencing strategy in the old fashion 4 lane polyacrylamide gel format. The bands would have been visualized by autoradiography based on radiolabeling of the sequencing primer. In modern usage, the terminators are fluorescently labeled, each with a different color dye. The reaction is conducted in one tube an separated by capillary electrophoresis resulting in a four color chromatogram. One can expect 600-900 bases of sequence per read. Cycle sequencing Most sequencing now is done by cycle sequencing. In cycle sequencing, the extension is done with a thermo-stable DNA polymerase at high temperature. A major advantage of this is that the primer can be annealed at high stringency, thus avoiding adventitious priming at alternate sites. It is then possible to sequence directly off of templates as large as 300 kb in size. A second advantage is that the reaction can be conducted in a thermal cycling machine (PCR machine) and cycled from a stringent temperature for annealing and polymerization, up to a higher temperature to denature the product from the template, and back to annealing temperature to reinitiate another round of priming and synthesis. In this way, product can be accumulated from relatively small amounts of template. (Note: this is not a chain reaction, because only one primer is used, and the product does not itself become a template in later cycles. Therefore, product accumulates proportionately to the number of cycles; not exponentially.) The ability to sequence from small amounts of template overcomes a major problem with prior sequencing techniques for double stranded DNA. Double stranded templates tend to reanneal and inhibit progress of the polymerase. Keeping the template concentration low offsets this problem. Furthermore, putting the label on the terminators rather than on the primer causes any products due to polymerase stalling to be unlabeled. Strategies of dideoxy sequencing. Primer walking For sequencing a small (<2Kb) clone, one generally starts with primers complementary to the vector. In primer walking, the sequence gained from priming with the vector primer is then used to custom design a new primer, and one walks stepwise through the sequence in this fashion. Shotgun Sequencing For larger sequencing projects, generally some form of shotgun cloning is employed. The DNA to be sequenced is broken into a large collection of semi-random fragments, usually by shearing (for example, with a sonicator). The fragments are cloned and each is sequenced from a vector primer without any prior characterization. A computer program searches the collection of sequences for overlaps, and assembles a contiguous sequence from the fragmentary sequence data. The problem with this approach is that near the end of the project, one has a greater chance of retrieving another sequence from a region that is already done, than of gaining a sequence that closes the last gaps. Generally, one should expect to sequence 8 times the total sequence length to fill the last gaps. Most groups gather data by shotgun cloning until they reach a point of diminishing return; then they switch to a primer walking strategy. Shotgun assembly is hampered by repetitive sequences. A common strategy around this problem is to size the inserts prior to making the library, and then to keep track of the left and right end sequence reads of each cloned insert. Also, commonly several different libraries are made with inserts of different sizes. Last updated 2/28/2005 - Steve Hardies See also http://biochem.uthscsa.edu/~hs_lab/frames/molgen/seq2005.html for images and desription of assembly by phred/phrap and sequence artifact problems in 4 color traces.