Bacterial Genome Finishing Using Optical Mapping

advertisement
Bacterial Genome Finishing Using Optical Mapping
Dibyendu Kumar, Fahong Yu and William Farmerie
Interdisciplinary Center for Biotechnology Research,
University of Florida, Gainesville, FL-32611
Abstract
Optical Mapping Method
The cost-efficiency of modern next-generation DNA sequencing technology allows
investigators to undertake genome projects that were not affordable earlier. These
sequencing techniques also bring new problems in genome assembly and
finishing. Our core laboratory has several ongoing bacterial genome projects
addressing a variety of challenges to genome assembly and closure. Several factors
contribute to these challenges; including sequence repeats versus read length,
intrinsic sequencing errors, and genome rearrangements. Together these factors
complicate genome closure when using shotgun DNA sequencing data alone. In
the absence of a physical map, we adopted whole-genome optical mapping as a
tool to validate bacterial genome assemblies. OpGen, Inc. (Gaithersburg,
Maryland) prepared the optical maps used in these projects. Briefly, an optical
map is a complete genome restriction map deduced from a number of partial
restriction maps. Optical maps are generated by spreading carefully extracted
genomic DNA onto a treated glass surface containing many narrow channels,
followed by digestion in situ with restriction enzymes. About 50–100 overlapping
partial optical contigs are combined by alignment software to produce a
contiguous whole genome restriction map. The contiguous optical map can be
aligned and compared with the in silico restriction map of contigs obtained from
whole-genome assembly. We successfully used optical mapping for guiding the
closure of four closely related bacterial genomes. The optical map not only orient
scaffolds but also allowed us to identify assembly errors, which was not possible
using shotgun DNA sequencing data alone. Thus, we conclude that, in order to
ensure the accuracy of a finished bacterial genome and to accelerate overall
finishing process, optical mapping is an important tool to de-novo assemblies
generated by next-generation DNA sequencing.
Challenges to Assembly
Optical chip containing single DNA molecule
After digestion with restriction enzyme
Size and order of fragments deduction
Vertical lines indicate the location of
restriction enzyme
Anchoring and Orienting Contigs to Optical Map
Optical circular XhoI map of a bacterial genome.
The outermost red circle represents the consensus
map created from the single-molecule maps shown
as arcs. Different arc color is random and for
contrast.
Optical Comparative Mapping
No physical library, unknown Gap sizes
Sequence read length
sequencing errors such as, carry
forward, homopolymer length, incomplete
extension, etc
genome rearrangement
Detecting Misassembly
Detecting Deletion in Assembled Sequence
Assembled Genome
Mismatch in Fragment length, missing repeat
Optical Map
Optical Mapping Limitation: Missing Fragments
Assembled Genome
Missed fragment
Optical Map
Optical Mapping Limitation: Assembled Contig Length
Conclusions
Based on our experience, we strongly recommend including optical
mapping in normal genome sequencing pipeline.
It is relatively efficient, very fast and independent way to validate a
bacterial genome assembly.
Estimating Gap Size
Contigs
Paired end reads are critical in building scaffold that can be aligned to
optical map. Without paired end reads only a minority of contigs align to
map. Many contigs remains as orphan.
Effectiveness of optical map depends on choice of enzyme used for
mapping. Sometimes, with some finishing jobs second mapping is critical.
Optical map [PvuII]
Fragment: 64
Length: 19,211 bo
Cut Position: 423,252
Scaffolds
An optical map increases the speed of finishing and decreases the overall
cost of the genome sequencing project.
Download