Structural variation in two human genomes mapped at single

advertisement
Structural variation in two human genomes mapped at single-nucleotide resolution
by whole genome de novo assembly
Jian Wang & Jun Wang, et al
Nature Biotechnology 29,723–730,2011
Presenter: Chih-Hao Wu
Commentator: Dr. Tsung-Lin Liu
Data/Time: 2011/11/24, 17:10-18:00
Place: Room 601,Med College Building
Background:
Several new DNA sequencing technologies, named next-generation sequencing, provide
high throughput and low cost approaches for DNA sequencing. In bioinformatics, sequence
assembly is refered to assemble the original sequence from aligning many fragments of
DNA sequence. Structural variation (SV) is defined as a region of DNA approximately 1 kb
and larger in size, such as deletions, duplications, insertions, inversions and translocations.
Methods used before for calling structural variation have their limitations in identification of
complex variation. In theory, accurate and complete de novo assembly of human genomes
should allow relatively more comprehensive mapping of structural variations.
Objective:
Authors analyze the structural variation in two human genomes by using whole genome de
novo assembly approach.
Result:
First, the authors establish an accurate and less-biased structural variation map by
eliminating false positive alignment, and compare their prediction with other existing method.
Computational simulations suggest that their results are accurate and de novo assembly can
identify structural variations of a wider range of lengths in comparison with previous methods.
Next, they examined the distribution of structural variation in Asian genome (YH) and African
genome (NA18507).They also analyzed the frequency of structural variation in protein
coding region and Alu element. The structural variation they found in gene is fewer than
whole genome, and the coding sequences have a lower structural variation rate than introns.
Finally, the authors used the database of 1000 Genome Project compared with structural
variations detected from the YH and NA18507 genomes. The result shows that structural
variations are more specific to individuals than SNPs in human.
Conclusion:
The authors have demonstrated the feasibility and power of identifying structural variations
in human genome by whole genome de novo assemblies to a reference genome.
Reference:
1. Li, R. et al. De novo assembly of human genomes with massively parallel short read sequencing. Genome
Res. 20, 265–272 (2009).
2. Limitations of next-generation genome sequence assembly. Nat. Methods 8, 61–65 (2011).
Download