utmj submission template - University of Toronto Medical Journal

advertisement
ELECTRONIC SUBMISSION
FOR CONSIDERATION IN THE
UNIVERSITY OF TORONTO MEDICAL JOURNAL
TITLE: Exome sequencing expedites rare disease gene discovery
AUTHOR NAMES:
Kun Huang (1T3)
MD candidate, University of Toronto
CORRESPONDING AUTHOR EMAIL ADDRESS:
Kun Huang
University of Toronto, Faculty of Medicine
1 King's College Circle,
Medical Sciences Building, Room 2109
Toronto, ON, M5S 1A8
Tel: (647) 863-2715
email: kun.huang@utoronto.ca
UTMJ Perspectives SUBMISSION
Page 1 of 8
Kun Huang
For the past several decades, genes underlying Mendelian disorders have been identified
through traditional positional cloning strategies which often have reduced power due to marked
locus heterogeneity, small kindred sizes, or substantially reproductive disadvantage. 1 The
emergence of next generation sequencing techniques (ie. whole-genome, whole-exome and
whole-transcriptome sequencing) allows substantial advances in identifying genetic alterations.
In theory, whole-genome sequencing of all human genes for discovery of genetic variants could
potentially identify the gene underlying any given monogenic disease. However the cost
associated with sequencing whole genome is daunting. An alternative approach, exome
sequencing, involves the targeted resequencing of all protein coding regions, which only
requires ~5% as much sequencing as a whole human genome. 2-4 An increasing number of
studies in the past two years have demonstrated whole-exome sequencing to be a powerful
approach to identify causative genes underlying extremely rare Mendelian disorders. 5-12
In order to effectively process the massive sequencing data, some assumptions, albeit
arbitrary, are made about causal mutations underlying Mendelian disease. 2 11 12 First, the
disease is monogenic and caused by a single mutation. Secondly, the mutation has a significant
effect on phenotypes; therefore it is most likely coding and highly penetrant, i.e. missense and
nonsense substitutions, coding indels as well as splice acceptor and donor site changes.
Thirdly, the mutation would be rare or novel, and probably private to affected individuals. Where
necessary, a further assumption is often made that the disease is genetically homogenous, i.e.
unrelated affected individuals have mutations in the same gene, at least for the individuals
whose DNA were sequenced for the study.
The proof-of-concept of exome sequencing was first demonstrated in 2009 in a rare
Mendelian disorder called Freeman–Sheldon syndrome to show the feasibility of this technique.
Through only four cases, this study was able to identify MYH3 as the single causal gene. 2
Since then, more than 40 exome sequencing studies have applied various strategies to identify
UTMJ Perspectives
Page 2 of 8
Kun Huang
the causal variants for different disorders such as Miller syndrome
12,
Kabuki syndrome 11,
Fowler syndrome 13, Perrault Syndrome 14 and Schinzel-Giedion syndrome 15. Some studies
also integrated exome sequencing data with traditionally used linkage and homozygosity
analysis 8 16.
However, the avalanche of data from exome sequencing provides a statistical and
computational challenge: how to separate the causative alterations from the noise caused by
normal variants. Based on the aforementioned assumptions, the primary filter used to identify
potentially causal mutations is variant function, with the rationale that mutations which are
disruptive to proteins and/or at more conserved sites are more likely to be pathogenic.
Therefore, non-coding and synonymous variants are often ignored or greatly down-weighted.
Tools like SIFT 17, PolyPhen 18 19, CDPred 20, PhyloP 21 and GERP 22 23 are developed to rank
the variants by potential effect on protein structure and function, and also by conservation
scores. Although such strategy has been justified for many studies, this will most certainly not
always be the case. Shortcomings of this method include the inability to capture regulatory or
evolutionary conserved sequences in non-coding regions. As more disorders are studied, there
will be a growing need for functional annotation of non-coding regions and tools to analyze the
same.
Empirical analysis of published exomes estimates about 20 000 single-nucleotide
variants in a given exome. 2 11 12 For rare mutations that give major effect and distinctive
phenotype, they are not expected to be found in the population at large, and hence will not be
seen in genome-wide scans for variants [e.g. the 1000 Genomes Project
24],
nor in
polymorphism repositories [e.g. dbSNP 25]. Exclusion from these data sets is typically an
important criterion in defining a rare, novel or private variant. This simple assumption and
filtering strategy offers an advantage to quickly sift through the exome data for promising causal
variants. However, a caveat to note is that dbSNP has a considerable false-positive rate of 15–
UTMJ Perspectives
Page 3 of 8
Kun Huang
17%.
26
It is possible that the recessive disease-causing mutations from a normal carrier are
deposited in the database.
As an illustration, a recent study by Haack et al 27 provides an elegant example of how
exome sequencing in combination with appropriate filtering strategies can be effective in the
elucidation of a human respiratory chain disease, mitochondrial complex I deficiency (figure 1).
Discovering the molecular basis of this disease is challenging given the large number of both
mitochondrial and nuclear genes that are involved. Using whole exome sequencing followed by
filtering with prioritization of mitochondrial protein (figure 1), Haack et al identified heterozygous
mutations in ACAD9, a mitochondrial acyl-CoA dehydrogenase gene, from a single individual
with severe, isolated complex I deficiency. The authors went on and screened 120 additional
complex I defective index cases for ACAD9 mutations. Two additional unrelated cases and a
total of five pathogenic ACAD9 alleles were identified, further supporting mutations in ACAD9
are associated with a mitochondrial disorder dominated by severe and generalized complex I
deficiency.
Of particular excitement is that supplementation of riboflavin whose metabolite fosters
ACADs assembly and stability resulted in a significant increase of complex I activity in mutant
cell cultures from the patient. 27 This is the first exome sequencing study that also obtained a
promising clinical response. Follow-up clinical trial is needed to establish the efficacy of a
supplementation with vitamins and cofactors in individuals with ACAD9 mutations.
Exome sequencing revolutionizes the way that the genetic bases of Mendelian disorders
are studied. More studies now have applied exome and whole-genome sequencing to common
and genetically complex diseases such as mental retardation. 28 29 Albeit an emerging new
technique, exome sequencing has already expedited the disease gene discovery and is on the
horizon to make personalized medicine a reality.
UTMJ Perspectives
Page 4 of 8
Kun Huang
Figure 1. Exome sequencing and filtering strategy.
UTMJ Perspectives
Page 5 of 8
Kun Huang
Reference
1. Collins FS. Positional cloning moves from perditional to traditional. Nat Genet 1995;9(4):34750.
2. Ng SB, Turner EH, Robertson PD, Flygare SD, Bigham AW, Lee C, et al. Targeted capture
and massively parallel sequencing of 12 human exomes. Nature 2009;461(7261):272-6.
3. Choi M, Scholl UI, Ji W, Liu T, Tikhonova IR, Zumbo P, et al. Genetic diagnosis by whole
exome capture and massively parallel DNA sequencing. Proc Natl Acad Sci U S A
2009;106(45):19096-101.
4. Hodges E, Xuan Z, Balija V, Kramer M, Molla MN, Smith SW, et al. Genome-wide in situ exon
capture for selective resequencing. Nat Genet 2007;39(12):1522-7.
5. Johnson JO, Mandrioli J, Benatar M, Abramzon Y, Van Deerlin VM, Trojanowski JQ, et al.
Exome sequencing reveals VCP mutations as a cause of familial ALS. Neuron
2010;68(5):857-64.
6. Kalay E, Yigit G, Aslan Y, Brown KE, Pohl E, Bicknell LS, et al. CEP152 is a genome
maintenance protein disrupted in Seckel syndrome. Nat Genet 2011;43(1):23-6.
7. Wang JL, Yang X, Xia K, Hu ZM, Weng L, Jin X, et al. TGM6 identified as a novel causative
gene of spinocerebellar ataxias using exome sequencing. Brain 2010;133(Pt 12):3510-8.
8. Musunuru K, Pirruccello JP, Do R, Peloso GM, Guiducci C, Sougnez C, et al. Exome
sequencing, ANGPTL3 mutations, and familial combined hypolipidemia. N Engl J Med
2010;363(23):2220-7.
9. Green P, Wiseman M, Crow YJ, Houlden H, Riphagen S, Lin JP, et al. Brown-Vialetto-Van
Laere syndrome, a ponto-bulbar palsy with deafness, is caused by mutations in c20orf54.
Am J Hum Genet 2010;86(3):485-9.
UTMJ Perspectives
Page 6 of 8
Kun Huang
10. Bilguvar K, Ozturk AK, Louvi A, Kwan KY, Choi M, Tatli B, et al. Whole-exome sequencing
identifies recessive WDR62 mutations in severe brain malformations. Nature
2010;467(7312):207-10.
11. Ng SB, Bigham AW, Buckingham KJ, Hannibal MC, McMillin MJ, Gildersleeve HI, et al.
Exome sequencing identifies MLL2 mutations as a cause of Kabuki syndrome. Nat Genet
2010;42(9):790-3.
12. Ng SB, Buckingham KJ, Lee C, Bigham AW, Tabor HK, Dent KM, et al. Exome sequencing
identifies the cause of a mendelian disorder. Nat Genet 2010;42(1):30-5.
13. Lalonde E, Albrecht S, Ha KC, Jacob K, Bolduc N, Polychronakos C, et al. Unexpected
allelic heterogeneity and spectrum of mutations in Fowler syndrome revealed by nextgeneration exome sequencing. Hum Mutat 2010;31(8):918-23.
14. Pierce SB, Walsh T, Chisholm KM, Lee MK, Thornton AM, Fiumara A, et al. Mutations in the
DBP-deficiency protein HSD17B4 cause ovarian dysgenesis, hearing loss, and ataxia of
Perrault Syndrome. Am J Hum Genet 2010;87(2):282-8.
15. Hoischen A, van Bon BW, Gilissen C, Arts P, van Lier B, Steehouwer M, et al. De novo
mutations of SETBP1 cause Schinzel-Giedion syndrome. Nat Genet 2010;42(6):483-5.
16. Krawitz PM, Schweiger MR, Rodelsperger C, Marcelis C, Kolsch U, Meisel C, et al. Identityby-descent filtering of exome sequence data identifies PIGV mutations in
hyperphosphatasia mental retardation syndrome. Nat Genet 2010;42(10):827-9.
17. Ng PC, Henikoff S. SIFT: Predicting amino acid changes that affect protein function. Nucleic
Acids Res 2003;31(13):3812-4.
18. Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, et al. A method
and server for predicting damaging missense mutations. Nat Methods 2010;7(4):248-9.
19. Ramensky V, Bork P, Sunyaev S. Human non-synonymous SNPs: server and survey.
Nucleic Acids Res 2002;30(17):3894-900.
UTMJ Perspectives
Page 7 of 8
Kun Huang
20. Johnston JJ, Teer JK, Cherukuri PF, Hansen NF, Loftus SK, Chong K, et al. Massively
parallel sequencing of exons on the X chromosome identifies RBM10 as the gene that
causes a syndromic form of cleft palate. Am J Hum Genet 2010;86(5):743-8.
21. Pollard KS, Hubisz MJ, Rosenbloom KR, Siepel A. Detection of nonneutral substitution rates
on mammalian phylogenies. Genome Res 2010;20(1):110-21.
22. Cooper GM, Stone EA, Asimenos G, Green ED, Batzoglou S, Sidow A. Distribution and
intensity of constraint in mammalian genomic sequence. Genome Res 2005;15(7):90113.
23. Cooper GM, Goode DL, Ng SB, Sidow A, Bamshad MJ, Shendure J, et al. Single-nucleotide
evolutionary constraint scores highlight disease-causing mutations. Nat Methods
2010;7(4):250-1.
24. Pennisi E. Genomics. 1000 Genomes Project gives new map of genetic diversity. Science
2010;330(6004):574-5.
25. Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, et al. dbSNP: the NCBI
database of genetic variation. Nucleic Acids Res 2001;29(1):308-11.
26. Day IN. dbSNP in the detail and copy number complexities. Hum Mutat 2010;31(1):2-4.
27. Haack TB, Danhauser K, Haberberger B, Hoser J, Strecker V, Boehm D, et al. Exome
sequencing identifies ACAD9 mutations as a cause of complex I deficiency. Nat Genet
2010;42(12):1131-4.
28. Vissers LE, de Ligt J, Gilissen C, Janssen I, Steehouwer M, de Vries P, et al. A de novo
paradigm for mental retardation. Nat Genet 2010;42(12):1109-12.
29. Caliskan M, Chong JX, Uricchio L, Anderson R, Chen P, Sougnez C, et al. Exome
sequencing reveals a novel mutation for autosomal recessive non-syndromic mental
retardation in the TECR gene on chromosome 19p13. Hum Mol Genet 2011;20(7):12859.
UTMJ Perspectives
Page 8 of 8
Download