Bioinformatic approach to filter candidate variants In order to define

advertisement
Bioinformatic approach to filter candidate variants
In order to define candidate variants for further analyses, we developed and
implemented a customized variant filtering and prioritization strategy (Figure 1)
based on frequency in the population, function and in-silico prediction using dbNSFP
plugin 1 via VEP. In the functional filter, all exonic non-synonymous, non-sense,
splice-site, loss-of-function and truncating variants in the targeted regions were
considered as potential candidates. The frequency filter utilized allele frequency from
the 1000 Genomes and ESP databases.2, 3 Using stringent allele frequency thresholds
of one individual in 1000 genomes database (1/1092 - 0.09%) and two in ESP
database (2/6503 - 0.03%), we excluded all variants present with allele frequency
greater than the set thresholds in the two databases respectively. Variants previously
reported in dbSNP database build 141 4 were noted but not excluded from
downstream analyses. By comparing the remaining variants with variants reported in
Human Gene Mutation Database (HGMD),5 we identified variants previously
reported in the literature as disease causing mutations. The pathogenicity of missense
variants was predicted in-silico using two ensemble scores from dbNSFP
(RadialSVM and LR) based on ten component scores (SIFT, PolyPhen-2 HDIV,
PolyPhen-2 HVAR, GERP++, MutationTaster, Mutation Assessor, FATHMM, LRT,
SiPhy, PhyloP) and the maximum frequency observed in the 1000 genomes
populations.1
Standard manual genetic analysis
A team of geneticist and a clinician also evaluated the results (in this study) manually.
The clinical evaluation involved inspection of all identified variants followed by
filtering based on the frequency in the reference population databases (1000G and
ESP projects) and in-silico predictions scores (SIFT, Polyphen, Mutation Taster). All
previously published variants were evaluated based on the literature and other
available information such as variants reported in NCBI’s ClinVar database
(www.clinvar.com). Five classification categories were used in the classification
process (pathogenic, likely pathogenic, variant of unknown significance, likely benign
and benign).
Reference
1.
Liu X, Jian X, Boerwinkle E. dbNSFP v2.0: A Database of Human Nonsynonymous SNVs and Their Functional Predictions and Annotations. Human
Mutation 2013;34:E2393-E2402.
2.
Genomes Project C, Abecasis GR, Auton A, Brooks LD, DePristo MA, Durbin
RM, Handsaker RE, Kang HM, Marth GT, McVean GA. An integrated map of
genetic variation from 1,092 human genomes. Nature 2012;491(7422):56-65.
3.
NHLBI. Exome Variant Server, NHLBI Exome Sequencing Project (ESP),
Seattle, WA (URL: http://evs.gs.washington.edu/EVS/)
4.
Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, Sirotkin
K. dbSNP: the NCBI database of genetic variation. Nucleic acids research
2001;29(1):308-11.
5.
Stenson PD, Ball EV, Mort M, Phillips AD, Shiel JA, Thomas NS, Abeysinghe
S, Krawczak M, Cooper DN. Human Gene Mutation Database (HGMD): 2003
update. Hum Mutat 2003;21(6):577-81.
Download