Manual (MS Word file)

advertisement
Huvariome
The central requirement for implementing NGS into clinical practice is to allow simple and secure access to
databases containing curated knowledge of variants scored as clinically relevant pathogenic mutations with
standardized clinical reporting. Huvariome provides the user with whole genome allele frequencies, their
associated quality score (detection and chance to detect the variant), gene based ranking and integrated access
to publicly available data for the detection of common, rare and deleterious variants. The functional impact of
variants in Huvariome is provided by the Complete Genomics (CG) annotation pipeline. The novelty of
Huvariome is that it provides rapid and simple access to SNV, short indels, and de novo assembled regions of
the genome at any position in the genome with allelic frequencies and associated error for position in the
human genome. Huvariome also delivers common variants from a small cohort of Benelux genomes from
unrelated individuals with no disease association. In light of these developments we have developed a simple
application, Huvariome, which goes beyond the current platforms with similar goals to enable efficient allelic
frequencies searching in both public and private genomes for clinical research scientists. The following pages
describe in detail how to use Huvariome.
hg18 variant analysis
The current database contains whole genomes from 165 individuals all mapped to hg18. The following example will
describe how to query Huvariome and what is retrieved from Huvariome. In the following example known variants from
Huvariome query
7
55216556
7
55216983
7
55217384
7
55217551
7
55217609
7
55217875
7
55217935
7
55217949
7
55218287
7
55218347
7
55218596
7
55218972
7
55219034
7
55219086
Fig-1A
Fig-1B
Fig 1 – Variants from EGFR from UCSC (A) and with chromosome number and start position used for the query to huvariome (B).
the EGFR were downloaded from UCSC table browser and a subset used to query the huvariome. The query takes the form
of chromosome number (e.g. 7) and 0-based start location (e.g. 55216556) in the above example (Fig 1). The user then
choose to keep the default query for NCBI36 (hg 18) and processes the request by “clicking” the Run Variomatic button at
the bottom of the query page (Fig -1B).
The results page is returned, and depending on the size of the request or the load on the server a notification that the page
will be refreshed every 30s appears which can be refreshed by pressing F5 on the key board (Fig 2). If the request is large
Huvariome will deliver back the genomic locations and the genotype frequencies before returning the annotations to
improve the usability of by reducing the waiting time of the user (Fig 3).
Fig-2
Fig-3
Fig 2 – Results page from huvariome
Fig 3 – If the query is large then the chromosome location of the query and genotypes are returned with an a warning that the
annotation will be delivered as soon as possible.
The results page is for the 14 variants submitted variants from the EGFR are displayed in Fig 4, initially with the Diversity
panel genotypes in the primary display.
Fig 4 An illustration of the analysis of five genomic locations using the guest account. The balloons labeled on this figure
outline the key data that are returned for each variant (each row). The frequency of each genotype is highlighted by the
size of the associated blue bar. Abbreviations: gsym= gene symbol, comp= gene component (e.g. exon , intron),
xref=external reference for variants, dgv=database of genomic variants, vista=VISTA enhancers (http://enhancer.lbl.gov/).
hg19 variant analysis
The current database contains whole genomes from 165 individuals all mapped to hg18. Variants mapped to hg19 can be
queried against huvariome which will return the hg19 position and the hg18 position of the submitted variant. Huvariome
uses the UCSC mapping table to “liftover” hg19 variants to the hg18 results currently stored in the database
In the following example known variants from the EGFR were downloaded from UCSC table browser and a subset (Fig 5A)
and used as a test case to query huvariome. UCSC uses 0-based half open notation for representing nucleotide positions
whilst huvariome uses 0-based thus only the chromosome number and start position are required to query Huvariome in
tab-delimited format (Fig 5B) as with hg18 query.
Chromsome
chr7
chr7
chr7
chr7
chr7
chr7
chr7
chr7
chr7
chr7
chr7
chr7
chr7
chr7
Fig 5A
Start
55249062
55249489
55249890
55250057
55250115
55250381
55250441
55250455
55250793
55250853
55251102
55251478
55251540
55251592
End
55249063
55249490
55249891
55250058
55250116
55250382
55250442
55250456
55250794
55250854
55251103
55251479
55251541
55251593
Alleles
A,G
A,G
C,T
A,G
C,T
A,G
A,G
A,G
C,T
G,T
A,G
C,T
A,G
A,G
Huvariome query
7 55249062
7 55249489
7 55249890
7 55250057
7 55250115
7 55250381
7 55250441
7 55250455
7 55250793
7 55250853
7 55251102
7 55251478
7 55251540
7 55251592
Fig 5B
Fig 5C
Fig5 – Variants from EGFR from UCSC (a) and with chromosome number and start position (b) used for the query to huvariome (c).
The resultant query (Fig 5B) is inserted into the query box on the Huvariome start page and the user then select GRCh37
(hg19; lift over to hg18) and then the Run Variomatic button (Fig 5C). The results are display as in the hg18 view but now
the position in hg19 is displayed as original (chr , pos) in Fig 6 below.
Fig-6
Huvariome output, hg18 (chr, pos), hg19 (original- chr, -pos), ref= reference allele, a1&a2 genome set allele, nc rate (no calls),
xref (external reference) , impact (effect oof gene), gysm (gene symbol), comp (gene feature), dgv (database of genomic variants), vista
(human regulatory elements), common var (>=5% MAF in 31 genomes denoted by a “1”).
Download