Genetic Variations Resources May 15, 2013 Ansuman Chattopadhyay, PhD, Head Molecular Biology Information Service Health Sciences Library System University of Pittsburgh ansuman@pitt.edu http://www.hsls.pitt.edu/guides/genetics Objective Human Genetic genetic variations variations databases Functional analysis of mutations/SNPs Topics Databases: dbSNP db GAP GenPheni DGV Decipher OMIM HGMD Regulome db Tools HugeNavigator FastSNP SPOT GenomeTrax Human Genetic Variations 0.4-0.6 % Population 100 % Population 100 % Population Deletions / Inversions / Translocations Duplications Deletions Insertions Chromosomal rearrangements Copy Number Variations (CNVs) In/Del Micro- and Minisatellites SNPs >5Mb 1Mb 1kb 1bp Human Genome Variations Scherer, S.W. (2009), "Copy number variation", in Scherer, S. (ed.), Copy Number Variation: , The Biomedical & Life Sciences Collection, Henry Stewart Talks Ltd, London 321,340,1 bp (0.11 % of the genome) : SNPs 40,568,593 bp (1.35% of the genome) : CNVs SNP Facts Life Cycle of SNPs and Mutations Mutation/ Private SNP SNPs Classifications of SNPs Genomic location based Classifications of SNPs Nucleotide substitution based Polymorphisms and Disease Markers International Hap Map Project http://www.hapmap.org/ Whole-genome genotyping of 10 million SNPs Technologically daunting Prohibitively expensive Researchers tried to downsize the problem of genome-wide genotyping by studying haplotypes. A haplotype is a contiguous, linear set of SNP alleles along a genome that is inherited as a block. The Origin of Haplotype Haplotype Blocks Haplotype and Tag SNPs Hap Map Population http://www.1000genomes.org/about Bioinformatics Institutions http://www.ncbi.nlm.nih.gov/ http://www.ebi.ac.uk/ dbSNP dbSNP Stats http://www.ncbi.nlm.nih.gov/SNP/snp_summary.cgi as of January 25, 2010 Current Status of dbSNP http://goo.gl/TwltA MassGenomics blog by Dan Koboldt dbSNP Data Types Ref SNP : rs4244285 Ref SNP : rs4244285 Genetic Terminologies Ref SNP : rs4244285 Submitted SNP: ss5586415 SNPedia and NextBio http://goo.gl/aOsoX http://goo.gl/Lqqd4 Searching dbSNP Identify SNPs present in a gene sequence SNPs reported to be present in a genomic region Searching dbSNP UCSC Genome Browser UCSC Table Browser NCBI dbSNP page http://www.ncbi.nlm.nih.gov/snp Searching dbSNP Searching dbSNP dbSNP Search result Display GWAS GWAS Plot Each SNP is assessed for ‘‘genome-wide’’ significance, after Bonferroni correction. Publications on HapMap Hap map GWAS Genome Wide Association Studies http://www.genome.gov/gwastudies/ Find SNPs for a Disease/Trait CDC developed HuGENavigator : http://hugenavigator.net/ GWAS Integrator GWAS Integrator What SNPs are associated with “asthma”? GWAS Integrator : rs7216389 GWAS Integrator What SNPs are associated with “asthma”? GWAS Integrator GWAS Integrator : rs7216389 Ref: Moffatt Mf etal. childhood asthma. Nature. 2007 Jul 26;448(7152):470-3. Epub 2007 Jul 4. PubMed PMID: 17611496. GWAS Integrator : rs7216389 Find associated Genes for a Disease/Trait Gene Prospector: Asthma Gene Prospector: Asthma HuGE Navigator An integrated, searchable knowledge base of genetic associations and human genome epidemiology PheGenI Clinically Associated Mutations OMIM HGMD Online Mendelian Inheritance in Man (OMIM) OMIM Epigenome and Encyclopedia of DNA Elements Project Spatiotemporal gene expression EGFR TP53 A movie on regulated transcription http://vcell.ndsu.edu/animations/regulatedtranscription/index.htm Epigenetic mechanisms Source: NCBI http://www.ncbi.nlm.nih.gov/books/NBK45788/#epi_sci_bkgrd.About_Epigenetics Genome in 3D http://www.nature.com/nature/journal/v470/n7333/pdf/470289a.pdf Chromatin Immuno-Precititation-Seq (ChIP-Seq) Epigenetic Markers Landmark Paper: http://www.nature.com/ng/journal/v39/n3/full/ng1966.html Histone Modifications http://goo.gl/GQ9V8 http://www.hsls.pitt.edu/guides/genetics Encode Project http://www.genome.gov/10005107 http://goo.gl/QeIbQ Regulome Regulome db Search rs7216389 rs2853669 Hands-on Exercise on Searching dbSNP Mutations in the human BRCA1 gene are reported to be associated with the early onset of breast cancer. How many coding nonsynonimous SNPs have been reported to be associated with this gene? How many of these SNPs shows >40% heterozygosity? Pick a SNP from the list and find the position of its protein sequence, which shows aa change due to this SNP. How many in dels are reported to be present in the chromosome chr21: region 33,031,597-33,041,570 ? Hands –On Exercise • Identify genes and SNPs associated with your disease/trait of interest • Crohn’s disease, Prostate cancer, LDL cholesterol Structural Variations Normal A B C Inversion C B A Duplication A B B Deletion A C Insertion A D B C CNV B C Structural Variations Structural Variations Databases Database of Genome Variations (DGV) dbVar DECIPHER dbRIP Mitelman Breakpoint Database DGV DGV :Genome wide view DGV DGV: Chr 1 DGV Hands-on Exercise Is the CETN1 or Grip1 gene located in a region that is copy number variable? Are there any other genes in this region? Can you find any Inversions or In Dels there as well? What is the frequency of the CNV reported in the study population? DGV Genome Browser DGV: Genome Browser dbVar DECIPHER https://decipher.sanger.ac.uk/application/dashboard DECIPHER Syndrome Report DECIPHER Syndrome Report Structural Variations Databases Genome Variation Database (DGV) dbVar DECIPHER dbRIP Mitelman Breakpoint Database dbRIP http://dbrip.brocku.ca/searchRIP.html Mitelman Breakpoint Hands-On Exercise Generate an integrated variation map with reference SNPs, Mitelman breakpoints and OMIM diseases for chromosome 17; region 7773,000-7792,000 bp. What gene(s) have you found in this region? Answer key: http://www.ncbi.nlm.nih.gov/Class/NAWBIS/Modules/Variation/Exercises/var_qa3.html Map Viewer Setup http://www.ncbi.nlm.nih.gov/projects/mapview Map Viewer Setup Genetic Variations Map Online Mendelian Inheritance in Man (OMIM) OMIM UCSC Table Browser Hands-on Exercise Find all human genes which have only one exon. How many of these also show CNVs? Tips: Use UCSC Table Browser Biobase Genome Trax and HGMD http://goo.gl/pUhQ4 Variant File : VCF GenomeTrax Input GenomeTrax Result Functional Analysis of SNPs http://www.hsls.pitt.edu/guides/genetics SNPs and the Structure of a Gene http://www.hsls.pitt.edu/guides/genetics Decision Tree for SNP Analysis http://www.hsls.pitt.edu/guides/genetics Exonic Splicing Enhancer/Silencer http://www.hsls.pitt.edu/guides/genetics Functional Analysis of SNPs A gene variant primarily found in African Americans, that slightly increases the risk for developing an irregular heartbeat, known as arrhythmia. The variant occurs in the cardiac sodium channel gene SCN5A which results a change of amino acid at the position of 1102 from serine to tyrosine (S To Y) . Can you predict the effect of this non-synonymous SNP (rs7626962). Answer http://www.hsls.pitt.edu/guides/genetics Functional Analysis of SNPs Functional Analysis of SNPs Entrez SNP - Search Entrez SNP by refSNP ID to find SNP information. Entrez Protein - Find protein information including its amino acid sequence and the presence of functional domains NCBI Amino Acid Explorer - Compare amino acids in terms of physyo-chemical properties NCBI Mutation Analyzer - Predict the effect of amino acid change on the protein structure TMHMM Server v. 2.0 - Predict the presence of transmembrane helix in a protein sequence Russel etal., Amino Acid Properties Table - Predict the effect of amino acid change on the protein structure SNP Gene View for SCN5A Multiple Sequence Alignment http://www.hsls.pitt.edu/guides/genetics Amino Acids Comparison NCBI Amino Acid Explorer http://www.hsls.pitt.edu/guides/genetics Compare Amino Acids Properties Amino Acid Properties Table: http://www.russell.embl.de/aas/ http://www.hsls.pitt.edu/guides/genetics Amino Acids Substitution Preference http://www.hsls.pitt.edu/guides/genetics Tools for Amino Acid Substitution Effect Prediction SIFT PolyPhen http://genetics.bwh.harvard.edu/pph/ SNPs3D http://sift.jcvi.org/ http://www.snps3d.org/ pMUT http://mmb2.pcb.ub.es:8080/PMut/ http://www.hsls.pitt.edu/guides/genetics Comparison of AAS prediction tools Pauline C. Ng and Steven Henikoff, Annu. Rev. Genomics Hum. Genet. 2006. 7:61–80 Tools on Functional SNP Analysis Search.HSLS MolBio link http://search.hsls.pitt.edu/vivisimo/cgi-bin/querymeta?v%3aproject=BioInfoTools&v%3afile=viv_B7AUre&v%3aframe=list&v%3astate=root%7cN891&id=N891&act ion=list& FASTSNP -- an always up-to-date and extendable service for SNP function analysis and prioritization http://fastsnp.ibms.sinica.edu.tw/ F-SNP: computationally predicted functional SNPs for disease association studies. http://compbio.cs.queensu.ca/F-SNP/ http://www.hsls.pitt.edu/guides/genetics FASTSNP Analysis http://www.hsls.pitt.edu/guides/genetics FASTSNP Analysis http://www.hsls.pitt.edu/guides/genetics F-SNP: A Collection of Functional SNPs Specifically Prioritized for Disease Association studies http://www.hsls.pitt.edu/guides/genetics F-SNP: A Collection of Functional SNPs Specifically Prioritized for Disease Association Studies http://www.hsls.pitt.edu/guides/genetics Tutorials and References Advanced Course on NCBI Resources (Browser: IE, select .html format) Predictive Functional Analysis of Polymorphisms: An Overview Authors:Michael R. Barnes Book:Bioinformatics for Geneticists Source:Wiley InterScience: Online Books Functional In Silico Analysis of Non-Coding SNPs Authors:Thomas Werner Book:Bioinformatics for Geneticists Source:Wiley InterScience: Online Books Predicting the Effects of Amino Acid Substitutions on Protein Function Pauline C. Ng and Steven Henikoff Fred Hutchinson Cancer Research Center, Seattle,Washington 98109; Thank you! Any questions? Carrie Iwema iwema@pitt.edu 412-383-6887 Ansuman Chattopadhyay ansuman@pitt.edu 412-648-1297 http://www.hsls.pitt.edu/guides/genetics