Exercises 23rd of April, 2013 Internet resources You will need the following genome browsers/databases to do these exercises: www.ensembl.org www.ncbi.nlm.nih.gov/omim www.ncbi.nlm.nih.gov/pubmed Exercise 3. a) Which disease has the OMIM number 105830? b) This syndrome has three gene/locus numbers listed. What are the three numbers, and what are the names of the genes? Exercise 4. You have done a linkage analysis in a family with cerebellar ataxia and found a huge linkage peak on chromosome 22. a) Are there any genes causing pure spinocerebellar ataxia on chromosome 22? (You can use either OMIM or Ensembl to answer this question.) b) Search for the in the ATXN10 gene in Ensembl database. Choose "Human", and write in the name of the gene. You will get a result summary: Choose and double click on “Gene” and “Human”. What is the location of the gene? c) Click on the location link. Where on the chromosome is the gene located (q or p arm)? d) On the left side menu: Click on “Chromosome summary”. What is the length of the chromosome? (Is this given in physical or genetic distance?) e) How many protein coding genes are there on chromosome 22? Exercise 5. a) Go to www.pubmed.org. Note that this is the starting point of OMIM and many other databases: By scrolling down the menu where you now can see “PubMed”, you can see all the various NCBI databases and resources. See if you can find an appropriate database to search for the SNP variant: rs186343462. Which database would you suggest? b) Click on the only search result and you’ll be directed to dbSNP. What are the reference alleles for this SNP? c) Go back to the front page at www.pubmed.org. Now, scroll down to find an appropriate database for a search for the SCN1A gene. What is the word (database) you found? d) Choose the gene listed first in the results (SCN1A [Homo sapiens]). What is the official full name? e) What is the chromosomal position of this gene? f) Can you find this gene by searching for it in OMIM? (Hint: You might find a shortcut on the right side menu.) What is the MIM number of this gene? g) What does the star in front of the number mean? (Hint: This is described under “Limits” if you go to the OMIM start page.) Exercise 6. a) You will now explore the region 123,456,789-132,121,131 on chromosome 3. How large is this region in physical distance? b) Go to www.ensembl.org. Go to the tool BioMart and make an xls list of genes in this region by using the settings: Ensembl genes, Homo sapiens, chromosome 3: 123,456,789132,121,131, Ensembl gene ID, associated gene name, MIM morbid description, MIM gene accession, MIM gene description. How many genes do you retrieve? c) How many have a MIM morbid description? (Note that some genes may have more than one MIM description). Exercise 7. The children in this family are affected with an unknown disease, and you suspect that the condition is monogenic and recessive. a) What is the maximum LOD score for this pedigree? Would you consider starting a linkage project with this family? b) You find out that the parents are first cousins. Explain how this changes your view of the potential for disease mapping in this family. Which mapping technique becomes relevant in light of the new information? c) What is the difference between homozygosity and autozygosity? For each of the children - how much of the genome is expected to be autozygous? d) For any pair of the children, what is the expected fraction of the genome where they are both autozygous? And the fraction of the genome where all three are autozygous? e) Explain why it can be a good idea to sequence one or both parents when doing homozygosity mapping. f) You decide to sequence the exome of two family members. Who would you choose? If you could afford to do three, who would you choose then? Discuss pros and cons of the competing alternatives. g) The exome sequencing produces thousands of variants for each individual. List some (as many as you can) of the filters you would use to remove uninteresting variants.