Projects - BC Bioinformatics

advertisement

BI420 – Introduction to Bioinformatics

Midterm exam (Prof. Marth)

– due Tuesday, November 2, 2004, in class. Each question is worth 10 points.

1. Sequencing informatics: (a) What is the function of a genome fragment assembler? (b) What is the essential input to a genome fragment assembler? (c) How does a genome fragment assembler piece together sequences? (d) Why does a genome fragment assembler have trouble multiply aligning alternatively spliced expressed sequences? (e) Why is it necessary to clip vector sequences before sequence assembly?

2. Genome sequencing: (a) What is the difference between clone-based and whole-genome shotgun sequencing? (b) Draw a figure that marks the main steps of clone-based genome sequencing. (c) Give examples of at least three genomes that were sequenced with each approach.

3. Variation informatics: (a) What are the main steps of polymorphism detection? (b) What is a multiple alignment? (c) What sequence is needed to perform an “anchored” multiple alignment? (d) What is the purpose of paralog filtering? (e) What are the four main quantities that are taken into account by the Bayesian polymorphism detection algorithm implemented in the PolyBayes polymorphism discovery software?

4. Sequence variations: (a) What is the event that gives rise to polymorphisms? (b) Can we detect sequence variations on human chromosome 5 in the DNA of a single person? (c) Can we detect sequence variations on human chromosome X in the DNA of a single person? (d) Name at least three different types of DNA polymorphisms. (e) What is the average rate of polymorphisms when pairs of human sequences are compared?

5. Genomes: (a) Using the NCBI Homology resource (http://www.ncbi.nlm.nih.gov/Homology/) determine those chromosomes in rat and mouse that are homologous to human chromosome 5, band 5p15.1. (b) Name at least one human gene from this band. (c) Determine if that human gene has a homolog in either rat or mouse.

6. Genomics: (a) What is a restriction enzyme? (b) What is the recognition sequence of the enzyme HindIII ? (c)

If you cut the following sequence with the restriction enzyme EcoRI, how many DNA restriction fragments will be produced and how long will they be: TTATCCATGAATTCGGTTAAATGTAGGAATTCATA? (d) Approximately how many EcoRI restriction sites are there in the human genome? (e) How many are there in the C. elegans genome?

7. Genome annotation: (a) Name at least three features that are annotated by genome annotations. (b) How are known human repeats annotated? (c) Name at least three different known human repeat types.

8. Proteomics: (a) Draw a figure marking the organization and units of euchariotic genes. (b) What is the average number of amino acids in a human gene? (c) What is the average genomic length of a human gene?

(d) What is the meaning of t he term “gene expression”? (e) Name three scientific or medical questions that can be answered by analyzing gene expression.

9. Genomics: (a) What is the BRCA2 gene? (b) What medical condition is associated with this gene? (c) Name at least three other organisms that carry a homologous gene. (d) Using the Ensemble database entry for the

BRCA2 gene, determine if Exon 3 (the second coding exon) has any amino-acid changing SNPs. If so, describe the nucleotide and amino-acid change. (e) Provide the title of a journal article that describes BRCA2.

10. Genomics: (a) What is the relationship between GC nucleotide content and gene density in the human genome? (b) What are segmental duplications? (c) What are paralogous sequences? (d) On average, how many human genes are expected between two genetic markers that are 5.7 cM apart? (e) What is the basic principle behind genetic mapping?

Please staple all your sheets together, mark your name, email address, and date on the front page.

Download