Bio/CS 251 Bioinformatics

advertisement
Bio/CS 251 Bioinformatics
Homework 1
1/23/06
20 points
Due Date: Thursday, 1/26/06 at 5 pm to Dr. James, 255 SC
Assignment: Complete Problems 1, 2, 3, and either #4 or #5 (but not
both!).
1. A double-stranded DNA molecule is 28% deoxyguanosine (G). What is the complete
base composition of this molecule?
2. Draw the structures of the nucleoside monophosphates containing Inosine, Guanosine,
and Adenosine, then do the following:
3.
a.
Identify the base hypoxanthine by drawing a circle around it.
b.
Which two bases are the most different in chemical structure? How many
chemical changes would need to occur to change one base into the other, or
vice versa? Indicate whether these changes are aminations, deaminations, or
oxidations.
c.
From your answer in b., explain why evolution (natural selection) chose these
two particular nucleotides over the other in this trio, as the two bases in DNA.
The total number of base pairs of DNA in one human cell is 6 x 109. From the
values presented in Monday’s Powerpoint lecture, do the following:
a.
Calculate the total length of the DNA from a human cell, expressed in meters.
Show your calculations.
b.
Your body is composed of approximately one quadrillion cells (1 x 1015 cells).
If you laid all of your DNA molecules end to end, approximately how many times
would this DNA extend to the moon and back?
Complete one of the following two exercises, either #4 or #5.
4.
Bioinformatic exercise: Do the following a.
Go to the National Center for Biotechnology Information homepage:
http://www.ncbi.nlm.nih.gov/
b.
Under “Search”, choose “Gene” from the scroll-down list. In the search box,
type “HPRT”, then hit the “Go” button (this is the abbreviation for
Hypoxanthine-Guanine Phosphoribosyl Transferase, the gene whose job it is to
Salvage excess guanine and adenine and recycle it back into usable nucleoside
monophosphates.
c.
How many GenBank entries exist for this gene?
d.
Examine the first three pages of GenBank entries for this gene, and take note of
the organisms from which this gene has been identified. The names of these
organisms are italicized, within brackets, at the end of the second line of each
gene entry.
(1) List at least 10 species, and no more than 20, in which this gene has been
reported. After listing the genus and species names for each organism,
e.g., Homo sapiens, indicate the type of organism, such as human, dog,
chicken, plant, fungus, bacteria, etc.
(2) How diverse is the range of species in which this gene occurs?
(e.g., primates-only, mammals-only, vertebrates-only, etc.)
(if you need help with the biology here, please see Dr. James)
(3) From your answer in (2) above, would you say that the HPRT gene
evolved recently, or did it evolve in the distant past. What information
lead you to your conclusion?
e.
From the list of HPRT genes, locate the human version of this gene and click on
the blue HPRT link.
(1) This is the Entrez Gene page containing the single-page annotation for the
human HPRT gene. Print out this page, and submit it with this
assignment.
(2) Scroll towards the bottom of this page of annotation, and click on the
genomic sequence link titled “M26434”. Scroll towards the bottom until
you find the DNA sequence for the human HPRT gene.
How long is this gene, in nucleotides?
#4 (continued)
f.
Go back to the previous page, and click on the link titled mRNA
“M31642”.
How long is the messenger RNA (mRNA) for the same gene?
g.
Go back to the previous page, and click on the link titled Protein
“AAA52690”
How long is the protein encoded by the HPRT gene and mRNA?
For the moment, take note of the length differences between the
gene, the mRNA, and the protein. These reasons for these
differences will be explained a lecture or two from now.
h.
Finally, determine the number of sites in the gene at which allelic variants, or
mutations, are known to occur. These mutations are often the result of a
single base substitution, also known as Single Nucleotide Polymorphism, or
SNP. Mutations can also be caused by deletion or insertion of one or more
bases.
To determine the number of sites within the DNA sequence of this gene at
which single –base mutations have been discovered, do the following:
From the “Display” box on the Entrez Gene page, select “SNP links”, and
proceed to the new page.
(1) How many different allelic variants are listed in the SNP database for
this gene?
(2) Scroll down and click on SNP #18 and #19. From the “Allele” column on
the “Single Nucleotide Polymorphism” page for each of these variants,
determine the type of variation in each case. List and briefly describe each
type mutation. For example, is the mutation caused by substitution of one
base for another (which ones?), or by the insertion or deletion of a base
(which one)?
5.
Bioinformatics Exercise: Do the following:
a.
Go to the National Center for Biotechnology Information homepage:
http://www.ncbi.nlm.nih.gov/
b.
Under “Search”, choose “Gene” from the scroll-down list. In the search box,
type “Adenosine deaminase”, then hit the “Go” button. This will provide a
list of ADA and ADA-related genes. One of these genes is responsible for
Adenosine Deaminase Deficiency, aka Severe Combined Immunodeficiency
Syndrome (SCID), aka “Bubble Boy Disease”.
c.
How many GenBank entries exist for this gene?
d.
Examine the first three pages of GenBank entries for this gene, and take note of
the organisms from which this gene has been identified. The names of these
organisms are italicized, within brackets, at the end of the second line of each
gene entry.
(1) List at least 10 species, and no more than 20, in which this gene has been
reported. After listing the genus and species names for each organism,
e.g., Homo sapiens, indicate the type of organism, such as human, dog,
chicken, plant, fungus, bacteria, etc.
(2) How diverse is the range of species in which this gene occurs?
(e.g., primates-only, mammals-only, vertebrates-only, etc.)
(if you need help with the biology here, please see Dr. James)
(3) From your answer in (2) above, would you say that the HPRT gene
evolved recently, or did it evolve in the distant past. What information
lead you to your conclusion?
e.
From the list of ADA and ADA-related genes, scroll down to #19, “ADA”, and
open this link.
(1) This is the Entrez Gene page containing the single-page annotation for one
version of the human gene. Print out this page, and submit it with this
assignment.
(2) Scroll towards the bottom of this page of annotation, and click on the
genomic sequence link titled “M13792”. Scroll towards the bottom until
you find the DNA sequence for the human HPRT gene.
How long is this gene, in nucleotides?
f.
Go back to the previous page, and click on the link titled mRNA
“X02994”.
How long is the messenger RNA (mRNA) for the same gene?
g.
Go back to the previous page, and click on the link titled Protein
“AAA78791”
How long is the protein encoded by the ADA gene and mRNA?
For the moment, take note of the length differences between the
gene, the mRNA, and the protein. These reasons for these
differences will be explained a lecture or two from now.
h.
Finally, determine the number of sites in the gene at which allelic variants, or
mutations, are known to occur. These mutations are often the result of a
single base substitution, also known as Single Nucleotide Polymorphism, or
SNP. Mutations can also be caused by deletion or insertion of one or more
bases.
To determine the number of sites within the DNA sequence of this gene at
which single –base mutations have been discovered, do the following:
From the “Display” box on the Entrez Gene page, select “SNP links”, and
proceed to the new page.
(1) How many different allelic variants are listed in the SNP database for
this gene?
(2) Scroll down and study SNP #s 7 and 8 by clicking on each SNP. From the
“Allele” column on the “Single Nucleotide Polymorphism” page for each
of these variants, determine the type of variation in each case. List and
briefly describe each type mutation. For example, is the mutation caused
by substitution of one base for another (which ones?), or by the insertion or
deletion of a base (which one)?
Download