Assignment 2 Answers

advertisement
Assignment II
Using NCBI databases
1. Find the protein with the accession number: P23367 in the NCBI protein
database. (10 points)
a. How many amino acids are in the protein? 615
b. What is the function of the protein? DNA mismatch repair
2. Find the gene mutL of Escherichia coli. (15 points)
a. How many records did you retrieve in the NCBI Gene database? 33
b. How many mutL genes does one Escherichia coli genome have? 1
3. Searching for the Homo sapiens g6pd protein in the NCBI protein database
will result in records from both RefSeq and GenBank.
(10 points)
a. How many records are from GenBank, RefSeq, and SwissProt?
GenBank – 124; RefSeq – 3; SwissProt - 1
b. Read about RefSeq and GenBank (e.g., in
http://www.ncbi.nlm.nih.gov/RefSeq/RSfaq.html#rsgbdiff). In which database you
expect to find more records? Why? GenBank. It contains data from numerous
individual laboratories, which allows for it to achieve wide-scale coverage,
but also increases its redundancy. RefSeq data has been curated and is
more regulated.
4. Find the tumor suppressor pp32r1 gene (accession number AF008216) in the
nucleotide database. (15 points)
a. What is the source organism and the chromosome from which the sequence
has been obtained? Homo Sapiens; 4
b. At which nucleotide does translation start? 4453
c. How many amino acids are in the protein? 234
5. Using the NCBI cross-database search, find all entries for Human
immunodeficiency virus 2 (HIV-2). (15 points)
a. In which database will you be able to find how many coding sequences are in
its genome? Entrez Genome
b. How many coding sequences are there? 9
6. Using the NCBI genome database, find the entry for the genome of Aquifex
aeolicus VF5 genome (without plasmids). (10 points)
a. What is the GC content of its chromosome? 43%
b. What is the length of its genome? 1,551,335 bp
7. Using the NCBI Genome Project database, answer the following questions:
(10 points)
a. How many chromosomes are in the genome of Saccharomyces cerevisiae? 16
b. How many different Saccharomyces species can be found in the Genome
Project database? 8 species Saccharomyces [orgn] limits: Genome
overview
8. Mutations on BRCA1 gene have been reported to be associated with the early
onset of breast cancer. (15 points)
a. How many non-synonymous mutations of human BRCA1 are in SNP
database? Limiting the BRCA1 search in the SNP database to “coding nonsynonymous” and Homo Sapiens yields 1282 results
Examine record rs70953662 in SNP database.
b. What are the nucleotide alleles? T to A
c. What is the amino acid change in between these alleles? Ile to Asn
Download