Lab 1 exercise

advertisement
Laboratory 1
NCBI, Entrez, and GenBank
The purpose of this exercise is for you to gain rudimentary experience using sequence
databases (i.e., GenBank) through an online portal (i.e., Entrez). In your web browser,
go to the NCBI Entrez website (www.ncbi.nlm.nih.gov/sites/gquery). Provide answers
to the following questions.
1. The last known Tasmanian tiger died in the Hobart Zoo in 1936. DNA sequences
have been obtained from museum specimens. (In fact, there is an effort to clone this
animal using museum material.) You can retrieve tasmanian tiger sequences using
the Taxonomy Browser. Search the taxonomy database for Tasmanian Tiger. How
many DNA and protein sequences are there? What genes were cloned? You can
build a phylogenetic dataset that could be used to analyze the taxonomic position of
the Tasmanian Tiger with the Taxonomy Browser. Click on the Metatheria
(Marsupial) link in the lineage of the tiger. How many nucleotide sequences are
there for Metatheria? Retrieve the entry for Metatheria and get the nucleotide
sequences. In Entrez you can refine the query to include only cytochrome b
sequences through the Preview/Index tab. How many marsupial cytochrome b
sequences are there? You could save these in FASTA format for use in phylogenetic
analysis if you wanted. You could browse up the lineage further to get an outgroup
sequence. (4 points)
2. Michael Crichton's fantasy about cloning dinosaurs, Jurassic Park, contains a
putative dinosaur DNA sequence. Use nucleotide-nucleotide BLAST against the
default nucleotide database, nr, to identify the real source of the following sequence.
What is the real source? Select, copy and paste it into the BLAST form window.
This is probably the most common use of nucleotide-nucleotide BLAST: sequence
identification, establishing whether an exact match for a sequence is already present
in the database. (1 point)
>DinoDNA from JURASSIC PARK p. 103 nt 1-1200
GCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGC
GGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCG
TGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGC
TGCTCACGCTGTACCTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTG
CCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAA
AGTAGGACAGGTGCCGGCAGCGCTCTGGGTCATTTTCGGCGAGGACCGCTTTCGCTGGAG
ATCGGCCTGTCGCTTGCGGTATTCGGAATCTTGCACGCCCTCGCTCAAGCCTTCGTCACT
CCAAACGTTTCGGCGAGAAGCAGGCCATTATCGCCGGCATGGCGGCCGACGCGCTGGGCT
GGCGTTCGCGACGCGAGGCTGGATGGCCTTCCCCATTATGATTCTTCTCGCTTCCGGCGG
CCCGCGTTGCAGGCCATGCTGTCCAGGCAGGTAGATGACGACCATCAGGGACAGCTTCAA
CGGCTCTTACCAGCCTAACTTCGATCACTGGACCGCTGATCGTCACGGCGATTTATGCCG
CACATGGACGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAA
CAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAA
GCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGG
CTTTCTCAATGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTG
ACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCA
ACACGACTTAACGGGTTGGCATGGATTGTAGGCGCCGCCCTATACCTTGTCTGCCTCCCC
GCGGTGCATGGAGCCGGGCCACCTCGACCTGAATGGAAGCCGGCGGCACCTCGCTAACGG
CCAAGAATTGGAGCCAATCAATTCTTGCGGAGAACTGTGAATGCGCAAACCAACCCTTGG
CCATCGCGTCCGCCATCTCCAGCAGCCGCACGCGGCGCATCTCGGGCAGCGTTGGGTCCT
3. Substantial data are available for two species of filarial nematodes that are human
parasites. Use the Taxonomy Browser to examine the number of nucleotide
sequences for the superfamily Filaroidea and determine which two species these
are. How many nucleotide and protein sequences are there for each of these two
species? Display nucleotide records for each of these. What kinds of sequences are
most of these? (2 points)
4. Search for population and phylogenetic studies on bears in Entrez PopSet. Find the
study on brown bears and polar bears and display the alignment. What gene or
molecular regions were used in this study? Use the tool bar link to display variations
in the alignment. Are there fixed differences in the sequences from the brown bear,
Ursus arctos, and the polar bear sequences in the alignment? What if the Ursus
arctos sequence from the "ABC" islands (Sequence 7) is removed?
Responses to these questions are due via email as an MS Word file before the next lab
period. Name your file as follows: LastName_lab1.doc.
Download