Use of genomic tools

advertisement
Small Brains, BIG Ideas
Bioinformatics
Use of genomic Tools
The purpose of this laboratory is to become familiar with some of the tools that are used
for the analysis of DNA and protein sequences. All these tools are available on the
internet (and most require a connection to the internet).
Databases and tools for sequence alignments
http://www.ncbi.nlm.nih.gov
This is the portal for many resources, including
PubMed (references to publications: www.pubmed.org), access to some books (follow
“Books” link), DNA, mRNA, protein sequence databases, etc. We will use the following
links:
Access to all databases:
http://www.ncbi.nlm.nih.gov/gquery/gquery.fcgi?itool=toolbar
Sequence databases:
http://www.ncbi.nlm.nih.gov/sites/entrez?db=nuccore
Database of homologous genes
http://www.ncbi.nlm.nih.gov/sites/entrez?db=homologene
Site for comparing 2 sequences (“BLAST2”)
http://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastSearch&PROG_DEF=blastn&
BLAST_PROG_DEF=megaBlast&BLAST_SPEC=blast2seq
Site for comparing insect sequences (12 Drosophila species, bee, mosquito, etc):
http://flybase.org/blast/
Site for making phylogenetic trees
http://www.ebi.ac.uk/Tools/clustalw2/index.html
http://align.genome.jp/
Site for making reverse complement of sequence
http://www.bioinformatics.org/SMS/rev_comp.html
Site for translating a sequence:
http://www.expasy.ch/tools/dna.html
Site for making primers:
http://frodo.wi.mit.edu/
Source of other sites:
http://www.bioinformaticsonline.org/links/ch_09_t_6.html
1
Small Brains, BIG Ideas
Bioinformatics
Exercises
The purpose of this exercise is to introduce you to the tools that are used to study the
structure of a gene, compare its sequence with that of its homologs in different animals,
etc.
1- Choose a gene (e.g., per2, etc).
2- Find the DNA sequence of the gene.
3- Find the RNA sequence of the gene.
4- Make a Word file with the sequence. Use BLAST 2 to identify exons, introns, 5’ and
3’ untranslated region; identify the transcription start site and the start of the coding
region. Mark these features on the genomic sequence.
5- Use BLAST against all genomes to identify homologs using the mRNA sequences
(why use the mRNA sequence?).
6- Repeat the same exercise using the protein sequence. Identify conserved regions.
7- Make a Word file with the sequence of homologous genes in FASTA format:
> Name 1 (sequence 1)
Sequence ………………………..
> Name 2 (sequence 2)
Sequence ………………………..
> Name 3 (sequence 3)
Sequence ………………………..
etc…..
Note:
- The “>” is critical for this format, as is the name that is on the first line. The
sequence must start on the following line.
- Some programs only consider the first letters of the name, so make sure that the
different names differ in the first letters
- Make sure you use the Courier font (a “proportional font”, i.e. one in which each
letter uses the same space).
8- Copy the text into one of the programs for making phylogenetic trees. Make a tree
first using species that are phylogeneticaly close, then more distant ones. Repeat the
same process using the protein sequence. Comment on the results obtained in both
cases.
9- Repeat the BLAST search you did in (5) but using only non-coding sequences (5’, 3’,
introns). Investigate the conservation of these sequences in different species and
animal groups.
10- Use these tools to answer a question you find interesting.
2
Download