Case Studies in Bioinformatics - Faculty Homepages (homepage

advertisement
Bioinformatics Worksheet Instructions
Biology 22
Winter 2009
Helpful hint: Keep the following items open for successful navigation of the databases:
A. For quick links to the databases, use Dr. Colavito’s homepage, Bioinformatics
Tools link from Course page.
B. For an online version of the Bioinformatics worksheet, open the posted version on
Dr. Colavito’s homepage from the Lecture Notes page.
C. To store data as you retrieve it from the databases, copy and paste the information
into a text file.
1. Accessing the genomic sequence to study
Database
Go To
Steps to Take
National Center for Biotechnology Information (NCBI)
http://homepage.smc.edu/colavito_mary/biology22/course.htm
 Choose the Bioinformatics Tools link
 Choose the sequence number shown on your
worksheet
 Copy the nucleotide sequence as it appears on the
screen and save it for future use (#2).
2. Locating the gene within the genomic sequence
Search Tool
Go To
Steps to Take
Information for
Summary Tables
Genscan at MIT
http://genes.mit.edu/GENSCAN.html
 Paste the nucleotide sequence from Step 1 into
the white query box
 Click on Run Genscan
 Record the number of exons predicted from the
scan
 Copy the Results information for future use (#3
and #7).
 Copy the largest predicted peptide sequence
from the results and save it in your text file
Number of Nucleotides in the genomic sequence
Number of exons
Intron-Exon Junction Comparison (see #7)
1
Bioinformatics Worksheet Instructions
Biology 22
Winter 2009
3. Identifying the predicted protein sequence
Search Tool
Go To
Steps to Take
Information for
Summary Tables
BLAST at NCBI
(Basic Local Alignment Search Tool)
http://blast.ncbi.nlm.nih.gov/Blast.cgi
 Under Basic BLAST, Choose protein blast
 Paste the predicted peptide sequence in the
query box
 Under Choose Search Set, select Swissprot
protein sequences
 Under Program Selection, choose blastp
algorithm (protein-protein BLAST)
 Click on the blue BLAST button at the bottom of
the page
 Scroll down the results page to “sequences
producing significant alignments”
 Click the link next to the human sequence with
the highest score [If there is no high-scoring
human sequence, go back to the GenScan
output, select the next largest protein and
repeat the BLAST analysis]
 Copy the amino acid sequence generated by
the BLAST search to use in the future (#4a)
 Note the abbreviation for the gene name under
the “Features” heading on the sequence page
for future use (#5 and #6).
Number of amino acids (Be sure to take this from the
protein sequence detected by the BLAST search
rather than the one predicted by Genscan)
Protein Name
4a. Determining the functional domains of the protein
Search Tool
Go To
Steps to Take
Information for
Summary Tables
Pfam at the Sanger Institute
http://pfam.sanger.ac.uk/search
 Paste the saved human sequence from #3 into
the white query box
 Click on the “Go” button to the right of the box
 On the results page, note the rows highlighted
in green, showing significant matches for
subunits of the protein
 Click on the link on the far left of the row to find
a description of the protein subunit
Protein Domains and their Functions
Overall Function of Protein in the Cell
2
Bioinformatics Worksheet Instructions
Biology 22
Winter 2009
4b. Alternative method for determining overall protein function (Use these steps only if step
4a didn’t provide this information)
Database
Go To
Steps to Take
Information for
Summary Table
Universal Protein Resource (UniProt) at European
Bioinformatics Institute, Swiss Institute of
Bioinformatics and Protein Information Resource
http://www.uniprot.org/
 Under “Search In” choose Protein
Knowledgebase
 Type the protein name in the Query box and
select search
 Choose an entry related to humans and click on
the accession number
 Note both the general information and material
presented by keyword
Overall Function of Protein in the Cell
5. Determining Chromosomal Location and Identifying Inherited Disease Relationship
Database
Go To
Steps to Take
Information for
Summary Table
Online Mendelian Inheritance in Man (OMIM)
at NCBI
http://www.ncbi.nlm.nih.gov/sites/entrez?db=OMIM
 Type the name of the gene in the search box
and click Go
 Select an entry that matches the gene name
 Make note of the abbreviation of the gene name
for future use (#6).
 Note the “Gene Map Locus” listed with the entry
 Note the name of the disease listed with the
entry
 Read the “Genotype/Phenotype Correlations”
and/or “Allelic Variants” portions of the entry to
learn how the disease is inherited
Gene Location on the Chromosome
Related Disease
Inheritance Pattern for Disease Development
3
Bioinformatics Worksheet Instructions
Biology 22
Winter 2009
6. Detecting Single Nucleotide Polymorphisms
Search Tool
Go To
Steps to Take
Information for
Summary Table
National Center for Biotechnology Information (NCBI)
http://www.ncbi.nlm.nih.gov/sites/entrez?db=snp
 Type the name of the gene (usually a three
letter code) in the query box and click Go
 Select the “Human” tab above the search
results
 Under any of the entries, select “Gene View”
For any missense mutation you observe, provide the
Exon Number
mRNA position
Change in nucleotide at that position
Amino Acid position
Change in amino acid at that position
7. Studying Intron-Exon Junctions
Database
Go To
Steps to Take
Information for
Summary
Table
National Center for Biotechnology Information (NCBI)
http://homepage.smc.edu/colavito_mary/biology22/course.htm
 Choose the Bioinformatics Tools link
 Choose GenBank reference
 Select the sequence number shown on your
worksheet
 Use this numbered version of your sequence to
perform the intron-exon junction comparison
 Choose any two consecutive exons from the GenScan
results in #2.
 Examine the Intron-Exon Junctions at the 5’ and 3’
ends of the intron between these two exons
 Compare these to the consensus sequence shown in
your textbook (duplicated in the figure presented
below)
Exon Numbers
Sequence of Nucleotides at junctions
Comparison to consensus sequences
4
Download