ISU_Bioinformatics_Assignment

advertisement
Bioinformatics Assignment
(BLUE is the same assignment done on the Biology Workbench)
1. Translation of DNA into Amino Acid sequence
Log on to the ISU Bioinformatics Portal (contact Mike Thomas, mthomas@isu.edu)
Username: #####
Password: #####
On the left side you will see a listing of nearly 20 software programs. Click on the file labeled
EMBOSS, click on Nucleic, click on Translation, click on transeq
Below input Section, enter your DNA sequence in the box marked Actual data
In the advanced Section, under Frame(s) to translate, choose All six frames
Scroll back up to the top, change the default email address to your own (otherwise I will delete
any emails sent to me)
To begin the search, click on Submit transeq at the top of the page
When your results are returned, you will see at the top of the page
Results: click on outseq.out
You will see 6 different results, one for each reading frame. If any sequences have an asterisk
(*), that means there is a stop codon. These sequences with an * are meaningless and you should
immediately disregard that as an option. Copy the remaining sequences and proceed to the next
step.
1. Translation of DNA into Amino Acid sequence
Go to workbench.sdsc.edu, and register/log in.
Click on Session Tools, and then select “Start New Session”.
Give your session a name such as “Unknown Id” and click on the “Start New Session”
button.
Click on Nucleic Tools, and then select “Add New Nucleic Sequence” and “Run”.
Put a description of your sequence (such as Unknown Sequence 1) into the “Label:” box.
Copy and paste your sequence into the “Sequence:” box.
Scroll down the page and click on “Save” you will end up back in the Nucleic Tools
window.
Select your sequence by clicking in the check box.
Scroll to the bottom of the available tools and select “SIXFRAME” and the “Run”.
You will be taken to the options page, click on the “Submit” button. The next screen shows
you the amino acid translations of your nucleotide sequence in all six reading frames.
Select the sequences that have no *, which indicate stop codons, by clicking on the
check box. Then click on “Import Sequence(s)” at the bottom of the page.
Your translated sequence(s) are now in the Protein Tools section of the Workbench.
2. BLAST Protein search
Log on to the following web address:
http://www.ncbi.nlm.nih.gov/BLAST/
Under Protein, click on Protein-protein BLAST (blastp)
Enter your amino acid sequence in the box labeled Search
Click on BLAST!, on the next screen click on Format!
This may take a few seconds to search all the available databases, so please wait patiently
Your file will be returned with a match to known amino acid sequences.
As you scroll down the screen you will see
Distribution of Blast Hits on the Query Sequence, keep scrolling down until you find
Related Structures
Sequences producing significant alignments:
Each sequence has a score (bits) , which tells you how closely aligned the 2 sequences are. By
clicking on the Score, it will take you down to a match.
Each match appears as follows:
Query: 1
Sbjct: 300 (or some other number)
Query is the sequence you submitted for analysis. Sbjct is the match that was found. The middle
line is the computer’s attempt to align them for you.
Find 10 different species (if available, use human as one species) that share similarity to the
submitted sequence. Copy and paste this Sbjct sequence into a new file to be used for
phylogenetic analysis. Each species must have a ONE word name, if the name consists of more
than one word, please use a one word abbreviation of 8 characters or less. Do NOT use the Latin
name of genus and species, find the common name. The format must be as follows for this
sequence to be used in the next step:
>species name(return)
Amino acid sequence(return)
>species name(return)
Amino acid sequence(return)
>human
ANSNCVMFKLGIRKMRL
>frog
ANSDHYMKLGIKMRL
Write all 10 sequences like this!!
2. BLAST protein search
Select one of your translated sequences by clicking on the check box.
Scroll through the list of tools, and select “BLASTP”, then click on “Run”.
From the list of databases, select “Genpept Full Release” and “Genpept Updates”. Do not
change any other options on this page.
Scroll down the page and click on “Submit”.
In less than a minute, your results will appear on the screen. As you scroll down the screen
you will see “Sequences producing significant alignments:” Each sequence has a score
(bits) , which tells you how closely aligned the 2 sequences are. By clicking on the
Score, it will take you down to a match. Each match appears as follows:
Query: 1
Sbjct: 300 (or some other number)
Query is the sequence you submitted for analysis. Sbjct is the match that was found. The
middle line is the computer’s attempt to align them for you.
If the Expect value associated with the first Sbjct is not close to 0, then the reading frame is not
correct, and you will need to repeat the above steps with your other translated sequences.
Find 10 different species (if available, use human as one species) that share similarity to the
submitted sequence. Scroll through the table at the top of the screen, and click on the
check box for each sequence that you wish to select. Then click on the “Import
Sequences” button.
Select the ten sequences from the BLAST search by clicking on their check boxes.
Select “Edit Protein Sequence(s)” and click on “Run”.
Replace the Genpept:### that follows the > in the “Sequence:” box with the common name
of the species from which it comes. Each species must have a ONE word name, if the
name consists of more than one word, please use a one word abbreviation of 8
characters or less. Do NOT use the Latin name of genus and species, find the common
name. Be careful to keep the > symbol exactly where it is.
When you have changed all of the Sequence files, click on the “Save” button, which can be
found at either the top or the bottom of the screen.
3. Phylogenetic Analysis
Back on the ISU Bioinformatics Portal, on the left side, click on Clustalw, click on clustalw
Under Number 2. Actual data, paste in your 10 sequences in the format above
In the next box labeled Actions, use the drop down menu to choose
-tree: calculate NJ tree
Scroll down, under Multiple Alignments Parameters
The second drop down box asks you to choose between Protein or DNA (-type), choose protein
Scroll back up to the top of the page, change the default Email to your own, and click on submit
clustalw
If you did everything correctly, you will see under Results: a file named infile.ph
Below this file is a drop down box, choose drawgram
Click on the box next to this which reads Run the selected program on infile.ph
A new screen will pop up, under Drawgram options
Choose the Tree style named P: Phenogram
Scroll down to Drawgram Options
Next to the box labeled, Which plotter or printer will the tree be drawn on
Choose P: PCX file format
Scroll back to the top of the page and click on Run drawgram
Your results will appear in a file named plotfile.pcx
Save this file to your computer.
2. Phylogenetic Analysis
Select your newly edited sequences, which are now labeled “edited”.
Scroll through the Tools, and select “CLUSTALW”, then click on “Run”.
Do not change the options, click on “Submit”
In less than a minute, you will see the aligned sequences. These sequences can be copied
and pasted into a word document, although you will lose the color. Click on the
“Import Alignment” button.
Your aligned sequences are now in the Alignment Tools section. Select your alignment,
scroll to and select “DRAWGRAM” and click on “Run”.
On the options page change “Exclude positions with gaps:” from no to yes and “Correct for
multiple substitutions:” from no to yes. Click on the submit button.
You can download a postscript version of your tree. Click on the “Return” button when
you are done.
Do NOT print your assignment; send it to me via email. Please include the following
information in your assignment:
1.
The name of your protein
2.
An alignment of each of 10 different species with differences shown in bold
or a different color (each sequence is compared only to the human not to each
other)
3.
A phylogenetic tree showing the evolutionary relatedness of each species
Download