BLAST SEARCHING - UNC Center for Bioinformatics

advertisement
BLAST SEARCHING
BLAST (Basic Local Alignment Search Tool) is a set of similarity search programs designed to
explore all of the available sequence databases regardless of whether the query is protein or
DNA. The BLAST programs have been designed for speed, with a minimal sacrifice of
sensitivity to distant sequence relationships. The scores assigned in a BLAST search have a welldefined statistical interpretation, making real matches easier to distinguish from random
background hits. BLAST uses a heuristic algorithm which seeks local as opposed to global
alignments and is therefore able to detect relationships among sequences which share only isolate
regions of similarity (Altschul et al., 1990).
The NCBI website has a wealth of information regarding the BLAST tools available. The main
site can be found at http://www.ncbi.nlm.nih.gov/BLAST. For those who are new to BLAST
searching, the BLAST Guide is probably the most helpful resource. The guide details the entire
process from start to finish and explains what is going on at every step. Once you are familiar
with BLAST, the BLAST FAQ will answer more specific questions and aid in troubleshooting
your search.
When you are ready to perform your own BLAST search, the BLAST Tutorial may be very
helpful. The tutorial walks you through every step involved with submitting a custom configured
BLAST query.
GCG/Seqlab BLAST Tutorial:
BLAST is available using GCG/SeqLab. The command within GCG/SeqLab to run a local
BLAST search is BLAST as opposed to NetBLAST. Both BLAST and NetBLAST are
available in GCG. The NetBLAST is used for remote searches only, and is currently unavailable.
We will use a sequence from the Swissprot database. Start SeqLab using the export
DISPLAY=<ip number>:0 command and the “seqlab &” command. Load the ABP protein
sequence into a list from the Swissprot database, accession number is P08689.
Highlight the ABP sequence and move your cursor to the Functions menu. Scroll down and
select “Database Sequence Searching” and then select “Blast”.
2
You have the selection of searching a protein or nucleotide database. These selections are the
same as described on the NCBI website (i.e. blastn, blastp, blastx, tblastn, tblastx). You can
choose your search set by clicking on the “Search Set...” button.
3
Next, click on the Options button. Be sure that the filters have been selected and that you are
using the Blosum62 matrix.
4
5
Close the Options menu and click on the Run button.
6
The results of this BLAST search can be found here(link to blast.txt).
Vector NTI BLAST Search Tutorial:
The first step to performing a BLAST search using Vector NTI is to import your sequence into
the Vector NTI database. For this search we will use the NCBI Entrez Browser to download a
sequence and then import it into Vector NTI.
Begin by opening your browser and navigating to http://www.ncbi.nlm.nih.gov.
7
In the drop-down menu on the search bar, select “Protein”, and in the box to the right type in
P08689. Click the Go button.
8
On the resulting page, click the link to the sequence P08689.
9
On the resulting page, select “GenPept” from the drop-down display menu, and click “Display”.
10
Next, click on the “Send to” button with “File” selected. A dialog box will appear. Choose save
and save the file to your desktop.
Now, start up Vector NTI. When the VNTI Database program opens, you can simply drag the
sequence file you just saved from the Desktop to the Database window. The following window
will appear:
11
Select OK, find the P08689 sequence in the database and double-click on it to open it in Vector
NTI.
12
Now that the sequence has been imported, we can begin the BLAST search. From the Tools
menu, select BLAST Search, and in the resulting pop-up window, select “Whole Sequence”.
Next, a window will appear prompting you to choose a BLAST server. Select NCBI and hit OK.
You will now see the BLAST Search program with the P08689 sequence loaded.
13
For this search we will use the blastp program included in the BLAST 2.0 suite. These options
should already be selected. From the database menu, select nr. This ensures that all nonredundant protein sequences will be compared to the query sequence. Also, take a moment to
look over the parameters tab. We will leave the default options for this search, but take note that
these options are the same as the ones given on NCBI’s BLAST website. The PSI, PHI and
MEGABLAST tabs will be inactive unless you are performing a PSI, PHI, or MEGABLAST.
14
To perform the search, simply click Submit once you are ready. The program will place your
search in NCBI’s queue, and it will retrieve the results once they are ready. It is not necessary to
leave the Blast Search program running while the search is being performed. You may close the
program and bring it back up to retrieve your results at your convenience.
15
Once your results are ready, you will see that the status of your search has changed to “Finished”.
You may double-click on the icon to bring up the results. The following is similar to what you
will see:
16
17
Download