Parsing BLAST outputs

advertisement
Parsing BLAST output
Output of a local BLAST search
“less” program
Full path to the BLAST output file
BLAST program used for the search
Reference
Information of the query sequence
Information of the database
One-line summary of
the search results
Detailed information for the
first 2 hsps of the first hit:
Accession number, description,
organism, score, E value,
identities, positives, and
alignment
Sample BLAST output (continued)
Hsp information from
the first hit
Press “q” to quit the “less” viewing mode
The size of the BLAST output is limited only
by the free disk space you have in your
computer. It’s virtually impossible to open a
large text file. Let alone going through the file
line by line.
The purpose of parsing BLAST output is to
extract user-defined information from the
BLAST output file for clear visualization and
summarization.
Search result parsing
The Bio::SearchIO system was designed for
parsing sequence database searches (BLAST,
sim4, waba, FASTA, HMMER, exonerate,
etc.)
One-line summary of
the search results
Load Bio::SearchIO module
Usage information
It will appear if the program
is invoked without arguments
Define the class
Print out the header information
Process each result
Process each hit
Process each HSP
Control for the number of hits to be extracted
Indicator showing the work is done
Change directory (cd)
to where the perl
script and the BLAST
output file are stored
Confirm that the perl
script and the BLAST
output are in place
Oops… an error message
It’s due to Windows and
Unix compatibility.
Find the file in Windows system
and open it with Notepad++
Select “convert to UNIX format” in
the “Format” drop-down menu
After the conversion, save the file
and exit Notepad++
Another error message
This is because the perl
interpreter has been installed
in another location (/usr/bin/)
while the script is looking for
the perl interpreter in
/usr/local/bin
Now it’s working !
Solution:
Create a symbolic link of /usr/bin/perl in /usr/local/bin
Command:
ln<space>-s<space>/usr/bin/perl<space>/usr/local/bin/perl
This is the file you’ve
just generated.
Congratulation! You’ve just parsed a BLAST output!
Let’s see how the file looks like, using “less”.
Here is how it looks like.
The parsed output is
tab-delimited and can
be imported into Excel
for better visualization.
Locate the file in
Windows system
Header row
Query sequence
Accession numbers of
E values of the top 3 hits
the top 3 hits
Descriptions of the top 3 hits
Information of each HSP of the
top 3 hits
Download