Frequently Asked Questions

advertisement
Frequently Asked Questions (FAQ) for mpiBLAST
Does mpiBLAST support PSI-BLAST, PHI-BLAST, RPS-BLAST, etc.?
No. Although it may be possible to parallelize these search algorithms using database
segmentation, our preliminary studies indicate they would not benefit as much as the other blast
search types do from such a parallelization scheme.
Does mpiBLAST support Mega-BLAST?
No. We are focusing our efforts on blastn, blastp, blastx, tblastn, and tblastx.
Can mpiBLAST run without local storage?
Yes. On systems without local storage, turn on the use-virtual-frags option for better
performance.
Can mpiBLAST be run on a single processor system for testing purposes?
Yes, simply execute the desired number of MPI processes using the -np flag. The minimum is np 3.
I benchmarked mpiBLAST but I don't see super-linear speedup! Why?!
mpiBLAST only yields super-linear speedup when the database being searched is significantly
larger than the core memory on an individual node. The super-linear speedup results published
in the ClusterWorld 2003 paper describing mpiBLAST are measurements of mpiBLAST v0.9
searching a 1.2GB (compressed) database on a cluster where each node has 640MB of RAM. A
single node search results in heavy disk I/O and a long search time.
Does mpiBLAST run on Mac OS X?
Yes, mpiBLAST versions 1.3.0 or later support Mac OS X.
How do I compile mpiBLAST from SVN?
Please see the instructions on the development page.
How do I format a huge database?
Large databases like nt can consume several gigabytes of disk space and it is preferable to store
them in compressed form. Starting with mpiBLAST 1.4.0 it is possible to pipe FastA formatted
sequence data into mpiformatdb. This feature provides the ability to directly format a
compressed (gzip/bzip etc.) database using command line syntax like:
zcat nt.gz | mpiformatdb -i stdin -N 100 -t nt -p F
mpiformatdb
needs the -t <title> and -p <T|F> options to format a database piped via
standard input.
How accurate are the E-value statistics?
In mpiBLAST 1.3 or later, they are exact for all supported search types. In versions 1.2.1 and
earlier, e-values for blastn were loosely approximated using a linear equation. For blastp,
blastx, tblastn, and tblastx they were inaccurate in versions 1.2.1 and earlier. Note that by
"exact" we mean exactly the same as those generated by NCBI-BLAST with the traditional
search engine. As of 2009, NCBI is still refining the e-value calculations in their blast
implementation.
How does mpiBLAST output differ from NCBI blastall output?
In mpiBLAST 1.3 or later, the text, XML, tabular and ASN.1 output formats are identical to
NCBI BLAST with the traditional search engine. When an individual query has multiple
database hits with the same e-value and bit score, mpiBLAST may report these hits in a
different order than NCBI BLAST.
Download