IMP: An Automated Pipeline For Intron

advertisement
IMP: An Automated Pipeline For Intron Prediction From Non-Cognate ESTs And Flanking
Primer Design To Aid In Marker Development
Lacey-Anne Sanderson
Abstract
Single and multiple nucleotide polymorphisms
(SNPs and MNPs), insertions/deletions (Indels) and
size polymorphisms (SiPs) are important tools for
developing gene-based markers. Low levels of
polymorphism observed in transcribed sequences
are a hurdle in marker development in many orphan
crop species, because ESTs are often the only
sequence resource available. Intron sequences, if
they can be successfully predicted and amplified,
can greatly enhance gene-based marker
development, because they are usually highly
polymorphic. Intron Marker Pipeline (IMP) can be
used to design intron flanking primers based on
gapped alignment of EST sequences of one
species to genomic sequence of model systems by
exploiting the conservation of exonic sequences
and exon-intron structure between homologous
genes in different species. In comparison to other
pipelines that design intron flanking primers, IMP
utilizes more recent and accurate algorithms and
allows the researcher maximal flexibility in choice of
genomic sequence for intron prediction.
Furthermore, 50 intron flanking primers have been
designed in bean using IMP and tested in the lab
with 87% of them yielding amplicons under a
standard set of PCR conditions. Thus, IMP is a
valuable tool in high-throughput polymorphism
discovery projects in any species with limited to no
genomic sequence data available. IMP is available
from Google Code under the GNU-GPL open
source liscence and has been tested on Mac and
Unix machines.
Required
RepeatMasker
BLAT
GeneSeqer
Primer3 (Command-line)
Perl
Recommended
Linux-style or Mac operating system
Download and Installation
1. Download from either Google Code
(http://code.google.com/p/intron-marker-pipeline/) or SourceForge
(http://sourceforge.net/projects/intronmarkerpip/)
2. Unpack (will create it’s own containing folder) using tar –
zxvf <filename> in the terminal
3. Install ReapeatMasker, BLAT, GeneSeqer and Primer3
command line utilities and note down the path to and
including the executable for each one
(ex:/usr/local/share/applications/Primer3/src/primers_core)
4. Run perl install.pl in the base directory of IMP
5. Enter the paths already noted down when prompted
Installation Complete!
Execute IMP by entering
perl IMP-1.0 <options> <genomic fasta> <query fasta>
Implementation
ESTs and Genomic ModelSequence
Alignment of
ESTs on nonconate
Genomic
Sequence
Options
5extensionFwd/Rvs
Allows the user to enter a string which will be added
to the 5’ end of every forward or reverse primer
MACROBUT
Comparison to
HTMLDirec
othert Programs
Progam
Autocuration
Allows the user to automatically filter the
primersets returned to fit their needs
Splice Site Model Species
IMP
Splice site model used by Geneseqer to aid in
prediction of intron-exon boundries
Pmin/opt/maxSize/Tm
Primer length/ Melting Temperature boundries for
primer desing by primer3
PMask
Allows for intelligent design of primers in sequence
in which masked regions (for example repeatmasked regions) are lower-cased
And Many More!
Filtering/
Evaluation of
Primersets
Intron flankingprimersets for arkeError! Bookmark
TON
not defined.r discovery
Alignment Program
Choose the program to be used to align the ESTs to
the Genome
MACROBUT
TON
HTMLDirec
t
Species-specific
Primers
Choice of Genomic
Template

Any

MACROBUT
TON Intron
Potential
Polymorphism
(PIP)
HTMLDirec
(6)
t
MACROBUT

TON
Rice & Arabidopsis
HTMLDirec
t

Sequence Similarity
Splice Site Prediction

Ease of Installation
Easy

Primer Design
No Installation
based
on
Gem
Prospector
cisPrimer
Tool (7)

(Error!
Bookmar
k not
defined.8)

Legume or
Grasses
Error!
Normal.dotmE
Bookmark rror! Bookmark
not defined. not defined.


Any
Difficult
No Installation
Download