Glimmer and GeneMark Glimmer • Glimmer is a system for finding genes in microbial DNA http://ccb.jhu.edu/software/glimmer/index.shtml • The system works by creating a variable-length Markov model from a training set of genes and then using that model to attempt to identify all genes in a given DNA sequence. Glimmer • Local Installation on burrow.soic.indiana.edu: – /l/glimmer3.02/bin • All the relevant code is in – /l/glimmer3.02/bin/ • I have added E.coli data in the following directory to play with: – /tmp/ecoli Glimmer • Running Glimmer involves a two-step process 1. Building the model using known genes – /l/glimmer3.02/bin/build-icm -r run1.icm < /tmp/ecoli/ecoli-genes.fasta 2. Make gene predictions using glimmer3 program – /l/glimmer3.02/bin/glimmer3 -o50 -g110 -t30 /tmp/ecoli/ecoli.fna run1.icm run1 • For more details please refer: – /l/glimmer3.02/glim302notes.pdf GeneMark • GeneMark includes a suite of software tools for predicting protein coding genes in various types of genomes http://opal.biology.gatech.edu/ • The algorithms use Hidden Markov models reflecting the "grammar" of gene organization. GeneMark • Local Installation on burrow: – /l/gmsuite/ • You can run the code for prokaryotic gene prediction using the following command – /l/gmsuite/gmsn.pl --prok --format GFF /tmp/ecoli/ecoli.fna • For more details please refer to: – /l/gmsuite/README.GeneMarkSuite