BLASTing loci

advertisement
Sheffield Molecular Genetics Facility
Protocols:
Checking loci are unique (and are from the species you hope
they are from!)
This search will …
1. identify any matching clones
(Before submitting them to EMBL, you should have already checked them using
GENEJOCKEY software – but the results of this search makes it easy to spot duplicates
and appears to be more sensitive. It also identifies any clones which match sequences
from other libraries (not just those developed by yourself.)
2. it will also prevent you genotyping the same individuals with the same locus
You should check for matching sequences, even if using loci developed in different
libraries of the same species and even if using loci developed from one or several
different species.
3. and finally this search will identify similar sequences in other species, which may
identify genes of interest, etc.
Notes
The microsatellite repeat region is automatically blocked out and not compared between
sequences – to prevent 1000’s of matches.
Method
1. Prepare sequence data ready to be BLAST searched.
You can use either…
The embl data (emailed back when the sequences were submitted – if you submitted the
sequences)
Available from…
http://srs.ebi.ac.uk/srs6bin/cgi-bin/wgetz?-page+top+-newId
Or
A word document made up of all the sequences - pasted in from embl – you can paste the
whole EMBL record in for each entry – the non-sequence text is ignored by the BLAST
program.
2. Search the data prepared against sequence data in EMBL/ Genbank/DDJB Databases
.
Go to…
http://www.ncbi.nlm.nih.gov/BLAST/
3.click nucleotide- nucleotide
4. paste in the sequences (including the text words - the program cuts them out
automatically.)
NB The first pasted in sequence has to start with the sequence data (eg gatcaccgttt etc) of
the first seq - so it can be recognised as sequence data (otherwise you will get the protein
data entered error message).
NB just search with 10-15 sequences at a time – this seems to be the most that can be
blasted and is also the easiest number to compare when the results are displayed.
-x-
Download