Phylogeny activity

advertisement
How to infer genetic relatedness among organisms by constructing a
PHYLOGENETIC TREE
Mr. Abercrombie
AP Biology
Tree by David Hillis
The image above represents a phylogenetic tree constructed by comparing
sequences of 3000 different organisms. You have learned about some of the amazing
conservation of molecular machinery that humans share with other living
organisms. In this exercise, you will use bioinformatics tools available online to
generate a “TREE” that illustrates the genetic similarities among organisms that all
share a common gene. Phylogenetic trees, or phylogenies, are frequently used by
biologists to infer genetic relatedness among species for which we have genetic
information (and this can be DNA, RNA, or protein). You will construct a tree by
comparing protein sequences for GAPDH (glyceraldehyde-3-phosphate
dehydrogenase), a “housekeeping” gene that has ancient origins, thus is shared by
eukaryotes, as you will see.
1. The first step in this process is to visit the NCBI website:
http://www.ncbi.nlm.nih.gov/
2. Click on the tab that says All Databases and scroll down to Homologene and
click it.
3. 2. Once this selection is highlighted, type GAPDH into the search bar and hit
the search button.
4. 3. Click on the first search result, HomoloGene:107053. Gene conserved
in Eukaryota
5. In the display bar that says Homologene, scroll down to Alignment scores.
This will display the percent similarity that each sequences has to each
other. For example, the human GAPDH protein sequence is 100%
identical to the Chimpanzee GAPDH protein.
6. Now, in the same display bar, scroll down to FASTA. This will convert all
protein sequences into a format that can be interpreted by the bioinfomatic
tool you will use to make the phylogenetic tree. For the firs part of this
assignment, I have taken some of the work out of this for you by renaming
the sequences a simpler designation that will make the labels for the tree
easy to read. (YOU WILL HAVE TO DO THIS later for the gene that you
select to construct your own phylogeny with). Copy the FASTA format
GAPDH sequences that I have placed at the bottom of this document.
7. To paste the sequences, you will have to go to this web page for the software,
which will allow you to construct the tree: http://www.phylogeny.fr/
8. On the main bar, click phylogeny analysis then “one-click” – input your
FASTA sequences, which is where you will paste the FASTA formatted
GAPDH sequences. Paste this in the white box, then click submit. The results
may take a minute or so, depending on your connection speed.
9. Now the GAPDH sequences should be aligned and a phylogenetic tree
will be constructed.
10. From here, you can right click and copy image and the paste that into a
word or ppt file.
11. Now that you know how to make a phylogenetic tree, your assignment is
to choose your favorite gene to use to construct your own unique
phylogenetic tree. No one student may use the same gene. It may take a
few times to choose a gene that makes for a “good phylogeny.” For
example, if homologene search results only give you one or two different
organisms, you will need to choose a gene in which more organisms
share homology (at least 8). Think about basic enzymes of eukaryotic
metabolism or maybe you want to analyze some prokaryotic genes. There
are many other ways to construct phyologenetic trees.
12. Your final document will comprise a ‘copy image’ copy of the GAPDH
phylogeny and a copy of your favorite gene phylogeny.
13. Remember! You will need to simplify your FASTA formatted sequences
by replacing the name with a simplified version of the organism, being
careful to leave the > in place, otherwise it will not work. Take a closer
look at GAPDH sequences below.
14. Last but not least, I will not accept any excuses as to why you cannot
complete this assignment. If you cannot complete this at home due to
limitations, you will have to work on this on the school computers in the
library. Getting hands-on experience with bioinformatics tools and
constructing phylogenies is an important part of this class.
Copy and paste this set of GAPDH sequences, which is in FASTA format:
>Human
MGKVKVGVNGFGRIGRLVTRAAFNSGKVDIVAINDPFIDLNYMVYMFQYDSTHGKFHGTVKAENGKLVIN
GNPITIFQERDPSKIKWGDAGAEYVVESTGVFTTMEKAGAHLQGGAKRVIISAPSADAPMFVMGVNHEKY
DNSLKIISNASCTTNCLAPLAKVIHDNFGIVEGLMTTVHAITATQKTVDGPSGKLWRDGRGALQNIIPAS
TGAAKAVGKVIPELNGKLTGMAFRVPTANVSVVDLTCRLEKPAKYDDIKKVVKQASEGPLKGILGYTEHQ
VVSSDFNSDTHSSTFDAGAGIALNDHFVKLISWYDNEFGYSNRVVDLMAHMASKE
>Chimp
MGKVKVGVNGFGRIGRLVTRAAFNSGKVDIVAINDPFIDLNYMVYMFQYDSTHGKFHGTVKAENGKLVIN
GNPITIFQERDPSKIKWGDAGAEYVVESTGVFTTMEKAGAHLQGGAKRVIISAPSADAPMFVMGVNHEKY
DNSLKIISNASCTTNCLAPLAKVIHDNFGIVEGLMTTVHAITATQKTVDGPSGKLWRDGRGALQNIIPAS
TGAAKAVGKVIPELNGKLTGMAFRVPTANVSVVDLTCRLEKPAKYDDIKKVVKQASEGPLKGILGYTEHQ
VVSSDFNSDTHSSTFDAGAGIALNDHFVKLISWYDNEFGYSNRVVDLMAHMASKE
>Dog
MVKVGVNGFGRIGRLVTRAAFNSGKVDIVAINDPFIDLNYMVYMFQYDSTHGKFHGTVKAENGKLVINGK
SISIFQERDPANIKWGDAGAEYVVESTGVFTTMEKAGAHLKGGAKRVIISAPSADAPMFVMGVNHEKYDN
SLKIVSNASCTTNCLAPLAKVIHDHFGIVEGLMTTVHAITATQKTVDGPSGKMWRDGRGAAQNIIPASTG
AAKAVGKVIPELNGKLTGMAFRVPTPNVSVVDLTCRLEKAAKYDDIKKVVKQASEGPLKGILGYTEDQVV
SCDFNSDTHSSTFDAGAGIALNDHFVKLISWYDNEFGYSNRVVDLMVYMASKE
>Cow2
MVKVGVNGFGRIGRLVTRAAFNSGKVDIVAINDPFIDLHYMVYMFQYDSTHGKFNGTVKAENGKLVINGK
AITIFQERDPANIKWGDAGAEYVVESTGVFTTMEKAGAHLKGGAKRVIISAPSADAPMFVMGVNHEKYNN
TLKIVSNASCTTNCLAPLAKVIHDHFGIVEGLMTTVHAITATQKTVDGPSGKLWRDGRGAAQNIIPASTG
AAKAVGKVIPELNGKLTGMAFRVPTPNVSVVDLTCRLEKPAKYDEIKKVVKQASEGPLKGILGYTEDQVV
SCDFNSDTHSSTFDAGAGIALNDHFVKLISWYDNEFGYSNRVVDLMVHMASKE
>Cow1
MVKVGVNGFGRIGRLVTRAAFNSGKVDIVAINDPFIDLHYMVYMFQYDSTHGKFNGTVKAENGKLVINGK
AITIFQERDPANIKWGDAGAEYVVESTGVFTTMEKAGAHLKGGAKRVIISAPSADAPMFVMGVNHEKYNN
TLKIVSNASCTTNCLAPLAKVIHDHFGIVEGLMTTVHAITATQKTVDGPSGKLWRDGRGAAQNIIPASTG
AAKAVGKVIPELNGKLTGMAFRVPTPNVSVVDLTCRLEKPAKYDEIKKVVKQASEGPLKGILGYTEDQVV
SCDFNSDTHSSTFDAGAGIALNDHFVKLISWYDNEFGYSNRVVDLMVHMASKE
>Mouse
MVKVGVNGFGRIGRLVTRAAICSGKVEIVAINDPFIDLNYMVYMFQYDSTHGKFNSTVKAENGKLVINGK
PITIFQERDPANIKWGEASAEYVVESTGVFTTMEKARAHLKGGAKRVIISAPSADAPMFVMGVNHEKYDN
SLKIFSNASCTTNCLAPVAKVIHDNFGIVEGLMTTVHAITATQKTVDGPSGKLWRDGRGAAQNIIPASTG
AAKAVGKVIPELNWKLTGMAFRVPTPNVFVLDLTCRLEKPAKYDDIKKVVKQASEGPLKGILGYTEDQVV
SCYFNSNSHSSTFDARAGIALNDNFVKLISWYDNEYGYSNRVVDLMAYMASKE
>Rat1
MVKVGVNGFGRIGRLVTRAAFSCDKVDIVAINDPFIDLNYMVYMFQYDSTHGKFNGTVKAENGKLVINGK
PITIFQERDPANIKWGDAGAEYVVESTGVFTTMEKAGAHLKGGAKRVIISAPSADAPMFVMGVNHEKYDN
SLKIVSNASCTTNCLAPLAKVIHDNFGIVEGLMTTVHAITATQKTVDGPSGKLWRDGRGAAQNIIPASTG
AAKAVGKVIPELNGKLTGMAFRVPTPNVSVVDLTCRLEKPAKYDDIKKVVKQAAEGPLKGILGYTEDQVV
SCDFNSNSHSSTFDAGAGIALNDNFVKLISWYDNEYGYSNRVVDLMAYMASKE
>Rat2
MVKVGVNGFGRIGHLVTRAAFSCDKVDIVAINDPFIDLNYMVYMFQYDSTHGKFNGTVKAENGKLVINGK
PITIFQERDPANIKWGDAGAEYVVESTGVFTTMEKAGAHLKGGAKRVIISAPSADAPMFVMGVNHEKYDN
SLKIVSNASCTTNCLAPLGKVIHDNFGIVEGLMTTVHAITATQKTVDGPSGKLWRDGRGAAQNIIPASTG
AAKAVGKVIPELNGKLTGMAFRVPTPNVSVVDLTCRLEKPAKYDDIKKVVKQVAEGPLKGILGYTEDQVV
SCDFNSNSHSSTFDAGAGIALNDNFVKLISWYDNEYGYSNRVVDLMAYMASKE
>Rat3
MVKVGVNGFGRIGRLVTRAAFSCDKVDIVAINDPFIDLNYMVYMFQYDSTHGKFNGTVKAENGKLVINGK
PITIFQERDPANIKWGDAGAEYVVESTGVFTTMEKAGAHLKGGAKRVIISAPSADAPMFVMGVNHEKYDN
SLKIVSNASCTTNCLAPLAKVIHDNFGIVEGLMTTVHAITATQKTVDGPSGKLWRDGRGAAQNIIPASTG
AAKAVGKVIPELNGKLTGMAFRVPTPNVSVVDLTCRLEKPAKYDDIKKVVKQAAEGPLKGILGYTEDQVV
SCDFNSNSHSSTFDAGAGIALNDNFVKLISWYDNEYGYSNRVVDLMAYMASKE
>Rat4
MVKVGVNGFGRIGRLVTRAAFSCDKVDIVAINDPFIDLNYMVYMSQYGSPHGKFNSTVKAENGKLVNNGK
PITIFQERDPANIKWGDAGAEYVMESTGIFTTMEKAGAHLKGGAKRVIISAPSADAPMFVMGVNHEKYDN
SLKIVSNASCTTNCLAPLAKVIHDNFGIVEGLMTTVHAITATQKTVDGPSGKLWRDGRGAAQNIIPASTG
AAKAVGKVIPELNGKLTGMAFRVPTPNVSVVDLTCRLEKPAKYDDIKKVVKQAAEGPLKGILGYTEDQVV
SCDFNSNSHSSTFDAGAGIALNDNFVKLISWYDNEYGYSNRVVDLMAYMASKE
>Chicken
MVKVGVNGFGRIGRLVTRAAVLSGKVQVVAINDPFIDLNYMVYMFKYDSTHGHFKGTVKAENGKLVINGH
AITIFQERDPSNIKWADAGAEYVVESTGVFTTMEKAGAHLKGGAKRVIISAPSADAPMFVMGVNHEKYDK
SLKIVSNASCTTNCLAPLAKVIHDNFGIVEGLMTTVHAITATQKTVDGPSGKLWRDGRGAAQNIIPASTG
AAKAVGKVIPELNGKLTGMAFRVPTPNVSVVDLTCRLEKPAKYDDIKRVVKAAADGPLKGILGYTEDQVV
SCDFNGDSHSSTFDAGAGIALNDHFVKLVSWYDNEFGYSNRVVDLMVHMASKE
>Zebrafish
MVKVGINGFGRIGRLVTRAAFLTKKVEIVAINDPFIDLDYMVYMFQYDSTHGKYKGEVKAEGGKLVIDGH
AITVYSERDPANIKWGDAGATYVVESTGVFTTIEKASAHIKGGAKRVIISAPSADAPMFVMGVNHEKYDN
SLTVVSNASCTTNCLAPLAKVINDNFVIVEGLMSTVHAITATQKTVDGPSGKLWRDGRGASQNIIPASTG
AAKAVGKVIPELNGKLTGMAFRVPTPNVSVVDLTVRLEKPAKYDEIKKVVKAAADGPMKGILGYTEHQVV
STDFNGDCRSSIFDAGAGIALNDHFVKLVTWYDNEFGYSNRVCDLMAHMASKE
>Fruitfly1
MSKIGINGFGRIGRLVLRAAIDKGASVVAVNDPFIDVNYMVYLFKFDSTHGRFKGTVAAEGGFLVVNGQK
ITVFSERDPANINWASAGAEYVVESTGVFTTIDKASTHLKGGAKKVIISAPSADAPMFVCGVNLDAYSPD
MKVVSNASCTTNCLAPLAKVINDNFEIVEGLMTTVHATTATQKTVDGPSGKLWRDGRGAAQNIIPAATGA
AKAVGKVIPALNGKLTGMAFRVPTPNVSVVDLTVRLGKGATYDEIKAKVEEASKGPLKGILGYTDEEVVS
TDFFSDTHSSVFDAKAGISLNDKFVKLISWYDNEFGYSNRVIDLIKYMQSKD
>Fruitfly2
MSKIGINGFGRIGRLVLRAAIDKGANVVAVNDPFIDVNYMVYLFKFDSTHGRFKGTVAAEGGFLVVNGQK
ITVFSERDPANINWASAGAEYIVESTGVFTTIDKASTHLKGGAKKVIISAPSADAPMFVCGVNLDAYKPD
MKVVSNASCTTNCLAPLAKVINDNFEIVEGLMTTVHATTATQKTVDGPSGKLWRDGRGAAQNIIPASTGA
AKAVGKVIPALNGKLTGMAFRVPTPNVSVVDLTVRLGKGASYDEIKAKVQEAANGPLKGILGYTDEEVVS
TDFLSDTHSSVFDAKAGISLNDKFVKLISWYDNEFGYSNRVIDLIKYMQSKD
>mosquito
MSKIGINGFGRIGRLVLRAAITKGASVVAINDPFIGVDYMVYLFKYDSTHGRFKGEVSAQDGCLVVNGQK
IAVFQERDPKAIPWGKAGAEYVVESTGVFTTTEKASAHLEGGAKKVIISAPSADAPMFVVGVNLEAYEPS
MKVVSNASCTTNCLAPLAKVINDNFGILEGLMTTVHATTATQKTVDGPSGKLWRDGRGAAQNIIPAATGA
AKAVGKVIPALNGKLTGMAFRVPTPNVSVVDLTVRLSKPATYDQIKQKVKEAANGPMKGILDYTEEEVVS
TDFVGDCHSSIFDAKAGIQLSDTFVKLISWYDNEYGYSNRVVDLIKYMQTKD
>nematode1
MSKANVGINGFGRIGRLVLRAAVEKDTVQVVAVNDPFITIDYMVYLFKYDSTHGQFKGTVTYDGDFLIVQ
KDGKSSHKIKVFNSKDPAAIAWGSVKADFVVESTGVFTTKEKASAHLQGGAKKVIISAPSADAPMYVVGV
NHEKYDASNDHVISNASCTTNCLAPLAKVINDNFGIIEGLMTTVHAVTATQKTVDGPSGKLWRDGRGAGQ
NIIPASTGAAKAVGKVIPELNGKLTGMAFRVPTPDVSVVDLTVRLEKPASMDDIKKVVKAAADGPMKGIL
AYTEDQVVSTDFVSDPHSSIFDTGACISLNPNFVKLVSWYDNEYGYSNRVVDLIGYIATRG
>nematode2
MTKPSVGINGFGRIGRLVLRAAVEKDSVNVVAVNDPFISIDYMVYLFQYDSTHGRFKGTVAHEGDYLLVA
KEGKSQHKIKVYNSRDPAEIQWGASGADYVVESTGVFTTIEKANAHLKGGAKKVIISAPSADAPMFVVGV
NHEKYDHANDHIISNASCTTNCLAPLAKVINDNFGIIEGLMTTVHAVTATQKTVDGPSGKLWRDGRGAGQ
NIIPASTGAAKAVGKVIPELNGKLTGMAFRVPTPDVSVVDLTARLEKPASLDDIKKVIKAAADGPMKGIL
AYTEDQVVSTDFVSDTNSSIFDAGASISLNPHFVKLVSWYDNEFGYSNRVVDLISYIATKA
>nematode3
MPKPSVGINGFGRIGRLVLRAAVEKDSVNVVAVNDPFISIDYMVYLFQYDSTHGRFKGTVAHEGDYLLVA
KEGKSQHKIKVYNSRDPAEIQWGASGADYVVESTGVFTTIEKANAHLKGGAKKVIISAPSADAPMFVVGV
NHEKYDHANDHIISNASCTTNCLAPLAKVINDNFGIIEGLMTTVHAVTATQKTVDGPSGKLWRDGRGAGQ
NIIPASTGAAKAVGKVIPELNGKLTGMAFRVPTPDVSVVDLTARLEKPASLDDIKKVIKAAADGPMKGIL
AYTEDQVVSTDFVSDTNSSIFDAGASISLNPHFVKLVSWYDNEFGYSNRVVDLISYIATKA
>nematode4
MSKANVGINGFGRIGRLVLRAAVEKDTVQVVAVNDPFITIDYMVYLFKYDSTHGQFKGTVTYDGDFLIVQ
KDGKSSHKIKVFNSKDPAAIAWGSVKADFVVESTGVFTTKEKASAHLQGGAKKVIISAPSADAPMYVVGV
NHEKYDASNDHVVSNASCTTNCLAPLAKVINDNFGIIEGLMTTVHAVTATQKTVDGPSGKLWRDGRGAGQ
NIIPASTGAAKAVGKVIPELNGKLTGMAFRVPTPDVSVVDLTVRLEKPASMDDIKKVVKAAADGPMKGIL
AYTEDQVVSTDFVSDPHSSIFDAGACISLNPNFVKLVSWYDNEYGYSNRVVDLIGYIATRG
>schizoYEAST1
MAIPKVGINGFGRIGRIVLRNALVAKTIQVVAINDPFIDLEYMAYMFKYDSTHGRFDGSVEIKDGKLVID
GNAIDVHNERDPADIKWSTSGADYVIESTGVFTTQETASAHLKGGAKRVIISAPSKDAPMYVVGVNEEKF
NPSEKVISNASCTTNCLAPLAKVINDTFGIEEGLMTTVHATTATQKTVDGPSKKDWRGGRGASANIIPSS
TGAAKAVGKVIPALNGKLTGMAFRVPTPDVSVVDLTVKLAKPTNYEDIKAAIKAASEGPMKGVLGYTEDA
VVSTDFCGDNHSSIFDASAGIQLSPQFVKLVSWYDNEWGYSRRVVDLVAYTAAKDN
>SchizoYEAST2
MAIPKVGINGFGRIGRIVLRNAILTGKIQVVAVNDPFIDLDYMAYMFKYDSTHGRFEGSVETKGGKLVID
GHSIDVHNERDPANIKWSASGAEYVIESTGVFTTKETASAHLKGGAKRVIISAPSKDAPMFVVGVNLEKF
NPSEKVISNASCTTNCLAPLAKVINDTFGIEEGLMTTVHATTATQKTVDGPSKKDWRGGRGASANIIPSS
TGAAKAVGKVIPALNGKLTGMAFRVPTPDVSVVDLTVKLAKPTNYEDIKAAIKAASEGPMKGVLGYTEDS
VVSTDFCGDNHSSIFDASAGIQLSPQFVKLVSWYDNEWGYSHRVVDLVAYTASKD
>SCYeast1
MVRVAINGFGRIGRLVMRIALSRPNVEVVALNDPFITNDYAAYMFKYDSTHGRYAGEVSHDDKHIIVDGK
KIATYQERDPANLPWGSSNVDIAIDSTGVFKELDTAQKHIDAGAKKVVITAPSSTAPMFVMGVNEEKYTS
DLKIVSNASCTTNCLAPLAKVINDAFGIEEGLMTTVHSLTATQKTVDGPSHKDWRGGRTASGNIIPSSTG
AAKAVGKVLPELQGKLTGMAFRVPTVDVSVVDLTVKLNKETTYDEIKKVVKAAAEGKLKGVLGYTEDAVV
SSDFLGDSHSSIFDASAGIQLSPKFVKLVSWYDNEYGYSTRVVDLVEHVAKA
>SCYeast2
MVRVAINGFGRIGRLVMRIALQRKNVEVVALNDPFISNDYSAYMFKYDSTHGRYAGEVSHDDKHIIVDGH
KIATFQERDPANLPWASLNIDIAIDSTGVFKELDTAQKHIDAGAKKVVITAPSSTAPMFVMGVNEEKYTS
DLKIVSNASCTTNCLAPLAKVINDAFGIEEGLMTTVHSMTATQKTVDGPSHKDWRGGRTASGNIIPSSTG
AAKAVGKVLPELQGKLTGMAFRVPTVDVSVVDLTVKLNKETTYDEIKKVVKAAAEGKLKGVLGYTEDAVV
SSDFLGDSNSSIFDAAAGIQLSPKFVKLVSWYDNEYGYSTRVVDLVEHVAKA
> SC3
MIRIAINGFGRIGRLVLRLALQRKDIEVVAVNDPFISNDYAAYMVKYDSTHGRYKGTVSHDDKHIIIDGV
KIATYQERDPANLPWGSLKIDVAVDSTGVFKELDTAQKHIDAGAKKVVITAPSSSAPMFVVGVNHTKYTP
DKKIVSNASCTTNCLAPLAKVINDAFGIEEGLMTTVHSMTATQKTVDGPSHKDWRGGRTASGNIIPSSTG
AAKAVGKVLPELQGKLTGMAFRVPTVDVSVVDLTVKLEKEATYDQIKKAVKAAAEGPMKGVLGYTEDAVV
SSDFLGDTHASIFDASAGIQLSPKFVKLISWYDNEYGYSARVVDLIEYVAKA
>KluYEAST
MVRVAINGFGRIGRLVLRIALSRPNVEVVAINDPFISVDYAAYMFKYDSTHGRFAGEVSHDENSLIIDGK
KVLVFQERDPATLPWGEHNVDIAIDSTGVFKELDSAQKHIDAGAKKVVITAPSSTAPMFVVGVNEDKYNG
ETIVSNASCTTNCLAPLAKVVNNAFGIEEGLMSTIHSITATQKTVDGPSQKDWRGGRTASGNIIPSSTGA
AKAVGKVLPELQGKLTGMAFRVPTVDVSVVDLTVKLAKEATYDEIKAVIKKASENELKGILGYTEDAVVS
SDFLGDTNSSIFDAAAGIQLSPKFVKLVTWYDNEYGYSTRVVDLVELVAKN
>Mold
MVKVAINGLGKIGRLVMRIALSRANVEVVAINDPFITVDYAAYMFKYDSTHGKYAGDVQYEGNTLVIDGK
KIKVFQERDPAQLPWGEEGIDIAIDSTGVFKELDSAQKHIDAGAKKVVITAPSSTAPMFVMGVNEEKYAG
ETIVSNASCTTNCLAPLAKVIDEQFGIEEGLMTTVHSLTATQKTVDGPSMKDWRGGRTASGNIIPSSTGA
AKAVGKVLPQLNGKLTGMAFRVPTVDVSVVDLTVKLNKETTYDEIKAAIKAASEGKLKGILGYTEDAVVS
TDFLGDNNSSIFDASAGIMLSPKFVKLVSWYDNEYGYSTRVVDLVEHVAAN
>RiceBlastFungus
MVKCGINGFGRIGRIVFRNAIEHPDCEIVAVNDPFIEPKYAKYMLEYDSTHGRFKGTVEVSGSDLVVNGK
KVKFYTERDPANIPWSETGAEYVVESTGVFTTTDKASAHLKGGAKKVIISAPSADAPMYVMGVNEKSYDG
SASVISNASCTTNCLAPLAKVINDKFGIVEGLMTTVHSYTATQKTVDGPSAKDWRGGRGAAQNIIPSSTG
AAKAVGKVIPALNGKLTGMSMRVPTANVSVVDLTCRLEKGASYEEIKAAIKEAADGPLKGILEYTEDDVV
SSDMIGNNASSIFDAQAGIALNDKFVKLVSWYDNEWGYSRRVIDLVTYISKVDGGK
>Red bread mold
MVVKVGINGFGRIGRIVFRNAIEHDDIHIVAVNDPFIEPKYAAYMLRYDTTHGNFKGTIEVDGADLVVNG
KKVKFYTERDPAAIPWSETGADYIVESTGVFTTTEKASAHLKGGAKKVIISAPSADAPMYVMGVNNETYD
GSADVISNASCTTNCLAPLAKVIHDNFTIVEGLMTTVHSYTATQKTVDGPSAKDWRGGRTAAQNIIPSST
GAAKAVGKVIPDLNGKLTGMAMRVPTANVSVVDLTARIEKGATYDEIKEVIKKASEGPLAGILAYTEDEV
VSSDMNGNPASSIFDAKAGISLNKNFVKLVSWYDNEWGYSRRVLDLISYISKVDAKKA
>Arabidopsis1
MAFSSLLRSAASYTVAAPRPDFFSSPASDHSKVLSSLGFSRNLKPSRFSSGISSSLQNGNARSVQPIKAT
ATEVPSAVRRSSSSGKTKVGINGFGRIGRLVLRIATSRDDIEVVAVNDPFIDAKYMAYMLKYDSTHGNFK
GSINVIDDSTLEINGKKVNVVSKRDPSEIPWADLGADYVVESSGVFTTLSKAASHLKGGAKKVIISAPSA
DAPMFVVGVNEHTYQPNMDIVSNASCTTNCLAPLAKVVHEEFGILEGLMTTVHATTATQKTVDGPSMKDW
RGGRGASQNIIPSSTGAAKAVGKVLPELNGKLTGMAFRVPTSNVSVVDLTCRLEKGASYEDVKAAIKHAS
EGPLKGILGYTDEDVVSNDFVGDSRSSIFDANAGIGLSKSFVKLVSWYDNEWGYSNRVLDLIEHMALVAA
SH
>Arabidopsis2
MALSSLLRSAATSAAAPRVELYPSSSYNHSQVTSSLGFSHSLTSSRFSGAAVSTGKYNAKRVQPIKATAT
EAPPAVHRSRSSGKTKVGINGFGRIGRLVLRIATFRDDIEVVAVNDPFIDAKYMAYMFKYDSTHGNYKGT
INVIDDSTLEINGKQVKVVSKRDPAEIPWADLGAEYVVESSGVFTTVGQASSHLKGGAKKVIISAPSADA
PMFVVGVNEKTYLPNMDIVSNASCTTNCLAPLAKVVHEEFGILEGLMTTVHATTATQKTVDGPSMKDWRG
GRGASQNIIPSSTGAAKAVGKVLPELNGKLTGMAFRVPTPNVSVVDLTCRLEKDASYEDVKAAIKFASEG
PLRGILGYTEEDVVSNDFLGDSRSSIFDANAGIGLSKSFMKLVSWYDNEWGYSNRVLDLIEHMALVAASR
>SativaRice
MAQQLSAPFRAAAAAGSRASAAAADPAKVLRLRSAGSAQFTSIAASSSFARNIEPLRAIATQAPPAVPQY
SSGEKTKVGINGFGRIGRLVLRIATSRDDIEVVAVNDPFIDAKYMAYMFKYDSTHGPFKGSIKVVDDSTL
EINGKKVTITSKRDPADIPWGNFGAEYVVESSGVFTTTEKASAHLKGGAKKVVISAPSADAPMFVVGVNE
KSYDPKMNVVSNASCTTNCLAPLAKVVHEEFGIVEGLMTTVHATTATQKTVDGPSMKDWRGGRGAAQNII
PSSTGAAKAVGKVLPELNGKLTGMAFRVPTPNVSVVDLTCRIEKSASYDDVKAAIKAASEGALKGILGYT
DEDVVSNDFVGDARSSIFDAKAGIGLSSSFMKLVSWYDNEWGYSNRVLDLIAHMALVNAKH
>JaponicaRice
MASLAVPLRASATPAIAGTGSGGGSRAADPVKVSCVRSKVTCGFPSVGASSSLASSVEPVRATATQAPLA
THQSSSTEKTKVGINGFGRIGRLVLRIATNRDDIEVVAVNDPFIDAKYMAYMFKYDSTHGPFKGTIKVVD
ESTLEINGKKISVTSKRDPSDIPWGNFGAEYVVESSGVFTTTEKASAHLKGGARKVVISAPSADAPMFVV
GVNEKNYNPSMNVVSNASCTTNCLAPLAKIVHEEFGIAEGLMTTVHATTATQKTVDGPSMKDWRGGRGAS
QNIIPSSTGAAKAVGKVLPALNGKLTGMAFRVPTPNVSVVDLTCRLEKSASYEDVKAAIKEASEGSLKGI
LGYTDEDVVSNDFIGDTRSSIFDAKAGIGLSSSFMKLVSWYDNEWGYSNRVLDLIGHMALVNAKP
>Malaria
MAVTKLGINGFGRIGRLVFRAAFGRKDIEVVAINDPFMDLNHLCYLLKYDSVHGQFPCEVTHADGFLLIG
EKKVSVFAEKDPSQIPWGKCQVDVVCESTGVFLTKELASSHLKGGAKKVIMSAPPKDDTPIYVMGINHHQ
YDTKQLIVSNASCTTNCLAPLAKVINDRFGIVEGLMTTVHASTANQLVVDGPSKGGKDWRAGRCALSNII
PASTGAAKAVGKVLPELNGKLTGVAFRVPIGTVSVVDLVCRLQKPAKYEEVALEIKKAAEGPLKGILGYT
EDEVVSQDFVHDNRSSIFDMKAGLALNDNFFKLVSWYDNEWGYSNRVLDLAVHITNN
Download