How to infer genetic relatedness among organisms by constructing a PHYLOGENETIC TREE Mr. Abercrombie AP Biology Tree by David Hillis The image above represents a phylogenetic tree constructed by comparing sequences of 3000 different organisms. You have learned about some of the amazing conservation of molecular machinery that humans share with other living organisms. In this exercise, you will use bioinformatics tools available online to generate a “TREE” that illustrates the genetic similarities among organisms that all share a common gene. Phylogenetic trees, or phylogenies, are frequently used by biologists to infer genetic relatedness among species for which we have genetic information (and this can be DNA, RNA, or protein). You will construct a tree by comparing protein sequences for GAPDH (glyceraldehyde-3-phosphate dehydrogenase), a “housekeeping” gene that has ancient origins, thus is shared by eukaryotes, as you will see. 1. The first step in this process is to visit the NCBI website: http://www.ncbi.nlm.nih.gov/ 2. Click on the tab that says All Databases and scroll down to Homologene and click it. 3. 2. Once this selection is highlighted, type GAPDH into the search bar and hit the search button. 4. 3. Click on the first search result, HomoloGene:107053. Gene conserved in Eukaryota 5. In the display bar that says Homologene, scroll down to Alignment scores. This will display the percent similarity that each sequences has to each other. For example, the human GAPDH protein sequence is 100% identical to the Chimpanzee GAPDH protein. 6. Now, in the same display bar, scroll down to FASTA. This will convert all protein sequences into a format that can be interpreted by the bioinfomatic tool you will use to make the phylogenetic tree. For the firs part of this assignment, I have taken some of the work out of this for you by renaming the sequences a simpler designation that will make the labels for the tree easy to read. (YOU WILL HAVE TO DO THIS later for the gene that you select to construct your own phylogeny with). Copy the FASTA format GAPDH sequences that I have placed at the bottom of this document. 7. To paste the sequences, you will have to go to this web page for the software, which will allow you to construct the tree: http://www.phylogeny.fr/ 8. On the main bar, click phylogeny analysis then “one-click” – input your FASTA sequences, which is where you will paste the FASTA formatted GAPDH sequences. Paste this in the white box, then click submit. The results may take a minute or so, depending on your connection speed. 9. Now the GAPDH sequences should be aligned and a phylogenetic tree will be constructed. 10. From here, you can right click and copy image and the paste that into a word or ppt file. 11. Now that you know how to make a phylogenetic tree, your assignment is to choose your favorite gene to use to construct your own unique phylogenetic tree. No one student may use the same gene. It may take a few times to choose a gene that makes for a “good phylogeny.” For example, if homologene search results only give you one or two different organisms, you will need to choose a gene in which more organisms share homology (at least 8). Think about basic enzymes of eukaryotic metabolism or maybe you want to analyze some prokaryotic genes. There are many other ways to construct phyologenetic trees. 12. Your final document will comprise a ‘copy image’ copy of the GAPDH phylogeny and a copy of your favorite gene phylogeny. 13. Remember! You will need to simplify your FASTA formatted sequences by replacing the name with a simplified version of the organism, being careful to leave the > in place, otherwise it will not work. Take a closer look at GAPDH sequences below. 14. Last but not least, I will not accept any excuses as to why you cannot complete this assignment. If you cannot complete this at home due to limitations, you will have to work on this on the school computers in the library. Getting hands-on experience with bioinformatics tools and constructing phylogenies is an important part of this class. Copy and paste this set of GAPDH sequences, which is in FASTA format: >Human MGKVKVGVNGFGRIGRLVTRAAFNSGKVDIVAINDPFIDLNYMVYMFQYDSTHGKFHGTVKAENGKLVIN GNPITIFQERDPSKIKWGDAGAEYVVESTGVFTTMEKAGAHLQGGAKRVIISAPSADAPMFVMGVNHEKY DNSLKIISNASCTTNCLAPLAKVIHDNFGIVEGLMTTVHAITATQKTVDGPSGKLWRDGRGALQNIIPAS TGAAKAVGKVIPELNGKLTGMAFRVPTANVSVVDLTCRLEKPAKYDDIKKVVKQASEGPLKGILGYTEHQ VVSSDFNSDTHSSTFDAGAGIALNDHFVKLISWYDNEFGYSNRVVDLMAHMASKE >Chimp MGKVKVGVNGFGRIGRLVTRAAFNSGKVDIVAINDPFIDLNYMVYMFQYDSTHGKFHGTVKAENGKLVIN GNPITIFQERDPSKIKWGDAGAEYVVESTGVFTTMEKAGAHLQGGAKRVIISAPSADAPMFVMGVNHEKY DNSLKIISNASCTTNCLAPLAKVIHDNFGIVEGLMTTVHAITATQKTVDGPSGKLWRDGRGALQNIIPAS TGAAKAVGKVIPELNGKLTGMAFRVPTANVSVVDLTCRLEKPAKYDDIKKVVKQASEGPLKGILGYTEHQ VVSSDFNSDTHSSTFDAGAGIALNDHFVKLISWYDNEFGYSNRVVDLMAHMASKE >Dog MVKVGVNGFGRIGRLVTRAAFNSGKVDIVAINDPFIDLNYMVYMFQYDSTHGKFHGTVKAENGKLVINGK SISIFQERDPANIKWGDAGAEYVVESTGVFTTMEKAGAHLKGGAKRVIISAPSADAPMFVMGVNHEKYDN SLKIVSNASCTTNCLAPLAKVIHDHFGIVEGLMTTVHAITATQKTVDGPSGKMWRDGRGAAQNIIPASTG AAKAVGKVIPELNGKLTGMAFRVPTPNVSVVDLTCRLEKAAKYDDIKKVVKQASEGPLKGILGYTEDQVV SCDFNSDTHSSTFDAGAGIALNDHFVKLISWYDNEFGYSNRVVDLMVYMASKE >Cow2 MVKVGVNGFGRIGRLVTRAAFNSGKVDIVAINDPFIDLHYMVYMFQYDSTHGKFNGTVKAENGKLVINGK AITIFQERDPANIKWGDAGAEYVVESTGVFTTMEKAGAHLKGGAKRVIISAPSADAPMFVMGVNHEKYNN TLKIVSNASCTTNCLAPLAKVIHDHFGIVEGLMTTVHAITATQKTVDGPSGKLWRDGRGAAQNIIPASTG AAKAVGKVIPELNGKLTGMAFRVPTPNVSVVDLTCRLEKPAKYDEIKKVVKQASEGPLKGILGYTEDQVV SCDFNSDTHSSTFDAGAGIALNDHFVKLISWYDNEFGYSNRVVDLMVHMASKE >Cow1 MVKVGVNGFGRIGRLVTRAAFNSGKVDIVAINDPFIDLHYMVYMFQYDSTHGKFNGTVKAENGKLVINGK AITIFQERDPANIKWGDAGAEYVVESTGVFTTMEKAGAHLKGGAKRVIISAPSADAPMFVMGVNHEKYNN TLKIVSNASCTTNCLAPLAKVIHDHFGIVEGLMTTVHAITATQKTVDGPSGKLWRDGRGAAQNIIPASTG AAKAVGKVIPELNGKLTGMAFRVPTPNVSVVDLTCRLEKPAKYDEIKKVVKQASEGPLKGILGYTEDQVV SCDFNSDTHSSTFDAGAGIALNDHFVKLISWYDNEFGYSNRVVDLMVHMASKE >Mouse MVKVGVNGFGRIGRLVTRAAICSGKVEIVAINDPFIDLNYMVYMFQYDSTHGKFNSTVKAENGKLVINGK PITIFQERDPANIKWGEASAEYVVESTGVFTTMEKARAHLKGGAKRVIISAPSADAPMFVMGVNHEKYDN SLKIFSNASCTTNCLAPVAKVIHDNFGIVEGLMTTVHAITATQKTVDGPSGKLWRDGRGAAQNIIPASTG AAKAVGKVIPELNWKLTGMAFRVPTPNVFVLDLTCRLEKPAKYDDIKKVVKQASEGPLKGILGYTEDQVV SCYFNSNSHSSTFDARAGIALNDNFVKLISWYDNEYGYSNRVVDLMAYMASKE >Rat1 MVKVGVNGFGRIGRLVTRAAFSCDKVDIVAINDPFIDLNYMVYMFQYDSTHGKFNGTVKAENGKLVINGK PITIFQERDPANIKWGDAGAEYVVESTGVFTTMEKAGAHLKGGAKRVIISAPSADAPMFVMGVNHEKYDN SLKIVSNASCTTNCLAPLAKVIHDNFGIVEGLMTTVHAITATQKTVDGPSGKLWRDGRGAAQNIIPASTG AAKAVGKVIPELNGKLTGMAFRVPTPNVSVVDLTCRLEKPAKYDDIKKVVKQAAEGPLKGILGYTEDQVV SCDFNSNSHSSTFDAGAGIALNDNFVKLISWYDNEYGYSNRVVDLMAYMASKE >Rat2 MVKVGVNGFGRIGHLVTRAAFSCDKVDIVAINDPFIDLNYMVYMFQYDSTHGKFNGTVKAENGKLVINGK PITIFQERDPANIKWGDAGAEYVVESTGVFTTMEKAGAHLKGGAKRVIISAPSADAPMFVMGVNHEKYDN SLKIVSNASCTTNCLAPLGKVIHDNFGIVEGLMTTVHAITATQKTVDGPSGKLWRDGRGAAQNIIPASTG AAKAVGKVIPELNGKLTGMAFRVPTPNVSVVDLTCRLEKPAKYDDIKKVVKQVAEGPLKGILGYTEDQVV SCDFNSNSHSSTFDAGAGIALNDNFVKLISWYDNEYGYSNRVVDLMAYMASKE >Rat3 MVKVGVNGFGRIGRLVTRAAFSCDKVDIVAINDPFIDLNYMVYMFQYDSTHGKFNGTVKAENGKLVINGK PITIFQERDPANIKWGDAGAEYVVESTGVFTTMEKAGAHLKGGAKRVIISAPSADAPMFVMGVNHEKYDN SLKIVSNASCTTNCLAPLAKVIHDNFGIVEGLMTTVHAITATQKTVDGPSGKLWRDGRGAAQNIIPASTG AAKAVGKVIPELNGKLTGMAFRVPTPNVSVVDLTCRLEKPAKYDDIKKVVKQAAEGPLKGILGYTEDQVV SCDFNSNSHSSTFDAGAGIALNDNFVKLISWYDNEYGYSNRVVDLMAYMASKE >Rat4 MVKVGVNGFGRIGRLVTRAAFSCDKVDIVAINDPFIDLNYMVYMSQYGSPHGKFNSTVKAENGKLVNNGK PITIFQERDPANIKWGDAGAEYVMESTGIFTTMEKAGAHLKGGAKRVIISAPSADAPMFVMGVNHEKYDN SLKIVSNASCTTNCLAPLAKVIHDNFGIVEGLMTTVHAITATQKTVDGPSGKLWRDGRGAAQNIIPASTG AAKAVGKVIPELNGKLTGMAFRVPTPNVSVVDLTCRLEKPAKYDDIKKVVKQAAEGPLKGILGYTEDQVV SCDFNSNSHSSTFDAGAGIALNDNFVKLISWYDNEYGYSNRVVDLMAYMASKE >Chicken MVKVGVNGFGRIGRLVTRAAVLSGKVQVVAINDPFIDLNYMVYMFKYDSTHGHFKGTVKAENGKLVINGH AITIFQERDPSNIKWADAGAEYVVESTGVFTTMEKAGAHLKGGAKRVIISAPSADAPMFVMGVNHEKYDK SLKIVSNASCTTNCLAPLAKVIHDNFGIVEGLMTTVHAITATQKTVDGPSGKLWRDGRGAAQNIIPASTG AAKAVGKVIPELNGKLTGMAFRVPTPNVSVVDLTCRLEKPAKYDDIKRVVKAAADGPLKGILGYTEDQVV SCDFNGDSHSSTFDAGAGIALNDHFVKLVSWYDNEFGYSNRVVDLMVHMASKE >Zebrafish MVKVGINGFGRIGRLVTRAAFLTKKVEIVAINDPFIDLDYMVYMFQYDSTHGKYKGEVKAEGGKLVIDGH AITVYSERDPANIKWGDAGATYVVESTGVFTTIEKASAHIKGGAKRVIISAPSADAPMFVMGVNHEKYDN SLTVVSNASCTTNCLAPLAKVINDNFVIVEGLMSTVHAITATQKTVDGPSGKLWRDGRGASQNIIPASTG AAKAVGKVIPELNGKLTGMAFRVPTPNVSVVDLTVRLEKPAKYDEIKKVVKAAADGPMKGILGYTEHQVV STDFNGDCRSSIFDAGAGIALNDHFVKLVTWYDNEFGYSNRVCDLMAHMASKE >Fruitfly1 MSKIGINGFGRIGRLVLRAAIDKGASVVAVNDPFIDVNYMVYLFKFDSTHGRFKGTVAAEGGFLVVNGQK ITVFSERDPANINWASAGAEYVVESTGVFTTIDKASTHLKGGAKKVIISAPSADAPMFVCGVNLDAYSPD MKVVSNASCTTNCLAPLAKVINDNFEIVEGLMTTVHATTATQKTVDGPSGKLWRDGRGAAQNIIPAATGA AKAVGKVIPALNGKLTGMAFRVPTPNVSVVDLTVRLGKGATYDEIKAKVEEASKGPLKGILGYTDEEVVS TDFFSDTHSSVFDAKAGISLNDKFVKLISWYDNEFGYSNRVIDLIKYMQSKD >Fruitfly2 MSKIGINGFGRIGRLVLRAAIDKGANVVAVNDPFIDVNYMVYLFKFDSTHGRFKGTVAAEGGFLVVNGQK ITVFSERDPANINWASAGAEYIVESTGVFTTIDKASTHLKGGAKKVIISAPSADAPMFVCGVNLDAYKPD MKVVSNASCTTNCLAPLAKVINDNFEIVEGLMTTVHATTATQKTVDGPSGKLWRDGRGAAQNIIPASTGA AKAVGKVIPALNGKLTGMAFRVPTPNVSVVDLTVRLGKGASYDEIKAKVQEAANGPLKGILGYTDEEVVS TDFLSDTHSSVFDAKAGISLNDKFVKLISWYDNEFGYSNRVIDLIKYMQSKD >mosquito MSKIGINGFGRIGRLVLRAAITKGASVVAINDPFIGVDYMVYLFKYDSTHGRFKGEVSAQDGCLVVNGQK IAVFQERDPKAIPWGKAGAEYVVESTGVFTTTEKASAHLEGGAKKVIISAPSADAPMFVVGVNLEAYEPS MKVVSNASCTTNCLAPLAKVINDNFGILEGLMTTVHATTATQKTVDGPSGKLWRDGRGAAQNIIPAATGA AKAVGKVIPALNGKLTGMAFRVPTPNVSVVDLTVRLSKPATYDQIKQKVKEAANGPMKGILDYTEEEVVS TDFVGDCHSSIFDAKAGIQLSDTFVKLISWYDNEYGYSNRVVDLIKYMQTKD >nematode1 MSKANVGINGFGRIGRLVLRAAVEKDTVQVVAVNDPFITIDYMVYLFKYDSTHGQFKGTVTYDGDFLIVQ KDGKSSHKIKVFNSKDPAAIAWGSVKADFVVESTGVFTTKEKASAHLQGGAKKVIISAPSADAPMYVVGV NHEKYDASNDHVISNASCTTNCLAPLAKVINDNFGIIEGLMTTVHAVTATQKTVDGPSGKLWRDGRGAGQ NIIPASTGAAKAVGKVIPELNGKLTGMAFRVPTPDVSVVDLTVRLEKPASMDDIKKVVKAAADGPMKGIL AYTEDQVVSTDFVSDPHSSIFDTGACISLNPNFVKLVSWYDNEYGYSNRVVDLIGYIATRG >nematode2 MTKPSVGINGFGRIGRLVLRAAVEKDSVNVVAVNDPFISIDYMVYLFQYDSTHGRFKGTVAHEGDYLLVA KEGKSQHKIKVYNSRDPAEIQWGASGADYVVESTGVFTTIEKANAHLKGGAKKVIISAPSADAPMFVVGV NHEKYDHANDHIISNASCTTNCLAPLAKVINDNFGIIEGLMTTVHAVTATQKTVDGPSGKLWRDGRGAGQ NIIPASTGAAKAVGKVIPELNGKLTGMAFRVPTPDVSVVDLTARLEKPASLDDIKKVIKAAADGPMKGIL AYTEDQVVSTDFVSDTNSSIFDAGASISLNPHFVKLVSWYDNEFGYSNRVVDLISYIATKA >nematode3 MPKPSVGINGFGRIGRLVLRAAVEKDSVNVVAVNDPFISIDYMVYLFQYDSTHGRFKGTVAHEGDYLLVA KEGKSQHKIKVYNSRDPAEIQWGASGADYVVESTGVFTTIEKANAHLKGGAKKVIISAPSADAPMFVVGV NHEKYDHANDHIISNASCTTNCLAPLAKVINDNFGIIEGLMTTVHAVTATQKTVDGPSGKLWRDGRGAGQ NIIPASTGAAKAVGKVIPELNGKLTGMAFRVPTPDVSVVDLTARLEKPASLDDIKKVIKAAADGPMKGIL AYTEDQVVSTDFVSDTNSSIFDAGASISLNPHFVKLVSWYDNEFGYSNRVVDLISYIATKA >nematode4 MSKANVGINGFGRIGRLVLRAAVEKDTVQVVAVNDPFITIDYMVYLFKYDSTHGQFKGTVTYDGDFLIVQ KDGKSSHKIKVFNSKDPAAIAWGSVKADFVVESTGVFTTKEKASAHLQGGAKKVIISAPSADAPMYVVGV NHEKYDASNDHVVSNASCTTNCLAPLAKVINDNFGIIEGLMTTVHAVTATQKTVDGPSGKLWRDGRGAGQ NIIPASTGAAKAVGKVIPELNGKLTGMAFRVPTPDVSVVDLTVRLEKPASMDDIKKVVKAAADGPMKGIL AYTEDQVVSTDFVSDPHSSIFDAGACISLNPNFVKLVSWYDNEYGYSNRVVDLIGYIATRG >schizoYEAST1 MAIPKVGINGFGRIGRIVLRNALVAKTIQVVAINDPFIDLEYMAYMFKYDSTHGRFDGSVEIKDGKLVID GNAIDVHNERDPADIKWSTSGADYVIESTGVFTTQETASAHLKGGAKRVIISAPSKDAPMYVVGVNEEKF NPSEKVISNASCTTNCLAPLAKVINDTFGIEEGLMTTVHATTATQKTVDGPSKKDWRGGRGASANIIPSS TGAAKAVGKVIPALNGKLTGMAFRVPTPDVSVVDLTVKLAKPTNYEDIKAAIKAASEGPMKGVLGYTEDA VVSTDFCGDNHSSIFDASAGIQLSPQFVKLVSWYDNEWGYSRRVVDLVAYTAAKDN >SchizoYEAST2 MAIPKVGINGFGRIGRIVLRNAILTGKIQVVAVNDPFIDLDYMAYMFKYDSTHGRFEGSVETKGGKLVID GHSIDVHNERDPANIKWSASGAEYVIESTGVFTTKETASAHLKGGAKRVIISAPSKDAPMFVVGVNLEKF NPSEKVISNASCTTNCLAPLAKVINDTFGIEEGLMTTVHATTATQKTVDGPSKKDWRGGRGASANIIPSS TGAAKAVGKVIPALNGKLTGMAFRVPTPDVSVVDLTVKLAKPTNYEDIKAAIKAASEGPMKGVLGYTEDS VVSTDFCGDNHSSIFDASAGIQLSPQFVKLVSWYDNEWGYSHRVVDLVAYTASKD >SCYeast1 MVRVAINGFGRIGRLVMRIALSRPNVEVVALNDPFITNDYAAYMFKYDSTHGRYAGEVSHDDKHIIVDGK KIATYQERDPANLPWGSSNVDIAIDSTGVFKELDTAQKHIDAGAKKVVITAPSSTAPMFVMGVNEEKYTS DLKIVSNASCTTNCLAPLAKVINDAFGIEEGLMTTVHSLTATQKTVDGPSHKDWRGGRTASGNIIPSSTG AAKAVGKVLPELQGKLTGMAFRVPTVDVSVVDLTVKLNKETTYDEIKKVVKAAAEGKLKGVLGYTEDAVV SSDFLGDSHSSIFDASAGIQLSPKFVKLVSWYDNEYGYSTRVVDLVEHVAKA >SCYeast2 MVRVAINGFGRIGRLVMRIALQRKNVEVVALNDPFISNDYSAYMFKYDSTHGRYAGEVSHDDKHIIVDGH KIATFQERDPANLPWASLNIDIAIDSTGVFKELDTAQKHIDAGAKKVVITAPSSTAPMFVMGVNEEKYTS DLKIVSNASCTTNCLAPLAKVINDAFGIEEGLMTTVHSMTATQKTVDGPSHKDWRGGRTASGNIIPSSTG AAKAVGKVLPELQGKLTGMAFRVPTVDVSVVDLTVKLNKETTYDEIKKVVKAAAEGKLKGVLGYTEDAVV SSDFLGDSNSSIFDAAAGIQLSPKFVKLVSWYDNEYGYSTRVVDLVEHVAKA > SC3 MIRIAINGFGRIGRLVLRLALQRKDIEVVAVNDPFISNDYAAYMVKYDSTHGRYKGTVSHDDKHIIIDGV KIATYQERDPANLPWGSLKIDVAVDSTGVFKELDTAQKHIDAGAKKVVITAPSSSAPMFVVGVNHTKYTP DKKIVSNASCTTNCLAPLAKVINDAFGIEEGLMTTVHSMTATQKTVDGPSHKDWRGGRTASGNIIPSSTG AAKAVGKVLPELQGKLTGMAFRVPTVDVSVVDLTVKLEKEATYDQIKKAVKAAAEGPMKGVLGYTEDAVV SSDFLGDTHASIFDASAGIQLSPKFVKLISWYDNEYGYSARVVDLIEYVAKA >KluYEAST MVRVAINGFGRIGRLVLRIALSRPNVEVVAINDPFISVDYAAYMFKYDSTHGRFAGEVSHDENSLIIDGK KVLVFQERDPATLPWGEHNVDIAIDSTGVFKELDSAQKHIDAGAKKVVITAPSSTAPMFVVGVNEDKYNG ETIVSNASCTTNCLAPLAKVVNNAFGIEEGLMSTIHSITATQKTVDGPSQKDWRGGRTASGNIIPSSTGA AKAVGKVLPELQGKLTGMAFRVPTVDVSVVDLTVKLAKEATYDEIKAVIKKASENELKGILGYTEDAVVS SDFLGDTNSSIFDAAAGIQLSPKFVKLVTWYDNEYGYSTRVVDLVELVAKN >Mold MVKVAINGLGKIGRLVMRIALSRANVEVVAINDPFITVDYAAYMFKYDSTHGKYAGDVQYEGNTLVIDGK KIKVFQERDPAQLPWGEEGIDIAIDSTGVFKELDSAQKHIDAGAKKVVITAPSSTAPMFVMGVNEEKYAG ETIVSNASCTTNCLAPLAKVIDEQFGIEEGLMTTVHSLTATQKTVDGPSMKDWRGGRTASGNIIPSSTGA AKAVGKVLPQLNGKLTGMAFRVPTVDVSVVDLTVKLNKETTYDEIKAAIKAASEGKLKGILGYTEDAVVS TDFLGDNNSSIFDASAGIMLSPKFVKLVSWYDNEYGYSTRVVDLVEHVAAN >RiceBlastFungus MVKCGINGFGRIGRIVFRNAIEHPDCEIVAVNDPFIEPKYAKYMLEYDSTHGRFKGTVEVSGSDLVVNGK KVKFYTERDPANIPWSETGAEYVVESTGVFTTTDKASAHLKGGAKKVIISAPSADAPMYVMGVNEKSYDG SASVISNASCTTNCLAPLAKVINDKFGIVEGLMTTVHSYTATQKTVDGPSAKDWRGGRGAAQNIIPSSTG AAKAVGKVIPALNGKLTGMSMRVPTANVSVVDLTCRLEKGASYEEIKAAIKEAADGPLKGILEYTEDDVV SSDMIGNNASSIFDAQAGIALNDKFVKLVSWYDNEWGYSRRVIDLVTYISKVDGGK >Red bread mold MVVKVGINGFGRIGRIVFRNAIEHDDIHIVAVNDPFIEPKYAAYMLRYDTTHGNFKGTIEVDGADLVVNG KKVKFYTERDPAAIPWSETGADYIVESTGVFTTTEKASAHLKGGAKKVIISAPSADAPMYVMGVNNETYD GSADVISNASCTTNCLAPLAKVIHDNFTIVEGLMTTVHSYTATQKTVDGPSAKDWRGGRTAAQNIIPSST GAAKAVGKVIPDLNGKLTGMAMRVPTANVSVVDLTARIEKGATYDEIKEVIKKASEGPLAGILAYTEDEV VSSDMNGNPASSIFDAKAGISLNKNFVKLVSWYDNEWGYSRRVLDLISYISKVDAKKA >Arabidopsis1 MAFSSLLRSAASYTVAAPRPDFFSSPASDHSKVLSSLGFSRNLKPSRFSSGISSSLQNGNARSVQPIKAT ATEVPSAVRRSSSSGKTKVGINGFGRIGRLVLRIATSRDDIEVVAVNDPFIDAKYMAYMLKYDSTHGNFK GSINVIDDSTLEINGKKVNVVSKRDPSEIPWADLGADYVVESSGVFTTLSKAASHLKGGAKKVIISAPSA DAPMFVVGVNEHTYQPNMDIVSNASCTTNCLAPLAKVVHEEFGILEGLMTTVHATTATQKTVDGPSMKDW RGGRGASQNIIPSSTGAAKAVGKVLPELNGKLTGMAFRVPTSNVSVVDLTCRLEKGASYEDVKAAIKHAS EGPLKGILGYTDEDVVSNDFVGDSRSSIFDANAGIGLSKSFVKLVSWYDNEWGYSNRVLDLIEHMALVAA SH >Arabidopsis2 MALSSLLRSAATSAAAPRVELYPSSSYNHSQVTSSLGFSHSLTSSRFSGAAVSTGKYNAKRVQPIKATAT EAPPAVHRSRSSGKTKVGINGFGRIGRLVLRIATFRDDIEVVAVNDPFIDAKYMAYMFKYDSTHGNYKGT INVIDDSTLEINGKQVKVVSKRDPAEIPWADLGAEYVVESSGVFTTVGQASSHLKGGAKKVIISAPSADA PMFVVGVNEKTYLPNMDIVSNASCTTNCLAPLAKVVHEEFGILEGLMTTVHATTATQKTVDGPSMKDWRG GRGASQNIIPSSTGAAKAVGKVLPELNGKLTGMAFRVPTPNVSVVDLTCRLEKDASYEDVKAAIKFASEG PLRGILGYTEEDVVSNDFLGDSRSSIFDANAGIGLSKSFMKLVSWYDNEWGYSNRVLDLIEHMALVAASR >SativaRice MAQQLSAPFRAAAAAGSRASAAAADPAKVLRLRSAGSAQFTSIAASSSFARNIEPLRAIATQAPPAVPQY SSGEKTKVGINGFGRIGRLVLRIATSRDDIEVVAVNDPFIDAKYMAYMFKYDSTHGPFKGSIKVVDDSTL EINGKKVTITSKRDPADIPWGNFGAEYVVESSGVFTTTEKASAHLKGGAKKVVISAPSADAPMFVVGVNE KSYDPKMNVVSNASCTTNCLAPLAKVVHEEFGIVEGLMTTVHATTATQKTVDGPSMKDWRGGRGAAQNII PSSTGAAKAVGKVLPELNGKLTGMAFRVPTPNVSVVDLTCRIEKSASYDDVKAAIKAASEGALKGILGYT DEDVVSNDFVGDARSSIFDAKAGIGLSSSFMKLVSWYDNEWGYSNRVLDLIAHMALVNAKH >JaponicaRice MASLAVPLRASATPAIAGTGSGGGSRAADPVKVSCVRSKVTCGFPSVGASSSLASSVEPVRATATQAPLA THQSSSTEKTKVGINGFGRIGRLVLRIATNRDDIEVVAVNDPFIDAKYMAYMFKYDSTHGPFKGTIKVVD ESTLEINGKKISVTSKRDPSDIPWGNFGAEYVVESSGVFTTTEKASAHLKGGARKVVISAPSADAPMFVV GVNEKNYNPSMNVVSNASCTTNCLAPLAKIVHEEFGIAEGLMTTVHATTATQKTVDGPSMKDWRGGRGAS QNIIPSSTGAAKAVGKVLPALNGKLTGMAFRVPTPNVSVVDLTCRLEKSASYEDVKAAIKEASEGSLKGI LGYTDEDVVSNDFIGDTRSSIFDAKAGIGLSSSFMKLVSWYDNEWGYSNRVLDLIGHMALVNAKP >Malaria MAVTKLGINGFGRIGRLVFRAAFGRKDIEVVAINDPFMDLNHLCYLLKYDSVHGQFPCEVTHADGFLLIG EKKVSVFAEKDPSQIPWGKCQVDVVCESTGVFLTKELASSHLKGGAKKVIMSAPPKDDTPIYVMGINHHQ YDTKQLIVSNASCTTNCLAPLAKVINDRFGIVEGLMTTVHASTANQLVVDGPSKGGKDWRAGRCALSNII PASTGAAKAVGKVLPELNGKLTGVAFRVPIGTVSVVDLVCRLQKPAKYEEVALEIKKAAEGPLKGILGYT EDEVVSQDFVHDNRSSIFDMKAGLALNDNFFKLVSWYDNEWGYSNRVLDLAVHITNN