Screening Rpl7a for Eukaryote-Eukaryote LGT The possibility exists

advertisement
Screening Rpl7a for Eukaryote-Eukaryote LGT
The possibility exists that the presence of the intron in the G.
lamblia Rpl7a gene might be the result of a eukaryote-eukaryote LGT
event. That such events occur, albeit rarely, seems now to be well
supported and it is thus prudent to eliminate this as a potential
source of contamination. Complete alignment of a representative
sample of L7a sequences from 20 eukaryotes and two archaeons,
followed by trimming to remove low-confidence aligned positions, left
approximately 236 aligned residues. The L7a sequence is not
strongly conserved, so phylogenetic noise is relatively high for this
protein. The results of 100 bootstrap replicates with PROML using the
JTT+Gamma model, resulted in a poorly resolved tree with
strong support only for the Metazoa as a clade, as well as for one
jakobid subgroup. The G. lamblia sequence clustered basally within
the eukaryotic radiation, and specifically as a sister taxon to the
Spironucleus barkhaus sequence, albeit at less than 50%
bootstrap support. We take this to indicate that there is no strong
association of the G. lamblia sequence with any phylogenetically
inappropriate taxa, and thus no indication that this gene has entered
the G. lamblia genome as the result of a lateral gene transfer event.
Haloarcula marismortui
Methanococcus jannaschii
78
Trichomonas vaginalis
Naegleria gruberi
Drosophila melanogaster
78
Homo sapiens
69
Ciona intestinalis
Dictyostelium discoideum
Hartmannella vermiformis
Cyanidioschyzon merolae
Acanthamoeba castellanii
Saccharomyces cerevisiae
Arabidopsis thaliana
Bigelowiella natans
Trimastix pyriformis
Malawimonas jakobiformis
Seculamonas ecuadoriensis
Reclinomonas americana
98
50
Jakoba libera
Euglena gracilis
Spironucleus barkhanus
Giardia lamblia
L7a
Accession Numbers
Rpl7a (partial EST sequences)
Acanthamoeba castellanii
Arabidopsis thaliana
Bigelowiella natans
Ciona intestinalis
Cyanidioschyzon merolae
Dictyostelium discoideum
Drosophila melanogaster
Euglena gracilis
Giardia lamblia
Haloarcula marismortui
Hartmannella vermiformis
Homo sapiens
Jakoba libera
Malawimonas jakobiformis
Methanococcus jannaschii
Naegleria gruberi
Reclinomonas americana
Saccharomyces cerevisiae
Seculamonas ecuadoriensis
Spironucleus barkhanus
Trimastix pyriformis
Trichomonas vaginalis
AY925000
NM_130329
DQ118090
DOE Joint Genome Institute
(http://genome.jgipsf.org/ciona4/ciona4.home.html)
(ID:ci0100137466)
Cyanidioshyzon merolae Genome Project
(http//merolae.biol.s.u-tokyo.ac.jp/)
(ID:CML317C)
AC116100
X82782
AY925002
Giardia lamblia Genome Database (contig 4036)
(www.mbl.edu/Giardia )
YP_134885
AY925001
X52130
AY924997
AY924996
NP_248198
DQ118091
AY924999
AAB65045
AY924998
DQ118093
DQ118092
TIGR (Trichomonas vaginalis Genome Project)
(ID: 43310.m00099)
L7a Sequence Alignment
22
236
Hver_L7a
Cmer_L7a
Scerevisia
Acas_L7a
Hsap_L7a
Cint_L7a
Dmel_L7A
Atha_L7A
Jlib_L7a
Secu_L7a
Rame_L7a
Mjak_L7a
Tpyr_L7a
Bnat_L7a
Ddis_L7a
Tvag_L7a
Ngru_L7a
Egra_L7a
Glamblia
Sbar_L7a
Hmarismort
Mjannischi
QKKTNTKLFE
TEKASKRLFE
KKVAPAPLTH
KQVAAAPLFE
KKVAPAPLFE
KRVAPAPLFE
KKVAPAPLFE
VKVA---LFE
RVVAKPNLHV
KPYEKPGLFP
XXXXEP-LHL
PFIPGSKLFD
KKTAGKKLIA
RRVAANKLFK
TKAAPAKLYT
EKVIED-LFA
KKVVKTSLLA
KDATGKKLFE
SKVSGSDLAV
PNSSSSKIVK
XXXXXXXXXX
XXXXXXXXXX
KTPKNFGIGQ
RRPKNFGIGQ
STPKNFGIGQ
KRPKNFGIGQ
KRPKNFGIGQ
KRPKNFGIGQ
KRPKNFGIGQ
RRPKQFGIGG
ARPKNFGLGG
STPKTFGIGG
ARPKNMSVGA
SRKLNFGIGQ
KRPRSFHVGG
KTPKDFRIGR
KNVKNFGTGF
EETAEK---PRPKDIGIGR
SRPKNFSVGQ
PENKSRS--TTVKTSQMGV
XXXXXXXXXX
XXXXXXXXXX
DVAPKRNMTH
SIQPKRDLSR
AVQPKRNLSR
DIQPKRDLTR
DIQPKRDLTR
DIQPKRDLTH
NVQPKRDLSR
ALPPKKDLSR
TVQPKRDVTR
TVLPKRDLTR
SVRKVKDLTR
SLKKGLDLSR
DVHPPRDVSR
SVQPKQNLSR
GVQPKRDLTH
-VADKTAQTR
DLGAKMNLTR
DLQPKRDLSR
--KCDFDLTP
GVHTQKDLTR
XXXXXXXXXX
XXXXXXXXXX
FVRWPKYVRL
FVRWPRYVRL
YVKWPEYVRV
FVRWPKYIRM
FVKWPRYIRL
FVRWPKYIRL
FVRWPKYIRV
YIKWPKSIRL
FVKWPKYIRL
FVKWPQYVRL
FVRWPKYVRL
YVKWPRYVRL
YVKWPKYIRL
FVKWPRYIRI
FTHWPRYIKL
FPK---YVQL
FVKWPVYVRL
FVRWPAYIKR
FVRWPRQVRI
YVRWPAYIRI
XXXXXXXXXX
XXXXXXXXXX
QRQHRILLNR
QRQRKILLQR
QRQKKILSIR
QRQKRVLLHR
QRQRAILYKR
QRQKVVLQKR
QRQKAVLQKR
QRQKRILKQR
QRQRSILLRR
QRQRSILLRR
QRQRSILNRR
QRQRKILYQR
QRQKKVMYQR
QRQKAILKQR
QRQRRVLLKR
QRQKRILMKR
QRQKRILLKR
QRQKRILLKR
QRQKAVLQRR
QRQKALLQTR
XXXXXXXXXX
XXXXXXXXXX
LKVPPVINQF
LKVPPAIAQF
LKVPPTIAQF
LKVPPTINQF
LKVPPAINQF
LKIPPAINQF
LKVPPPIHQF
LKVPPALNQF
LKVPPSIRQF
LKVPPAINRF
LKVPPAINHF
LKVPPSIAQF
MKTPAMINQF
LRVPPAVNQF
LKVPPTINQF
LKVPPPVNHF
VKIPPAINQF
LRVPPAINQF
LKVPPTVNQF
LKVPGAINQF
XXXXXXXXXX
XXXXXXXXXX
TNTLDKNAAT
QKTADKNLTD
QYTLDRNTAA
TRTLDKNLAK
TQALDRQTAT
TNTLDRQTAT
SQTLDKTTAV
TKTLDKNLAT
RFTLDKNVAT
SHVLDKNLAT
TFTLNKNAAT
TNALDKNNAT
TQTLDKHVAT
TNTIDKNQAM
TRVFDKNTAV
NHTLGKDAAV
RIVADKPLAH
NHTVDRHLKK
MNPISRNLTN
HHPVSKNLNI
XXXXXXXXXX
XXXXXXXXXX
SLFKILHAHR
SLFRLLERYR
ETFKLFNKYR
NLFAFVDKYR
QLLKLAHKYR
SLFRLAKKYQ
KLFKLLEKYR
SLFKVLLKYR
QLITLLSKYS
NLFKLLIKYR
NLFKLLLKYR
SLFKLLHKYR
EMFKLLHKYR
TLFKLLAKYR
HLFKLLDKYR
ALFKFLEKYR
SVVQFLAKYK
ELFKFALKYK
EIFNLARKYS
EILNFAKKYI
XMPVYVDFDV
XMAVYVKFKV
PEDKASKKKR
PEEKVAKRQR
PETAAEKKER
PETKAEKADR
PETKQEKKQR
PESKLEKRER
PESPLAKKLR
PEDKAAKKER
PETKAEKSAR
PEDKQEKNAR
PETRSEKRSR
PEDKAEKAAR
PEDKKAKKER
PENRKEKQER
PEEASVKKAR
PETKTEKKQR
PEDEQQKKER
PESSFERRSR
PESKEEHKAR
KETREERKVR
PADLEDDALE
PEEIQKELLD
LLEAAQKAKK
LQQAAERTPK
LTKEAAASPK
LKKRAAEVAT
LLARAEKTKR
LRQRAQETRR
LKKIAEAKKK
LVKKAQASKK
LLAAAQEKKK
LKAAAEATSK
LREQAANSTK
LKAAGEATKK
LLKLAEAPKK
LKAIAEEKDK
LLKIAEAAEK
NKEDAEKSGN
LRQAAKDKKS
LKKEAEAVPG
LLQIADA-KS
ITKMAET--ALEVARD--AVAKAQK---
PK-FVKMGIN
PI-FVKHGVN
PY-AVKYGLN
PY-FVKYGLN
PP-VLRAGVN
PL-VVSSGVN
PS-YVSAGTN
PI-VVKYGLN
PF-VLKFGLN
PY-VVKYGLN
PH-FVKYGLN
PV-VLKYGLN
PM-TLKCGLN
PK-VLKYGLN
PVQHLRFGID
KK-ALVQGVK
ES-SLIHGIN
PR--VYSGAQ
DKLVIASGIR
----VISGIN
----VKKGTN
----IKKGAN
HVTSLVESKK
HVTDLIEQKK
HVVALIENKK
HVTSLVESKK
TVTTLVENKK
TVTNLIERKK
TVTKLIEQKK
HVTYLIEQNK
HITTLVEQKK
HITSLVEQKK
HIVSLVESKE
HITSLVENNK
HITTLVEQKK
HITSLVESRK
SVTKLIEKKK
NVTAAIESKK
EVVKAVERKQ
RVFRLVEQKR
RITSLVESKR
EVTNLIEKKK
ETTKSIERGS
EVTKAVERGI
AKLVVIAHDV
AKLVIIAHDV
AKLVLIANDV
AKLVVIAHDV
AQLVVIAHDV
AQLVVIAHDV
AQLVVIAHDV
AQLVVIAHDV
AKLVVIAHDV
AKLVVIAHDV
AKLVIIAHDV
AKLVVIAHDV
AKLVVIAHDV
AKLVVIAHDV
AKLVVIAHDV
AQLVIIAHDV
ASLVVIAHDV
AKLVLIAHDV
AKLVLIANDV
AKLVLIANDV
AELVFVAEDV
AKLVIIAEDV
DPIEIVMWLP
DPIELVMWMP
DPIELVVFLP
DPIELVVWLP
DPIELVVFLP
EPVEIVVYLP
DPLELVLFLP
DPIELVVWLP
DPIELVLWLP
DPIELVVWLP
DPIELVVFLP
DPIELVLWLP
DPIELVIWLP
DPIEMVIWLP
DPVELVLYLP
DPIELVIWMP
EPIELVLFLP
DPIEIVLCLP
DPLELVLWLP
DPIELVMWLP
QPEEIVMHIP
KPEEVVAHLP
TLCVKMGIPY
ALCRKLDIPY
ALCKKMGVPY
SLCKKVGVPY
ALCRKMGVPY
ALCRKMNVPY
ALCRKMGVPY
ALCRKMEVPY
ALCRKMDVPY
TLCKKMGVPY
VLCRRMGVPY
ALCRKRNVPY
TLCRKMDVPY
TLCRKMKVPF
TLCRRMDVPY
ALCRNLEIPY
ALCKKLDIPY
ALCRKQGIPW
TLCHKMGVPY
SLCHKMQIPY
ELADEKGVPF
YLCEEKGIPY
VIVKGKARLG
VIVKGKARLG
AIVKGKARLG
CIVKSKSRLG
CIIKGKARLG
CIVKGKSRLG
CIVKGKARLG
CIVKGKSRLG
CIIKGKSRLG
CIVKGKARLG
CIVKGKARLG
VIVKSKSRLG
CIVKGKARLG
CIVKGKARLG
CIVKSKSRLG
CIVKSKSRLG
VIVKSKSRLG
CIVKGKANLG
AIVRTKGDLG
AIIRSKSELG
IFVEQQDDLG
AYVASKQDLG
QVVHKKTAAV
ALVHLKTATC
TLVNQKTSAV
QVVHKKTSAV
RLVHRKTCTT
RLVHRKTCTC
RLVRRKTCTT
AVVHQKTASC
QLVHQKTATC
QVVHQKNATA
QVVHKKTATA
QLVHKKTATA
TLVGLKTATC
ALCHMKTCTA
ELVHMRNASC
QIVGMKTCSC
QLVHMKNCAA
KLVGLKTATS
KLVHLKKTTS
ALAGLKTCAV
HAAGLEVGSA
KAAGLEVAAS
LAVTEVDPKF
LAVTGVHDRD
AALTEVRAED
LAITNVRKED
VAFTQVNSED
VAITDVNNED
LALTTVDNND
LCLTTVKNED
VALTDVADEH
LAFVDVRDED
LAITSVRDED
VALTGVRAED
LALTDVKPED
VAITDVEKQD
VALTGVNSAD
VALAEVKPED
VALTEVKNED
LAFVDIKNGD
VCFTDVNPED
LAIDEIRSED
AAAVTDAGEA
SVAIINEGDA
STDFTNLVAL
RAELSKIIEV
EAALAKLVST
QPALATLTKA
KGALAKLVEA
KGALSKLVES
KANFGKVLEA
KLEFSKILEA
SAAFNRLVES
KNSFSKLLES
VGALSRVLES
EHDFAQLVQV
KKTFSDLVAV
DRALNQFVAS
SNELALLVES
RAAFTKIVDS
RDPFAKIVEA
KTDFEKLTQS
KPTFDKILAA
TAALRSITDK
DADVEDIADK
E-ELKVLIEK
AKDQYNNKYT
CKMRFNDRYE
IDANFADKYD
IQENYNDRYD
IRTNYNDRYD
VRTNYNERFD
VKTNFNERHE
IKANFNDKYE
CKLSLPDEHX
IENAHVENXX
VDGAIPKDSH
ARTSYXXXXX
CRTNYIDRAE
VAPMYEEAPR
AKQMFNNN-S
VNSGFLAHYK
VRGAFVDRFR
VKLAYNDKYE
V--AHEVDYA
IAVEVN--YE
VEELRXXXXX
VNVLKQXXXX
EQMKKYGGRT
EIRRQWGGGV
EVKKHWGGGI
DLRRQWGGLQ
EIRRHWGGNV
EIRRHWGGGI
EIRRHWGGGI
EYRKKWGGGI
XXXXXXXXXX
XXXXXXXXXX
ALNXXXXXXX
XXXXXXXXXX
AIVRQQGGGK
AWGAVG---EHRKTWGGNT
EEMHQWGGGE
TVNTKWGGGQ
ELSRKWGGLR
KAMKTYGGGV
KTIKNHGGNT
XXXXXXXXXX
XXXXXXXXXX
FGYKHTSQKA
LGIKSTHKLE
LGNKAQAKMD
LGRKSVHKQK
LGPKSVARIA
MGNKSLARIA
LGSKSLARIS
MGSKSQAKTK
XXXXXXXXXX
XXXXXXXXXX
XXXXXXXXXX
XXXXXXXXXX
LGVKSAKKIE
FGIKSRHKIR
LSGPARAILA
LSEKTIEKLK
LSKRSIIEQK
LSKKSKQKMA
RREDEAQQMX
LSLRSQRKLP
XXXXXXXXXX
XXXXXXXXXX
KQDRRRRKEE
KRRRALAIEE
KRAKNSDSAX
AKAKAAAANQ
KLEKAKAKEL
KIEKLRAKDA
KLERAKAREL
AKERVIAKEA
XXXXXXXXXX
XXXXXXXXXX
XXXXXXXXXX
XXXXXXXXXX
KRQKALAKDE
ARQXXXXXXX
KRQKAEAKES
AQGKYKEXXX
NRIKKLKXXX
KKKRIAAANA
XXXXXXXXXX
SVEKHQFYQF
XXXXXXXXXX
XXXXXXXXXX
AKKEQX
AKRAQA
XXXXXX
XXXXXX
ATKLGX
AQKASL
AQKQGX
AQRMNX
XXXXXX
XXXXXX
XXXXXX
XXXXXX
KQREKM
XXXXXX
LAKSKX
XXXXXX
XXXXXX
AKXXXX
XXXXXX
EKKKKK
XXXXXX
XXXXXX
Download