We believe the putative At3G48190 protein is likely to be an artifactual chimera of two separate proteins, a PWWP-RRM protein and an ATM kinase protein, for we know of no direct data demonstrating the PWWP-RRM sequences are physically joined to the AtATM homologous region, and there is significant evidence against At3G48190 being a single, composite protein in other plants: 1) In plant species where EST sequences encoding proteins homologous to the N-terminal portion of AtATM 1-836 AA are known, translated C-terminal regions are highly homologous and contain stop codons >Brassica oleracea QKPKNPPISVESMKQNLLMMTAMLDKSGDHLSRETKAKLKSDISALLEKISSMPSSSSSSS* >Raphanus sativus RELNEKPTSVDSVKQNLLMITAMLEKSGDRLSRETKAKLKSEVTELLEKVSSMPCSSSSSSSA* >Quercus petraea ANPGEAPPLDSIRQNLQMMTSMLEKSGDNLSPEMRAKLESEIKGLLHKVTSMAGSSST* > Manihot esculenta SMPAEAPPIDFIRQNLEMMTSMLEKSGDNLSLEMRAKLETEIKGLLKKVTSMXXSSSS* > Theobroma cacao PHLGDAPPIDFIRQNLEMMTSMLERSGDNLSPEMKAKLESEIKGLLKKVSSLPNSSSS* >Populus trichocarpa PKLAEAPPIDFIRQNLEMMTSMLEKSGDNLSPEMRAKLEIEIKGLLKKVSSLPSSSS* > Euphorbia esula PAPAPAPPIDLIRQNLEMMTSMLEKSGDNLSPEMRAKLETEIKGLLKKVSSLPGSSSSS* >Carica papaya CHPGEVPPIDFIKQNLEAMTSMLEKSGDNLSPEMRVKLESEIKGLLEKISSMPSSSSS* >Gossypium hirsutum PRLGDAPPIDVIKKNLEMMTSMLEKSGDNLSPEMKAKLESEIKGLLKKVSHCPGSSS* > Glycine max PRLGDAPPIDVIKKNLEMMTSMLEKSGDNLSPEMKAKLESEIKGLLKKVSHCPGSSS* > Solanum tuberosum PSNGEGPDLVVIKQNLESMTTMLEKAGDNISPEMKAKLESEVKGFLEKVSNMVGSSSS* > Solanum lycopersicum GAVPSNGEGPDLVVIKQNLESMTTMLEKAGDNISPEIKAKLESEVKGFLEKVSTMVGSSSS* > Theobroma cacao SSKKSLASQKSADQASQNSADQASQLNFFRHKLEMLTSMLEKSDEKMSSEIKSKVHSEIKGLLEKVNTMVKSSS* >Nicotiana tabacum PSNGEGPDLVVIKQNLEAMTSMLEKAGDNISPEMRAKLENEVKGFLKKVSSMVGSSSS* >Capsella rubella PKPNSIPTSVESMRQNLLMMTAMLENSGDSLSRETKAKLKSEITGLLEKVSSMPSSSSS* Translation of EST sequences containing C-termini of PWWP-RRM proteins from different species. (Note that all have in frame stop codon). 2) The deduced C-terminus of Arabidopsis PWWP-RRM N-terminal portion of At3G48190 ends with a stop codon and is highly homologous to proteins from other species. 1 MKLQNPDKKT LREGFSQESS VVALDSGVLA MSGLKCDGKF PVKDVLMEEG 51 GDKVRKIQVS GGNISLVVDF SGARTSSNNF FESNASCVNE NLVKGNGYRE 101 DETQEFLVGN LVWVMTKYKK WWPGEVVDFK ADAKESFMVR SIGQSHLVSW 151 FASSKLKPFK ESFEQVLNQR NDNGFFDALQ KAMSLLSNSL KLDMTCSCIA 201 DGNGIVSAQN ITTRKNKPLI LREFSVDRLE PKEFVTQLKN IAKCVLNAGV 251 LESTVMQSQL SAFYTLFGHK QIPMAQLHEN EGRKSFTAKM SDSKFIGSPS 301 ICAGNSRKRF RKEWFRKFVS EVDNVSARDD LVNVPPSDLI SKLKLLAVGY 351 NCSEETENIG LFEWFFSKFR ISVYHDENAY KMQLANMAGF KDLMLATNAN 401 RGTVQKTLKS KKIGKSKMEP LNGVSVADTE QKTFELQISK KSNIESLNGV 451 SVADTEQKTF ELQILEKSNI ESLNGVSTPN IDHEASKSNN SGKTKINHII 501 GHSNFPSSVA KVQLAKDFQD KLLVQAPDRK AMTADTLSRP AAILVPDLNS 551 GGNALGTAEF DHMQRPETLI QHNVCPQEEK TPRSTILNFQ VTAHQGVSGT 601 QFVSSQPTSY KHFTSADLFT YSGKKKRGRK RKNAEELPIV AHASATTGIP 651 DLNGTNTEPT LVLPQVEPTQ RRRRRKKEES PNGLTRGITI LFLKFSSQVS 701 MPSRDDLTST FSAFGPLDSS ETHVSEEFSG AQVAFVSSAD AIEAVKSLEK 751 ANPFGETLVN FRLQQKLITV QRNIAPRMPV ISHVSPVPKP NNIPTSMDAM 801 RQNLLMMTAM LEKSGDSLSR ETKAKLKSEI TGLLEKVSSI PSSSSS* The predicted C-terminus sequence is shown, with the sequence generated by extending reading frame beyond putative AtATM 2/3 exon splice site is shown in red. The sequence used to search the plant EST database is italicized. Comparison of predicted C- terminal sequence from Arabidopsis At3G48190 PWWP-RRM protein region AA 788-836 (annotated as N-terminal portion of AtATM) with C-terminal sequences ending with stop codon generated from ESTs from different plant species. Plant EST sequences were searched with AT3G48190 protein fragment (AA 788-836). No EST sequences encompassing this region in Arabidopsis are represented in the Arabidopsis EST database. EST clones from other species (including closely related species Brassica oleracea, Raphanus sativus and Capsella rubella) have in frame stop codons supporting that the N –terminal portion of the Arabidopsis AtATM protein is most likely a separate PWWP-RRM protein. (Note that the evidence of PWWP- RRM region of AtATM mRNA sequence was generated by RT-PCR (that could potentially generate aberrant product) and has no other independent confirmation in ESTs). 3) The deduced Arabidopsis ATM N-terminal sequence is highly homologous to the annotated N-terminal ATM regions from other species lacking PWWP-RRM extension 4) In most species PWWP-RRM protein and ATM protein are located on separate chromosomes While in the Brassicacea a putative PWWP-RRM protein is located in close proximity (1.6 kb) to the chromosomal region encoding the ATM catalytic domain, in other species homologs of At3G48190 are located on separate chromosomes. For example, the Theobroma cacao gene encoding 1076 AA PWWPRRM protein homologous to PWWP-RRM N-terminal segment of AtATM region 50-856 AA is located on chromosome 10, while the gene encoding a 3039 AA protein homologous to AtATM region 820-3856 AA containing ATM catalytic domain is located on chromosome 3. (In Medicago truncatula, Solanum lycopersicum, Phaseolus vulgaris, Citrus sinensis etc gene arrangement is similar to Theobroma cacao).