Record of past encounters with phages and plasmids delivers new insights about the origin and dispersal of fire blight pathogen Erwinia amylovora F. Rezzonico, T.H.M. Smits and B. Duffy Agroscope Changins-Wädenswil Swiss National Competence Centre for Fire Blight CH-8820 Wädenswil, Switzerland Keywords: fire blight, CRISPR, diversity, dispersal Abstract Comparative genomics identified CRISPRs (Clustered Regularly Interspaced Short Palindromic Repeats) as regions of higher than expected variability in Erwinia amylovora, a species in which diversity is otherwise very narrow and difficult to evaluate with standard methodologies. Analysis of CRISPR regions revealed the existence of three major clusters for strains isolated from Spiraeoideae (e.g., Malus and Pyrus spp.) and suggested that the genotype of the causative agent of fire blight in Pomaceae was selectively enriched from the broader genetic pool which is present on wild host plants in North America. INTRODUCTION Most of the current methods to explore relatedness in E. amylovora take advantage of changes induced to the overall restriction (e.g., PFGE, AFLP, ARDRA) or amplification (e.g., rep-PCR, RAPDs) pattern of a strain by single nucleotide polymorphism (SNP). However, the occurrence of SNPs is essentially stochastic and does not deliver any precise information about the temporal succession of the mutational events. Additionally, the number of SNPs detected in E. amylovora is too low (Smits et al., 2010c). CRISPRs are genomic regions composed of highly conserved direct repeats interspersed with variable spacer sequences of foreign origin (i.e., phages or plasmids) that are incorporated in the CRISPR array upon internalization and processing of the invading nucleic-acid element (Sorek et al., 2008). These repeats, together with the associated cas genes (Haft et al., 2005), were shown to protect bacteria against invasion by mobile DNA elements by the means of a RNA-interference-like mechanism (Makarova et al., 2006). Insertion of new spacers units is polarized at the 5′ end next to the leader sequence, whereas loss and duplication of spacers can also occur in the middle of the CRISPR array (Barrangou et al., 2007). This ordered incorporation of novel spacers delivers a chronological record of past encounters with phages and plasmids that allows to retrace the point of divergence between lineages (Fig. 1) and, applied to the E. amylovora case, to put forward a circumstantiated hypothesis on the origin and dispersal of fire blight. MATERIALS AND METHODS On the basis of the genome sequence of E. amylovora strain CFBP 1430 (Smits et al., 2010 c) three CRISPR repeat regions (CRRs) were identified. Inward-facing oligonucleotide primers were designed on the flanking regions and were used for PCR amplification (Rezzonico et al., 2010). A primer walking strategy was adopted to obtain the sequence of the three CRRs within a collection of geographically/temporally distant E. amylovora strains comprising thirty-four isolates from Spiraeoideae hosts (belonging to the genera Malus, Pyrus, Prunus, Crataegus, Cydonia, Sorbus, Pyracantha and Rhaphiolepsis) and three isolates from Rosoideae (Rubus spp.). The number and arrangement of the spacers in each strain was assessed to infer the most probable genealogy of the strains and similarity to known phages or plasmids sequences was analyzed by BLASTN analysis. RESULTS AND DISCUSSION Using CRISPR analysis total of 18 distinct genotypes were defined (Rezzonico et al., 2010). All the strains isolated from Spiraeoideae, except one, clustered in three major CRISPR groups (Fig. 2): both group II and group III were composed exclusively from bacteria originating from the United States, whereas group I (further subdivided in group IA and IB) contained as well strains from Europe, New Zealand and the Middle East. The strain isolated from Indian hawthorn was excluded from these three groups as it shared only 26 spacers at the 3’ end of its CRR2 with the latter. Strains isolated from Rosoideae clustered separately and displayed an higher intrinsic diversity, barely sharing any spacer not only with the other E. amylovora strains isolated from Spiraeoideae, but also among each other. Overall, there was a high correlation between spacer sequences and plasmid content supporting the hypothesis for a role of CRISPRs in adaptive/heritable immunity to foreign DNA elements (Rezzonico et al., 2010). The number of spacers in CRR3 was almost invariant and included the same five spacers in every single strain but three, all isolated from Rubus host plants, which presented minor changes in the identity of the last one or two spacers at the 5’ end of the array (Rezzonico et al., 2010). The CRR3 is considered a remnant of a different CRISPR type as being present in E. pyrifoliae DSM 12163T and E. tasmaniensis Et1/99 (Kube et al., 2008; Rezzonico et al., 2010; Smits et al., 2010a; Smits et al., 2010b). On the other hand, the number of spacers in CRR1 and CRR2 showed large variations among the different E. amylovora strains (Rezzonico et al., 2010). CRR1 contained between 31 and 37 spacers in group I strains, but the variability was mainly restricted to the sporadic deletion of a small number of spacers. Group II strains showed a major deletion of a total of 24 internal spacers in CRR1, with the remaining spacers that were collinear with those found in group I strains. Conversely, CRR1 of group III strains was remarkably large and contained up to 98 spacers, only 10 and 2 of which were shared with those of groups I and II, respectively. A number of the recently accumulated spacers in the CRR1 of group III strains showed perfect identity to sequence of plasmid pEU30 (Foster et al., 2004), which is ubiquitous in all group III strains. In the same line, with respect to groups I and II, these isolates presented an additional set of 17 spacers at the 5’ end of their CRR2, adding up to a total of 49 spacers. The sequence of many of additional spacers matched plasmid pEU30, suggesting that group III strains diverged recently from the other two CRISPR groups after the acquisition of the plasmid. In group II and group Ia strains the CRR2 was composed of 28 to 35 spacers, while the both group Ib strains showed a 8 and 11 spacer internal deletion, respectively. In both cases, the collinearity of the shared spacers was conserved. A tentative evolutionary model for E. amylovora based on CRISPR data was developed under the assumption that strains showing internal spacer deletion or the addition of new spacers at the 5’ end of the array cannot be ancestral (Fig. 3). This is the case for both group IB and group II strains, which have undergone major deletions in the central regions of CRR2 and CRR1, respectively, as well as for group III strains that show a boost of new spacers at the 5’ end of CRR2 after the encounter with plasmid pEU30. Thus, the most probable genotype of the earliest causal agent of fire blight in apple and pear trees must have been similar to that of group IA strains, which show the most complete CRRs among all the E. amylovora groups. This common ancestor probably originated from the reservoir of E. amylovora strains with broader genetic variability found on undomesticated Rubus spp. in North America, as suggested by the 3’ terminal spacer in CRR2 shared by some Spiraeoideae and Rubus isolates and the only minor rearrangements in CRR3. The Indian hawthorn isolate seems to be genetically close to the “missing link” between strains on undomesticated plants and fruit tree isolates, as it shares the more ancient part of its CRR2 region with other isolates from Spiraeoideae, but shows (with the exception of the 3’ terminal spacer) a completely diverging CRR1 region. In summary, global biodiversity among E. amylovora strains isolated from Spiraeoideae is very low in comparison to diversity among strains isolated from Rubus spp. in North America. A possible explanation for this phenomenon is that the Spiraeoideae genotype of E. amylovora was selected and enriched from the broader genetic pool of the species by the recent encounter between the host plant which was imported from Europe (probably a Malus or Pyrus spp.) and the pathogen which was among the genotypes present on undomesticated plants in the North American continent (Smits et al., 2010b). Following this rationale, the observed genetic homogeneity of E. amylovora may not be the result of a protracted coevolution, but rather be the outcome of a fortuitous encounter between the host plant and the bacterium. This could also explain the relative low number of type III secretion system effectors found in this species compared to other plant pathogen such as Pseudomonas syringae or Xanthomonas spp. (Grant et al., 2006). ACKNOWLEDGEMENTS Funding was provided by the Swiss Federal Office of Agriculture (BLW Fire Blight Project – Pathogen). This work was conducted within the European Science Foundation supported research network COST Action 864. Literature cited Barrangou, R., Fremaux, C., Deveau, H., Richards, M., Boyaval, P., Moineau, S., Romero, D. A., and Horvath, P. 2007. CRISPR provides acquired resistance against viruses in prokaryotes. Science 315: 1709-1712. Foster, G.C., McGhee, G.C., Jones, A.L., and Sundin, G.W. 2004. Nucleotide sequences, genetic organization, and distribution of pEU30 and pEL60 from Erwinia amylovora. Appl. Environ. Microbiol. 70: 7539-7544. Grant, S.R., Fisher, E.J., Chang, J.H., Mole, B.M., and Dangl, J.L. 2006. Subterfuge and manipulation: type III effector proteins of phytopathogenic bacteria. Annu. Rev. Microbiol. 60: 425-449. Haft, D.H., Selengut, J., Mongodin, E.F., and Nelson, K.E. 2005. A guild of 45 CRISPRassociated (Cas) protein families and multiple CRISPR/Cas subtypes exist in prokaryotic genomes. PLoS Comput. Biol. 1: e60. Kube, M., Migdoll, A.M., Müller, I., Kuhl, H., Beck, A., Reinhardt, R., and Geider, K. 2008. The genome of Erwinia tasmaniensis strain Et1/99, a non-pathogenic bacterium in the genus Erwinia. Environ. Microbiol. 10: 2211-2222. Makarova, K.S., Grishin, N.V., Shabalina, S.A., Wolf, Y.I., and Koonin, E.V. 2006. A putative RNA-interference-based immune system in prokaryotes: computational analysis of the predicted enzymatic machinery, functional analogies with eukaryotic RNAi, and hypothetical mechanisms of action. Biol. Direct 1: 7. Rezzonico, F., Smits, T.H.M., and Duffy, B. 2010. Diversity and functionality of CRISPR regions in fire blight pathogen Erwinia amylovora. Submitted. Smits, T.H.M., Jaenicke, S., Rezzonico, F., Kamber, T., Goesmann, A., Frey, J.E., and Duffy, B. 2010a. Complete genome sequence of the fire blight pathogen Erwinia pyrifoliae DSM 12163T and comparative genomic insights into plant pathogenicity. BMC Genomics 11: 2. Smits, T.H.M., Rezzonico, F., and Duffy, B. 2010b. Evolutionary insights from Erwinia amylovora genomics. Submitted. Smits, T.H.M., Rezzonico, F., Kamber, T., Blom, J., Goesmann, A., Frey, J.E., and Duffy, B. 2010c. Complete genome sequence of the fire blight pathogen Erwinia amylovora CFBP 1430 and comparison to other Erwinia species. Mol. Plant-Microbe Interact. 23: 384-393. Sorek, R., Kunin, V., and Hugenholtz, P. 2008. CRISPR – a widespread system that provides acquired resistance against phages in bacteria and archaea. Nat. Rev. Microbiol. 6: 181-186. Figures Fig. 1. The assembly mechanism of CRISPR arrays allows to retrace the point of divergence between lineages and to deduce the genealogy of the strains. Insertion of new spacers polarized at the 5’ end of the cluster (A-D), while loss and duplication of older spacers can also occur in the middle of the array (E). Fig. 2. Clustering of 37 E. amylovora strains based on the cumulative spacer patterns of CRR1, CRR2 and CRR3. Results of CRISPR typing were converted in a binary array according to the presence or absence of each spacer and taxonomy was inferred using the UPGMA method. All Spiraeoideae strains except IH 3-1 (isolated from Indian hawthorn) clustered in one of the three main CRISPR groups. Strains Ea 6-96r, Ea 7-96r and IL-5 were isolated from Rubus plant species. Fig. 3. Origin and dispersal of the causative agent fire blight on Pomaceae based on CRISPR data. Undomesticated Rubus spp. in North America contained a reservoir of E. amylovora strains with broad genetic variability. A single highly virulent phenotype/genotype of E. amylovora was selectively enriched from this broader genetic pool via the introduction on the East Coast of the United States of domesticated Malus spp. from Europe. This genotype may be have been similar to the one found in the Indian hawthorn isolate, suggesting that other plants may have been intermediate hosts. The common ancestor of E. amylovora on Malus and Pyrus spp. was probably similar to group IA isolates. During two centuries, the CRR of group II and group III strains diversified from group I strains as pomaceous fruits (and fire blight) followed the westward migration of the American settlers, whereby the higher diversity of group III strains may be explained by the encounter with plasmid pEU30 that caused a boost in the incorporation of new spacers. The introduction of fire blight to Europe and New Zealand was probably caused by the import in that regions of plant material infected with E. amylovora strains of group IA, thus probably originating from the East Coast of the United States.