Supplementary materials: Figure S1: ClustalW alignments (http://www.ebi.ac.uk/Tools/clustalw2/) of reads or mate pairs (middle line), which resulted from recombination between UBA and 5-way CG type sequences, with the UBA (top line) and 5-way CG consensus sequence (bottom line). Boxed sequence is identical, while red bases are divergent from the two other sequences. Green bases are unique to one of the three sequences. (a) Contig11111: 2 cases of recombination within a read (b) Contig11233: recombination within a read 1 (c) Contig11276: recombination within a read (d) Contig11276: recombination event occurred between the matepaired sequence reads 2 (d) Contig11364: matpaired reads with recombination point in both, indicating a block of ~2kb of the UBA type has recombined into this region. 3 (e) Contig11389: Read XYG40166.b1 seems to have a recombination point in the middle, from whereon both UBA and 5-way CG sequences are identical. In reality, it the consensus sequence reads are the recombinant reads, switching from 5-way CG type to the UBA type sequence. The second half of XYG40166.b1 and its matepair are both identical to the original 5-way CG type sequence and thus are displaying SNPs with both the UBA and the “5way CG” consensus sequence. 4 (f) Contig11284: Though the initial segment of Leptospirillum group II UBA sequence covered by this 5-way CG dataset read contains an insertion relative to the 5-way CG sequence, the second part seems to display a switch from the 5-way CG type to a variant of the UBA type (some SNPs are shared, some new ones occurred) (g) Contig11277: Recombination between the mate paired reads. While read .g1 is identical to the 5-way CG consensus sequence, read .b1 diverges from both UBA and 5-way CG type sequences. The second part of the .b1 read is significantly closer related to the UBA sequence than the 5-way CG sequence, indicating the 5-way CG consensus sequence in that region (determined by majority of sequencing reads in that region), is significantly different and the minority subset of reads that XYG41253.b1 belongs to presumably corresponds to the original 5-way CG type sequence. 5 6 Figure S2A: Other samples strongly dominated by one genome type only (as in Figure 2) 7 Figure S2B: Other samples strongly dominated by one genome type only (as in Figure 2) 8 Figure S3: Other samples with a mixture of two or three genome types (as in Figure 3) 9 Figure S4: Detail of the CRISPR region and Cas proteins presumably missing in the 5-way CG type dominating sample 31. Cas proteins were identified in every other sample and the corresponding region in a representative sample (4) is presented for reference. 10