Supplementary materials

advertisement
Supplementary materials:
Figure S1: ClustalW alignments (http://www.ebi.ac.uk/Tools/clustalw2/) of reads or
mate pairs (middle line), which resulted from recombination between UBA and 5-way
CG type sequences, with the UBA (top line) and 5-way CG consensus sequence (bottom
line). Boxed sequence is identical, while red bases are divergent from the two other
sequences. Green bases are unique to one of the three sequences.
(a) Contig11111: 2 cases of recombination within a read
(b) Contig11233: recombination within a read
1
(c) Contig11276: recombination within a read
(d) Contig11276: recombination event occurred between the matepaired sequence reads
2
(d) Contig11364: matpaired reads with recombination point in both, indicating a block of
~2kb of the UBA type has recombined into this region.
3
(e) Contig11389: Read XYG40166.b1 seems to have a recombination point in the
middle, from whereon both UBA and 5-way CG sequences are identical. In reality, it
the consensus sequence reads are the recombinant reads, switching from 5-way CG
type to the UBA type sequence. The second half of XYG40166.b1 and its matepair
are both identical to the original 5-way CG type sequence and thus are displaying
SNPs with both the UBA and the “5way CG” consensus sequence.
4
(f) Contig11284: Though the initial segment of Leptospirillum group II UBA sequence
covered by this 5-way CG dataset read contains an insertion relative to the 5-way CG
sequence, the second part seems to display a switch from the 5-way CG type to a
variant of the UBA type (some SNPs are shared, some new ones occurred)
(g) Contig11277: Recombination between the mate paired reads. While read .g1 is
identical to the 5-way CG consensus sequence, read .b1 diverges from both UBA and
5-way CG type sequences. The second part of the .b1 read is significantly closer
related to the UBA sequence than the 5-way CG sequence, indicating the 5-way CG
consensus sequence in that region (determined by majority of sequencing reads in that
region), is significantly different and the minority subset of reads that XYG41253.b1
belongs to presumably corresponds to the original 5-way CG type sequence.
5
6
Figure S2A: Other samples strongly dominated by one genome type only (as in Figure 2)
7
Figure S2B: Other samples strongly dominated by one genome type only (as in Figure 2)
8
Figure S3: Other samples with a mixture of two or three genome types (as in Figure 3)
9
Figure S4: Detail of the CRISPR region and Cas proteins presumably missing in the 5-way
CG type dominating sample 31. Cas proteins were identified in every other sample and the
corresponding region in a representative sample (4) is presented for reference.
10
Download