TPJ_4826_sm_Supportinginformationlegends

advertisement
Supporting Information Legends
Figure S1. Structure of primers containing MIDs and that of reads. (a) Structure of the
primer used in the 2nd PCR complement to the Splinkerette linker is represented. From
5' end, sequences of 454 adaptor A, sequencing key for 454 sequencer, MID, and
sequence complement to the Splinkerette linker. (b) Structure of reads. The 5' end
contains Mbo I site corresponding to the site of the Splinkerette linker ligated. The 3'
end corresponds to the proximate sequence of a LORE1 insertion site, called the
insertion site end in this article.
Figure S2. Proportion of reads of background sequences. The background sequences
were composed of 9 classes, they are 5' FSTs of pre-existing LORE1 copies, LORE1a, b,
c, d, i, and j, and newly transposed LORE1 copies segregating in the 2 plant lineages
G297-2 and G533-6, and 3' internal region of LORE1.
Figure S3. Relation between the number of FSTs and the number of reads constituting
an FST. An enlargement represents only the data of FSTs which are constituted by less
than 20 reads. Median of the number of reads per one FST, 11, is indicated with an
arrow in each graph.
Figure S4. Relation between read length and frequency in each FST. Each plot
represents one FST. Number and length of the reads constituting each FST are
1
indicated.
Figure S5. Distribution of LORE1 insertions on MG-20 pseudomolecules. Frequencies
of insertions in each 100kb were indicated.
Figure S6. Distribution of insertion sites, TEs and repetitive sequences on chromosome
3. (a) Distribution of LORE1 insertion sites. (b) Distribution of candidate regions of
LTR retroelements detected by LTRharvest {Ellinghaus, 2008 #104}. (c) Distribution of
TEs and repetitive sequences. Copy number estimation of each TE family in the rel2.5
pseudomolecules was done as follows. Blastn search using the rel2.5 pseudomolecules
as query against the database of LjTE_REP dataset (see Experimental procedures) was
conducted. Among the identified genome regions homologous to TEs and repeats, only
those of longer than 300bp with E-values smaller than 1.0E-5 were regarded as one
copy of a TE or a repetitive sequence. When multiple regions homologous to a certain
TE were detected in series, the sum of them was regarded as one copy of a TE or a
repetitive sequence. (d) Distribution of LjTR1, a tandem repeat sequence known to be
accumulated in the constitutive heterochromatin regions of the L. japonicus genome
{Ohmido, 2010 #138}. Blastn search using the rel2.5 pseudomolecules of chromosome
3 as query against LjTR1 was conducted. Among the identified genome regions
homologous to LjTR1, those not overlapping each other with E-values smaller than
1.0E-5 were counted, and represented in this histogram.
2
Figure S7. Nucleotide composition around insertion sites of LORE1. Frequency of each
nucleotide at each site around the 5bp of TSD is plotted. The first nucleotide of the TSD
of 5bp is counted as +1, and sites with negative numbers correspond to the region of 5'
upstream of the TSD. Note that the directions of the flanking sequences represented
here are aligned with the orientation of LORE1 insertions.
Figure S8. SSAP analysis detecting LORE2 copies in the selfed progeny of G329-3.
Arrowheads indicate bands segregated in the progeny of G329-3 which are absent in the
wild type B-129 accession, suggesting LORE2 transpositions in the early stage of the
course of establishment of the plant line.
Table S1. Primers used for high-throughput FSTs sequencing.
Table S2. Primers used for insertion site investigation.
Table S3. Primers used for LORE2 SSAP.
Table S4. Comparison of transposition frequencies among different lineages of
regenerated plants.
Table S5. Predicted fragment sizes of FSTs from pre-existing LORE1 copies in B-129
accession.
3
Table S6. Number of FSTs in TEs and repetitive sequences.
4
Download