Supporting Information Legends Figure S1. Structure of primers containing MIDs and that of reads. (a) Structure of the primer used in the 2nd PCR complement to the Splinkerette linker is represented. From 5' end, sequences of 454 adaptor A, sequencing key for 454 sequencer, MID, and sequence complement to the Splinkerette linker. (b) Structure of reads. The 5' end contains Mbo I site corresponding to the site of the Splinkerette linker ligated. The 3' end corresponds to the proximate sequence of a LORE1 insertion site, called the insertion site end in this article. Figure S2. Proportion of reads of background sequences. The background sequences were composed of 9 classes, they are 5' FSTs of pre-existing LORE1 copies, LORE1a, b, c, d, i, and j, and newly transposed LORE1 copies segregating in the 2 plant lineages G297-2 and G533-6, and 3' internal region of LORE1. Figure S3. Relation between the number of FSTs and the number of reads constituting an FST. An enlargement represents only the data of FSTs which are constituted by less than 20 reads. Median of the number of reads per one FST, 11, is indicated with an arrow in each graph. Figure S4. Relation between read length and frequency in each FST. Each plot represents one FST. Number and length of the reads constituting each FST are 1 indicated. Figure S5. Distribution of LORE1 insertions on MG-20 pseudomolecules. Frequencies of insertions in each 100kb were indicated. Figure S6. Distribution of insertion sites, TEs and repetitive sequences on chromosome 3. (a) Distribution of LORE1 insertion sites. (b) Distribution of candidate regions of LTR retroelements detected by LTRharvest {Ellinghaus, 2008 #104}. (c) Distribution of TEs and repetitive sequences. Copy number estimation of each TE family in the rel2.5 pseudomolecules was done as follows. Blastn search using the rel2.5 pseudomolecules as query against the database of LjTE_REP dataset (see Experimental procedures) was conducted. Among the identified genome regions homologous to TEs and repeats, only those of longer than 300bp with E-values smaller than 1.0E-5 were regarded as one copy of a TE or a repetitive sequence. When multiple regions homologous to a certain TE were detected in series, the sum of them was regarded as one copy of a TE or a repetitive sequence. (d) Distribution of LjTR1, a tandem repeat sequence known to be accumulated in the constitutive heterochromatin regions of the L. japonicus genome {Ohmido, 2010 #138}. Blastn search using the rel2.5 pseudomolecules of chromosome 3 as query against LjTR1 was conducted. Among the identified genome regions homologous to LjTR1, those not overlapping each other with E-values smaller than 1.0E-5 were counted, and represented in this histogram. 2 Figure S7. Nucleotide composition around insertion sites of LORE1. Frequency of each nucleotide at each site around the 5bp of TSD is plotted. The first nucleotide of the TSD of 5bp is counted as +1, and sites with negative numbers correspond to the region of 5' upstream of the TSD. Note that the directions of the flanking sequences represented here are aligned with the orientation of LORE1 insertions. Figure S8. SSAP analysis detecting LORE2 copies in the selfed progeny of G329-3. Arrowheads indicate bands segregated in the progeny of G329-3 which are absent in the wild type B-129 accession, suggesting LORE2 transpositions in the early stage of the course of establishment of the plant line. Table S1. Primers used for high-throughput FSTs sequencing. Table S2. Primers used for insertion site investigation. Table S3. Primers used for LORE2 SSAP. Table S4. Comparison of transposition frequencies among different lineages of regenerated plants. Table S5. Predicted fragment sizes of FSTs from pre-existing LORE1 copies in B-129 accession. 3 Table S6. Number of FSTs in TEs and repetitive sequences. 4