Supporting Information Legends Figure S1. Analysis of the 1.6x cis mutant A. Northern blot analysis of hairpin-derived siRNAs in 1.6x and 15.5a (T+S35S) cis mutants. A mutant defective in the DNA methyltransferase DRM2, which contains the T+SWT transgene silencing system, is shown as a positive control for 21-, 22-, and 24-nt hairpin-derived siRNAs. (Naumann et al., 2011). In this initial experiment, siRNAs were not observed in the two cis mutants. Lanes 1.6x and drm2 are reproduced with permission of the Genetics Society of America (GSA). The copyright is retained by the GSA. B. Bisulfite sequence analysis of DNA methylation in the transgene target enhancer in the 1.6x cis mutant. No methylation was observed. For the levels of methylation induced by the original unaltered silencer in T+SWT plants, see Fig. 3c (T+SWT). C. Reactivation of GFP expression in the 1.6x cis mutant (right). GFP is silenced in T+SWT plants (left). D. Genotyping of Target and Silencer. +T/+S and -T/-S indicate presence and absence of the junction fragments containing target (T) or silencer (S) transgenes and flanking plant DNA. Both junction fragments are present in T+SWT, 1.6x and 15.5a, although the latter two have undergone partial deletions of the S transgene construct (Fig. 2; Fig. S2). Primers that amplify the T construct-plant DNA and S construct-plant DNA junction fragments are shown in Table S2. Reference: Naumann U. et al. (2011) Genetic evidence that DNA methyltransferase DRM2 has a direct catalytic role in RNA-directed DNA methylation in Arabidopsis thaliana. Genetics 187:977-979. Figure S2. Sequence of S35S locus as derived from whole genome sequencing of the 15.5a (T+S35S) cis mutant The entire 6045 bp sequence of the wild-type silencer locus (SWT) (Accession number: HE584556) as inferred from whole genome sequence data on the T+SWT line (NCBI Sequence Read Accession Number SRX312279) (Sasaki et al., 2012) is shown on the following pages and annotated according to the basic elements of the original transgene construct drawn below (color code similar in this figure and in the sequence shown at the end of this text): L B vector (pUC18) NT NST HYG 1 R 19Spro ’ loop 35Spro lox lox R B 6045 The SWT sequence was used as a reference sequence to assemble the sequence of S35S from whole genome sequencing data from T+S35S plants. The whole genome sequencing data from the T+S35S line are available from NCBI's Sequence Read Archive under accession number SRX312270. The regions of SWT that are inferred to be present at the SΔ35S locus in the T+S35S mutant are shaded green on both DNA strands in the sequence. Regions that are deleted in the T+S35S mutant are indicated by the lack of green shading of the double stranded DNA in the sequence. Most notably, the 35S promoter, including the two flanking lox sites, is deleted (approximately nt 2680 to 3320) while leaving the two halves of the inverted repeat (IR; opposing pink arrows) largely intact (approximately nt 1760 to 2080 and 2360 to 2680). The 58 bp of ‘unknown vector DNA’ (nt 3345) is present at the S35S locus but not the SWT locus and apparently was inserted during the event that deleted the 35S promoter by an unknown mechanism. The ‘breaks’ in the sequence indicate that gaps remained between some contigs in the vector region (approximately nt 4820 and 5568) in areas that do not affect the IR region of the silencer. The spacer region between the two halves of the IR (’ loop) is derived from the embryospecific promoter of the ’ subunit of soybean seed storage protein -conglycinin (Chen et al., 1988; Kanno et al., 2005; Eun et al., 2012). This sequence is not homologous to any part of the target locus or to any region of the Arabidopsis genome. This sequence is present in the coordinates 2081-2363 in the SWT locus (Accession number HE584556). The target locus sequence is found under the accession number HE582394. The pUC18 sequences are in the T-DNA region of the transgene construct whereas the RSF1010 is from the binary vector outside of the T-DNA region. Some RSF1010 fragments apparently integrated together with the T-DNA during the transformation event. Abbreviations: NST, nopaline synthase terminator; HygR, gene encoding resistance to hygromycin; 19S pro, 19S promoter of cauliflower mosaic virus; LB and RB, left and right TDNA borders. Thin black arrows on top indicate the direction of transcription. References Chen, Z.L., Schuler, M.A. and Beachy, R.N. (1986) Functional analysis of regulatory elements of a plant embryo-specific gene. Proc. Natl. Acad. Sci. USA, 88, 8560-8564. Kanno, T., Huettel, B., Mette, M.F., Aufsatz, W., Jaligot, E., Daxinger, L., Kreil, D.P., Matzke, M. and Matzke, A.J.M. (2005) Atypical RNA polymerase subunits are required for RNAdirected DNA methylation. Nat. Genet., 37, 761-765. Sasaki, T., Naumann, U., Forai, P., Matzke, A.J.M. and Matzke, M. (2012) Unusual case of apparent hypermutation in Arabidopsis thaliana. Genetics, 192, 1271-1280. Figure S3. Bisulfite sequence analysis of DNA methylation at the target enhancer in the indicated genotypes The Y-axis indicates the percent methylation at individual cytosines at the target enhancer (tandem repeat represented by gray arrows). CG, CHG and CHH methylation are indicated by the black, blue and red lines. The top two panels show complete loss methylation in T+SWT and T+SΔ35S plants in an nrpe1 mutant, which is impaired in the function of the largest subunit of Pol V (NRPE1). The bottom two panels show loss of DNA methylation at the target enhancer (T*) following segregation of the silencer locus in either T+SWT or T+SΔ35S plants. Fourteen to twenty clones were sequenced for each genotype. The levels of methylation in wild-type T+SWT and T+SΔ35S plants are shown in Fig. 3c. Figure S4. Effect of an rdr2 mutation on GFP silencing and target enhancer methylation An rdr2 mutation does not release GFP silencing (top, left) or reduce target enhancer methylation (bottom, left) in T+SWT plants, which synthesize Pol II-dependent hairpin-derived siRNAs that do not rely on the Pol IV-RDR2 pathway (Daxinger et al., 2009). By contrast, GFP silencing is at least partially released (top, right) and target enhancer methylation is reduced (bottom, right) in T+SΔ35S rdr2 plants, which depend on Pol IV-dependent siRNAs for silencing (Fig. 3b) and RdDM (Fig. 3c). Reference Daxinger, L., Kanno, T., Bucher, E., van der Winden, J., Naumann, U., Matzke, A.J. and Matzke, M. (2009) A stepwise pathway for biogenesis of 24-nt secondary siRNAs and spreading of DNA methylation. EMBO J., 28, 48-57. Figure S5. Small RNA accumulation detected by Northern blotting Northern blot analysis of hairpin-derived siRNAs in T+SWT, T+SΔ35S, T, SWT, SΔ35S, and nrpb2-3 (T+SWT background). On this blot, nrpb2-3 is on the same membrane as T+SWT, allowing a direct comparison between siRNAs in the two genotypes. In both T+SWT and SWT, which are inferred to contain siRNAs produced by both the Pol IIdependent pathway (21, 22 and 24 nt, with 21 nt predominating; Fig. 4, lane 4), and Pol IV- dependent pathways (predominantly 24 nt; Fig. 4, lane 2), the levels of 24 nt siRNAs do not exceed those of 21 nt siRNAs. By contrast, in the nrpb2-3 mutant, Pol II-dependent siRNAs are reduced and 24 nt siRNAs, which include those made in the Pol IV pathway, predominate. Figure S6. siRNA accumulation in downstream region of target enhancer Accumulation of siRNAs (21 nt in red and 24 nt in blue) at the region downstream of the enhancer sequence in both polarities (top and bottom) in the indicated genotypes. This downstream sequence is only present at the T locus and not at the S locus. The level of these siRNAs (Daxinger et al., 2009) correlates with the degree of DNA methylation in the downstream region, i.e. more siRNAs, more methylation in T+SWT than in T+SΔ35S (Fig. 3c). Reference Daxinger, L., Kanno, T., Bucher, E., van der Winden, J., Naumann, U., Matzke, A.J. and Matzke, M. (2009) A stepwise pathway for biogenesis of 24-nt secondary siRNAs and spreading of DNA methylation. EMBO J., 28, 48-57. Figure S7. Sequence of the target enhancer region and peaks of siRNA accumulation A. Sequence of the target enhancer (corresponding to nucleotides 1600-1930 of the entire target construct; Accession number HE582394) is shown. The tandem repeat (3 x 42 bp) is highlighted with black arrows and an incomplete copy of the repeat monomer (10 bp) at the 3’ end is indicated by the dotted underline. The upstream non-repetitive region comprising a peak of siRNAs is shaded with gray. Within this upstream nonrepetitive region, the portion highlighted with white letters indicates the region with the most abundant 24 nt siRNAs (right-hand portion of upstream peak in part B). The portion with black letters gives rise to considerably fewer siRNAs (left-hand portion of upstream peak in part B). In the target enhancer region, nearly all cytosines in CG sites are methylated in T+SWT. B. siRNA accumulation in T+SWT plants (also shown in Fig. 5a). The numbered regions correspond to the nucleotide (nt) sequence of the target enhancer sequence (Accession number HE582394). The siRNA-depleted region (1696-1774) between the upstream peak of siRNAs and tandem repeat is GC-poor (17.7% compared to 36.4% for the upstream peak and around 40% for the tandem repeat). Figure S8. Accumulation of siRNAs at ‘GF’ inverted repeat This information is taken from Molner et al. (2010). The GEO reference number for the small RNA sequences is GSM518446. GFP matching filtered reads (≧15 nt) were analyzed for this figure. A. Constructs encoding GFP (top) and the inverted repeat (IR) of the ‘GF’ portion (bottom). The 35S promoter (p35S) was used in both constructs. The nopaline synthase (NOS) and octopine synthase (OCS) transcriptional terminators (t) were used in the GFP and GF‐IR constructs, respectively. B. Accumulation of 21‐nt siRNAs (top panel, red) and 24‐nt siRNAs (middle panel, blue) from ‘GF’ and ‘P’ regions (black and white bar below panels) of the GFP gene as determined by deep sequencing of small RNAs. Levels are shown as the number of siRNA species targeting per nucleotide. Nearly all siRNAs accumulated from the top strand. The bottom panel shows GC ratio (blue) and distribution of CG dinucleotides (red) in the GFP sequence. C. Sequence of GFP coding region. Portion used in the GF‐IR is indicated by the dotted outline. CG dinucleotides are highlighted in red. Underlined sequences correspond to regions accumulating the most siRNAs (peaks in part B). Note that most siRNAs accumulate from the internal part of the GF‐IR, which also contains the most CG dinucleotides. Molnar A, Melnyk CW, Bassett A, Hardcastle TJ, Dunn R, Baulcombe DC (2010) Small silencing RNAs in plants are mobile and direct epigenetic modification in recipient cells. Science 328: 872‐875. Figure S9. Correlation between Pol IV-dependent 24-nt siRNA accumulation and CG methylation at endogenous tandem repeats A. Flow-chart showing categorization of endogenous tandem repeats. Tandem repeats whose normalized 24-nt siRNA abundance/100bp>1 are categorized as “siRNA +”, and those whose meCG ratio>0.4 are categorized as “meCG +” (for detail, please see Supplementary Table S1). B. Scatter plots of 24-nt siRNA abundance and CG methylation. Upper panel shows a correlation between CG methylation and 24-nt siRNA abundance at tandem repeats in IGN regions (blue) and a weaker correlation in genic regions (red). Scale of Y-axis is changed in graph to the right to show details. Lower panel shows a correlation in transposons (purple) and pseudogenes (green), although the number sampled is low. Original data are shown in Table S1. Figure S10. Correlation between siRNA accumulation, GC content and cytosine methylation at selected endogenous intergenic (IGN) tandem repeats. Top panel: Top and bottom strand accumulation of Pol IV-dependent 24-nt siRNAs shown as number of siRNA species targeting per nucleotide. Middle panel: GC ratio (blue) and CG dinucleotide ratio (red) within 24-nt sliding window. Black arrows below middle panel indicate positions of tandem repeat monomers. Bottom: DNA methylation level in WT and nrpd1 mutant plants (red, blue, and green bars indicate DNA methylation in CG, CHG, and CHH, respectively). Note that siRNA accumulation often is enriched in regions containing methylated CG dinucleotides and depleted in regions that are GC-poor. CG methylation generally persists in the nrpd1 mutant, indicating that it is independent of Pol IV-dependent siRNAs. These IGN tandem repeats represent examples of siRNA-enriched IGN tandem repeats listed in Supplementary Table S1. Supplementary Tables Table S1. List of endogenous euchromatic tandem repeats Table S2. Primers used in this study