piRNA pathway targets active LINE1 elements to establish the

advertisement
Supplementary figures and legends
Figure S1. Distribution of the H3K9me3 mark in the genome of three cell types
(A) Input normalized level of the H3K9me3 mark in three cells types of 10 dpp animals
on four types of genomic partitions (TSSs, exons, introns, and intergenic space). For
germ cells the data was obtained on FACS-sorted spermatogonia. For liver and testis
somatic cells, values of four and two ChIP-seq replicas were averaged, respectively. The
error bars show standard error. (B) Input normalized level of the H3K9me3 mark in
testicular somatic and germ cells (spermatogonia) on five TE classes. (C) Correlation of
H3K9me3 levels between independent ChIP-seq experiments performed on different
spermatogonia samples: FACS- and MACS-sorted spermatogonia from Miwi2 KO and
control animals. The correlation was analyzed for all 1 kb genomic windows and,
separately, on all TE families. For this analysis only TE families that had at least 5,000
reads mapped in the input sample were considered (291 families in total). Genomic
windows with > 50 reads in input sample were considered.
Figure S2. H3K9me3 in spermatogonia of Miwi2 knock-out and control animals
Input normalized level of the H3K9me3 mark on genomic partitions (A) and transposon
classes (B) in FACS sorted spermatogonia isolated from Miwi2 knock-out and control
(Miwi2 heterozygote) mice. (C) Enrichment of H3 ChIP signal on select TE families in
Miwi2 KO and heterozygous control. The bars show input-normalized H3 ChIP signal on
bodies of TEs belonging to six families of LINE and LTR classes of TEs. Reads mapped
to all annotated genomic instances of TEs were summarized and normalized to account
for differences in library depths (see Materials and Methods). (D) Differences in
H3K9me3 levels in LINE and LTR families in MACS-sorted spermatogonia of Miwi2
knock-out animals relative to control littermates. 5' repeats of selected LINE elements
are displayed separately as black dots. Fold changes observed in two independent
experiments are averaged, and the error bars show standard error. (E) Distribution of the
H3K9me3 mark along a L1-T consensus sequences in spermatogonia of Miwi2 KO and
control animals. (F) Distribution of the H3K9me3 mark along a L1-Gf consensus
sequences in spermatogonia of Miwi2 KO and control animals. (G) Correlation of
changes in H3K9me3 levels on TE families upon Miwi2 deficiency in independent ChIPseq experiments performed on FACS and MACS-sorted spermatogonia. The x-axis
shows change in input normalized level of the H3K9me3 mark on TE families observed
upon Miwi2 knock-out in FACS-sorted testicular germ cells. The y-axis shows fold
changes observed in two independent experiments on MACS-sorted testicular germ
cells. We considered only TE families that had at least 5,000 reads mapped in the input
sample (291 families in total).
Figure S3. Different strategies for normalization of H3K9me3 ChIP-seq signal on
TE families
Differences in H3K9me3 levels in LINE and LTR families in spermatogonia of Miwi2
knock-out animals relative to control (heterozygous) littermates. 5' repeats of selected
LINE elements are displayed separately as black dots. In addition to normalization to the
ChIP input sample, two alternative normalization strategies were implemented: 1) to
average H3K9me3 enrichment in 100 kb windows that have high level of H3K9me3
signal (more than 2-fold enrichment in spermatogonia of control mice); 2) to average
H3K9me3 enrichment on major satellite repeats that have very high level of H3K9me3
mark. Both normalization strategies show that H3K9me3 levels are decreased on 5’
portions of three L1 families (L1-A, L1-T and L1-Gf) in Miwi2 KO spermatogonia sorted
by two methods. For MACS-sorted cells, fold changes observed in two independent
experiments are averaged, and the error bars show standard error.
Figure S4. H3K9me3 signal in sequences flanking LTR IAPEz insertions and
validation of ChIP-seq results in somatic cells
(A) Metaplots of input normalized level of the H3K9me3 mark in spermatogonia of Miwi2
KO and control animals over 25-kb genomic regions flanking all annotated IAPEz
insertions. Only uniquely mapped reads were considered. Dashed lines show the
distance at which the signal dropped 2 fold from the peak value. (B) ChIP-qPCR was
used to measure H3K9me3 signal on major satellite, IAPLTR1a, and L1-A in somatic
testicular and liver (C) cells from 10-day old Miwi2 knock-out and heterozygous animals.
Somatic testicular cells were purified by MACS. H3K9me3 levels are normalized to input
and to signal on B1 SINE. Bars represent fold enrichment of the average of two
independent ChIP experiments on each sample, and error bars show standard
deviations.
Figure S5. MIWI2-bound piRNAs targeting LTR and LINE families
2
(A) Relationship between the amount of piRNA in MIWI2 complex in prospermatogonia
(E16.5), TE transcript abundance in normal testis and TE derepression upon Miwi2
deficiency. The x-axis shows the level of expression of selected TE families in control
(Miwi2 heterozygous) testis of 10-day old mice as measured by RNA-Seq. The y-axis
shows the fold change of the expression between Miwi2 knock-out and heterozygous
animals as measured by RNA-Seq. The amount of MIWI2-associated piRNAs derived
from each TE family corresponds to the size of the bubbles. (B) Strand distribution of
piRNAs in immunopurified MIWI2 complexes in prospermatogonia of E16.5 wild-type
animals.
Figure S6. Derepression of TE in Miwi2-deficient spermatogonia
(A) RT-qPCR was used to measure expression of L1-T in testis of 10 dpp Miwi2 KO and
control animals. DNase-treated total RNA from two separate sets of heterozygous and
KO littermate total testes was used for oligo(dT)-primed reverse transcription. Primers
targeting ORF1 and ORF2 regions of the L1-T transcript were used for qPCR. The
values were normalized to actin. Error bars show standard deviations. (B) Relationship
between levels of H3K9me3 in LINE families in germ cells relative to somatic cells and
effect of Miwi2 knock-out on LINE expression. The x-axis shows the fold difference of
H3K9me3 levels on different LINE families between germ cells and somatic testicular
cells. The y-axis shows the fold difference in abundance of LINE transcripts between
Miwi2 knock-out and control cells. (C) Relationship between L1-A element divergence
and derepression in Miwi2 KO. All genomic copies of L1-A were binned in groups based
on their divergence from the consensus sequence. Numbers of genomic L1 copies in
each group are indicated above the boxes. The y-axis shows fold difference in the
transcript abundance in testes of Miwi2 knock-out animals vs. heterozygous littermates.
Boxes correspond to 25-th and 75-th percentile, the thick line inside boxes is the
median. The whiskers spread to either 1.5 of IQR or to the farthest outlier if the outlier
was within 1.5 IQR distance.
Figure S7. Distribution of the H3K9me3 mark on the downstream flanking
sequences of L1-A elements
Level of the H3K9me3 mark on the 1-kb downstream flanks of individual L1-A copies (yaxis) in relation to the length of each insertion (x-axis) in spermatogonia of Miwi2
heterozygous (A) and knock-out (C) mice. Only uniquely mapped ChIP-seq reads were
3
considered. The dots correspond to individual L1-A copies that had at least one read
mapped to their flanks in both ChIP and input libraries (9,855 insertions in control and
10,072 in KO). (B) and (C) show box plot representation of data in (A) and (C) for two
categories of insertions: shorter than 2kb, and longer than 5kb. The whiskers spread to
either 1.5 of IQR or to the farthest outlier if the outlier was within 1.5 IQR distance.
Figure S8. Distribution of the H3K9me3 mark on the flanking regions of L1-F
elements
(A), (C), (E), (G) Level of the H3K9me3 mark in the upstream and downstream 1-kb
flanks of individual L1-F copies (y-axis) in relation to the length of each insertion (x-axis)
in spermatogonia of Miwi2 KO and heterozygous mice. Only uniquely mapped ChIP-seq
reads were considered. The dots correspond to individual L1-F copies that had at least
one read mapped to their flanks in both the ChIP and input libraries. (B), (D), (F), (G)
Box plot representation of data distribution shown in the other panels, for two categories
of insertions: shorter than 2kb, and longer than 5kb. Boxes correspond to 25-th and 75th percentiles, the lines inside the boxes are the medians. The whiskers spread to either
1.5 of IQR or to the farthest outlier if the outlier was within the 1.5 IQR distance.
4
Table S1. Statistics for ChIP-seq libraries
List of ChIP-seq libraries generated in the study together with genome mapping
statistics. Total: total number of reads in the library; unique mappers: reads with only
one unique valid alignment to the mouse genome (mm10) with zero mismatches; unique
+ multi mappers: reads with up to 10,000 valid alignments to the mouse genome with
zero mismatches; complexity: the number of distinct sequences divided by the total
number of reads (only uniquely mapped reads and corresponding sequences were
considered for evaluation of complexity).
Table S2. Evaluation of mapping threshold for genomic alignment of ChIP-seq
reads
Four libraries (Miwi2 KO and heterozygotes ChIP-seq and chromatin input libraries)
were truncated to 1 million reads and aligned to the mouse genome (mm10) using
Bowtie 0.12.7 allowing zero mismatches and an unlimited number of valid alignments
per read. The table lists numbers of reads and distinct sequences that were discarded
when certain thresholds for specific maximum number of genomic alignments were
implemented. Five thresholds were tested: 1000, 5000, 10000, 15000, and 25000. The
table also lists numbers of reads that have at least one alignment within a genomic
region annotated as simple repeat in the RepeatMasker track of the UCSC Genome
Browser.
Table S3. Statistics for RNA-seq libraries
List of RNA-seq libraries generated in the study together with genome and transcriptome
mapping statistics. Total: total number of reads in the library; rRNA: number of reads
mapped to rRNA consensus sequence with up to 3 mismatches; non-rRNA: number of
reads that did not align to rRNA; transcriptome: number of non-rRNA reads that have at
least one valid alignment to the mouse transcriptome with up to 3 mismatches; unique
mappers: non-rRNA reads with only one unique valid alignment to the mouse genome
(mm10) with zero mismatches; unique + multi mappers: non-rRNA reads with up to
10,000 valid alignments to the mouse genome with zero mismatches; complexity: the
number of distinct sequences divided by the total number of reads (only uniquely
mapped reads and corresponding sequences were considered for evaluation of
complexity).
5
Table S4. Intronic location among truncated L1 insertions
Table S5. Individual L1A insertions analyzed in Fig. 4E.
Shown are the chromosomal location and the length of each L1 insertion, H3K9me3
enrichment in FACS-sorted spermatogonia of 10 dpp control animals, the ratio of
H3K9me3 in KO vs control animals, KO/Het ratio of both H3K9me3 and H3K4me2/3
levels measured by ChIP-qPCR, and p-values for the difference between Het and KO for
each individual insertion.
Table S6. Three genes located in vicinity of L1 insertions have altered expression
levels in Miwi2 KO testes
Out of 353 genes whose TSSs are positioned within 25 kb of a full-length (6000 bp or
longer) L1 insertions, three genes were differentially expressed in testes of 10 day old
Miwi2 KO and control (heterozygous littermates) mice. Average expression: mean
normalized effective count of reads (eXpress estimation); Miwi2 Het testis: normalized
effective count of reads in control (Miwi2 heterozygous) testis; Miwi2 KO testis:
normalized effective count of reads in Miwi2 KO testis; foldChange: fold change in
expression between control and KO; log2FoldChange: fold change on log2 scale; p
value: p value of differential expression (DESeq estimation); adjusted p value: p value
adjusted for multiple testing using the Benjamini-Hochberg algorithm.
Table S7. Primer sequences for ChIP-qPCR on both TE consensuses and individual loci
Table S8. Primer sequences for DNA methylation analysis
Table S9. Primer sequences for qPCR analysis of gene expression
6
Download