Uploaded by kevinnghm

abb0074 michael sm (14)

advertisement
science.sciencemag.org/cgi/content/full/science.abb0074/DC1
Supplementary Materials for
Mechanisms of OCT4-SOX2 motif readout on nucleosomes
Alicia K. Michael, Ralph S. Grand, Luke Isbel, Simone Cavadini, Zuzanna Kozicka,
Georg Kempf, Richard D. Bunker, Andreas D. Schenk, Alexandra Graff-Meyer, Ganesh
R. Pathare, Joscha Weiss, Syota Matsumoto, Lukas Burger, Dirk Schübeler*, Nicolas H.
Thomä*
*Corresponding author. Email: nicolas.thoma@fmi.ch (N.H.T.); dirk@fmi.ch (D.S.)
Published 23 April 2020 on Science First Release
DOI: 10.1126/science.abb0074
This PDF file includes:
Materials and Methods
Figs. S1 to S17
Table S1
Caption for Movie S1
References
Other Supplementary Material for this manuscript includes the following:
(available at science.sciencemag.org/cgi/content/full/science.abb0074/DC1)
MDAR Reproducibility Checklist (.pdf)
Movie S1 (.mov)
Materials and Methods
Human octamer histones expression, purification and reconstitution
Human histones were expressed and purified as described previously (40). Lyophilized histones were mixed
at equimolar ratios in 20 mM Tris-HCl (pH 7.5) buffer, containing 7 M guanidine hydrochloride and 20
mM 2-mercaptoethanol. Samples were dialyzed against 10 mM Tris-HCl (pH 7.5) buffer, containing 2 M
NaCl, 1 mM EDTA, and 2 mM 2-mercaptoethanol. The resulting histone complexes were purified by size
exclusion chromatography (Superdex 200; GE Healthcare).
DNA preparation
DNA for medium to large scale individual nucleosome purifications was generated by Phusion (Thermo
Fisher) PCR amplification (on average 2 x 96 well plates) or large-scale plasmid purification (GigaPrep,
Invitrogen) followed by EcoRV-HF (New England Biosciences) blunt-end restriction enzyme cleavage.
The resulting DNA fragment was purified by a monoQ column (GE Healthcare). The unmodified 601
Widom sequence was purified with a large-scale plasmid purification using a high copy plasmid containing
32 copies of the 601 sequence previously cloned into pUC19 vector (41, 42). All purified DNA was
concentrated and stored at -20C in 10mM Tris-HCl pH 7.5 until use.
Nucleosome assembly
The DNA and the histone octamer complex were mixed in a 1:1.5 molar ratio in the presence of 2 M KCl.
The samples were dialyzed against refolding buffer (RB) high (10 mM Tris-HCl (pH 7.5), 2 M KCl, 1 mM
EDTA, and 1 mM DTT). The KCl concentration was gradually reduced from 2 M to 0.25 M using a
peristaltic pump with RB low (10 mM Tris-HCl (pH 7.5), 250 mM KCl, 1 mM EDTA, and 1 mM DTT) at
4°C. Samples were further dialyzed against RB low buffer at 4°C overnight. After dialysis, nucleosomes
were incubated at 55C for 2 hours. The reconstituted nucleosome pools used for SeEN-seq were then
purified by native polyacrylamide gel electrophoresis using a Prep Cell apparatus (Bio-Rad) in TCS buffer
(10 mM Tris-HCl (pH 7.5) and 500 M TCEP). Large scale assemblies of individual nucleosomes were
purified by on a monoQ 5/50 ion exchange gradient (GE Healthcare) and desalted using a Zeba spin column
(Thermo Fisher) into 20mM Tris-HCl pH 7.5 and 500 M TCEP and stored at 4°C.
Protein expression and purification of OCT4 and SOX2
Human full-length OCT4 (residues 1 – 360), OCT4 DNA binding domain (residues 134-290) or human
SOX2 DNA binding domain (residues 37 – 118) were subcloned into pAC-derived vectors (43) containing
an N-terminal StrepII tag. An additional N-terminal EGFP tag and C-terminal sortase-6XHIS tag
(LPETGGHHHHHH) were fused in frame with OCT4 full-length to improve purification and was also used
2
for cryo-EM sample preparation. Recombinant proteins were expressed in 2- 4 L cultures of Trichoplusia
ni High Five cells using the Bac-to-Bac system (Thermo Fisher). Cells were cultured at 27°C, harvested 2
days after infection, resuspended in lysis buffer (50 mM Tris-HCl pH 8.0, 1M NaCl, 100 M
phenylmethylsulfonyl fluoride, 1 × protease inhibitor cocktail (Sigma), 250 M TCEP), and lysed by
sonication. The supernatant was harvested and the proteins were purified by Streptactin affinity
chromatography (IBA), and then purified by Heparin ion exchange chromatography (GE Healthcare). All
proteins were further purified by size exclusion chromatography (Superdex 200; GE Healthcare) in GF
buffer (20 mM HEPES pH 7.4, 150 mM NaCl, 5% glycerol, 500 M TCEP) as a last purification step. The
purified proteins were concentrated and stored at -80°C.
SeEN-seq library pool preparation
DNA sequences were generated by replacing Widom 601 sequence with the canonical consensus JASPAR
OCT4-SOX2 motif (CTTTGTTATGCAAAT, MA0142.1) (25) at 1bp intervals across the entire W601.
The W601-OCT4-SOX2 variant DNA sequences were flanked by EcoRV sites and adapter sequences and
ordered as gene fragments from TWIST Biosciences. The individual gene fragments were
suspended, pooled equally, cut with EcoRV-HF (NEB) and W601-OCT4-SOX2 variant DNA fragments
(153bp) purified from an agarose gel using QIAquick Gel Extraction kit (Qiagen). The W601-OCT4SOX2 DNA pool was spiked with an excess of W601 DNA (1:30 molar ratio; pool:601). The nucleosome
pool was assembled and purified as described above.
SeEN-seq assay
For SeEN-seq EMSAs, nucleosomes (100nM) were incubated with increasing amounts of full-length
OCT4, SOX2 (residues 37-118) or OCT4 and SOX2 (100nM, 200nM and 400nM) in 20uL reactions
containing 20mM Tris-HCl pH 7.5, 75mM NaCl, 10mM KCl, 1mM MgCl2, 0.1 mg/ml BSA, and 1 mM
DTT. The reactions were incubated at room temperature for ~1 hour and loaded onto a 6% non-denaturing
polyacrylamide gel (acrylamide:bis = 37.5:1) in 0.5X TGE and run for 1 hour (150V, room temperature).
At least 2 technical replicate experiments were generated for all TF(s) conditions tested. Gels were then
stained with SYBR gold nucleic acid stain (~10 min, Invitrogen). DNA bands corresponding to the size of
TF bound and unbound nucleosome complexes were imaged and excised using a C300 gel doc UVtransilluminator (Azure Biosystems). Gel slices were incubated with acrylamide gel extraction buffer
(100uL, 500mM Ammonium acetate, 10mM Magnesium Acetate, 1mM EDTA, 0.1% SDS) and heated
(50OC, 30 min.). H2O (50uL) and QIAquick Gel Extraction kit QG buffer (450uL, Qiagen) were added and
the samples heated (50OC, 30 min.). Samples were briefly spun and the supernatant containing DNA
fragments were transferred to QIAquick Gel Extraction spin columns. Samples were purified according to
3
manufactures instructions, eluted in H2O (22uL) and DNA quantified by Qubit reagent (Thermo Fisher
scientific). Purified DNA (20ul, ~2-20ng DNA) was used for Next generation sequencing (NGS) library
preparation (NEBNext ChIP-seq, E6240S) with dual indexing (E7600S) and no more than 10 cycles of
PCR amplification. Purified sequencing libraries were quantified by Qubit reagent (Thermo Fisher) and the
library size checked on the bioanalyser platform (Agilent) before sequencing on an Illumina MiSeq or
NextSeq platform (300bp paired-end). Sequencing fragments were mapped to the W601 sequence and
OCT4-SOX2 motif containing variants (153bp) using the Bioconductor package QuasR with default
settings (44), which internally use bowtie for read mapping (45). The number of sequence reads aligned to
each construct was quantified by the QuasR function Qcount with every construct represented. SeEN-seq
enrichments are calculated by determining the fold change between library-size normalized read counts for
each 601-OCT4-SOX2 variant in the TF-bound and unbound nucleosome fractions. These fold changes
represent a relative affinity difference between all positions. In all replicates we were able to capture every
motif position, suggesting that the OCT4-SOX2 motif does not dramatically affect nucleosome stability.
For autocorrelation analysis, SeEN-seq profiles (averages of technical replicates) were de-trended by
subtracting an 11bp running mean and autocorrelation values were calculated from the residuals using
default parameters of the acf function from the R TSApackage (46).
In vivo analysis of Oct4 binding and accessibility at full and partial motifs
Comparison of in vivo binding of Oct4 and Sox2
Previously published ChIP-seq data for Oct4 and Sox2 (34) were downloaded from GEO: GSM1910640
and GSM1910642 (two replicates of Sox2 ChIP-seq), GSM1910644 and GSM1910646 (two replicates of
Oct4 ChIP-seq and GSM1910648 (input chromatin control, only first replicate used due to limited number
of reads in the second replicate). ChIP-seq datasets were selected from this particular study (34) due to the
existence of replicates for both factors, a well-matched control and their limited GC bias, as GC bias can
confound the analysis of NGS data if not carefully controlled for (47-49). Datasets were downloaded from
GEO using the SRAdb R package (50) and aligned to the mm10 assembly of the mouse genome using
Bowtie (45) within the QuasR (44) package. Bowtie was run using QuasR default parameters, returning
only unique alignments. All read counting in given genomic regions was done using the QuasR function
qCount, whereby reads were shifted by 80 base-pairs. For all replicates across TF datasets, peaks were
identified using MACS2 (51) using default parameters and with corresponding control samples as a
background. Resulting peaks were then filtered requiring at least 80% mappability. Here we define
mappability as the fraction of all possible 50mers in a given region that are uniquely mappable using default
QuasR parameters for Bowtie (read lengths of the ChIP-seq datasets used in this study are 45 and 50). The
4
library-size normalized counts were determined as: nsIP = min(NIP, Ncontrol)*(nIP/NIP) and nscontrol =
min(NIP, Ncontrol)*(ncontrol/Ncontrol). Where nIP and ncontrol are the raw counts per peak and NIP and
Ncontrol are the total number of reads mapping to the genome in the IP and control sample respectively.
Thus, counts were in each case scaled down to the smaller library. For each dataset, enrichment over input
in peaks was defined as log2(nsIP + 8)–log2(nscontrol + 8), where a pseudo-count of 8 is used to reduce
noise levels at small read counts. Only peaks with a log2 enrichment of at least 1 were retained for further
analysis. The joint peak set was defined as the union of all peaks identified in any of the samples. In cases
where two or multiple peaks overlapped, a new peak region was defined containing all the nucleotides of
the overlapping peaks. The 500 top-enriched peaks of each sample were used for de novo motif finding
using HOMER (52). HOMER was run using the function findMotifsGenome.pl using 5 different motif
lengths (6, 10, 14, 18 and 22) and 200nt long sequences centered on each peak as input. Resulting weight
matrices were, if necessary, reverse-complemented so they all had the same orientation as the reference
weight matrix in the Jaspar database (MA0142.1) (25). Each inferred weight matrix was then used to scan
the genome using the matchPWM function from the Biostrings R package (53). Matching sequences were
determined by requiring a log2-odds score of at least 10 over a uniform background. In cases where two
(or multiple) matches overlapped (ignoring their strands), only the match with the highest log2-odds score
was retained.
In vivo analysis of Oct4 binding and accessibility at full and partial motifs
For the analysis focusing on full and partial motifs, we downloaded an additional previously published
ChIP-seq data for Oct4 (33) as well as ATAC-seq data for untreated and doxycycline-treated mouse ESCs
that contain doxycycline-sensitive Oct4 and Sox2 transgene (36, 54) from GEO: accessions GSM2417142
(Oct4 ChIP-seq), GSM2417127 (whole cell extract control), GSM2341271-6 (ATAC-seq) and GSE134652
(ATAC-seq). We used an additional Oct4 dataset (GSM2417142) to ensure that the results were
reproducible across different labs. This additional ChIP-seq datasets was again selected due to its limited
GC bias and well-matched control. The previously used Oct4 replicates (GSM1910644 and GSM1910646,
see above) were merged for this analysis. ATAC-seq reads were trimmed using cutadapt (55). Both ChIPseq and ATAC-seq data were aligned as described for the Oct4 and Sox2 samples in the previous paragraph.
The weight matrix for Oct4-Sox2 was downloaded from Jaspar (MA0142.1) (25) and Oct4-Sox2 sites (full
motif) were predicted by scanning the genome and selecting for sites with log2-odds score (probability of
a given sequence under the weight matrix model versus a uniform background) of at least 10 using the
matchPWM function from the Biostrings package (53). HMG-POUS partial motif matches were determined
by scanning with only the first 11 bases of the weight matrix, excluding the last 4 bases which model the
specificity of the POUHD, using the same cut-off of 10. Only those partial motif matches where the 4 bases
5
of sequence downstream of the predicted site had a log2-odds score < 0, using the last 4 bases of the weight
matrix, were retained for further analysis. Analogously, only those predicted full motifs were retained that
had a log2-odds score >= 0 for the last 4 bases of the predicted site. In this way, we ensured that full motif
matches contained a 3’ sequence that can be bound by the POUHD and partial motif matches did not, using
the logic that a log2-odds of 0 means that the sequence is equal likely to come from the weight matrix and
a uniform background and thus represents a natural cut-off for motif distinction. Each predicted site was
enlarged to a window of 251bp centered at the start of the motif and only predicted sites for which at least
80% of all possible overlapping 50-mers within the enlarged sequence were mappable using QuasR default
parameters were retained for further analysis. In addition, only predicted sites for a full motif that did not
overlap with a HMG-POUS motif, and vice versa, within the 251bp window were used. Finally, promoters
were defined as the regions ±1000bp around transcription starts of all genes in the UCSC known Gene table
(http://genome.ucsc.edu), via the R package TxDb.Mmusculus.UCSC.mm10.knownGene (56) and only
distal predicted sites not overlapping with any promoter in this set were kept. ChIP-seq as well ATAC-seq
counts on the (enlarged) predicted sites were determined using the QuasR function qCount, using a shift of
80bp for the ChIP-seq data and no shift for ATAC-seq. Log2 ChIP enrichments over the control were
determined as described in the previous paragraph. Library-size normalization was performed for both ChIP
and ATAC-seq data by normalizing to the smallest library as described above.
Fluorescence polarization (FP) assays
Flc-labelled DNA containing the canonical OCT4-SOX2 motif (5ʹ-Flc-GACCTTTGTTATGCAAATTAA3ʹ) was used as a fluorescent tracer. Increasing amounts of OCT4 or SOX2 (0.3-2500 nM) were mixed with
tracer (10 nM final concentration) in a 384-well microplate (Greiner, 784076) and incubated for 15 min at
room temperature. The interaction was measured in a buffer containing 15 mM HEPES pH 7.4, 250 M
TCEP, 75 mM NaCl, 10 mM KCl, 1 mM MgCl2, 0.1% (v/v) pluronic acid. Changes in fluorescence
polarization were monitored by a PHERAstar FS microplate reader (BMG Labtech) equipped with a
fluorescence polarization filter unit. The polarization units were converted to fraction bound as described
previously (57). The fraction bound was plotted versus OCT4 or SOX2 concentration and fitted assuming
a one-to-one binding model to determine the dissociation constant (Kd) using Prism 7 (GraphPad). Since
the oligonucleotide that was used contained a fluorescent label, we refer to these as apparent Kd (Kd (app))..
All measurements were performed in triplicates. For the competitive titration assays, the OCT4 or SOX2
bound to the fluorescent oligo tracer was back-titrated with unlabeled oligo or nucleosomes containing the
canonical motif at different sites. The competitive titration experiments were carried out by mixing tracer
(10 nM), OCT4 (300 nM) or SOX2 (150 nM) and increasing concentration of different nucleosomes or
DNA (0 - 3.2M). The fraction bound vs. the nucleosome or DNA concentration were fitted with a
6
nonlinear regression curve to obtain the IC50 values in Prism 7 (GraphPad). Two to three technical replicates
were measured for each reaction. We note that the assay does not allow us to differentiate between a specific
and nonspecific contribution to the binding.
Thermal stability assay of nucleosomes
Thermal stability assays (TSAs) of the nucleosomes were performed by the previously described method
(58). The nucleosomes (final concentration 1 M) or complexes (1 M nucleosome or DNA:2 M OCT4SOX2) were incubated with a temperature gradient from 26°C to 95°C, in steps of 1°C/min, using a
StepOnePlusTM Real-Time PCR unit (Applied Biosystems), in 1X binding buffer (BB) containing
5 × SYPRO Orange (Sigma-Aldrich). The buffer only background control was subtracted from the raw
fluorescence data and then normalized and plotted.
Cryo-EM sample preparation
After nucleosome assembly, the nucleosomes were purified using a Mono Q 5/50 column (GE Healthcare)
and desalted into 20mM Hepes pH 7.4, 0.5mM TCEP. The NCP SHL6 (~65L, 5M) was then mixed with
molar excess of GFP-OCT4 and SOX2 in ~100L volume and incubated at room temperature for 30
minutes (1:3:3; NCP:OCT4:SOX2 molar ratio) in a binding buffer containing 20 mM HEPES pH 7.4, 1 mM
MgCl2, 10 mM KCl, and 0.5mM TCEP. The sample was then purified using a Superose 6 3.2/300 column
(GE Healthcare) into a buffer containing 20mM Hepes pH 7.4, 50mM NaCl, 10mM KCl, 1mM MgCl2 and
0.5mM TCEP or directly used for a GraFix gradient (29). Peak fractions were analysed by SDS-PAGE
stained with Coomassie and native PAGE stained with SYBR gold (Thermo Fisher) to identify proteinDNA complexes. The sample was concentrated using an Amicon Ultra-0.5mL centrifugal filter (Merck
Millipore) and either prepared directly for electron microscopy (non-crosslinked sample) or subject to
crosslinking using the GraFix method (29). For GraFix crosslinking, the OCT4-SOX2-NCPSHL6 complexes
were layered on top of a 10%–30% (w/v) sucrose gradient (50 mM HEPES pH 7.4, 50 mM NaCl, 0.2 mM
TCEP) with an increasing concentration (0.18%–0.36% w/v) of glutaraldehyde (EMS) and subjected to
ultracentrifugation (Beckman SW40Ti rotor, 30000 rpm, 18 h, 4°C). After centrifugation, 300L fractions
were collected from the top of the gradient and peak fractions were analyzed by both native PAGE and
SDS-PAGE. Samples were quenched with 50mM Tris-HCl pH 7.5. The peak fractions were combined and
desalted to remove sucrose using a Zeba spin column (Thermo Fisher). The resulting sample was
concentrated with an Amicon-Ultra 0.5mL centrifugal filter to ~1M nucleosomes as determined by
measuring the DNA concentration at Abs260. After concentration, 3L of sample was applied to Quantifoil
holey carbon grids (R 1.2/1.3 200-mesh, Quantifoil Micro Tools). Glow discharging was carried out in a
7
Solarus plasma cleaner (Gatan) for 15s in a H2/O2 environment. Grids were blotted for 3s at 4°C at 100%
humidity in a Vitrobot Mark IV (Thermo Fisher), and then immediately plunged into liquid ethane.
Cryo-EM data collection
Data were collected automatically with EPU (Thermo Fisher) on a Cs-corrected (CEOS GmbH, Heidelberg,
Germany) Titan Krios (Thermo Fisher) electron microscope at 300 keV. Zero-energy loss micrographs
were recorded using a Gatan K2 summit direct electron detector (Gatan) in counting mode located after a
Quantum-LS energy filter (slit width of 20 eV). The acquisition was performed at a nominal magnification
of 130,000 × in EFTEM nanoprobe mode yielding a pixel size of 0.86 Å at the specimen level. The objective
aperture was 100 m. All datasets were recorded with exposure rates between 3.5-5 e-/(px·s) and the
exposures were fractionated into 40 frames. The targeted defocus values ranged from -0.25 to -2 m.
Cryo-EM image processing
Real-time evaluation along with acquisition with EPU (Thermo Fisher) was performed with CryoFLARE
(59). Drift correction was performed with the RELION 3.0 (60) motioncor implementation where a motion
corrected sum of all 40 frames was generated with and without applying a dose weighting scheme and CTF
was fitted using GCTF (61) on the non-dose-weighted sums. Particles were picked using crYOLO on the
dose-weighted sums (62). All datasets were processed in RELION 3.0 (60) including: 2D and 3D
classification, 3D refinement, particle polishing and CTF refinement. The resulting particles were imported
into cryoSPARCv2 (63) and a final non-uniform refinement was performed. The resolution values reported
for all reconstructions are based on the gold-standard Fourier shell correlation curve (FSC) at 0.143 criterion
(64, 65) and all the related FSC curves are corrected for the effects of soft masks using high-resolution
noise substitution (66). A negative B factor was applied to sharpen the maps automatically in PHENIX
(phenix.auto_sharpen) (67). Before sharpening, all maps have been filtered based on local resolution
estimated with MonoRes (Xmipp) (68).
All datasets were analysed for anisotropic effects and relative angular distribution using both cryo-EF (69)
and MonoDir (70). For cryoEF analysis a particle size of 190 Å was used. The box size and FSC resolution
were taken from the final refinement. Final 3D density map with an efficiency Eod > 0.7, as determined by
cryo-EF, were judged as having only limited angular distribution defects (as described in (69)). For both
TF-bound nucleosome reconstructions, OCT4-SOX2-NCPSHL-6 and OCT4-SOX2-NCPSHL+6, the EOD value
was within an acceptable range (Eod = 0.73, 0.78, respectively) and no additional normalization of angular
distribution was performed. The non-crosslinked nucleosome only sample (NCPSHL-6) showed significant
anisotropy, as determined by cryo-EF (Eod = 0.67). In order to decrease anisotropy and improve the quality
8
of the map, a 2D classification into 50 classes was performed on these particles (28,138 particles). Junk
classes containing only a small number of particles were discarded. The remaining 16 2D classes were
balanced by selecting a random subset of particles in each class corresponding to the number of particles in
the smallest class. This resulted in 6,302 particles. After subsequent refinement in cryoSPARCv2 of these
limited particles, cryo-EF was performed and the EOD increased to 0.79, with only a nominal decrease in
resolution (PSF best, 3.38 Å, PSF worst, 5.50 Å). Directional resolution and radial averages versus
resolution for all maps were calculated using monoDIR (70) (fig. S12).
Model building and refinement
A nucleosome template model was extracted from PDB entry 6NJ9 for subsequent interpretation of the
cryo-EM maps and was identified as having highest correlation with the cryo-EM map of the nucleosome
in a search of all available nucleosome models in the PDB (based on the highest cross correlation coefficient
calculated with PHENIX) (67). The template DNA was replaced with the specific 601-OCT4-SOX2
sequence based on the SOX2 motif and confirmed by the differences in purine and pyrimidine densities (in
regions with adequate resolution). Human histone coordinates were extracted from PDB entry 5Y0C and
rigid body docked. For the OCT4-SOX2-NCPSHL-6 structure, a homology model of human SOX2 HMG in
complex with bent DNA was prepared from a Mus musculus SOX2-HMG/OCT1-POU/DNA ternary
complex (PDB entry 1GT0; chains A, B, and D; 100% sequence identity for relevant part, see also fig. S4),
docked, and merged with the DNA from PDB entry 6NJ9. A homology model of human OCT4 POUs was
prepared from a Mus musculus OCT4-POU/DNA complex (PDB entry 3L1P; chain B; 93% sequence
identity for relevant part), and rigid body docked. Initial fitting of the template models into the cryo-EM
maps, and model building was carried out interactively with COOT (71) and ROSETTA (72). For all-atom
refinement, dihedral angle restraints for OCT4 and SOX2 were generated from the corresponding template
models (same as above) using PHENIX (67). Initial all-atom real-space refinement was carried out with
PHENIX applying reference model restraints (sidechain and backbone torsions) for OCT4 and SOX2 and
secondary structure restraints for protein (hydrogen bonds) and DNA (hydrogen bonds, planarity, stacking)
(67). Later refinement steps (including B factor fitting) were carried out with ROSETTA (72) using the
‘FastRelax’ protocol in combination with a density scoring function (73) and reference model restraints
(sidechain and backbone torsions) for OCT4 and SOX2 (converted from PHENIX). The weight for the
dihedral angle restraints was adjusted to allow a certain degree of freedom in order to prevent clashes as
well as geometry and density-fit outliers. Restraints for the covalently-attached crosslinker (modified PTD)
were generated with PHENIX and JLigand (74). MOLPROBITY (75) and PHENIX were used for model
validation. For the non-crosslinked OCT4-SOX2-NCPSHL-6 structure, the model of the crosslinked OCT4SOX2-NCPSHL-6 structure was rigid body docked, followed by all-atom refinement as described above. For
9
the OCT4-SOX2-NCPSHL+6 structure, the OCT4 and SOX2 chains together with respective DNA fragments
from the OCT4-SOX2-NCPSHL-6 and NCPSHL-6 only structures were rigid body docked and re-combined
with COOT. A 3.45 Å mouse OCT4/SOX2/DNA complex structure (PDB: 6HT5) was used for validation
of the overall arrangement of OCT4-SOX2 on a juxtaposed OCT4-SOX2 DNA motif (see fig. S11C). The
sequence register was confirmed by purine/pyrimidine density patterns. All-atom refinement was carried
out as described above. In all models, sidechain atoms with missing or ambiguous density were marked by
setting their occupancies to zero (see fig. S14).
Density maps segmentation, figure preparation
Structural figures and cryo-EM segmented maps were produced with PyMOL (The PyMOL Molecular
Graphics System, Version 2.0 Schrödinger, LLC) and UCSF Chimera (version 1.13).
Calculation of clash scores and contact surface area
Clash scores for OCT4–nucleosome models were calculated using a PyMOL script (scanFactor.py) as
described previously (13). In brief, an OCT1 probe (1O4X) containing an appropriately positioned DNA
fragment for superimposing on a nucleosome template model was placed in all possible binding positions,
and the clash score for each taken as total number of atoms in OCT1 closer than an adjustable threshold
distance (1 Å default) to nucleosome atoms. The PyMOL script has also been deposited here:
https://github.com/aliciamichael/amichael/blob/master/scanFactor_var_super.py
DNase I nucleosome footprinting assay
Nucleosome core particles (NCP) reconstituted with Widom 601 DNA containing an OCT4-SOX2 motif
51bp from the dyad and purified full-length human OCT4 or OCT4 and SOX2 (residues 37-118) were
mixed in a 1:2 molar ratio in BB buffer (20 mM HEPES pH 7.4, 1 mM MgCl 2, 10 mM KCl, and 0.5mM
TCEP) and incubated on ice for ~30 minutes. Nucleosomes in the presence or absence of OCT4 and/or
SOX2 were treated with a titration (0.12U, 0.06U, 0.03U) of DNaseI (NEB M0303S) in the presence of
MgCl2 (2.5mM) and CaCl2 (0.5mM) for 5 minutes at 37OC. The reaction was stopped by adding an equal
volume of Stop Buffer (200 mM NaCl, 30 mM EDTA, 1% SDS) and chilled on ice for 10 min. Samples
were treated with Proteinase K (10g) for 2 hours and DNA retrieved using Ampure Beads (A63881). DNA
was used for sequence library preparation (NEBNext ChIP-seq, E6240S) with Dual indexing and sequenced
on an Illumina MiSeq (300bp paired-end). Sequences were mapped to the Widom 601 sequence (147bp)
containing the OCT4-SOX2 motif using the Bioconductor package QuasR with default settings (44), which
internally using bowtie for read mapping (45). The start position of mapped reads, the DNaseI cut site, was
10
extracted and the counts were binned into 1bp bins across the length of the W601 sequence. Plots and
comparisons were done using 100,000 reads per replicate.
Generation of single motif mES cell lines
The Recombinase-mediated Cassette Exchange (RMCE) insertion protocol was used to generate clonal
lines with variant sequences inserted at the same position (35, 76). Briefly, TC-1 ES cells (background
129S6/SvEvTac) carrying an RMCE selection cassette (described in (35)) were selected under hygromycin
(250 μg/ml, Roche, Switzerland) for 10 days. Next, 4 million cells were electroporated (Amaxa
nucleofection, Lonza, Switzerland) with 25 μg of L1-601/motif-1L plasmid and 15 μg of pIC-Cre. Negative
selection with 3 μM Ganciclovir (Roche, Switzerland) was started 2 days after transfection and continued
for 10 days. Pools of selected cells were clonally expanded and tested for successful insertion of DNA
construct by PCR using primers recognizing locus flanking the insertion site. For motif insertion, DNA
fragments containing the Widom 601 sequence were cloned into a plasmid containing a multiple cloning
site flanked by two inverted L1 Lox sites, all motifs were inserted into the SHL -6 position as used for
structural studies (see above). The optimal match to the weight matrix for Oct4-Sox2 from the Jaspar
database (MA0142.1) (25) was used to construct the full-length motif while those nucleotides having the
most detrimental contribution to a score corresponding to the POUHD domain-contacted nucleotides were
used to construct the partial HMGPOUS motif. Control sequences were constructed by using only those
nucleotides of the weight matrix model contacted by the HMG domain of Sox2 in the motif. For each
construct at least two independent clones were expanded and used for analysis.
Chromatin immunoprecipitation
ChIP experiments were carried out as described (77), starting with 70 μg of chromatin and 5 µl of Oct-4A
(C30A3C1) Rabbit mAb (Cell signaling technology, cat #5677S). Real-time PCR was performed using
SYBR Green chemistry (Applied Biosystems) and 1/20 of the ChIP reaction or 1/40 of input chromatin per
PCR reaction. Two overlapping primer sets were used targeting the RMCE insertion locus to test
enrichment, as well as a negative control primer set for ChIP enrichment (Mm10 genome build: chr12
47,899,688-47,899,802). All primer sequences are available upon request.
11
Fig. S1. SeEN-seq is highly reproducible and identifies specific sites of TF binding to nucleosomes. A,
SeEN-seq profiles are highly reproducible across a range of TF concentration (Pearson correlation >0.8).
12
B, SeEN-seq enrichment profiles are highly reproducible between technical replicates at single TF
concentrations (Pearson correlation >0.8). C, SeEN-Seq enrichment profiles shown across the nucleosome,
indicated by SHLs that describe where the minor groove faces away (±1, 2, etc.) or towards (±1.5, 2.5, etc.)
the histone octamer. Asterisks (*) indicate positions tested in (27). Values are the average of technical
replicates for the nucleosome pool only (no TF), or the nucleosome pool in the presence of OCT4, OCT4SOX2, or SOX2 and the difference between OCT4-SOX2 and OCT4 alone. Error bars indicate s.d.
13
Fig. S2. SeEN-seq validation by independent methods and periodic binding to nucleosomal DNA. A,
Schematic of the 601 sequence indicating histone interactions. The location of the OCT4-SOX2 motif
14
(SOX2, orange; OCT4, red.) for individual affinity measurements is indicated below. Grey bars indicate
regions of histone-DNA interactions known as the ‘TA steps’ in the Widom 601 nucleosome positioning
sequence (12). B, 10 nM of a fluorescein (Flc) labeled 21-bp DNA containing the OCT4-SOX2 motif
(OCT4-SOX2 oligo, 5’-Flc-GACCTTTGTTATGCAAATTAA) were mixed with 300 nM OCT4 fulllength protein and counter-titrated with nucleosomes (Methods). Relative affinities are indicated as IC50
values. All data include two to three technical replicates and are shown as mean ± s.d. C, 10 nM of a Flc21-bp-DNA containing the OCT4-SOX2 motif were mixed with 150 nM SOX2 (residues 37-118) and
counter titrated with nucleosomes. Relative affinities are indicated as IC50 values. All data include two to
three technical replicates and are shown as mean ± s.d. D, Fluorescence polarization forward titration
experiments using 10 nM Flc-OCT4-SOX2 oligo in the presence of increasing amounts of either full-length
OCT4, SOX2 (residues 37-118) or OCT4-SOX2 mixed at equimolar concentrations and titrated as
indicated. For OCT4-SOX2, the concentration indicates the concentration of the heterodimer. Kd (app):
apparent dissociation constant. E, As in B and C but counter-titrations were performed with the indicated
unlabeled oligonucleotides. Relative affinities are indicated as IC50 values. We note that these
measurements are unable to distinguish between non-specific and specific binding events and were fitted
using a total binding curve. All data include three technical replicates (n=3) and are shown as mean ± s.d.
F, SeEN-seq enrichments for OCT4 (grey, three technical replicates per position) and the 3bp-running
average across the nucleosome (blue). Indicated are observed regions of periodic binding at 10bp intervals.
G, Autocorrelation analysis of the nucleosome pool only (no TF), or in the presence of OCT4, OCT4-SOX2
and SOX2 SeEN-seq enrichments across the nucleosome, x-axis gives lag in bp and dashed lines indicate
95% confidence interval to indicate statistical significance. H, SeEN-seq enrichments for SHL-6.5 to SHL5.5, with the corresponding solvent accessible domain of OCT4 indicated above. Solvent accessibility
determined using atom clash score (see Methods). Asterisk (*) indicates SHL-6 position used for structure
determination. I, ChIPseq enrichments (Log2 - Immunoprecipitated/Input) on the joint set of peaks from
two replicates of Sox2 (Sox2 R1 and Sox2 R2) and Oct4 (Oct4 R1 and Oct4 R2). Bottom panels show
scatter plots of ChIP enrichments, top panels show the Pearson’s correlation coefficients. Peaks bound
strongly by Oct4 tend to also be strongly bound by Sox2. J, Sequence logos of the top weight matrix of
each dataset as inferred by HOMER on the top 500 peaks (ranked by enrichment over input). For easier
visual comparison, starting or trailing positions with very low information content were removed (first 3
positions of the Sox2 R1 and the first two and last position of the Oct4 R2 logo (information content <
0.041)) or positions with uniform nucleotide distributions were added at the beginning or end (1 at the end
for Sox2 R2 and 3 at the beginning and 2 at the end for Oct4 R1). K, The percentage of the top 500 peaks
possessing their respective motif (log2-odds motif score >= 10). L, SeEN-seq enrichment for those
positions bound at least 2-fold in the OCT4 only SeEN-seq profile, shown as box and whisker (25th
15
percentiles) for OCT4, OCT4-SOX2 and delta between them. Addition of SOX2 at these OCT4-bound loci
increases the SeEN-seq enrichment by 1.7 log2-fold on average.
16
Fig. S3. Classification and refinement procedures for the OCT4-SOX2-NCPSHL-6 complex.
17
A, SeEN-seq enrichment of the specific position used for structural study (SHL-6), shown are values of
OCT4, OCT4-SOX2 and delta between them. B, Size-exclusion trace of GFP-OCT4, SOX2 (residues 37118) and NCPSHL-6. Peak fractions were analyzed by SDS-PAGE and Coomassie blue stain. C,
Representative cryo-EM micrograph and reference-free 2D class averages for the OCT4-SOX2-NCPSHL-6
complex. The micrograph was denoised by JANNI (62). D, A Titan Krios 300keV microscope was used to
collect 5,702 micrographs. All dose-fractionated micrograph stacks were subjected to beam-induced motion
correction with MotionCorr in RELION 3.0 (60). Particles were picked with crYOLO and only one round
of 2D classification to clean up particles before 3D classification in RELION 3.0 (60). A nucleosome model
(PDB: 6R94) was low-pass filtered to 60 Å and used as initial model for the first round of 3D classification.
Several rounds of 3D classification, including local searches, were necessary to obtain homogeneous
datasets. The last 3D classification divided the dataset into four models. Refinement of the best particles
with cryoSPARC using a non-uniform refinement led to a 3.1 Å resolution map. E, Gold-standard Fourier
shell correlation curve. F, Local resolution filtered map (MonoRes) (68). G, Angular distribution for the
OCT4-SOX2-NCPSHL-6. H, Local resolution filtered map (MonoRes) colored by protein chain identity.
18
Fig. S4. OCT and SOX2 homology modeling. A, OCT4 alignment from the OCT4-SOX2-NCPSHL-6
structure aligned with chain B (residues 3-73) of OCT4-free DNA structure (PDB: 3L1P). The backbone
(C) of each model was used to calculate an r.m.s.d = 0.60 Å. using the PyMOL align function. B, SOX2DNA alignment from the OCT4-SOX2-NCPSHL-6 structure and the free DNA high-resolution crystal
structure (PDB:1GT0). SOX2 was aligned to chain D, residues 1-79 of 1GT0. The SOX-bound NCP DNA
was aligned to chain A, DNA bases 1-10 and chain B 40-48 of 1GT0. The backbone of the protein chain
(C) and the DNA (phosphoribose only) was used to calculate an r.m.s.d. = 0.97 Å using the PyMOL align
function. C, Multiple sequence alignment of OCT1 and OCT4 (human) and D, SOX2 (human and mouse)
DNA binding domain sequences using T-Coffee (78) and visualized in Jalview (79). The conserved SOX2
Phe48 and Met49 reported to induce DNA bending are indicated (30, 80).
19
Fig. S5. Analysis and validation of the OCT4-SOX2-NCPSHL-6 structure. Representative, sharpened,
local-resolution filtered density with corresponding model cut-out segments of the OCT4-SOX2-NCPSHL-6
complex map using the PyMOL 2.0 carve function; including A, DNA segment (contour level, 9) B, H4
of the histone core (contour level, 4), C, OCT4 POUS domain (contour level, 4), and D, SOX2 density
(contour level, 4). Nucleotide and amino acid boundaries used for the cut-out are indicated in each panel.
E, Overlay of the OCT1-SOX2 structure (PDB:1O4X) including only the POUS domain and SOX2 (the
PDB:1O4X also includes the POUHD domain on free DNA) with the nucleosome bound structure (28). The
alignment was performed by superimposing on the DNA. The free-DNA bound structure is shown in solid
and the nucleosome-bound structure in transparent. F, Close-up view of OCT4 POUS only engaged on
DNA. Nucleotides corresponding to the POUS motif are shown in red and the entire OCT4-SOX2 motif is
shown with ribose and base rings. Residues at the DNA-protein interface are shown in sticks. G, 10 nM of
a Flc-21-bp-DNA containing the OCT4-SOX2 motif were mixed with 300 nM full-length OCT4 or OCT4
DNA binding domain only (residues 134-290) and counter titrated with nucleosomes. All data include three
technical replicates (n = 3) and are shown as mean ± s.d.
20
Fig. S6. OCT-SOX2 free DNA binding modality clashes with the nucleosome architecture. A, Residue
clashes for the OCT1 DNA binding domain (PDB: 1O4X, chain A) with the unbound nucleosome structure
determined by aligning the 2nd base of the OCT4 motif (ATGCAAT) with the nucleosome DNA at 1bp
intervals and calculating the residue clash score with a 1Å cut-off. B, OCT4 POUHD domain is blocked by
the native, non-crosslinked H2A-H2B dimer. Lysine crosslink between H2A and H2B molecules of distinct
H2A:H2B dimer pairs across the nucleosome dyad axis (H2A, chain C and residue 37; H2B, chain H and
residue 86). The inter-lysine crosslink was modelled with a pentanediol moiety (PTD, pink) and is included
in the OCT4-SOX2-NCPSHL-6 model. Contour level is 4.
21
Fig. S7. Classification and refinement procedures for the non-crosslinked OCT4-SOX2-NCPSHL-6
complex and the unbound NCPSHL-6. A, Representative cryo-EM micrograph and reference-free 2D class
averages for the non-crosslinked NCPSHL-6 and OCT4-SOX2-NCPSHL-6 complex. The micrograph was
denoised by JANNI (62). B, A Titan Krios 300keV microscope was used to collect 4,905 micrographs. All
22
dose-fractionated micrograph stacks were subjected to beam-induced motion correction with MotionCorr
in RELION 3.0 (60). Particles were picked with crYOLO and only one round of 2D classification to clean
up particles before 3D classification in RELION 3.0 (60). A nucleosome model (PDB: 6R94) was low-pass
filtered to 60 Å and used as initial model for the first round of 3D classification. Several rounds of 3D
classification, including local searches, were necessary to obtain homogeneous datasets. Two classification
routes were followed to for the bound and free nucleosome after the first round of 3D classification. The
28,138 particles contributing to the NCPSHL-6 construction were further filtered by further 2D classifications
and subset selection of limiting classes to decrease anisotropy (EOD, 0.66). The resulting 6,302 particles
were used in the final refinement and showed an increased EOD value of 0.79, consistent with a relatively
isotropic reconstruction. C, Refinement of the best particles with cryoSPARCv2 using a non-uniform
refinement led to a 3.49Å resolution map (free nucleosome, NCP
SHL-6
resolution filtered map, NCP
SHL-6
SHL-6
) and D, Free nucleosome local
(MonoRes) (68). E, Angular distribution for the free nucleosome, NCP
F, Fit of the crosslinked OCT4-SOX2-NCPSHL-6 model into the 4.15 Å resolution map (FSC, 0.143)
for the non-crosslinked OCT4-SOX2-NCPSHL-6 map. The non-crosslinked map was deblurred with ccpem
1.3.0, using refmac5 (81, 82). The individual models were built and refined with the corresponding cryoEM map and the resulting models were compared between the backbone of the entire crosslinked model
with the backbone of the entire non-crosslinked model (Overall root mean square deviation, r.m.s.d., 1.3
Å.). When comparing isolated OCT4-SOX2 and histone core chains between maps, the r.m.s.d. is equal to
1.4 Å and 1.8 Å, respectively. G, Cut-out density segment of the H2A:H2B dimer at the OCT4-SOX2 site
in the non-crosslinked map. Contour level is 5.5.
23
Fig. S8. OCT4-SOX2 induce DNA release does not induce rearrangements in the histone octamer
core. A, Zoom of SOX2-induced DNA kink highlighting the residues involved in intercalating the TT-step
(M49 and F48), consistent with previously reported crystal structures of SOX2 on free DNA (18). B,
Sharpened, local-resolution filtered density with corresponding model cut-out segment of the H3 Nterminal -helix near the OCT4-SOX2 binding site (contour level, 3). C, Overlay of the unbound
nucleosome histone octamer core (grey) and the OCT4-SOX2 bound histone octamer core (colored). The
root mean square deviation (r.m.s.d.) of the histones when the structures are aligned on the DNA is equal
to 1.8 Å.
24
Fig. S9. DNaseI and thermal stability assays show OCT4 and OCT4-SOX2 destabilize the
nucleosome. A, DNaseI digestion profile across the nucleosome only (SHL-6), OCT4 and OCT4-SOX2(OS), 2 replicates are shown per protein condition across a range of enzyme concentrations (0.12-0.03 Units
25
DNaseI). B, Pairwise correlations of DNase I measurements (from A), separated by protein condition. C,
Thermal shift assay of the Widom 601 containing the OCT4-SOX2 motif used for structure determination
(NCPSHL-6) or the Widom 601 nucleosome only (NCP601) in the presence or absence of OCT4 or OCT4SOX2. Nucleosomes were mixed with buffer only or in a molar ratio of 1:2 (nucleosome:OCT4 or OCT4SOX2). The raw values were normalized to one and plotted (Methods). These data are representative of
two separate experiments and within each experiment two technical replicates (n=2) and are shown as mean
± s.d (solid line). The first peak at a melting temperature at ~78C corresponds to release of H2A:H2B
dimer and the 2nd peak at ~83C corresponds to release of H3:H4 and DNA (58). The grey bar indicates the
temperature of the peaks in the nucleosome only samples for reference. The DNASHL-6 sample is the
corresponding 153bp NCP DNA containing the OCT4-SOX2 motif (no histones) in the presence of OCT4SOX2 as a control for the individual melting curve of OCT4-SOX2 and DNA.
26
Fig. S10. Classification and refinement procedures for the OCT4-SOX2-NCPSHL+6 complex.
A, Representative cryo-EM micrograph and reference-free 2D class averages for Dataset 1 of the OCT4SOX2-NCPSHL+6 complex. The micrograph was denoised by JANNI (62). B, Representative cryo-EM
27
micrograph for Dataset 2 of the OCT4-SOX2-NCPSHL+6 complex. The micrograph was denoised by JANNI
(62) C, A Titan Krios 300 keV microscope was used to collect 3,669 (Dataset 1) and 4,387 (Dataset 2)
micrographs. All dose-fractionated micrograph stacks were subjected to beam-induced motion correction
with MotionCorr in RELION 3.0 (60). Particles were picked with crYOLO and only one round of 2D
classification to clean up particles before 3D classification in RELION 3.0 (60). In the case of Dataset 2,
3D classification was performed directly after particle picking and extraction. A nucleosome model (PDB:
6R94) was low-pass filtered to 60 Å and used as initial model for the first round of 3D classification. Again,
several rounds of 3D classification, including local searches, were necessary to obtain homogeneous
datasets. 3D variability of the final refinement of Dataset 1 was then assessed using cryoSPARCv2 and a
volume that showed the best resolution for both TFs (as visualized and extracted by 3D display within
cryoSPARCv2) was used as a reference map for a subsequent 3D classification in RELION 3.0. The last
3D refinements from each independent dataset (Datasets 1 and 2) (RELION 3.0) were joined and a final 3D
classification was performed with eight classes. Final refinement of the best particles with the most
continuous OCT4 density with cryoSPARCv2 using a non-uniform refinement led to a 3.42 Å resolution
map. D, Gold-standard Fourier shell correlation curve from final cryoSPARC non-uniform refinement of
joined datasets. E, Local resolution filtered map colored by local resolution (MonoRes) (68). F, Local
resolution filtered map colored by chain (MonoRes) (68). G, Angular distribution for the OCT4-SOX2NCPSHL+6.
28
Fig. S11. Model details of the OCT4-SOX2-NCPSHL+6 cryo-EM structure. A, OCT4 alignment from the
OCT4-SOX2-NCPSHL+6 structure aligned with chain B (residues 3-73) of OCT4-free DNA structure (PDB:
3L1P). The backbone (C) of each model was used to calculate an r.m.s.d = 0.59 Å. using the PyMOL align
function. B, SOX2-DNA alignment from the OCT4-SOX2-NCPSHL+6 structure and the free DNA highresolution crystal structure (PDB:1GT0). SOX2 was aligned to chain D, residues 1-79 of 1GT0. The SOXbound NCP DNA was aligned to chain A, DNA bases 1-10 and chain B 40-48 of 1GT0. The backbone of
the protein chain (C) and the DNA (phosphoribose only) was used to calculate an r.m.s.d. = 0.95 Å using
the PyMOL align function. C, Alignment of an OCT4-SOX2 crystal structure (PDB: 6HT5, 3.45Å
resolution) to the OCT4-SOX2-NCPSHL+6 structure. OCT4 (residues 130 – 201) and SOX2 chains from
PDB: 6HT5, together with the DNA were aligned to OCT4 and SOX2 chains of the SHL+6-bound structure.
The backbone of the protein chains for OCT4 and SOX2 (C) was used to calculate an r.m.s.d. = 0.69 Å
using the PyMOL align function. Representative, sharpened, local-resolution filtered density with
corresponding model cut-out segments of the OCT4-SOX2-NCPSHL+6 complex map using the PyMOL 2.0
carve function; including D, OCT4 POUS domain (contour level, 3), E, SOX2 density (contour level, 5)
and F, H4 of the histone core (contour level, 2).
29
Fig. S12. Local-directional resolution measurement and local anisotropy analysis with MonoDir.
A, D and G, Average directional resolution plots for the cross-linked OCT4-SOX2-NCPSHL-6, noncrosslinked NCPSHL-6 and cross-linked OCT4-SOX2-NCPSHL+6 complexes respectively. B, E and H, Polar
angular distribution plots showing the distribution of the highest local-directional resolutions for the crosslinked OCT4-SOX2-NCPSHL-6, non-crosslinked NCPSHL-6 and cross-linked OCT4-SOX2-NCPSHL+6
complexes respectively. C, F and I, Radial average of local-directional resolution maps (tangentialresolution map, radial-resolution map, high- and low-resolution maps and MonoRes map) of the crosslinked OCT4-SOX2-NCPSHL-6, non-crosslinked NCPSHL-6 and cross-linked OCT4-SOX2-NCPSHL+6
complexes respectively. Plots were generated using MonoDir implementation and visualized using Scipion
2.0 (Xmipp3) (68, 70).
30
Fig. S13. The OCT4-POUHD domain is not engaged with its motif at SHL+6. A, Representative cryoEM micrograph and reference-free 2D class averages for Dataset 3 of the OCT4-SOX2-NCPSHL+6 complex.
The micrograph was denoised by JANNI (62). The acquisition was performed at a nominal magnification
of 105,000 × in EFTEM nanoprobe mode yielding a pixel size of 1.06 Å at the specimen level. B, A Titan
Krios 300keV microscope was used to collect 7,170 micrographs. All dose-fractionated micrograph stacks
were subjected to beam-induced motion correction with MotionCorr in RELION 3.0 (60). Particles were
picked with crYOLO and only one round of 2D classification to clean up particles before 3D classification
in RELION 3.0 (60). Varied classification of this additional dataset yielded a 3D reconstruction of an OCT4SOX2-bound nucleosome at SHL+6 at 4.5Å containing 54,648 particles. In addition to reasonably wellresolved density for OCT4(POUS)-SOX2-HMG, this reconstruction also shows evidence of additional
density emanating from the docked OCT4 POUS that we tentatively assign to the POUHD (see also panel
31
C). C, Cryo-EM map of a 3D classification showing the density we tentatively assign to the POUHD that is
~17Å away from the most C-terminal residue of the POUS domain in the OCT4-SOX2-NCPSHL+6 model.
D, Model depicting the clash of the POUHD motif in this OCT4-SOX2 orientation with the neighboring
DNA gyre. Binding of the POUHD to its motif juxtaposed to the POUS motif on the same DNA strand, as
observed in the free DNA structure (PDB: 1O4X) would result in clashes with the nucleosome DNA gyre.
32
Fig. S14. Density correlation plots of the OCT4-SOX2-NCP bound models with corresponding maps.
Density correlation (CC) of the A, OCT4-SOX2-NCPSHL-6 map and B, OCT4-SOX2-NCPSHL+6 map with
the corresponding model. The DNA (chains I and J) shows increased flexibility and correspondingly lower
CC values at the ends. The histone proteins (chains A-H) for both models do not show significant variation,
with a CC value near 1.0 for all residues. For OCT4 and SOX2, sidechains that were not resolved by density
have been set to zero occupancy in the atomic model.
33
Fig. S15. The HMG-POUS partial motif is bound in vivo and requires Oct4 and Sox2 for accessibility.
A, Representation of SOX2 HMG and OCT4 POU domains interaction with the preferred motif sequence,
34
as the Oct4-Sox2 position weight matrix (MA0142.1). The dashed lines indicate the DNA bases contacted
by OCT4-SOX2 in the nucleosome-bound structure. B, Oct4 ChIP-qPCR data at the ectopic insertion locus
using a non-overlapping second primer set and endogenous control locus (cont. primer data as in Fig. 5B)
(*P < 0.05, error bars indicate SEM of at least two biological replicates). C, Heatmaps showing accessibility
(measured by ATAC-seq) at Oct4-bound sites in cells in the presence and absence of Oct4 (36). Displayed
are the top thousand Oct4-bound loci (ranked by accessibility in Oct4-expressing cells) for the canonical
and partial HMG-POUS motif. Library size-normalized read densities (51-bp smoothed) are shown ±
0.5kbp around the motif. D, A metaplot of ATAC-seq signal in mES cells before and after Oct4 knockdown
(36). Data as in C. E, Heatmaps showing accessibility (measured by ATAC-seq) at Oct4-bound sites in
cells in upon knockdown of Oct4 in additional datasets (37). Displayed are loci as in C. F, As in E but
showing accessibility in cells before and after knockdown of Sox2 (37). G and H, Metaplot of OCT4
ATAC-seq signal in mES cells before and after Oct4 and Sox2 knockdown (37). Data as in E and F,
respectively.
35
Fig. S16. The OCT4 POUS clash score negatively correlates with SeEN-seq binding. A, The OCT4alone SeEN-seq 3bp running mean average profile is plotted (solid red line) together with the POUS
nucleosome atom clash score (grey bars) or the B, POUHD nucleosome atom clash score. C, OCT4-SOX2
SeEN-seq 3bp running mean average profile is plotted (solid purple line) together with the POUS
nucleosome atom clash score (grey bars) or the D, POUHD nucleosome atom clash score (see Materials and
Methods).
36
Fig. S17. Model of OCT4-SOX2 binding in higher order chromatin structure. Model of the OCT4SOX2-NCPSHL-6 nucleosome within a tetranucleosome (PDB: 5OY7) (39, 83). The OCT4-SOX2- NCPSHL6
was aligned to the tetranucleosome structure using the histone chains. Only the DNA bound by the factors
of OCT4-SOX2- NCPSHL-6 model is shown for clarity.
37
Table S1. Cryo-EM data collection, refinement and validation statistics
OCT4-SOX2- NCPSHL-6
NCP SHL-6
OCT4-SOX2- NCPSHL+6
(EMD-10406)
(EMD-10408)
(EMD-10864)
(PDB 6T90)
(PDB 6T93)
(PDB 6YOV)
Microscope
Titan Krios
Titan Krios
Titan Krios
Camera
K2
K2
K2
Magnification
Nominal: 130,000
Nominal: 130,000
Nominal: 130,000
Calibrated: 58,140
Calibrated: 58,140
Calibrated: 58,140
Voltage (keV)
300
300
300
Total dose (e–/Å2)
45
45
45
Number of frames
40
40
40
Defocus range (μm)
-0.25 – -2.0
-0.2 – -2.0
-0.2 – -2.0
Pixel size (Å)
0.86
0.86
0.86
Energy filter slit width
20 eV
20 eV
20 eV
Acquisition software
EPU
EPU
EPU
No. of micrographs
5,702
4,905
Dataset 1: 3,669
Data collection and
processing
Dataset 2: 4,387
Symmetry imposed
C1
C1
C1
Initial particle images (no.)
1,391,576
853,266
2,472,943 (2 datasets)
Final particle images (no.)
94,282
6,302
71,284
Map resolution, masked (Å)
3.05 (0.143)
3.49 (0.143)
3.42 (0.143)
3.30 (0.5)
4.15 (0.5)
4.12 (0.5)
Map resolution, unmasked
3.8 (0.143)
6.7 (0.143)
3.7 (0.143)
(Å)
6.03 (0.5)
9.37 (0.5)
6.03 (0.5)
2.03 – 4.14
2.42 – 5.49
2.44 – 4.37
Efficiency (Eod)
0.73
0.79
0.78
Map resolution range (Å)
3.0–11
3.0–9
3.0-12
FSC threshold (0.5, 0.143)
FSC threshold (0.5, 0.143)
Resolution range due to
anisotropy (Å)
(best, worst PSF)
38
Refinement
Refinement package
Phenix, Rosetta
Phenix, Rosetta
Phenix, Rosetta
real space
real space
real space
Resolution cut-off (Å)
3.05
3.49
3.42
Initial models used (PDB
6NJ9, 1GT0,
6NJ9
6T90
codes)
3L1P
Model resolution (Å)
3.05
3.49
3.42
0.143
0.143
0.143
-98
-69
-104
Non-hydrogen atoms
13,031
12,199
12,298
Protein residues
903
758
898
Nucleotides
282
302
262
Ligands
9 (PTD)
FSC threshold
2
Map sharpening B factor (Å )
Model composition
8 (PTD)
2
B factors (Å )
Protein
124
82
167
DNA
181
146
225
Bond lengths (Å)
0.022
0.014
0.011
Bond angles (°)
1.56
1.087
1.056
MolProbity score
0.66
0.73
0.87
Clashscore
0.46
0.73
1.39
Poor rotamers (%)
0.00
0.00
0.00
Favored (%)
99.66
99.73
100.0
Allowed (%)
0.34
0.27
0.00
Disallowed (%)
0.00
0.00
0.00
C-beta deviations
0.0
0.0
0.0
EMringer score
4.3
2.9
1.8
CaBLAM outliers (%)
0.7
0.6
0.7
R.m.s. deviations
Validation
Ramachandran plot
39
Model-to-data fit*
CCmask
0.81
0.84
0.82
CCbox
0.84
0.87
0.88
CCpeaks
0.79
0.79
0.78
CCvolume
0.80
0.82
0.82
Movie S1. OCT4-SOX2 binding at SHL-6 removes DNA from the histone core. A morph video
modelling the structural change induced in the nucleosome upon OCT4-SOX2 binding at SHL-6. Morph is
between the DNA of the NCP-SHL-6 and OCT4-SOX2-NCP-SHL-6 models.
40
References and Notes
1. A. Soufi, M. F. Garcia, A. Jaroszewicz, N. Osman, M. Pellegrini, K. S. Zaret, Pioneer
transcription factors target partial DNA motifs on nucleosomes to initiate
reprogramming. Cell 161, 555–568 (2015). doi:10.1016/j.cell.2015.03.017 Medline
2. D. J. Rodda, J.-L. Chew, L.-H. Lim, Y.-H. Loh, B. Wang, H.-H. Ng, P. Robson,
Transcriptional regulation of nanog by OCT4 and SOX2. J. Biol. Chem. 280, 24731–
24737 (2005). doi:10.1074/jbc.M502573200 Medline
3. K. Takahashi, S. Yamanaka, Induction of pluripotent stem cells from mouse embryonic and
adult fibroblast cultures by defined factors. Cell 126, 663–676 (2006).
doi:10.1016/j.cell.2006.07.024 Medline
4. V. Malik, L. V. Glaser, D. Zimmer, S. Velychko, M. Weng, M. Holzner, M. Arend, Y. Chen,
Y. Srivastava, V. Veerapandian, Z. Shah, M. A. Esteban, H. Wang, J. Chen, H. R.
Schöler, A. P. Hutchins, S. H. Meijsing, S. Pott, R. Jauch, Pluripotency reprogramming
by competent and incompetent POU factors uncovers temporal dependency for Oct4 and
Sox2. Nat. Commun. 10, 3477 (2019). doi:10.1038/s41467-019-11054-7 Medline
5. D. C. Ambrosetti, C. Basilico, L. Dailey, Synergistic activation of the fibroblast growth factor
4 enhancer by Sox2 and Oct-3 depends on protein-protein interactions facilitated by a
specific spatial arrangement of factor binding sites. Mol. Cell. Biol. 17, 6321–6329
(1997). doi:10.1128/MCB.17.11.6321 Medline
6. T. Kumar Mistri, C. S. Lam, W. Arindrarto, D. Rodda, Y. H. Foo, W. Ping Ng, S. Ahmed, P.
Robson, T. Wohland, Quantitative Determination of Oct4-Sox2 Heterodimer Formation
with Nanog Promoter Element. Biophys. J. 100, 74a (2011).
doi:10.1016/j.bpj.2010.12.607
7. C. C. Adams, J. L. Workman, Binding of disparate transcriptional activators to nucleosomal
DNA is inherently cooperative. Mol. Cell. Biol. 15, 1405–1421 (1995).
doi:10.1128/MCB.15.3.1405 Medline
8. L. A. Mirny, Nucleosome-mediated cooperativity between transcription factors. Proc. Natl.
Acad. Sci. U.S.A. 107, 22534–22539 (2010). doi:10.1073/pnas.0913805107 Medline
9. K. S. Zaret, J. S. Carroll, Pioneer transcription factors: Establishing competence for gene
expression. Genes Dev. 25, 2227–2241 (2011). doi:10.1101/gad.176826.111 Medline
10. F. Zhu, L. Farnung, E. Kaasinen, B. Sahu, Y. Yin, B. Wei, S. O. Dodonova, K. R. Nitta, E.
Morgunova, M. Taipale, P. Cramer, J. Taipale, The interaction landscape between
transcription factors and the nucleosome. Nature 562, 76–81 (2018). doi:10.1038/s41586018-0549-5 Medline
11. M. Fernandez Garcia, C. D. Moore, K. N. Schulz, O. Alberto, G. Donague, M. M. Harrison,
H. Zhu, K. S. Zaret, Structural Features of Transcription Factors Associating with
Nucleosome Binding. Mol. Cell 75, 921–932.e6 (2019).
doi:10.1016/j.molcel.2019.06.009 Medline
12. R. K. McGinty, S. Tan, Nucleosome structure and function. Chem. Rev. 115, 2255–2273
(2015). doi:10.1021/cr500373h Medline
13. S. Matsumoto, S. Cavadini, R. D. Bunker, R. S. Grand, A. Potenza, J. Rabl, J. Yamamoto, A.
D. Schenk, D. Schübeler, S. Iwai, K. Sugasawa, H. Kurumizaka, N. H. Thomä, DNA
damage detection in nucleosomes involves DNA register shifting. Nature 571, 79–84
(2019). doi:10.1038/s41586-019-1259-3 Medline
14. L. A. Cirillo, C. E. McPherson, P. Bossard, K. Stevens, S. Cherian, E. Y. Shim, K. L. Clark,
S. K. Burley, K. S. Zaret, Binding of the winged-helix transcription factor HNF3 to a
linker histone site on the nucleosome. EMBO J. 17, 244–254 (1998).
doi:10.1093/emboj/17.1.244 Medline
15. B. Fierz, M. G. Poirier, Biophysics of Chromatin Dynamics. Annu. Rev. Biophys. 48, 321–
345 (2019). doi:10.1146/annurev-biophys-070317-032847 Medline
16. G. Li, M. Levitus, C. Bustamante, J. Widom, Rapid spontaneous accessibility of nucleosomal
DNA. Nat. Struct. Mol. Biol. 12, 46–53 (2005). doi:10.1038/nsmb869 Medline
17. J. Huertas, C. M. MacCarthy, H. R. Schöler, V. Cojocaru, Nucleosomal DNA dynamics
mediate Oct4 pioneer factor binding. Biophys. J. S0006-3495(20)30032-1 (2020).
doi:10.1016/j.bpj.2019.12.038 Medline
18. A. Reményi, K. Lins, L. J. Nissen, R. Reinbold, H. R. Schöler, M. Wilmanns, Crystal
structure of a POU/HMG/DNA ternary complex suggests differential assembly of Oct4
and Sox2 on two enhancers. Genes Dev. 17, 2048–2059 (2003). doi:10.1101/gad.269303
Medline
19. D. Esch, J. Vahokoski, M. R. Groves, V. Pogenberg, V. Cojocaru, H. Vom Bruch, D. Han, H.
C. A. Drexler, M. J. Araúzo-Bravo, C. K. L. Ng, R. Jauch, M. Wilmanns, H. R. Schöler,
A unique Oct4 interface is crucial for reprogramming to pluripotency. Nat. Cell Biol. 15,
295–301 (2013). doi:10.1038/ncb2680 Medline
20. X. Yu, M. J. Buck, Defining TP53 pioneering capabilities with competitive nucleosome
binding assays. Genome Res. 29, 107–115 (2019). doi:10.1101/gr.234104.117 Medline
21. G. D. Stormo, Z. Zuo, Y. K. Chang, Spec-seq: Determining protein-DNA-binding specificity
by sequencing. Brief. Funct. Genomics 14, 30–38 (2015). doi:10.1093/bfgp/elu043
Medline
22. L. A. Boyer, T. I. Lee, M. F. Cole, S. E. Johnstone, S. S. Levine, J. P. Zucker, M. G.
Guenther, R. M. Kumar, H. L. Murray, R. G. Jenner, D. K. Gifford, D. A. Melton, R.
Jaenisch, R. A. Young, Core transcriptional regulatory circuitry in human embryonic
stem cells. Cell 122, 947–956 (2005). doi:10.1016/j.cell.2005.08.020 Medline
23. X. Chen, H. Xu, P. Yuan, F. Fang, M. Huss, V. B. Vega, E. Wong, Y. L. Orlov, W. Zhang, J.
Jiang, Y.-H. Loh, H. C. Yeo, Z. X. Yeo, V. Narang, K. R. Govindarajan, B. Leong, A.
Shahab, Y. Ruan, G. Bourque, W.-K. Sung, N. D. Clarke, C.-L. Wei, H.-H. Ng,
Integration of external signaling pathways with the core transcriptional network in
embryonic stem cells. Cell 133, 1106–1117 (2008). doi:10.1016/j.cell.2008.04.043
Medline
24. N. Tapia, C. MacCarthy, D. Esch, A. Gabriele Marthaler, U. Tiemann, M. J. Araúzo-Bravo,
R. Jauch, V. Cojocaru, H. R. Schöler, Dissecting the role of distinct OCT4-SOX2
heterodimer configurations in pluripotency. Sci. Rep. 5, 13533 (2015).
doi:10.1038/srep13533 Medline
25. A. Khan, O. Fornes, A. Stigliani, M. Gheorghe, J. A. Castro-Mondragon, R. van der Lee, A.
Bessy, J. Chèneby, S. R. Kulkarni, G. Tan, D. Baranasic, D. J. Arenillas, A. Sandelin, K.
Vandepoele, B. Lenhard, B. Ballester, W. W. Wasserman, F. Parcy, A. Mathelier,
JASPAR 2018: Update of the open-access database of transcription factor binding
profiles and its web framework. Nucleic Acids Res. 46, D260–D266 (2018).
doi:10.1093/nar/gkx1126 Medline
26. P. T. Lowary, J. Widom, New DNA sequence rules for high affinity binding to histone
octamer and sequence-directed nucleosome positioning. J. Mol. Biol. 276, 19–42 (1998).
doi:10.1006/jmbi.1997.1494 Medline
27. S. Li, E. B. Zheng, L. Zhao, S. Liu, Nonreciprocal and Conditional Cooperativity Directs the
Pioneer Activity of Pluripotency Transcription Factors. Cell Rep. 28, 2689–2703.e4
(2019). doi:10.1016/j.celrep.2019.07.103 Medline
28. D. C. Williams Jr., M. Cai, G. M. Clore, Molecular basis for synergistic transcriptional
activation by Oct1 and Sox2 revealed from the solution structure of the 42-kDa
Oct1.Sox2.Hoxb1-DNA ternary transcription factor complex. J. Biol. Chem. 279, 1449–
1457 (2004). doi:10.1074/jbc.M309790200 Medline
29. H. Stark, GraFix: Stabilization of fragile macromolecular complexes for single particle cryoEM. Methods Enzymol. 481, 109–126 (2010). doi:10.1016/S0076-6879(10)81005-5
Medline
30. P. Scaffidi, M. E. Bianchi, Spatially precise DNA bending is an essential activity of the sox2
transcription factor. J. Biol. Chem. 276, 47296–47302 (2001).
doi:10.1074/jbc.M107619200 Medline
31. K. Luger, A. W. Mäder, R. K. Richmond, D. F. Sargent, T. J. Richmond, Crystal structure of
the nucleosome core particle at 2.8 A resolution. Nature 389, 251–260 (1997).
doi:10.1038/38444 Medline
32. M. P. Meers, D. H. Janssens, S. Henikoff, Pioneer Factor-Nucleosome Binding Events during
Differentiation Are Motif Encoded. Mol. Cell 75, 562–575.e5 (2019).
doi:10.1016/j.molcel.2019.05.025
33. C. Chronis, P. Fiziev, B. Papp, S. Butz, G. Bonora, S. Sabri, J. Ernst, K. Plath, Cooperative
Binding of Transcription Factors Orchestrates Reprogramming. Cell 168, 442–459.e20
(2017). doi:10.1016/j.cell.2016.12.016 Medline
34. Z. Liu, W. L. Kraus, Catalytic-Independent Functions of PARP-1 Determine Sox2 Pioneer
Activity at Intractable Genomic Loci. Mol. Cell 65, 589–603.e9 (2017).
doi:10.1016/j.molcel.2017.01.017 Medline
35. F. Lienert, C. Wirbelauer, I. Som, A. Dean, F. Mohn, D. Schübeler, Identification of genetic
elements that autonomously determine DNA methylation states. Nat. Genet. 43, 1091–
1097 (2011). doi:10.1038/ng.946 Medline
36. H. W. King, R. J. Klose, The pioneer factor OCT4 requires the chromatin remodeller BRG1
to support gene regulatory element function in mouse embryonic stem cells. eLife 6,
e22631 (2017). doi:10.7554/eLife.22631 Medline
37. E. T. Friman, C. Deluz, A. C. A. Meireles-Filho, S. Govindan, V. Gardeux, B. Deplancke, D.
M. Suter, Dynamic regulation of chromatin accessibility by pluripotency transcription
factors across the cell cycle. eLife 8, e50087 (2019). doi:10.7554/eLife.50087 Medline
38. M. A. Hall, A. Shundrovsky, L. Bai, R. M. Fulbright, J. T. Lis, M. D. Wang, High-resolution
dynamic mapping of histone-DNA interactions in a nucleosome. Nat. Struct. Mol. Biol.
16, 124–129 (2009). doi:10.1038/nsmb.1526 Medline
39. T. Schalch, S. Duda, D. F. Sargent, T. J. Richmond, X-ray structure of a tetranucleosome and
its implications for the chromatin fibre. Nature 436, 138–141 (2005).
doi:10.1038/nature03686 Medline
40. A. Osakabe, H. Tachiwana, W. Kagawa, N. Horikoshi, S. Matsumoto, M. Hasegawa, N.
Matsumoto, T. Toga, J. Yamamoto, F. Hanaoka, N. H. Thomä, K. Sugasawa, S. Iwai, H.
Kurumizaka, Structural basis of pyrimidine-pyrimidone (6-4) photoproduct recognition
by UV-DDB in the nucleosome. Sci. Rep. 5, 16330 (2015). doi:10.1038/srep16330
Medline
41. B. Fierz, C. Chatterjee, R. K. McGinty, M. Bar-Dagan, D. P. Raleigh, T. W. Muir, Histone
H2B ubiquitylation disrupts local and higher-order chromatin compaction. Nat. Chem.
Biol. 7, 113–119 (2011). doi:10.1038/nchembio.501 Medline
42. A. J. Ruthenburg, H. Li, T. A. Milne, S. Dewell, R. K. McGinty, M. Yuen, B. Ueberheide, Y.
Dou, T. W. Muir, D. J. Patel, C. D. Allis, Recognition of a mononucleosomal histone
modification pattern by BPTF via multivalent interactions. Cell 145, 692–706 (2011).
doi:10.1016/j.cell.2011.03.053 Medline
43. W. Abdulrahman, M. Uhring, I. Kolb-Cheynel, J.-M. Garnier, D. Moras, N. Rochel, D.
Busso, A. Poterszman, A set of baculovirus transfer vectors for screening of affinity tags
and parallel expression strategies. Anal. Biochem. 385, 383–385 (2009).
doi:10.1016/j.ab.2008.10.044 Medline
44. D. Gaidatzis, A. Lerch, F. Hahne, M. B. Stadler, QuasR: Quantification and annotation of
short reads in R. Bioinformatics 31, 1130–1132 (2015).
doi:10.1093/bioinformatics/btu781 Medline
45. B. Langmead, C. Trapnell, M. Pop, S. L. Salzberg, Ultrafast and memory-efficient alignment
of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).
doi:10.1186/gb-2009-10-3-r25 Medline
46. K.-S. Chan, B. Ripley, TSA: Time Series Analysis (R package version 1.2, 2018);
https://CRAN.R-project.org/package=TSA.
47. Y. Benjamini, T. P. Speed, Summarizing and correcting the GC content bias in highthroughput sequencing. Nucleic Acids Res. 40, e72 (2012). doi:10.1093/nar/gks001
Medline
48. C. A. Meyer, X. S. Liu, Identifying and mitigating bias in next-generation sequencing
methods for chromatin biology. Nat. Rev. Genet. 15, 709–721 (2014).
doi:10.1038/nrg3788 Medline
49. M. Teng, R. A. Irizarry, Accounting for GC-content bias reduces systematic errors and batch
effects in ChIP-seq data. Genome Res. 27, 1930–1938 (2017). doi:10.1101/gr.220673.117
Medline
50. Y. Zhu, R. M. Stephens, P. S. Meltzer, S. R. Davis, SRAdb: Query and use public nextgeneration sequencing data from within R. BMC Bioinformatics 14, 19 (2013).
doi:10.1186/1471-2105-14-19 Medline
51. Y. Zhang, T. Liu, C. A. Meyer, J. Eeckhoute, D. S. Johnson, B. E. Bernstein, C. Nusbaum, R.
M. Myers, M. Brown, W. Li, X. S. Liu, Model-based analysis of ChIP-Seq (MACS).
Genome Biol. 9, R137 (2008). doi:10.1186/gb-2008-9-9-r137 Medline
52. S. Heinz, C. Benner, N. Spann, E. Bertolino, Y. C. Lin, P. Laslo, J. X. Cheng, C. Murre, H.
Singh, C. K. Glass, Simple combinations of lineage-determining transcription factors
prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell
38, 576–589 (2010). doi:10.1016/j.molcel.2010.05.004 Medline
53. H. Pagès, P. Aboyoun, R. Gentleman, S. DebRoy, Biostrings: Efficient manipulation of
biological strings (R package version 2.52.0, 2019).
54. S. Masui, Y. Nakatake, Y. Toyooka, D. Shimosato, R. Yagi, K. Takahashi, H. Okochi, A.
Okuda, R. Matoba, A. A. Sharov, M. S. H. Ko, H. Niwa, Pluripotency governed by Sox2
via regulation of Oct3/4 expression in mouse embryonic stem cells. Nat. Cell Biol. 9,
625–635 (2007). doi:10.1038/ncb1589 Medline
55. M. Martin, Cutadapt removes adapter sequences from high-throughput sequencing reads.
EMBnet.journal 17, 10–12 (2011). doi:10.14806/ej.17.1.200
56. Team BC, Maintainer BP, TxDb.Mmusculus.UCSC.mm10.knownGene: Annotation package
for TxDb object(s) (R package version 3.4.7, 2019).
57. B. D. Marks, N. Qadir, H. C. Eliason, M. S. Shekhani, K. Doering, K. W. Vogel,
Multiparameter analysis of a screen for progesterone receptor ligands: Comparing
fluorescence lifetime and fluorescence polarization measurements. Assay Drug Dev.
Technol. 3, 613–622 (2005). doi:10.1089/adt.2005.3.613 Medline
58. H. Taguchi, N. Horikoshi, Y. Arimura, H. Kurumizaka, A method for evaluating nucleosome
stability with a protein-binding fluorescent dye. Methods 70, 119–126 (2014).
doi:10.1016/j.ymeth.2014.08.019 Medline
59. A. D. Schenk, S. Cavadini, N. H. Thomä, C. Genoud, Live analysis and reconstruction of
single-particle cryo-electron microscopy data with CryoFLARE. bioRxiv 861740
[Preprint]. 2 December 2019. https://doi.org/10.1101/861740.
60. J. Zivanov, T. Nakane, B. O. Forsberg, D. Kimanius, W. J. H. Hagen, E. Lindahl, S. H. W.
Scheres, New tools for automated high-resolution cryo-EM structure determination in
RELION-3. eLife 7, e42166 (2018). doi:10.7554/eLife.42166 Medline
61. K. Zhang, Gctf: Real-time CTF determination and correction. J. Struct. Biol. 193, 1–12
(2016). doi:10.1016/j.jsb.2015.11.003 Medline
62. T. Wagner, F. Merino, M. Stabrin, T. Moriya, C. Antoni, A. Apelbaum, P. Hagel, O. Sitsel,
T. Raisch, D. Prumbaum, D. Quentin, D. Roderer, S. Tacke, B. Siebolds, E. Schubert, T.
R. Shaikh, P. Lill, C. Gatsogiannis, S. Raunser, SPHIRE-crYOLO is a fast and accurate
fully automated particle picker for cryo-EM. Commun. Biol. 2, 218 (2019).
doi:10.1038/s42003-019-0437-z Medline
63. A. Punjani, J. L. Rubinstein, D. J. Fleet, M. A. Brubaker, cryoSPARC: Algorithms for rapid
unsupervised cryo-EM structure determination. Nat. Methods 14, 290–296 (2017).
doi:10.1038/nmeth.4169 Medline
64. S. H. Scheres, RELION: Implementation of a Bayesian approach to cryo-EM structure
determination. J. Struct. Biol. 180, 519–530 (2012). doi:10.1016/j.jsb.2012.09.006
Medline
65. P. B. Rosenthal, R. Henderson, Optimal determination of particle orientation, absolute hand,
and contrast loss in single-particle electron cryomicroscopy. J. Mol. Biol. 333, 721–745
(2003). doi:10.1016/j.jmb.2003.07.013 Medline
66. S. Chen, G. McMullan, A. R. Faruqi, G. N. Murshudov, J. M. Short, S. H. W. Scheres, R.
Henderson, High-resolution noise substitution to measure overfitting and validate
resolution in 3D structure determination by single particle electron cryomicroscopy.
Ultramicroscopy 135, 24–35 (2013). doi:10.1016/j.ultramic.2013.06.004 Medline
67. P. D. Adams, P. V. Afonine, G. Bunkóczi, V. B. Chen, I. W. Davis, N. Echols, J. J. Headd,
L.-W. Hung, G. J. Kapral, R. W. Grosse-Kunstleve, A. J. McCoy, N. W. Moriarty, R.
Oeffner, R. J. Read, D. C. Richardson, J. S. Richardson, T. C. Terwilliger, P. H. Zwart,
PHENIX: A comprehensive Python-based system for macromolecular structure solution.
Acta Cryst. D66, 213–221 (2010). doi:10.1107/S0907444909052925 Medline
68. J. M. de la Rosa-Trevín, J. Otón, R. Marabini, A. Zaldívar, J. Vargas, J. M. Carazo, C. O. S.
Sorzano, Xmipp 3.0: An improved software suite for image processing in electron
microscopy. J. Struct. Biol. 184, 321–328 (2013). doi:10.1016/j.jsb.2013.09.015 Medline
69. K. Naydenova, C. J. Russo, Measuring the effects of particle orientation to improve the
efficiency of electron cryomicroscopy. Nat. Commun. 8, 629 (2017).
doi:10.1038/s41467-017-00782-3 Medline
70. P. R. Baldwin, D. Lyumkis, Non-uniformity of projection distributions attenuates resolution
in Cryo-EM. Prog. Biophys. Mol. Biol. 150, 160–183 (2020).
doi:10.1016/j.pbiomolbio.2019.09.002 Medline
71. P. Emsley, K. Cowtan, Coot: Model-building tools for molecular graphics. Acta Cryst. D60,
2126–2132 (2004). doi:10.1107/S0907444904019158 Medline
72. T. C. Terwilliger, F. Dimaio, R. J. Read, D. Baker, G. Bunkóczi, P. D. Adams, R. W. GrosseKunstleve, P. V. Afonine, N. Echols, phenix.mr_rosetta: Molecular replacement and
model rebuilding with Phenix and Rosetta. J. Struct. Funct. Genomics 13, 81–90 (2012).
doi:10.1007/s10969-012-9129-3 Medline
73. F. DiMaio, Y. Song, X. Li, M. J. Brunner, C. Xu, V. Conticello, E. Egelman, T. Marlovits,
Y. Cheng, D. Baker, Atomic-accuracy models from 4.5-Å cryo-electron microscopy data
with density-guided iterative local refinement. Nat. Methods 12, 361–365 (2015).
doi:10.1038/nmeth.3286 Medline
74. A. A. Lebedev, P. Young, M. N. Isupov, O. V. Moroz, A. A. Vagin, G. N. Murshudov,
JLigand: A graphical tool for the CCP4 template-restraint library. Acta Cryst. D68, 431–
440 (2012). doi:10.1107/S090744491200251X Medline
75. I. W. Davis, A. Leaver-Fay, V. B. Chen, J. N. Block, G. J. Kapral, X. Wang, L. W. Murray,
W. B. Arendall 3rd, J. Snoeyink, J. S. Richardson, D. C. Richardson, MolProbity: Allatom contacts and structure validation for proteins and nucleic acids. Nucleic Acids Res.
35, W375–W383 (2007). doi:10.1093/nar/gkm216 Medline
76. Y. Q. Feng, J. Seibler, R. Alami, A. Eisen, K. A. Westerman, P. Leboulch, S. Fiering, E. E.
Bouhassira, Site-specific chromosomal integration in mammalian cells: Highly efficient
CRE recombinase-mediated cassette exchange. J. Mol. Biol. 292, 779–785 (1999).
doi:10.1006/jmbi.1999.3113 Medline
77. F. Mohn, M. Weber, M. Rebhan, T. C. Roloff, J. Richter, M. B. Stadler, M. Bibel, D.
Schübeler, Lineage-specific polycomb targets and de novo DNA methylation define
restriction and potential of neuronal progenitors. Mol. Cell 30, 755–766 (2008).
doi:10.1016/j.molcel.2008.05.007 Medline
78. C. Magis, J.-F. Taly, G. Bussotti, J.-M. Chang, P. Di Tommaso, I. Erb, J. Espinosa-Carrasco,
C. Notredame, T-Coffee: Tree-based consistency objective function for alignment
evaluation. Methods Mol. Biol. 1079, 117–129 (2014). doi:10.1007/978-1-62703-646-7_7
Medline
79. A. M. Waterhouse, J. B. Procter, D. M. Martin, M. Clamp, G. J. Barton, Jalview Version 2—
A multiple sequence alignment editor and analysis workbench. Bioinformatics 25, 1189–
1191 (2009). doi:10.1093/bioinformatics/btp033 Medline
80. L. Hou, Y. Srivastava, R. Jauch, Molecular basis for the genome engagement by Sox
proteins. Semin. Cell Dev. Biol. 63, 2–12 (2017). doi:10.1016/j.semcdb.2016.08.005
Medline
81. T. Burnley, C. M. Palmer, M. Winn, Recent developments in the CCP-EM software suite.
Acta Cryst. D73, 469–477 (2017). doi:10.1107/S2059798317007859 Medline
82. G. N. Murshudov, P. Skubák, A. A. Lebedev, N. S. Pannu, R. A. Steiner, R. A. Nicholls, M.
D. Winn, F. Long, A. A. Vagin, REFMAC5 for the refinement of macromolecular crystal
structures. Acta Cryst. D67, 355–367 (2011). doi:10.1107/S0907444911001314 Medline
83. B. Ekundayo, T. J. Richmond, T. Schalch, Capturing Structural Heterogeneity in Chromatin
Fibers. J. Mol. Biol. 429, 3031–3042 (2017). doi:10.1016/j.jmb.2017.09.002 Medline
Download