ResponseToReviewers_WuTAL1revised

advertisement
Response to reviews, MS ID#: GENOME/2013/164830
MS TITLE: Dynamic shifts in occupancy by TAL1 are guided by GATA factors and drive large-scale
reprogramming of gene expression during hematopoiesis
Weisheng Wu et al.
The comments from reviewers are in black, and our responses are in blue.
Reviewer 1
Reviewer 1 Comments for the Author...
In this manuscript, Wu and colleagues investigate the role of TAL1 in hematopoiesis, and specifically
how changes in the binding specificity of TAL1 is modulated by other transcription factors and
contributes to commitment to the erythroid lineage. The authors identify the GATA factors as key
transcription factors in this process, and demonstrate that the ensuing choreography of TAL1 occupancy
is highly dynamic through differentiation.
I am overall enthusiastic about the manuscript. My major critique is that the manuscript suffers, in my
opinion, from multiple sections that are very dense and difficult to work through. In several of these
cases, it was difficult for me to figure out what was the main point of the section, and it took several
reads before I understood. Much of the discussion closely recapitulates the results, and it would perhaps
be better if the discussion was focused on integrating the complex results into a bigger picture, and
putting them into context. Overall, I feel that this is interesting, revealing, and well-executed research
that would strongly benefit from efforts to streamline the text and to focus attention on the key points.
 We are gratified by the complimentary comments, and we have worked for clarity during our
revision and streamlining. We revised the sections specifically commented on by the Reviewer,
and made revisions throughout to improve clarity. We re-wrote the Discussion to emphasize
how our results integrate into a larger picture.
Specific critiques / descriptions of some areas where writing could be improved:
I struggle with the analysis presented in Table 2 (pages 7-8) of the manuscript. In this section, the
authors demonstrate a divergence in the function of candidate TAL1 target genes. They describe how
TAL1 targets in precursor cells are enriched in functions such as proliferation and apoptosis, while
targets in megakaryocytic cells and erythroid cells are enriched in related functions, respectively. In
addition to being difficult to parse through, I was also distracted/confused by several results that seem
to suggest that the functional enrichment is not specific. For example, several of the megakaryocte gene
categories are also enriched in TAL1 target genes that are specific to erythroid cells (e.g. abnormal
megakaryocyte morphology, p = 2E-129). Also, the authors describe enriched innate immunity in the
gene targets of Ter119- cells, but it appears that innate immunity is more strongly enriched in the G1E
cell type. Anything that could be done to make this section more approachable would, in my opinion,
greatly benefit the reader.
 The reviewer’s comment led us to make extensive revision to this section. We now emphasize
from the start of the section two confounding factors in this analysis: (1) each gene is associated
with multiple TAL1 OSs, each of which can have a different occupancy pattern, and (2) the target
for each TAL1 OS is unknown. For the latter issue, we took two approaches. One is an inclusive
method that assigns every gene within an enhancer-promoter unit (EPU) as a potential target
for a TAL1 OS. The second, more exclusive method assigns the gene with the nearest TSS as the
target. In the inclusive approach, each TAL1 OS has multiple gene targets. These several
multiplicities allow a given gene to be included in the functional term enrichment for more than
one TAL1 OS type (each column in the Table). We emphasize even more strongly in the revised
test that the results are significant and robust, even with these serious confounding issues that
could have hidden the associations. We also added the point that some TAL1 OSs could be
playing a negative role, e.g. erythroid TAL1 OSs in megakaryocytic genes could be involved in
repression. Indeed, we found that the genes contributing to the enrichment for megakaryocytic
function in targets of TAL1 occupancy in erythroid cells tended to be expressed at higher levels
in megakaryocytes than in erythroblasts. This is now stated in the Results and presented as
Supplementary Figure 7. We also made extensive edits to improve clarity.
I found Figure 5C confusing. I am not sure how to interpret TF binding in some cell types with TAL1
occupancy in different cell types. For example, what does the high overlap of GATA1 binding (and
comparatively less overlap with GATA2 binding) mean in the Gata1- cell line? What is driving TAL1
absent GATA1 and 2?
 We have re-written and expanded the presentation to address this issue. The reviewer is correct
that some of the TFs evaluated for overlap with TAL1 OSs are not present in the cell type in
which TAL1 was ascertained. We interpret these overlaps as reflecting re-use of the previously
bound DNA segments, e.g. early binding by GATA2 could mark places that are later bound by
TAL1 and GATA1. Likewise, GATA1 (and TAL1) can bind to places that were previously bound by
other factors. This is now explicitly stated in the revised text. Also, we added an analysis of GATA
switch sites in a later section. All this is consistent with our previous results that GATA1 binding
sites are largely pre-determined by chromatin environment in G1E cell line, even though GATA1
is absent there (Wu, et al. 2011). This is added to the revised Discussion.
In Figure 6B, the legend is confusing because it aligns with the cell condition. Moving the legend would
make the figure easier to read. In the same figure, I struggle to understand what it means for the sites
with indirect gain or retention of TAL1 to be depleted in _both_ repression and induction. Intuitively,
this seems contradictory -- are these sites just largely non-fucntional? Perhaps it would help to be more
clear about how the enrichment and depletion are measured in these analyses.
 We appreciate the recommendations for clarification. The legend with the + and – designations
was found to be redundant, and it was removed. The method for evaluating enrichment and
depletion, relative to the induction and repression frequency of all TAL1 occupied genes, now is
explained more completely in the Results and Methods. Also, two new graphs were generated
to convey the results more clearly (Fig. 6B). Because the reference point to calculate enrichment
of induction is independent of the reference point for enrichment of repression (the two dotted
lines now in the bar plot), it is possible for the presumptive target genes for a group of TAL1 OSs
to be depleted for both induction and repression. The question as to whether these sites
associated with groups of genes depleted for both induction and repression could be nonfunctional led to a new paragraph in the Discussion, where do acknowledge the possibility that
some of this binding could be “opportunistic” and point to further experiments to address this
issue.
Reviewer 2
This manuscript reports the analysis of genomic occupancy by the key transcriptional regulator TAL1
across hematopoietic differentiation in progenitor cells, at various stages of erythroid maturation and in
megakaryocytes. One of the aims was to determine whether the genes regulated by TAL1 in the
different cell types are distinct and how differential binding might occur, in order to get better insight
into TAL1 function. The authors have compiled their own data with that from previously published
reports to correlate the pattern of TAL1 genomic occupancy with gene expression, histone modifications,
DNA motifs and co-occupancy by other critical transcription factors.
This is a very thorough and integrated analysis that offers a comprehensive map and a clear view of the
highly dynamic shifts in TAL1-occupied genomic sites across hematopoiesis. This is, to my knowledge,
the first time a study of that kind is reported. It is a very good demonstration of how large-scale data
analyses can validate and extend important previous observations, thereby linking individual studies in a
meaningful way. The authors have created a complete genomic platform that will facilitate further
analyses of some of the regulatory mechanisms underlying lineage specification and erythroid
differentiation. This is an important resource for the community that will certainly serve as a paradigm
for the study of other transcription factors and differentiation systems.
 We thank the reviewer for these comments.
Some interesting observations are made that could be further developed, as detailed below.
1. Page 8. The increasing number of TAL1 binding sites in the Cpox locus as erythroid differentiation
progresses is of interest but its significance is unclear and not discussed. Would inspection of the
sequences underlying the peaks provide some clues? Could the authors discuss how this might lead to
increased levels of expression as differentiation proceed?
 We examined this locus more carefully in response to these questions. Matches to transcription
factor binding sites for GATA factors and KLF proteins, as well as E-boxes, were found underlying
the peaks. These motifs, especially the GATA and KLF motifs, are enriched at a large number of
TAL1 OSs, which is discussed in a separate section. Thus we did not add a motif analysis at this
point in the manuscript. The observation of multiple KLF binding site motifs motivated us to
examine another KLF1 occupancy dataset (Tallack et al, 2010, Genome Research), which showed
peaks overlapping the TAL1 OSs in Cpox, and we modified the figure to include this. Also, to
address the potential significance of the additional binding sites, we added the statement
“These additional sites of binding by TAL1 and other hematopoietic transcription factors may
serve to keep the Cpox gene expressed while the bulk of the genome is repressed during later
stages of erythroid maturation.” Inspection of both the microarray and RNA-seq data during
erythroid maturation shows that Cpox is already expressed at a moderate level in erythroid
progenitors, and its level of expression increases about two-fold during maturation. Thus the
additional TAL1 binding sites may not be needed for a strong increase in expression, at least for
this locus, but there is a need to block the encroaching repressive chromatin. We also added to
the Discussion some thoughts about the complex relationships between number of TF OSs and
expression levels.
2. Page 8. How do the authors explain the complex and changing binding pattern of TAL1 in some genes
such as Cbfa2t3, with limited changes in expression levels? As in point 1, what are the sequences
underlying each of those peaks and could different combinations of DNA motifs account for the
differential binding?
 This is a challenging observation, which we now emphasize more in the text in response to this
comment. After observing “This indicates a complex set of regulatory regions that are utilized
dynamically during hematopoiesis,” we expanded the text to say “Remarkably, these dynamic
changes in occupancy are not accompanied by large changes in expression of Cbfa2t3,
suggesting that distinct sets of TF-bound DNA segments are utilized in different lineages to
achieve a similar level of expression”. This was not what we expected, and it is not clear why
different sets of CRMs would be used in different lineages to achieve the same end, but we
agree with the reviewer that this is important to explicitly point out. We also discuss possible
explanations for the multiple binding sites per gene in the Discussion (complex regulation,
redundancy, opportunistic binding) and suggest experiments that could test them. As in our
response to comment 1, we devote a separate section to motif analysis, which leads to some
helpful insights about other TFs that help guide TAL1 to lineage-specific binding sites. Thus we
do not describe separately the motif occurrences underlying TAL1 OSs in Cbfa2t3.
3. Page 11. The analysis of TAL1 binding in G1E cells before and after induction of GATA1 expression
gives a good picture of the possible combinations of TAL1 and GATA1 binding and their association with
gene expression. It would be interesting to take this further and provide some data on how the interplay
between GATA2 and GATA1, as hinted at in the text, might explain TAL1 genomic binding. To do this, the
authors might want to incorporate in their study data from the recent paper by Suzuki et al (Genes to
Cells, 2013) who have contrasted GATA1 and GATA2 binding by ChIP in primary erythroid cells. This
might give molecular insights into how the shifts in TAL1 occupancy may occur during erythroid
differentiation and thereby strengthen one of the key points of the paper.
 We thank the reviewer for this suggestion to explore the role of GATA2. Suzuki et al (2013)
mapped sites of occupancy by GATA2 and GATA1 in a cell model system similar to the G1E
system, using ChIP-chip. They found about 1800 GATA2-occupied segments and 1600 GATA1
OSs. We wanted to compare their results to the approximately 4000 GATA2 OSs and 14,000
GATA1 OSs we had previously reported, but after multiple contacts with the authors, we were
not able to obtain their peak calls. It does appear from the results in the paper that we are
seeing similar patterns, albeit with quite different numbers of bound sites. Following the
suggestion of the reviewer, we added an analysis of how TAL1 occupancy is influenced by
GATA2 binding, in particular at the GATA switch sites. We added a panel (C) to Fig. 6, a
Supplementary Figure 10, a new paragraph at the end of the Results, and a new paragraph in
the Discussion. The new analysis reveals two prominent classes of TAL1 OSs at GATA switch sites.
The most abundant are sites at which TAL1 is retained after the switch, and these are in a
category associated with gene induction. The other class is characterized by a loss of TAL1 after
the switch, which is a well-known model for repression during erythroid differentiation.
Other points
4. The authors need to soften some of their statements with regards to the novelty of their findings. As
examples, the fact that Gata motifs are stronger determinants of TAL1 binding than Ebox motifs has
been reported previously: (i) before large-scale ChIP assays were performed, as mentioned in Palii et al
(Embo J 2011) and (ii) more recently in ChIP-seq studies by the authors themselves (Tripic et al) and
others. Similarly, the broader statement that recruitment of TAL1 to different genomic sites in different
cell contexts operates through association with distinct, additional regulators clearly comes through this
analysis to strengthen previous observations. It is important to mention that molecular mechanisms that
underlie the redistribution of TAL1 in different cell types are now coming to light, as described in El
Omari et al, Cell Reports, 2013.
 We agree with the reviewer and we have revised our text to insure that the previous reports are
incorporated. We thank the reviewer for emphasizing the insights from the structural analysis in
El Omari et al, and we have incorporated these into the Discussion.
5. Page 4. The sentence referring to the heptad of TFs described by Wilson et al suggests the existence
of one large multiprotein complex containing the 7 TFs. I don’t think it has ever been shown that such a
complex exists; it is more likely that several smaller complexes may form on a given element. Please
rephrase.
 We have rephrased this to refer to co-associating proteins rather than a large complex.
6. The recent paper by Sanjuan-Pla et al, Nature 2013 identifies a platelet-primed HSC population and
should therefore be referred to when mentioning the priming of megakaryopoiesis in HPC7 cells.
 We thank the reviewer for pointing this out, and we now incorporate the reference as
recommended.
7. Suppl Figure S3. The correlation in signal strength for TAL1 and GATA1 Chip-seq in G1E-ER4+E2 (p.11)
is not clear. Please provide additional examples.
 We thank the reviewer for pointing out this error. We meant to refer to Supplementary Figure
9C, and we have corrected this.
8. Page 9. The authors make the assumption that the TAL1-bound DNA segments are active in HPC7 cells.
However, as TAL1 interacts with co-repressors, one would assume that it can be associated with
repressed elements. The authors should define “active chromatin”.
 We now explicitly state our operational definition of “active chromatin” as being in a state
dominated by the histone modifications H3K4me1, H3K4me3, or H3K36me3. The issue of
association with co-repressors and how that impacts chromatin is interesting and it needs more
data. For all the TFs we have examined, most of the occupancy occurs in DNase-accessible
chromatin with histone modifications associated with activity. We rarely see TF binding to
chromatin with the repressive modifications (an example is in Fig. 4). We hope that future work
will allow a direct examination of chromatin states in HPC7 cells, or better still, in primary
multipotential progenitors. Given the current data, we think that stating this assumption that
the HPC7 binding is in “active chromatin” allows us to see some evidence of a more repressed
state for some genes upon maturation. However, that conclusion is dependent on the accuracy
of that initial assumption.
9. Page 9, last paragraph and page 10, second paragraph, Supplemental Figure 1 should be corrected to
Supplemental Figure 7.
 Thanks for pointing this out; we made the change.
10. Figure 6B. This figure is confusing. Please label the occupancy pattern as in Figure 7. The columns of
+ and – should be relabeled and ordered differently. It should be clear that the first column corresponds
to binding of TAL1 in G1E cells, the second column of GATA1 in G1E-ER4+E2 cells and the third column of
TAL1 in G1E-ER4+E2 cells (to reflect the representation of the occupancy pattern).
 We appreciate the recommendations to improve clarity, and we made several changes,
including dropping the + and – designations. As discussed in more detail in our reply to Reviewer
1, we also re-made the graphs to make it clearer how we assessed frequency of induction and
repression, and calculated enrichment or depletion relative to the frequency of those responses
for all genes that are potential targets for TAL1 occupancy.
11. Figure 8. The cartoon suggests direct interactions between TAL1/E47 and RUNX1, RUNX1 and EKLF,
GATA2 and ERG. Have these direct interactions been demonstrated or is this just for drawing purposes?
Please clarify in the legend.
 The reviewer is correct and we have clarified this in the legend. We also re-drew the figure to fit
better with the models in El Omari et al, 2013, and added a drawing for co-associations in
megakaryocytes.
12. Suppl Figures 3-6. The WT TAL1 track should be relabeled Epro TAL1.
 We made this change.
Download